May 2025
(Editor’s note - even by normal standards, there was a lot of ML this month). A takeaway would be that I need to carve out some evenings to understand what the alignment/interp teams at Anthropic are currently trying to do.
- https://huggingface.co/spaces/jane-street/puzzle a call to adventure
- https://www.anthropic.com/news/tracing-thoughts-language-model anthro research
- https://www.anthropic.com/research/reasoning-models-dont-say-think more anthro research
- https://transformer-circuits.pub/2025/attribution-graphs/biology.html even more anthro research
- https://transformer-circuits.pub/2025/attribution-graphs/methods.html even even more anthro research
- https://www.anthropic.com/research/values-wild even even even more anthro research
- https://transformer-circuits.pub/2025/attention-update/index.html even even even even more anthro research
- https://www.anthropic.com/engineering/claude-code-best-practices anthro pragmatics
- https://www.benkuhn.net/pjm/ an anthro PM guide to PMing.
- https://www.darioamodei.com/post/the-urgency-of-interpretability anthro CEO exhortations
- https://helentoner.substack.com/p/nonproliferation-is-the-wrong-approach a former OpenAI board member lays out her stall
- https://huyenchip.com/2025/01/07/agents.html an overview of agents
- https://talyarkoni.org/blog/2018/10/02/no-its-not-the-incentives-its-you/ exhortations to academia
- https://arxiv.org/abs/2312.06942 Redwood research on control
- https://www.alignmentforum.org/posts/kcKrE9mzEHrdqtDpE/the-case-for-ensuring-that-powerful-ais-are-controlled motivation for the above
- https://www.redwoodresearch.org/reading-list An AI control reading list
- https://arxiv.org/abs/2504.10374 A further Redwood control paper
- https://thingofthings.substack.com/p/against-and-then-for-internal-loci Ozy, as always, being smart and thoughtful
- https://dynomight.net/paper/ a guide to being productive when writing notes
- https://www.nybooks.com/articles/1983/08/18/report-from-a-besieged-city/ a poem
- https://blackbird-archive.vcu.edu/v7n2/nonfiction/levis_l/strange.htm an article on Zbigniew Herbert.
- https://medium.com/airbnb-engineering/how-airbnb-measures-listing-lifetime-value-a603bf05142c Good DS
- https://pydantic.dev/logfire telemetry
- https://www.dwarkesh.com/p/questions-about-ai Dwarkesh is possibly the best placed person on earth to ask these qs