May 2024

Posted on May 2, 2024

Writing/meta-professional

https://fellowship.rootsofprogress.org/programs/ summer program for writers 🧿
https://vickiboykis.com/2024/02/28/gguf-the-long-way-around/ boykis post
https://www.cell.com/cell/fulltext/S0092-8674(24)00304-0 article on choosing the way you spend your time
https://www.lesswrong.com/posts/KgFNwBuaDpfGSJktM/a-dozen-ways-to-get-more-dakka more dakka
https://danfrank.ca/things-i-tell-myself-to-be-more-agentic/ as in url

ML

https://arxiv.org/abs/2405.00208 A Primer on the Inner Workings of Transformer-based Language Models
https://www.youtube.com/watch?v=Bg1LQ_jWliU mamba video
https://www.quantamagazine.org/how-do-machines-grok-data-20240412/ popsci grokking thing
https://blog.mozilla.ai/exploring-llm-evaluation-at-scale-with-the-neurips-large-language-model-efficiency-challenge/ boykis evals explanation, using PEFT and mistral/llama
https://www.deeplearning.ai/short-courses/getting-started-with-mistral/ guide to using mixtral
https://www.kaggle.com/code/awsaf49/birdclef24-kerascv-starter-train birdclef contest 🧿
https://www.kaggle.com/automl-grand-prix automl grand prix 🧿
https://github.com/InternLM/xtuner peft framework
https://www.climatechange.ai/events/summer_school2024 - climate change ai! 🧿
https://podcasts.apple.com/us/podcast/what-if-dario-amodei-is-right-about-a-i/id1548604447?i=1000652234981 amodei v klein
https://www.youtube.com/watch?v=WaQlGeVa0-c connor leahy ai podcast
https://github.com/mistralai/mistral-common mistral tooling
https://github.com/google-research/tuning_playbook googles ml tuning advice
https://arxiv.org/abs/2404.19756 KAN networks
https://arxiv.org/abs/2306.03819 LEACE - alternative to the nimry refusal ablation thing
https://github.com/stas00/ml-engineering mle open book
https://github.com/GistNoesis/FourierKAN/blob/main/fftKAN.py KAN extension
https://arxiv.org/abs/2402.04362 eleuther paper

https://ai.stanford.edu/~kzliu/blog/unlearning machine unlearning
https://colab.research.google.com/drive/1ieDJ4LoxARrHFqxXWif8Lv8e8aZTgmtH local RAG
https://arxiv.org/abs/2312.04709 how to guess a gradient
https://arxiv.org/abs/2403.19647 sparse feature circuits paper (c.f. https://features.baulab.info/)
https://transformer-circuits.pub/2023/monosemantic-features/index.html#setup-interface the monosemanticity paper
https://transformer-circuits.pub/2022/toy_model/index.html#strategic-ways-out the toy models of superposition paper
https://transformer-circuits.pub/ circuits 🧿
https://www.lesswrong.com/posts/DtdzGwFh9dCfsekZZ/sparse-autoencoders-work-on-attention-layer-outputs applying SAEs to attention layer outputs
https://www.lesswrong.com/posts/rZPiuFxESMxCDHe4B/sae-reconstruction-errors-are-empirically-pathological counter to SAEs, about the errors
https://eugeneyan.com/writing/evals/ eugene on evals
https://thegradient.pub/mamba-explained/ more MAMBA
https://arxiv.org/ftp/arxiv/papers/2312/2312.00752.pdf the MAMBA paper
https://arxiv.org/pdf/2403.00745 activation patching
https://twitter.com/swyx/status/1769920689832972574 talk about claude 3 🧿
https://www.youtube.com/watch?v=zduSFxRajkE 🧿 the goat karpathy talking about tokenizers
https://www.youtube.com/watch?v=VMj-3S1tku0&list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&index=2 the goat karpathy talking about backprop
https://jaykmody.com/blog/gpt-from-scratch/ 🧿 a thing of beauty
https://nnsight.net/ transformerlens alternative
https://arxiv.org/abs/2404.16014 improvement to SAEs
https://arxiv.org/abs/2404.15255 handbook for activation patching
https://github.com/Giskard-AI/giskard evals framework
https://arxiv.org/abs/2404.08801 megalodon
https://huggingface.co/papers/2404.09173 feedback loops for arbitrarily long sequences
https://twitter.com/lambdaviking/status/1780246351520887281 ssm dunk
https://arxiv.org/pdf/1607.06450 OG layernorm paper
https://www.3blue1brown.com/lessons/backpropagation 3b1b neural net explainer vids
https://www.youtube.com/watch?v=dsjUDacBw8o&list=PL7m7hLIqA0hoIUPhC26ASCVs_VrqcDpAz&index=4 nanda transformer walkthrough
https://www.youtube.com/watch?v=KV5gbOmHbjU nanda transformer maths walkthrough
https://websim.ai/ weird thing
https://arxiv.org/abs/2402.19427 griffin, a mamba competitor
https://jalammar.github.io/illustrated-transformer/ explanation of transformers
https://transformer-circuits.pub/2021/framework/index.html circuits explainer
https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=_Jzi6YHRHKP1JziwdE02qdYZ circuits explainer 2
https://blog.eleuther.ai/rotary-embeddings/ eleuther cooking about transformer embeddings
https://www.oreilly.com/library/view/ai-engineering/9781098166298/ new chip huyen book
https://www.kaggle.com/code/awsaf49/aimo-kerasnlp-starter AIMO example notebook
https://www.kaggle.com/code/awsaf49/aes-2-0-kerasnlp-starter another AIMO example
https://www.anthropic.com/research/probes-catch-sleeper-agents - anthropic research note
https://huggingface.co/datasets/HuggingFaceFW/fineweb enormous text data set
https://arxiv.org/abs/2404.14219 the phi3 paper
https://martin.zinkevich.org/rules_of_ml/rules_of_ml.pdf ml best practices 🧿
https://www.answer.ai/posts/2024-04-26-fsdp-qdora-llama3.html jeremy howard new PEFT
https://arxiv.org/abs/2312.01037 eleuther linear probes
https://twitter.com/p_nawrot/status/1783812669251731851 project idea
https://docs.google.com/document/d/e/2PACX-1vQD8IlBotGdBxp3BnXkSjk8bNZlPV_0EH9ZA6wHd5dNf-BLSiwXUinvgv8ZoBEnNyTCF-chWO30NRw0/pub AGI big doc
https://arxiv.org/abs/2404.15702 weird new model
https://arena3-chapter1-transformer-interp.streamlit.app/ ARENA work (c.f. the MATS course)
https://www.alignmentforum.org/posts/64MizJXzyvrYpeKqm/sparsify-a-mechanistic-interpretability-research-agenda sharkey’s idea for mech interp progress
http://incompleteideas.net/book/RLbook2020.pdf book on RL
https://arxiv.org/pdf/2203.11355 origimi in N dimensions
https://www.alignmentforum.org/posts/7fxusXdkMNmAhkAfc/finding-sparse-linear-connections-between-features-in-llms - older SAE work
https://www.neelnanda.io/mechanistic-interpretability/prereqs - MATS prep - a guide to mech. interp. pre-reading
https://www.neelnanda.io/mechanistic-interpretability/getting-started - an improvement on the above
https://www.lesswrong.com/posts/5spBue2z2tw4JuDCx/steering-gpt-2-xl-by-adding-an-activation-vector - steering vectors
https://www.lesswrong.com/users/technicalities - a review of current agendas in alignment research, from 5 months ago.
https://ai.stanford.edu/~kzliu/blog/unlearning machine unlearning
https://www.lesswrong.com/posts/fJE6tscjGRPnK8C2C/decoding-intermediate-activations-in-llama-2-7b - the pre-work for the current refusal and sycophancy steering work
https://www.lesswrong.com/posts/zt6hRsDE84HeBKh7E/reducing-sycophancy-and-improving-honesty-via-activation the sycophancy work from above (c.f. LEACE from Belrose)
https://graphcore-research.github.io/posts/gemma/ guide to end-to-end transformers with RoPE

DS

https://dtkaplan.github.io/Lessons-in-statistical-thinking/ stats book
https://www.andrewheiss.com/blog/2024/03/21/demystifying-ate-att-atu/ causal inference
https://www.fharrell.com/post/assume/ stats assumptions
https://avehtari.github.io/ActiveStatistics/index.html stats book
https://m-clark.github.io/book-of-models/ stats models book
https://talks.andrewheiss.com/2024-04-25_ksu-bayes/ OLS extensions
https://ecogambler.netlify.app/blog/interpreting-gams/ GAMs

SWE

https://martinheinz.dev/blog/110 zsh config stuff (for my dotfiles, which i should update)
https://pyinfra.com/ pyinfra stuff
https://arxiv.org/pdf/1804.06826 citadel volta microarch paper
https://loglog.games/blog/leaving-rust-gamedev/?ref=thediff.co thing about rust gamedev

Recreational: