
Coding agents struggle to do AI research. Our analysis on the NanoGPT speedrun show they rarely attempt ambitious ideas, instead opting for hyperparameter tweaks. Releasing our framework as a benchmark here! x.com/intology/statu…
Stay tuned 🤫 x.com/classiclarryd/…