Discussion about this post

User's avatar
blake harper's avatar

You’re far too charitable bro.

> “The more defensible forecast is that AI will make frontier research teams much more productive and may dramatically increase the number of experiments they can plan, code, and analyze”

I’m sorry but I’m deep in this stuff and I have not seen evidence for that. Experiments are expensive, they can’t just run them Willy-nilly on low-credibility guesses. Coding an experiment is rarely the long tail in R&D. OpenAI even admitted in the GPT 5.5 system card that the model is terrible when evaluated against their own internal RSI benchmark.

Their whole IPO depends upon convincing the street that RSI is imminent because it’s the only way they maintain pricing power in a world where open source just commoditizes it all. That’s why Jack has to parrot this stuff about RSI b/c from what I hear their S-1 prep is not looking good.

For those who want an even more skeptical take that looks at the automation opportunity within each phase of the model development lifecycle, I wrote about that here: https://tailwindthinking.substack.com/p/the-ai-bubbles-favorite-fairy-tale

I know it’s fashionable to assign determinate probabilities to this stuff, but I truly don’t understand how this is anything but that. It’s not epistemically responsible to assign low probabilities to events that involve compounding uncertainties and assumptions. Better to just say “we don’t yet have reason to believe this is possible in any determinate kind of way” and leave it at that. Assigning a determinate possibility means you have some reason to believe the obstacles will be cleared — and if you do, you owe us that account of precisely how it would be for that 1 in 10 scenario.

Sam Tobin-Hochstadt's avatar

I wish people had an actual definition of RSI that we could check. If it's "best the human baseline on PostTrainBench" then we might well reach that by 2028. If it's "Opus 7 could create Opus 8 successfully and it would be better than 7" then how would we ever tell, since Anthropic is obviously not going to try that out. If it's "AI makes (some parts of) AI research 10x faster" then probably that's already happened.

No posts

Ready for more?