A penguin with a jetpack: can scale just save us all?

Published: January 18th, 2026

One of the big revelations in the history of AI was Richard Sutton's bitter lesson. In essence, it states that instead of teaching machines how "we think we think" rather we should build universal programs capable of learning as well as improving by themselves. A lot of deadends in AI were the result of smart people coming up with smart algorithmic ideas that helped improve performance on a particular problem (image captioning, translation, etc.). Unfortunately, none of these generalized to any other task and were thus never constituted progress towards a general form of intelligence. A similar thing was pointed out by Hans Moravec when he stated "the performance of AI machines tends to improve at the same pace that AI researchers get access to faster hardware" as early as 1998. He also quite accurately extrapolates this outwards and predicts "that the required hardware will be available in cheap machines in the 2020s" to match the human brain.

Scale is the fundamental assumption of modern AI: scale your model bigger and its performance will improve across tasks and entirely new capabilites will emerge. We can see glimpses of this even in AI4Science where Apple recently showed that you can essentially recover >90% of AlphaFold's performance by deleting all of the complicated, really cleve architecture and just replace everything with really large transformers. Modern hyperscalers operate at the edge of data scaling (the oft forgotten other half of the Chinchilla scaling laws), more or less at the edge of parameter scale, we've scaled inference time to what can be reasonably expected from a user (or to yield any additional gain) - this comes with the exception of multi-agent systems (MAS) which also scale inference time (by distributing it across multiple agents) and which we're still very much in the process of scaling -, and we're still scaling RL (or "self play") wherever still possible.

With enough power everything lifts off. Continuing this aeronautic metaphor: the question today is whether we have moved beyond designing bird-like wing-suits for humans to engineering jet-powered airplanes; or have we strapped a jet-pack to a penguin and are now cheering that it can finally fly?

For now we just now that it flies. Of course at the beginning everything feels like a penguin with a jetpack. Iteration 0 is never perfect. Perhaps the more concrete questions should be: how much bigger can that jet-pack get? What are the fundamental capaiblity limits of the penguin (or do any such even exist)? It's related to a question I've been thinking about anyway posed by AG Wilson: if sufficiently large and flexible models (with a soft simplicty bias) can learn symmetries of scientific problems by themselves, should we just let them learn everything independently or still hard-encode what we know about the world anyway?

For now I remain impressed by how much training on the entire internet can give you in terms of capabilities. But if you combine all the knowledge of Google with a blackbox capable of acting on that information, you should expect that blackbox to be very, very capable. It doesn't necessarily prove it is amazingly good at things beyond that large (but limited) information pool that it has at its disposal.

I for one enjoy using LLMs for coding a lot. But I also let my reciped be generated by it simply because it comes without ads. I could have found the source it was scraped from just as easily but because that website is not VC subsidized it carries ads. Similarly, I enjoy LLMs for transcribing PDFs into text or translation or even spell checking. Not because there weren't any comparable software solutions prior to AI but because this current solution does it all at once and comes without ads or friction.

It has not had this impact on in my scientific work for now. Yes it helps me generate much better code, I've become both much more efficient (do more) and much more capable (do new things) thanks to it. I've also become a much lazier thinker and more accepting of sloppy work. But I am yet to use it productively for my wet-lab work, or for really coming up with orthogonally creative ideas, with long-term planning that goes beyond platitudic advice. There's much still to be desired for and one wonders whether a bigger jet-pack is just the answer to it all.

← Back to main.