Learning Library

← Back to Library

AI’s Exponential Rise Defies Bubble Narrative

Key Points

  • Humans consistently misjudge exponential growth, so we tend to dismiss rapid AI advances—just as we downplayed COVID’s spread—because day‑to‑day changes feel normal.
  • Julian Schvviser (formerly of AlphaGo, Muse, now Anthropic) argues that internal data shows AI productivity could increase ten‑fold within 18 months, with Frontier Labs seeing no sign of a slowdown, making “bubble” claims essentially bogus.
  • Expert forecasts often miss exponential curves (e.g., solar‑panel installations) because intuition is anchored to linear trends, leading credentialed skeptics to underestimate transformative technologies.
  • There’s a substantial information gap: AI researchers inside companies observe fast, detailed progress, while outsiders see only surface‑level imperfections and assume development is lagging.
  • Highlighting Julian’s podcast and essay is crucial, as it offers a rare view into these internal metrics and counters the prevailing narrative that AI’s growth is stagnant.

Full Transcript

# AI’s Exponential Rise Defies Bubble Narrative **Source:** [https://www.youtube.com/watch?v=SW1s22kJ15g](https://www.youtube.com/watch?v=SW1s22kJ15g) **Duration:** 00:20:14 ## Summary - Humans consistently misjudge exponential growth, so we tend to dismiss rapid AI advances—just as we downplayed COVID’s spread—because day‑to‑day changes feel normal. - Julian Schvviser (formerly of AlphaGo, Muse, now Anthropic) argues that internal data shows AI productivity could increase ten‑fold within 18 months, with Frontier Labs seeing no sign of a slowdown, making “bubble” claims essentially bogus. - Expert forecasts often miss exponential curves (e.g., solar‑panel installations) because intuition is anchored to linear trends, leading credentialed skeptics to underestimate transformative technologies. - There’s a substantial information gap: AI researchers inside companies observe fast, detailed progress, while outsiders see only surface‑level imperfections and assume development is lagging. - Highlighting Julian’s podcast and essay is crucial, as it offers a rare view into these internal metrics and counters the prevailing narrative that AI’s growth is stagnant. ## Sections - [00:00:00](https://www.youtube.com/watch?v=SW1s22kJ15g&t=0s) **Exponential Blindness and AI Bubble** - The speaker contends that, as with COVID‑19, people misjudge AI’s trajectory by fixating on present imperfections rather than its exponential growth, citing Julian Schvviser’s claim of a 10× productivity surge within 18 months and declaring the “bubble” narrative misguided. - [00:04:58](https://www.youtube.com/watch?v=SW1s22kJ15g&t=298s) **Beyond Hype: AI Progress Metric** - The speaker argues that, contrary to bubble fears, concrete evidence of AI advancement lies in the increasing number of work hours AI systems can reliably operate, a metric Julian identifies as economically meaningful and difficult to game. - [00:10:17](https://www.youtube.com/watch?v=SW1s22kJ15g&t=617s) **Beyond Leaderboards: Measuring Real AI Capability** - The speaker criticizes optimizing for public benchmarks—citing Goodhart's law—and argues that evaluation should prioritize agents' ability to perform useful, real‑world work over leaderboard scores, while noting reinforcement learning from massive human text remains a viable growth path. - [00:15:38](https://www.youtube.com/watch?v=SW1s22kJ15g&t=938s) **Julian's AI Timeline & Deployment** - Julian predicts AI will work full eight‑hour days by mid‑2026, match human experts by year‑end and surpass them by 2027, urging firms to deploy AI strategically on tasks where humans are weakest, as illustrated by solo founders leveraging AI to fill their gaps. ## Full Transcript
0:00What if we're making the same mistake 0:02today with AI that we made with COVID in 0:042020? Looking at today's imperfections 0:07instead of exponential growth rates. 0:09Julian Schvviser from Alph Go, Muse, and 0:12now at Enthropic says the data shows we 0:14have 18 months until 10x productivity 0:17and the Frontier Labs are not seeing any 0:19slowdown whatsoever. In other words, the 0:21bubble is fake news. Let's get into what 0:24he said in his hourlong podcast and even 0:26better essay that he wrote last month. 0:28I'm going to give it to you very quickly 0:29in just a few minutes and I'm going to 0:30give you my take as well. Let's jump 0:32right in. Okay. First, why is the bubble 0:34narrative backwards? Leon's argument is 0:37that humans are very bad at 0:38understanding exponential growth. When 0:40something doubles regularly, humans 0:42consistently fail to grasp what's coming 0:44because today does not feel any 0:46different from yesterday. And so, bubble 0:47skeptics will see AI continuing to make 0:49mistakes on tasks and will conclude it 0:51will never work. or more recently, we'll 0:53see multi-hour work from AI agents 0:56inconceivable at the beginning of 2025 0:58and say, "We're not making fast enough 1:00progress." This is the same cognitive 1:02error, Julian argues, that led people to 1:04dismiss COVID as just a flu, even as 1:06case counts were doubling every week. We 1:09humans tend to focus on extending the 1:11patterns we see and are very bad at 1:14seeing exponentials. And this includes 1:16smart people, right? During any time of 1:18rapid transformation, simple math will 1:21often beat expert intuition. The joke is 1:23the straight line on the graph will beat 1:24expert intuition because expert tends to 1:27anchor on how fast things have changed 1:29in the past. A great example of this, 1:31one of my personal favorites is the 1:34solar rate of install. So if you look at 1:37the graph of the solar rate of install, 1:39it is an exponential curve. It goes 1:40straight up. But if you look at the 1:42graph of projections for solar, it is 1:45missing every time because experts just 1:47cannot get an exponential correct. This 1:50leads to widespread skepticism from 1:52credentialed researchers whenever we're 1:54in an exponential situation because they 1:56struggle to understand that we really 1:58are seeing extremely rapid progress and 2:02all of their expertise is calibrated to 2:04a different curve, a more normal curve, 2:07a more gradual rate of adoption. that 2:09Julian argues that's just not what we're 2:10seeing with AI. I would agree. The last 2:13piece I think is the most important for 2:15his thesis. Julian agrees that on the 2:17outside it may look like slower 2:19progress, but from his position inside 2:21Anthropic and as a longtime frontier AI 2:24researcher, it does not look like that. 2:26And so there's this tremendous 2:28information gap between what's visible 2:29from the inside and what's visible from 2:31the outside. And that's why I wanted to 2:33do this summary for you because I don't 2:35think that Julian's interview, Julian's 2:36post is getting enough attention. We 2:38need to understand how AI companies see 2:41internal measurements on the inside with 2:43a lot more detail. So this can create a 2:45strange situation, right? People 2:47building these systems will talk about 2:49explosive growth in metrics that matter, 2:51but outsiders are still debating is this 2:54true, when will this be deployed, etc. 2:56And so one example of this, Julian 2:58mentions it. I think it's also very 2:59important is how long AI agents are able 3:01to work without supervision. You'll 3:04remember when Anthropic, the company 3:06Julian works at, launched Sonnet 4.5, 3:09they launched it with a claim that it 3:11built Slack in 30 hours, right? I talked 3:13about that. People tend to sort of wave 3:15their hands and say, well, what did the 3:17official sort of test results say? Is it 3:20extending the horizon, etc. And I think 3:23the point is really you continue to see 3:26anecdotes like that that are very very 3:28long and multi-day and even a few months 3:30ago you weren't seeing that. And so the 3:32point isn't any given model specific 3:35measure on a given test. The point is 3:38whether the AI tide is coming in, how 3:41rapidly it's coming in. And Julian's 3:42argument is the tide is coming in really 3:44fast. It's silly to argue about whether 3:46we've hit a wall. We keep seeing these 3:48stories from the inside in his 3:50perspective and from the outside if 3:52we're looking of better and better and 3:53better and better performance 3:55particularly on things that matter a lot 3:56like whether AI can operate autonomously 3:59and that is the root of his perspective 4:02that internal internal researchers are 4:05seeing things in ways that external you 4:07know news anchors external journalists 4:09just don't see. I think that's a really 4:11important take. Um, and that really like 4:13if we talk about why the AI bubble is 4:15backwards, I think that's the root of 4:17it. If you are in San Francisco, if you 4:19visit San Francisco, if you talk to 4:21people who are on the cutting edge of AI 4:24research, none of them are saying that 4:26we are hitting a wall. And it's not just 4:29the people who have gigantic piles of 4:31equity like Sam Alman who have that 4:33position, who would have the incentive, 4:34right? It is people who are just 4:36researchers like Julian who are just 4:39trying to understand how to make systems 4:41that work better. They're also saying 4:43we're not hitting a wall. And it's very 4:46odd for someone like me to be in a 4:48position where I talk a lot to people 4:50who are on the outside and who don't 4:52understand that and who ask me 4:54frequently, is there an AI bubble, Nate? 4:56I got to say, I see the same thing 4:58Julian sees. I don't see evidence of a 5:00bubble. And I'm really glad he did this 5:03podcast. I'm glad he did this blog post 5:04because I think he articulates it really 5:06well. Let's dive in though to what isn't 5:09hype. What is the evidence that suggests 5:12that we are not in a bubble and that AI 5:14continues to progress? Because I think 5:16that it often gets lost in arguments 5:18over how much we're spending on data 5:20centers or how much we're spending as a 5:22percentage of capital expenditure at big 5:24companies etc. how much electricity 5:26we'll need. Let's leave that to the side 5:28for now and let's look at evidence of 5:29progress in AI models. Julian's argument 5:32here is that what I referred to 5:34previously, how long AI can get work 5:36done, is actually the core metric for 5:39measuring whether we're making 5:40meaningful progress. I think that's 5:42really interesting for two reasons. One, 5:44it means that maybe the major model 5:46makers are converging on a metric that 5:50is not easily gamed. And that would be 5:53great news because we've had a lot of 5:54metrics where you get up to 90 95% on 5:57this test or that test and everyone sort 5:58of rolls their eyes. Sweet bench is an 6:00example of that. People are arguing over 6:02ARC AGI all the time and how we can make 6:04that harder so you can't game it as 6:06easily, math test, etc., etc. So the 6:10breakthrough insight is really that 6:12there is a correlation between what 6:14matters economically and the number of 6:16hours AI can work. If AI can work 6:18longer, Julian argues, then we will get 6:21more value from AI. And so it's not just 6:24whether you can delegate work, it's 6:26whether you can delegate work and 6:28effectively get an answer not just for 6:30quick responses but for long-term work. 6:33MER, an organization that Julian cites 6:36that is the bar for measuring this has 6:39tracked that we have gone from handling 6:4115minute tasks to 2-hour tasks in just 7 6:44months. That's how fast things are 6:46moving. And yes, is the 30-hour task 6:49that I described about rebuilding Slack, 6:51is that an outlier when it is compared 6:53to two hours? It absolutely is. But the 6:56overall tide is coming in and we've 6:58moved from 15 minutes to 2 hours very 7:00quickly. And so Julian's argument here 7:02is essentially the number of hours AI 7:05can work autonomously is so tightly 7:07correlated to economically useful work 7:09that we should view them as the same 7:10metric and we should not use other 7:14metrics. Now, he doesn't go as so far as 7:15to say we shouldn't use other metrics. 7:17I'm going so far as to say I don't think 7:18a lot of other metrics are useful. And I 7:20think that he's correct here. What 7:21separates real progress from bubble hype 7:23is that we can prove with an independent 7:26organization like MBTR that that 7:28duration has doubled every 7 months for 7:31a while and we are on a 7-month doubling 7:33track going forward. It's not just that 7:35this is after the fact spin. It's it's a 7:38forecast that we have made since the 7:40beginning of the year. I remember 7:41talking with researchers. I've remember 7:44talking with others in the AI space who 7:46were saying back in January, we are 7:48seeing a doubling curve every six or 7:51seven months for AI on autonomous tasks 7:54and we expect it to continue through 7:572025 into 2026. That is a falsifiable 8:01prediction which is a big deal because 8:02you can say well it didn't come true. In 8:04this case they made the call it has come 8:07true. We are seeing exactly what they 8:10predicted that the ability of autonomous 8:13agents to do work is extending and 8:15extending and extending. And by the way, 8:17this is not just about claude. I know 8:18Julian works at Anthropic, but Codeex 8:20also is capable of very longunning 8:22workloads. I saw a story on X where 8:26Codex was asked by a researcher to do a 8:2860-hour task and was able to do it 8:30autonomously for 60 hours. Again, just 8:32one anecdote. I'm not saying it's going 8:34to score 60 hours on the METR test, 8:37which technically measures the human 8:39equivalent amount of time, not just the 8:41AI amount of time, but the point is sort 8:43of the tide is coming in, right? We see 8:45longer and longer autonomous tasks being 8:47something that we can do. Now, a skeptic 8:49would argue, look, Nate, you just talked 8:51about SW Labs. You talked about how you 8:53can game tests. Maybe AI labs have 8:55optimized for this test. But OpenAI just 8:58released a completely different 8:59evaluation called GDP val which is 1,300 9:03plus real work tasks across 44 different 9:06professions and they were graded by very 9:08experienced professionals not by open AI 9:10and they could not tell if they were 9:12rating human or AI work. It was a double 9:14blind test and the result was the same 9:16pattern of exponential improvement on an 9:20entirely madeup exam that did not exist 9:22when the models were trained. In other 9:24words, two independent measurement 9:26systems, one created entirely after the 9:29models it was measured and it showed the 9:31same doubling pattern. You're measuring 9:33something real. And that's sort of 9:34Julian's argument. Julian actually 9:35specifically praised OpenAI because they 9:37published GDP val even though Anthropics 9:40model is the one that did the best, 9:42right? And Julian rightly said that's a 9:44sign of integrity from the OpenAI team 9:46there. Right? They they're not trying to 9:47trumpet their own model and say it did 9:49the best. They're saying in this case 9:51Opus 4.1 did the best on that test. I'm 9:53sure, you know, wait a month and we'll 9:54see. Now, you might think, having talked 9:57about benchmarks a lot, that what I'm 9:59saying is that we should trust 10:00benchmarks, but I've said pretty clearly 10:03benchmarks are not the thing to pay a 10:05lot of attention to. And so, what what's 10:06the solution there? Well, the answer is 10:08you want to pick benchmarks that are not 10:11easy to game. And that's why I've called 10:13out GDP valr. 10:16And Julian calls both of those out as 10:17well. It is absolutely possible to 10:21optimize for public leaderboards on non 10:26real tests on tests that are easy to 10:28gain and that is what we see in the real 10:31world performance on GDP val of Gro 4 10:34and Gemini 2.5 Pro both of which have 10:36topped many public leaderboards and both 10:38of which perform poorly on GDP val's 10:41real world tasks and this it's an 10:43example of goodart's law right when you 10:45optimize for a metric you gain the 10:48metric instead of building the 10:49capability. And so what Julian is trying 10:51to argue for is that we want to build 10:52meaningful capabilities. We want to 10:54measure them in a meaningful way and we 10:55want to show they do real work. So if we 10:57step back here and we look at the whole 10:59measurement conversation, I think where 11:01this leads to is we need to be having 11:05much simpler conversations about 11:07measurement. We need to be talking about 11:09whether agents can do useful work much 11:12more than we do. And we need to be 11:14talking much less about whether a model 11:16has scored a one or a two or a three or 11:18four on some public leaderboard 11:20somewhere because it doesn't matter. And 11:22I'm really grateful for Julian to Julian 11:24for saying it because it it's really 11:25true. Overall across the technical side 11:28of things, Julian also doesn't see a 11:30wall. And he talks about it in a fair 11:32bit of detail. He talks about the idea 11:34that you can pursue reinforcement 11:37learning and continue to grow by looking 11:40at massive amounts of human written 11:42text. And that's true. And that's really 11:44important for forming good models. And 11:46one of the things that he calls out is 11:48that this sort of body of excellent 11:50human text like scientific papers or 11:53highquality books is something that 11:54enables anthropic in particular to 11:58pursue pre-training that gives both 12:00efficiency and safety benefits which is 12:02a fancy way of saying if you start with 12:03a strong clean corpus of human knowledge 12:06you actually don't necessarily have the 12:08sort of contamination issues you have if 12:10you like throw red in there right which 12:12is something that is actually largely 12:14gone from a lot of the frontier models. 12:16Now people don't know that but Reddit is 12:17mostly been purged out because it didn't 12:19end up being a high source of a high 12:21quality source of data. It's not 12:22completely gone but it's like 1 2% of 12:24the of the total. So another technical 12:26point that he made that I think is 12:28really significant 12:30is he went back to his own experience 12:32when he was uh building machine learning 12:34systems in 2016. He was part of the Alph 12:37Go project which built at Google an AI 12:40system that could play go which is a 12:41game that is harder to play than chess. 12:44And there's an infamous story, if you're 12:46familiar with the history of AI, called 12:47move 37. And Julian talks about it 12:50because it's important for us today. So, 12:52at the time, Alph Go was trying to learn 12:54to beat the best human players. And it 12:55had not done so. And then at one point 12:57in a real life game with a real life Go 13:00Master, it played a 37th move that was 13:03considered a mistake by all of the 13:06commentators. It violated basic strategy 13:09and everyone thought it was a disaster. 13:10But later as the game progressed, the 13:13masters realized that Alph Go understood 13:16the game on a level they didn't. And it 13:18was something that led to the win, the 13:20the game being won. And the masters then 13:22realized that actually the move 37 was 13:25brilliant. And so Julian talks about 13:28this and this idea of a move 37. And 13:31what he suggests is that as we start to 13:33extend the length of time agents can do 13:37work, the amount of useful work they can 13:38do, we are going to get to a point where 13:40there's a move 37 moment uh for AI. And 13:44we don't know what that will be going in 13:45where we inherently can't. But it is 13:48something 13:49that we need to be prepared for. We need 13:52to expect something like maybe a Nobel 13:56Prize scientific breakthrough in 2027, 13:592028 somewhere in there where AI can 14:01search a solution space faster and more 14:04effectively than human intuition and we 14:06get something that would not have been 14:08possible at all that is world changing 14:10because of AI. The last technical piece 14:12I want to talk about is this idea of an 14:14implicit world model. And Julian talks 14:16about that a fair bit. One of the things 14:17that I think is really significant is 14:19modern LLMs do something similar to like 14:22the alpho system that I described, they 14:24predict consequences well enough to plan 14:27multi-step solution. And so one of the 14:30things that Julian talks about in the 14:31podcast is that if you can plan 14:33multi-step solutions effectively enough, 14:36as early AI did with Alph Go, planning 14:39moves as uh chess models did, this was 14:41back in the day by the 1990s, uh Deep 14:43Blue, right? The suggestion is if LLMs 14:45are doing the same thing. If they're 14:46able to plan multi-step solutions 14:48autonomously as agents and conduct work 14:50for many hours, at that point, it's not 14:53really just next token prediction. It's 14:56using predictions to search possible 14:58action sequences and then to plan 15:01strings of lines. I do think gaming as a 15:03metaphor helps here because you can 15:05actually see the strings of moves ahead 15:06in Alph Go or in chess. And it's a 15:09similar thing, right? the go board game, 15:10the chess board game all depend on the 15:12right moves in the right sequence and 15:13understanding those sequences helps you 15:15unlock strategy. And that's what we 15:17learned as we build AI systems that can 15:19play those games. Well, in a similar 15:20way, we are now learning that LLMs get 15:23good enough at next token prediction 15:24that they can predict possible action 15:28sequences as a whole. And that's a 15:30significant shift that we are starting 15:32to see that I think is going to really 15:33come out in 2026. 15:36Okay, 15:38let's look at what the timeline looks 15:40like here. So, one of the things that 15:43Julian is big on when he talks about 15:44falsifiable claims, he wants to make 15:47claims that are falsifiable and put his 15:50own skin on the line. I'm just going to 15:51say them back and let you sort of think 15:53about them. But he's he's thinking AI 15:54will be working full 8 hour days without 15:57human intervention by mid 2026 and will 16:00be matching human expert performance 16:02across many industries by the end of 16:04this next year in 2026 and routinely 16:06exceeding human experts by the end of 16:082027. And so what he's saying is in the 16:10next 12 to 30 months we can prove him 16:13wrong or we can prove him right. And 16:14really all he's doing is drawing those 16:16straight lines on a graph. And so if we 16:18ladder up, what does this mean? Number 16:21one, it means that we need to think very 16:23carefully 16:26about where we want to deploy AI 16:28systems. So I I want to suggest that you 16:31should think about this as where are you 16:34strong on task versus where are you weak 16:37on task because AI is going to be at a 16:40point very soon where AI can pick up the 16:42things you are weak at. I already see 16:44this with solo founders. they're able to 16:46have AI pick up their weak spots in ways 16:49that solo founders have traditionally 16:50just had to suffer through. And I think 16:52that's going to be true for other 16:53workers as well. Another example that I 16:55think is important is that we need to 16:57start deciding how we want to talk and 17:01think about the idea that X more work is 17:04capable of being done because it's not 17:06necessarily clear that the 10x more work 17:09is only going to be valuable to a few 17:12people. We can choose to be 10x more 17:16productive for ourselves. We can choose 17:19to jump in where we want to and excel at 17:22our skill sets in ways that were not 17:24possible because we have AI assistance. 17:27And so I think that one of the things 17:28that is hard to do is to talk about what 17:31it means as a society. But one of the 17:33things that's easy to do that you and I 17:34can do right now is talk about what it 17:36means for us. Right? If you're a 17:38business owner, what does it mean to 17:39think about how you empower your 17:40employees and help them to do 10x more? 17:43If you are a employee, what does it mean 17:46to think about the ability to spread out 17:48your wings and sort of have a mech suit 17:50and do a whole lot more than you could 17:51before and have confidence in that 17:53because at roots we go back to the idea 17:55of you do your strengths, you're going 17:57to have expertise that is really, really 17:59deep in a particular area. And it's 18:02Julian argues this and I agree. It is 18:04the human AI collaboration around that 18:06deep AI expertise that unlocks value. 18:09And so I don't see a world of wholesale 18:10replacement. And I think it's 18:11interesting that this researcher at 18:12Anthropic is not arguing for it either. 18:15He sees a world of human AI 18:16collaboration as well, but we need to 18:19get ready for that, right? We need to 18:20think about what that means. I say this 18:22all the time, but the preparation window 18:25for all of this is closing fast, right? 18:26If you can draw flat lines on a chart 18:28and you're going up, like, we have to 18:30get ready for this now. And that's one 18:31of the things I talk about all the time 18:33on this podcast. You have to get ready 18:35now. Like there's not another time. like 18:36it will not get easier if you wait 6 18:38months. If I can leave you with one 18:40thing about this whole AI bubble 18:41conversation, keep in mind that every 18:44cloud provider out there, Microsoft, 18:46Amazon, Google, all of them are doing 18:48everything they can to get GPUs in the 18:52door for demand for AI. The demand is 18:55there. If the demand is that high from 18:57businesses, it is hard to argue that we 18:59are in a bubble. In fact, on October the 19:0228th when Amazon laid off 30,000 19:05workers, there is a direct line from 19:08that, not to AI and automation. People 19:10are going to make this claim that it's 19:12all about AI and automating your work 19:13away. No, it's a direct line to the 19:16freeing up of cash to buy more GPUs for 19:19more cloud compute because the demand 19:21for AI from other companies is so great 19:23that the capital expendit expenditure 19:25ratio at these big cloud companies is 19:27getting out of control. And so they're 19:28trying to bring down their fixed costs 19:30and that means salaries. So it's not 19:33that AI is automating away roles. The 19:35existing people are more stressed. It's 19:37that you need to free up cash in order 19:40to have money to buy GPUs to finance the 19:42demand for AI from businesses. That's 19:45not a bubble. It is a real hard time for 19:47the 30,000 people that were cut at 19:50Amazon. And there may be other stories 19:51like that as well. But I want to get the 19:53narrative really clear. That is not an 19:55AI automation story. That is a story 19:57about securing cloud compute to deal 19:59with surging demand. People are just 20:01making financial decisions to reduce 20:03fixed costs for the public markets in 20:04the next quarterly report. And that's 20:06just what companies do. So summing all 20:08of this up, Julian doesn't think we're 20:10in a bubble. I don't think we're in a 20:11bubble either.