Will Million‑GPU Clusters Arrive?
Key Points
- Industry leaders agree that a one‑million‑GPU cluster is unlikely to appear in the next three years, citing a forthcoming reset in ROI expectations that will drive more pragmatic scaling strategies.
- AI companies have historically chased scale by amassing ever more data and compute, a formula that has fueled massive growth in data‑center demand and projected $250 billion in infrastructure spending by 2030.
- Experts warn that usable training data is approaching saturation, meaning larger models no longer benefit proportionally from additional data and prompting a shift toward higher inference‑time compute.
- This evolving landscape will shape discussions about the next wave of AI developments, including the role of agents and the path toward artificial general intelligence.
Sections
- Skepticism About Million‑GPU Clusters - Experts on the Mixture of Experts podcast argue that a one‑million GPU cluster is unlikely within three years, foreseeing a reset in scaling expectations and a shift toward more rational, energy‑aware AI infrastructure.
- Questioning the Scale-First Paradigm - The speaker debates whether to keep relying on ever‑larger data and compute for machine learning—recognizing past successes and personal enthusiasm for the engineering challenges of scaling—while urging a reassessment of the motivations and limits of this approach.
- Compact Distributional Learning for AI - The speaker argues that AI should emulate animal‐like efficient learning by building and updating compact causal distributions instead of relying on massive observational data, advocating a return to reinforcement‑learning–style approaches exemplified by AlphaZero and AlphaFold.
- Debating the Limits of AI Scaling - Panelists argue that merely increasing model size won’t sustain performance gains, questioning whether scaling has already hit a wall and if 2025 will mark the point where larger models cease to be the primary driver of AI advancement.
- Brain Size, Architecture, and Human Intelligence - A neuroscientist explains that human cognitive superiority stems from a distinctive mix of brain architecture, scaling, and environmental factors rather than sheer brain size.
- Inference Compute Scaling & Overbuild - The speakers argue that as businesses realize ROI from AI inference, demand will drive massive investment in inference hardware, leading to a temporary over‑building of data‑center capacity similar to early internet infrastructure.
- Integrating AI Agents into Products - The panel debates hiring dedicated AI sales teams versus embedding AI agents directly into products, stressing seamless integration, realistic expectations, and the gap between current capabilities and lofty promises.
- Beyond Prompt: Building Controllable AI Workflows - The speaker cautions that relying on lengthy prompts to direct AI agents is unsustainable and argues for explicit control points, system‑ and model‑level rules, and bounded autonomy to achieve robust, reliable real‑world automation.
- Practical LLM Agents for Workflow Automation - The speaker argues that realistic, task‑specific LLM agents—not full AGI—will become increasingly useful for automating spreadsheet and other application workflows over the next year.
- AI Headlines, AGI Definition, and Progress - The speaker argues that current AI hype eclipses the unclear definition of AGI, emphasizing incremental, domain‑specific integrations such as coding assistants while cautioning that true general intelligence remains distant and its development will not follow a simple linear trajectory.
- Debating AGI Timelines and Hype - The speaker questions optimistic AGI predictions, contrasting genuine belief with possible marketing hype, using financial market analogies and referencing Anthropic’s Dario Amadei.
- Realist Concerns Over LLM Impact - Panelists discuss the dangers of unrestricted LLMs, the debate over AI timelines, and how customers weigh adoption benefits against existential worries.
- Responsible AI Strategy for Enterprise - The speaker stresses that enterprises must adopt AI with robust protocols, safety measures, and governance, pressuring model providers to build responsible tooling rather than merely accelerating development.
Full Transcript
# Will Million‑GPU Clusters Arrive? **Source:** [https://www.youtube.com/watch?v=GP4UrwbzLT8](https://www.youtube.com/watch?v=GP4UrwbzLT8) **Duration:** 00:39:07 ## Summary - Industry leaders agree that a one‑million‑GPU cluster is unlikely to appear in the next three years, citing a forthcoming reset in ROI expectations that will drive more pragmatic scaling strategies. - AI companies have historically chased scale by amassing ever more data and compute, a formula that has fueled massive growth in data‑center demand and projected $250 billion in infrastructure spending by 2030. - Experts warn that usable training data is approaching saturation, meaning larger models no longer benefit proportionally from additional data and prompting a shift toward higher inference‑time compute. - This evolving landscape will shape discussions about the next wave of AI developments, including the role of agents and the path toward artificial general intelligence. ## Sections - [00:00:00](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=0s) **Skepticism About Million‑GPU Clusters** - Experts on the Mixture of Experts podcast argue that a one‑million GPU cluster is unlikely within three years, foreseeing a reset in scaling expectations and a shift toward more rational, energy‑aware AI infrastructure. - [00:03:08](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=188s) **Questioning the Scale-First Paradigm** - The speaker debates whether to keep relying on ever‑larger data and compute for machine learning—recognizing past successes and personal enthusiasm for the engineering challenges of scaling—while urging a reassessment of the motivations and limits of this approach. - [00:06:12](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=372s) **Compact Distributional Learning for AI** - The speaker argues that AI should emulate animal‐like efficient learning by building and updating compact causal distributions instead of relying on massive observational data, advocating a return to reinforcement‑learning–style approaches exemplified by AlphaZero and AlphaFold. - [00:09:15](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=555s) **Debating the Limits of AI Scaling** - Panelists argue that merely increasing model size won’t sustain performance gains, questioning whether scaling has already hit a wall and if 2025 will mark the point where larger models cease to be the primary driver of AI advancement. - [00:12:17](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=737s) **Brain Size, Architecture, and Human Intelligence** - A neuroscientist explains that human cognitive superiority stems from a distinctive mix of brain architecture, scaling, and environmental factors rather than sheer brain size. - [00:15:23](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=923s) **Inference Compute Scaling & Overbuild** - The speakers argue that as businesses realize ROI from AI inference, demand will drive massive investment in inference hardware, leading to a temporary over‑building of data‑center capacity similar to early internet infrastructure. - [00:18:25](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=1105s) **Integrating AI Agents into Products** - The panel debates hiring dedicated AI sales teams versus embedding AI agents directly into products, stressing seamless integration, realistic expectations, and the gap between current capabilities and lofty promises. - [00:21:50](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=1310s) **Beyond Prompt: Building Controllable AI Workflows** - The speaker cautions that relying on lengthy prompts to direct AI agents is unsustainable and argues for explicit control points, system‑ and model‑level rules, and bounded autonomy to achieve robust, reliable real‑world automation. - [00:24:53](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=1493s) **Practical LLM Agents for Workflow Automation** - The speaker argues that realistic, task‑specific LLM agents—not full AGI—will become increasingly useful for automating spreadsheet and other application workflows over the next year. - [00:28:02](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=1682s) **AI Headlines, AGI Definition, and Progress** - The speaker argues that current AI hype eclipses the unclear definition of AGI, emphasizing incremental, domain‑specific integrations such as coding assistants while cautioning that true general intelligence remains distant and its development will not follow a simple linear trajectory. - [00:31:25](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=1885s) **Debating AGI Timelines and Hype** - The speaker questions optimistic AGI predictions, contrasting genuine belief with possible marketing hype, using financial market analogies and referencing Anthropic’s Dario Amadei. - [00:34:33](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=2073s) **Realist Concerns Over LLM Impact** - Panelists discuss the dangers of unrestricted LLMs, the debate over AI timelines, and how customers weigh adoption benefits against existential worries. - [00:37:39](https://www.youtube.com/watch?v=GP4UrwbzLT8&t=2259s) **Responsible AI Strategy for Enterprise** - The speaker stresses that enterprises must adopt AI with robust protocols, safety measures, and governance, pressuring model providers to build responsible tooling rather than merely accelerating development. ## Full Transcript
Will we see a one million GPU cluster
opening up sometime in the next three years?
Kate Soule is a director of technical
product management at Granite.
Kate, welcome back to the show.
What do you think?
No, I really don't think so.
Anthony Annunziata is director
of AI Open Innovation.
Anthony, welcome to the show for the first time.
What's your take?
I don't think so either.
And then we've got a very special guest,
Naveen Rao is VP of AI at Databricks.
I think our first external
guest on Mixture of Experts.
Naveen, what do you think?
Unlikely.
Um, I think there will be a reset in
terms of, uh, expectations, ROIs, and
that's probably going to drive a little
more rationality into building this out.
All right.
All that and more on today's Mixture of Experts.
I'm Tim Hwang welcome to Mixture of Experts.
Each week MOE brings you the insights you
need to navigate the ever changing, ever
unpredictable world of artificial intelligence.
Today we're going to be talking about
2025, what the future holds for agents,
what the future holds for AGI, but first
let's talk about the future of scale.
AI companies have basically been chasing scale.
Unless you've been living under a rock, that
won't be something that's unfamiliar to you.
And kind of where that has been most
prominent has been in data centers and power.
McKinsey just came out with a report that
estimated that global demand for data centers
could triple by 2030 with generative AI
driving huge increases in energy consumption.
And, you know, their estimate,
which is mind boggling, right, is
that spend will be 250 billion.
Um, for this infrastructure by
2030, um, and so I guess maybe Kate,
maybe I'll kick it to you first.
Can you give our listeners a little bit of
intuition for like why all these companies
are chasing scale and why that's been
important to the history of AI so far?
Yeah, sure thing, Tim.
So if you think of how these models
have trained and evolved over time, it's
basically been a really simple formula
of taking as much data as you can get.
Adding as much compute as you have
access to and training a model for as
long as you can afford it in order to
maximize performance and send that out.
So, you know, to date, the recipe for
scale has been a mixture of getting
more data and getting more compute.
And obviously that's going to continue
to drive, uh, costs and potentially
drive demand for data centers.
I think there's going to be some
interesting things that start to emerge,
though, that are going to maybe break
some of the trends that we've seen.
For one, we're just running out of data.
Uh, we're seeing all the data is being, you
know, used no matter the model size, it's
no longer scaling proportional to size.
And there's only so much, uh, data
out there that's worth training.
We're also seeing a lot more compute
being starting to spend at inference
time instead of just training time.
So as we continue to max out what we can
bake, pre bake into the model as it starts
to train, we're starting to see are there
other places, like when the model runs
inference, that we could spend some extra
compute to try and boost performance.
So that also might start to
break some of those trends.
That's great.
Well, Naveen, maybe I'll turn to you because
I know in your opening comment when we
were talking a little bit before the show,
you were saying that, hey, look, you know,
maybe scale is not all you need, right?
And that we're going to have to kind of
really evaluate how we do machine learning.
And I guess maybe to kind of take Kate's
comment there, you know, like, why shouldn't
we believe like it's been working so far?
Like, why shouldn't it keep working?
Basically, like, it feels like, you
know, we had these huge successes, just
kind of doing the dumb thing, which
is add more data and add more compute.
Um, why is, why is now different?
Right?
Like, you know,
well, I think you also got
to look at the motivations.
I mean, I was a scale
maximalist for a long time.
I mean, I, I started the first AI
chip company, uh, back in 2014, and
we built it for scale from day one.
It was designed to be a
scale out sort of a thing.
And, uh, I'll offer a different explanation.
I mean, yes, everything Kate said is
correct, but there's also another motivation.
As an engineer, it's a really freaking
cool problem to scale something
bigger and bigger and bigger.
It's just cool.
And I've been
seduced by that myself.
Like, oh, this is cool.
I want to build that, you know?
And like, there's, there's interesting
challenges that get presented each time.
Like the, you know, the, the,
the latency starts to matter.
How do I deal with that?
Can I come up with new strategies?
So.
It's one of these things that's like,
it's like an intellectual pursuit.
I'm like, I'm going to keep going
bigger and bigger and bigger.
And you know, it, it is a cool problem.
It is a fun problem, but at some
point you've got to solve problems,
not just for their own sake.
And, uh, and I think that's
what we've come to come to now.
Like Kate said, we have run out of data, but
also the paradigm is simply trying to train on
more data isn't going to yield more results.
And I'm happy to go into why, um, because.
These, these things are essentially
conditional probability estimators, and
you can never uncover every conditional
probability in the data you have.
You will, you will, I've said it many times,
you will get to the eventual heat death of the
universe before you will uncover all of those.
So I think there, there is always going
to be some, um, return from getting bigger
and more data, but like it's diminishing,
uh, for real world applications.
So you need a new paradigm.
Yeah, for sure.
And do you want to talk a little bit
about what you think that new paradigm is?
I mean, in some ways, it's like
the multi billion dollar question.
But, you know, while we're speculating around
2025, kind of curious if like you've got
intuitions on, okay, if not this, like it
actually turns out the data is failing us,
which is kind of a crazy thing to say, but it's
like, well, what, what comes after data, right?
Data is kind of what we know in
some sense, if you think about
like trying to train these models.
Yeah, I think there's several facets to it.
Okay. So I'll say on the algorithmic
side, like It's intuitive.
If anyone's been around, you know, a
child learning or even trying to train
an animal, um, you don't train it
through exhaustive, um, observation.
You don't put a kid in front of
every observation of how to do a task
and then expect them to learn it.
That's exactly, that's what we're
asking right now of an AI model.
So, we actually do it through a trial
and error of actually performing
something, getting a reward or an anti
reward for, uh, for, for performance.
You're talking reinforcement learning.
Yeah, I mean, that's, that's a big part of it.
We do some form of reinforcement
learning with neural networks.
Uh, to be clear, uh, it's kind of a
weak version, but it is, it is there.
So this concept does exist, but it's also
predicated on this huge, you know, set of
distributions that has been trained upon.
And what animals tend to do is
actually be much more efficient.
They, they observe some, they build some, some
baseline distributions and then they act and
update these distributions kind of all at once.
So I, I think something towards
that end is going to be the answer.
There's no doubt in my mind it has to work
that way, because this way we can be much
more compact with our representations.
We can actually discern causality.
Causality may or may not exist from
a physics standpoint, but the reality
is, it's a more compact way to describe
how the world tends to work, right?
And so I think this is something that
has to be uncovered, uh, in our models.
We can't make it just hugely observational.
It's not going to work.
Yeah, that's super fascinating and
it's actually kind of funny to think
that like the recent history of deep
learning is kind of out of order, right?
Like I remember in the AlphaGo era, it was
like everything's going to be reinforcement
learning and then that kind of just
like sort of petered out as all of these
other approaches kind of had success.
But almost you're saying like we kind of got
to get back to that, like that's actually true.
I actually think AlphaFold and AlphaZero
were very much on the right track.
I think we didn't have the
scale part figured out yet.
Um, but honestly, I think
it was the right approach.
I'd present also a complementary
perspective, which is maybe a simple one.
So research is hard.
Research in AI is hard.
When you find something that works, uh,
people jump on it and they run, right?
So what happened a couple years ago
is that, Um, and when that happens,
there's kind of irrational exuberance
almost, right, in the research community.
Sometimes we think decisions are made, uh,
more deeply than that, but sometimes you
just find something that works, and you
push it as hard as you can until it stops
working as well, or until other things catch
up, including, you know, costs and ROI.
100%.
Yeah, for sure.
And I think what's interesting, I
mean, your title is, uh, looking
at specifically open innovation.
And I did, do think that is one thing to talk
a little bit about is, you know, traditionally,
traditionally, and by traditionally,
I mean, like last 36 months, right?
Like a lot of the breakthroughs have
been, you have access to this really big
computer that no one else has access to.
And I guess, I don't know, Anthony,
if your predictions about how
these dynamics change, right?
Like if scale is no longer the thing that
really gets the breakthrough, are there
just more opportunities elsewhere now?
Like there's going to be more people who
can do you know, kind of advance the state
of the art here without necessarily having
to have access to a million GPU cluster.
Yeah, I think so.
Uh, just taking a little bit of what Naveen
was saying, um, innovation at the architectural
level, innovation at the feedback level,
innovation in how AI systems are built,
like there's a huge opportunity for that.
Um, In the open community and universities
in players that I think have been left
on the sidelines, but have struggled to
catch up with the scale story, right,
just because of the centricity of compute.
I think we're going to see even more of that.
I think it's really important.
I think the other side of it is the product of a
couple years of just pushing ahead really hard.
Is that we have great, um, open models
out there, uh, that are very capable.
And you've already seen a
flourishing of innovation with them.
But, um, there's a lot more to go
just with what we've built already.
And, and, you know, what's
going to continue to come out.
For sure.
I want to force the panel to make
some concrete predictions here, right?
I think one of the interesting things
about scale is there's always the dream.
If you like flip over a few more
cards, you know, maybe the model is
just going to get that much better.
And so like, it feels like this kind of
like the gas could run out of the scale
car Before we realize that kind of scale
is broken, but I'm kind of curious, like
is 2025 the year where scale sort of breaks
like we're just like, actually, it turns
out that this is not going to work anymore.
I think it already broke.
You think it already broke?
Why? Why do you think that?
Show me a bigger model than
gbt for no one built one.
And there's a good reason for it, right?
They probably have built one, but it
didn't do anything all that special.
Right? Right.
Uh, and I think that's been the
issue, is I think it's already
I asked for hot takes, and it feels like
you're really delivering for us in the opinion.
Yeah, there you go, right?
Yeah.
Show me something bigger than 1.
6 trillion parameters.
I mean, not to say that there won't be
a way that that does yield advantages,
but there's got to be more to it.
It's not the only ingredient, scale is not
the only ingredient, you need that plus
something else and maybe then you'll get some
super intelligence or whatever you want to
call it, but we haven't cracked that yet.
Is, uh, Kate, Anthony, I don't know if
you'd agree, is, is scale already failed?
Right? Like, are we, we're already living in
a kind of post scale world, basically?
I mean, I think there's an important
part of the story that we haven't covered
yet, which is part of the advantage of
scale right now is being able to then
boost the performance of smaller models.
So maybe the performance of how far we
can push the top of the spectrum has been
maxed out to some degree, just on pure
size alone, but I think there's still
a lot more to talk about in terms of.
how to scale the performance and the amount
of performance you can pack into fewer and
fewer parameters on the smaller model sizes,
using those large models as teacher models,
as synthetic data generators, as, uh, you
know, using them in our AIF workflows in
order to better improve smaller models.
So we've seen a trend, right?
If you look at what, you know, a model
could do last year, you could do that if
it took 70 billion parameters or a hundred
billion or a trillion parameters last
year, you can do many of those same tasks
in fewer than 8 billion parameters today.
I don't think we've maxed out that curve
of downsizing and packing more and more
performance into smaller and smaller models.
Yeah, the commercial dynamics of that are
really interesting because you know the rhetoric
has often been we're gonna train this massive
model And then we're gonna sell an API against
it right like basically that it's gonna be
an external phenomenon Okay, you're almost
presaging a world where like the each of the big
labs will have their gigantic model But it'll
kind of be for internal purposes almost it's
like for minting things The smaller models that
really are absolutely the commercial action.
I think there's like this huge competitive
advantage that model providers have simply
by having their own in house large model
to boost and create the smaller models
that everyone's actually going to use.
No one wants to run a trillion
parameter model for real tasks that
inference done as cool as it is.
It's cool.
Everyone wants to say they have it,
but no one wants to actually use it
in real world applications, right?
It's the smaller models that
will be much more cost effective.
I'll offer another set of data.
So I'm a neuroscientist,
uh, um, from grad school.
And, you know, I think I, I like to
look at biology as a, as a blueprint
for many of these things, because over
4 billion years of evolution, you know,
some, some interesting things came about.
And, uh, if you look at brains,
scale was not all you needed.
Uh, humans do not have the largest
brains in the animal kingdom.
You know, brains do scale with body size, so
blue whales have the largest brain by mass.
Um, it's actually very likely also more neurons.
Dolphins have very large brains as well.
So, uh, there are, and elephants.
So there are lots of mammals that have larger
brains than us, but clearly haven't had
the same impact on the world as we've had.
So, I mean, there are several reasons
for that, but I actually argue there's
some architectural differences.
in their brains that, that lead to this.
And we came up with the right, um, mix of
scale, architecture, and environment to
actually, you know, build human intelligence.
Yeah, I like that.
It's almost like the adage like super
intelligence is not all you need, right?
It's basically like yeah, like you might
have a huge brain, but actually its impact
may be actually quite limited in some ways
Yeah, yeah, that's a whole other topic
I'd love to love to dive into if you
want, but like I don't even know what
the hell super intelligence is, right?
Like how do we even define this?
I mean have some definitions, but I
think everyone's like talking about,
oh, it's a foregone conclusion.
It's happening in two years.
Like guys, we haven't even
solved regular intelligence.
You can't even define it.
So we will, we will definitely get to that.
I guess, Anthony, maybe I'll turn to you on
predictions and we'll close out this segment
is so, I mean, I would just observe, right?
Like the contracts to build these massive
data centers are happening now, right?
Like regardless of what's
happening in scale lands.
Hardware is data centers are certainly scaling.
So is it kind of like we're going to see
in 36 years or 36 months basically just
like These huge facilities just kind
of mothballs like we're gonna have big
empty data centers is kind of the future
No, I don't think that's going to happen.
I think there'll be some correction,
but I think it'll be a smooth correction
Also, I think what's really important
is we you know, we focused a bit on
the training part of scaling, right?
So the scaling of deployment, whether they're
medium sized models, small models, APIs to
big models or whatnot, will absolutely depend
on the availability of cloud data centers.
So I think the trend is reasonable,
maybe it's inflated a bit, but
I don't think it's going to, uh,
I would agree with that as well.
So if we look at, again, where there is
opportunities to add something that's not
scale into the equation to try and improve
and boost performance, I think we're starting
to see there's a lot more innovation that
we can do at a single model, regardless of
its scale at runtimes, allowing it to run
multiple times, generate multiple answers is
a very basic example, um, in order to boost
performance for any, any given inference.
And if that trend continues, then we have
a whole much larger population that's
going to be driving up inferencing costs.
And they only have to pay for their small
fraction or part of it versus training, right?
You have to get these big model providers
to dump tons and billions of dollars
into building these compute centers.
But if everyone can start to see that
lift and, you know, have their own, uh,
ROIs that they can take advantage of, I
think that's going to continue to drive
the investment at inference time compute.
Even more so than what we have today, you know,
any given API call could, you know, cost 10
times what it does today just because it could
be worth it from a performance gain perspective.
Yeah, for sure.
I think that'll be really sort of
interesting to see as you build these
big data centers being like, we're going
to do the mother of all training runs.
And it's like actually like
we need it for inference.
So, yeah, I think that's a whole scale AI
practically that, you know, really just started
and I think, uh, yeah, I fully agree with Kate.
There's some parallels to, uh,
the internet build out as well.
Like, I think a lot of, there was a lot
of talk around, around 2000 timeframe when
the stock market crashed, that, oh my God,
did we overbuild, you know, a bunch of
network infrastructure, blah, blah, blah.
And, you know, in the fullness
of time, none of that was true.
It was underbuilt, if anything.
Uh, but it, you know, it took a few years.
There was an overbuild for a short
period, like maybe two or three years.
Until all the demand caught up.
And I think you're absolutely right.
That's where we're going to end up,
probably, is like, it's not going to be
like these data centers are way, way fallow,
but Everyone's gonna, there's gonna be a
bunch of articles that say like everything
was overbilled, the bubbles burst, and
then in two years it'll all make sense.
Wouldn't that be a nice change?
High availability of GPU
compute at reasonable prices.
I think it's already true, honestly.
It's true, it's true.
Dream the impossible dream.
I'm going to move us on to our second segment.
Um, so if there is one word that
has characterized enterprise and AI
in 2024, it has been, uh, agents.
Agents, agents, agents.
Uh, even on this show, it's become a little bit
of an in joke that like, agents need to come up
at least once during the course of the episode.
And there is news out, that Salesforce, uh,
is planning on hiring one thousand salespeople
to support its push into the agents market.
Um, as we kind of get into November here
and start thinking about 2025, I just want
to ask, like, is the future really agents?
Like, are we going to continue to live?
Like, I'm going to have to
hear more about agents in 2025.
Um, I guess maybe Naveen, I'll kick it
over to you first, is like, how do you
think this market's going to evolve?
And are we, are we about to like, is hiring
a thousand salespeople justified here?
Well, honestly, no, um, because I know the state
of the art of where agents are, but, you know,
it's a great headline and that's what they do.
I mean, Salesforce is great
at this and it's fine.
You know, it's going to make them try to appear
as more of a big AI player and that's what
they're going for, uh, with that statement.
So, uh, I think it'll serve their needs.
I don't think it's actually necessary
because when an agent's really good,
when an agent really works, you're not
going to have to do much to sell it.
Honestly, it'll just automate
things, but we're not there yet.
And I think that's where.
The hype is a little bit ahead.
And again, it's going to be one of these things.
It'll be a big disillusionment
in the next two years.
And then it'll come back slowly and it'll
actually be super useful in three or four years.
That's kind of how this is all going to work.
So you have to join us November 2025.
And then maybe then Naveen
will be like, eh, maybe.
Yeah, exactly.
But I, but I think, you know, it's not,
you're not going to need a thousand
people just to focus on agents.
It's just going to, it's going to be
something that's going to be amazing for
the products and people will use it and
their, and their, their sales infrastructure
should be able to handle such a thing.
I mean, at Databricks, we have similar problems.
You know, we've actually decided, should we
hire, like we, we gone through this, we hire
a whole bunch of people to sell AI, or should
we try to like layer it into the product?
We've actually done a mix.
We haven't hired a thousand,
but we have hired some people.
And, uh, you know, it comes with mixed
success because What you need to do is
really integrate it into how people use
the tools and make it somewhat invisible.
And then it will sell itself.
Yep.
For sure.
Anthony, I see you nodding vigorously.
I mean, maybe I can just ask you to go a little
bit more into almost like what Naveen said is
like, almost like the promises right now are
not necessarily matching up with where we are.
I'm curious if you've got thoughts on
like where the gaps are at the moment.
Yeah.
A few thoughts.
So look, first, you know, what is an AI agent?
What is an agent in general?
Like there's a large
spectrum of what that means.
I think if you look at some of the
announcements, like the one you referenced,
uh, the use of agent is kind of, uh,
a relatively early version in terms
of the level of automation and, you
know, task automation and execution.
So like.
If by agent we mean, you know, a chat
experience that has a bit more of a lookup
and, you know, search capability and the
ability to ask sort of questions to get
the right data and things like that, right?
A little bit more interactive, a little bit
more, you know, kind of implicit reasoning.
I think we've already seen that.
I think that'll, you know,
steadily and incrementally grow.
If instead we're talking about an agent, like
give it a goal and it'll go off and interact
with a large variety of systems and execute
without any supervision, like no way, right?
There's so many steps with compounded error
across that whole, that whole environment.
Like we can't even get high accuracy
basic Q& A in many industry domains yet.
Like no way we're going to get that
level of automated agent execution.
So I think like, like any story,
there's a piece of it that's,
that's valid and true and will grow.
I think there's a long tail of research
that has to be done to get this kind of full
fruition of uh, what an agent might mean.
Yeah, for sure.
And it's kind of funny.
I mean, one of the adage about AI is always
like, we don't know what we're talking about.
People just say AI to refer to everything,
like, do you mean linear regression?
That's not AI, you know?
And I think that like, almost, it sounds
like Anthony, you're kind of arguing
that that's almost happening in the agent
market where it's kind of like the word
has become so broad that it's like, Yeah.
You know, are you, are you just talking rag?
Because like, if that's the
case, then sure agents exist.
That's right.
I agree.
It's the, there's, there's a definite
stretching of the definition here.
Okay. Maybe I can ask you to jump in.
I mean, so are you ultimately, I
guess, you know, because Anthony
had these two pictures of the world.
One is like, is it just a chat
bot that looks things up for you?
And the other one is like, you tell
the agent to do something and it does
the whole thing in the real world.
Are you kind of an optimist?
Like, do you think we're going to
get there to that second vision?
Or is that going to be like way,
way off from your point of view?
Yeah.
So, I'm pretty skeptical of the broad
definition of agent as it exists today, you
know, an agent is really just a long prompt
right now, it's like a multi page prompt
where you're asking a model very nicely
to do five different things and to always
think in a specific order and to call APIs
a specific way, and it works pretty well.
Pretty well, but you know is not controllable.
There's no real thought yet in my mind on
like what are the control points that need
to be inserted along an agentic workflow
in order to have any degree of robustness
and reliability deployed out in the world.
And You know, ultimately, I think that there's
a lot of work to do to transition from here's
a four page kind of word vomit of everything
I want an agent to do and it goes off and does
the thing to here is a very controllable program
that I've executed that has very clear rules,
some of which are at the system level, some of
which are at the model level that can go out
and execute a series of tasks within a certain
degree of Freedom, um, not, uh, not unlimited.
And I worry that right now everyone's just so
amazed that if I give a model four pages worth
of instructions, it can do a reasonable job.
It can do, I mean, I can't read four
pages worth of instructions, remember
everything I'm supposed to do.
So like, well, yeah, it's really impressive.
And I see a lot of excitement
and hype being built around it.
But if we're not careful, like we're just
going to keep going down this road of how do
I cram more and more instructions into this?
prompt for what I want the model to do
and not really focus on like what are
the control points needed for AI enabled
workflows to be automated out in the world
and does chat even need to be a part of them?
Agent also I think really connotes like having
a conversation or a dialogue and I think a
lot of the opportunities for AI and where
we're going to be incentivized to build AI
are not necessarily chat based and so there's
I think just a lot of Evolution that's going
to be needed for agents to really find their
their application and actually get traction.
Yeah, it's pretty interesting to hear that
like I think like that did that the chat thing
actually might be totally just this kind of
Mistake of history and like the long term
evolution of this stuff is like Actually, it's
like kind of a bad interface for this stuff.
Well, I, I agree.
If I'm writing an email, I don't want
to, like, talk to somebody multiple
times about what the email should have.
I want to, like, have just a short
little, you know, box I put some
info in and an email comes out.
I mean, chat has been an obsession
in AI for decades, right?
Like, it's like a life definition.
Yeah. And these kinds of things.
Eliza.
That's what I was thinking.
That's right.
Yeah. Well, again, it's a little bit
like Naveen was saying earlier.
I mean, it feels really cool.
That's actually a really
strong motivator, for sure.
It is.
Um.
And I think part of this is also that,
uh, we haven't built those models
that do what I was saying, right?
About actually trying to uncover, causality.
You can't build something that
has quote unquote agency unless it
understands the, the, the intrinsic,
uh, causal nature of the world, right?
I do this and that happens.
Like, these models don't have that.
They, they basically can pick up on
patterns and extract different sorts
of patterns, but they actually don't
understand this causal relationship.
So, Naveen, I know you're on
the show for the first time.
I'm trying to, I'm starting to get
a sense of your vibe, which is that
you're, you're grumpy about AI.
Uh, I'm wondering if I can kind of push you in.
I'm actually very hopeful.
I think it's actually, I devoted my last
15 or 18 years of my life to this field.
I'm not grumpy.
I'm just a realist.
Well, I think in the spirit of
realism, can I push you on your
predictions around agents in 2025?
But like, what's the bull case?
Like, what do you think the most
impactful thing on agents is going to
be in the next 12 months, if anything?
Yeah, I think, uh, if you basically narrow
the definition a bit, we actually get
something that's, that's super useful, right?
To be clear, an LLM, the thing that
can summarize and do all the things
that they do, is actually super useful.
It doesn't mean it's AGI or whatever.
I, I kind of hate that term, uh, but it,
it, it is something that is super useful.
And I think, um, you know, as Kate said,
the interface is not necessarily a chatbot.
Like, what I want is something that when
I'm in an Excel spreadsheet, I want it to,
like, You know, impute values or describe
things or, you know, there's so many ways
that you can add value to those experiences.
That's what we can do now.
And so, uh, being able to automate, okay, I
want to like copy all these cells and then,
you know, apply this formula across the rows.
You know, there's all these
kind of tasks that we do.
If I could just say, hey, do this for
me, that's an agentic workflow, if you
will, but it's not thinking on its own.
It's I'm telling it what to do.
It just has to carry it out
within the framework of the app.
So I think that's what we're going
to, we're going to see more of.
And you know, inside Databricks,
we're seeing a lot of this now.
In fact, we've been using, um, LLMs and, you
know, generative AI to improve the experience
of Databricks itself, like actually finding bugs
in your SQL code or You know, and actually be
able to fix it for you or propose a fix for you.
These kinds of things
actually are big time savers.
So I think that's what we're going
to see in 2025 is more of that.
It is going to drive demand for
compute and everything, but it's
not, you know, super intelligence.
That's what I'm grumpy about.
Maybe if you want to put a point on
it.
I'm glad you said that, Naveen, because
it's always a good sign when a panelist
is like, I really dislike that term.
Moving us on to the third segment
of today, let's talk a little bit
about superintelligence and AGI.
This is the last segment I kind
of wanted to focus on, uh, just
as we kind of look towards 2025.
And of course, it's part of the
narrative of where AI is going.
Um, the information reported out, uh, that,
uh, OpenAI is seeing sort of the rates of
improvement in GPT kind of slowing over time.
And, um, I thought, I think I caught earlier,
there's an interview with Ilya, um, where he
basically said, Hey, you know, maybe, maybe
this is like actually progress is slowing down.
Um, and I think I just kind of wanted to put.
Those kind of rumblings or concerns,
uh, next to some of what we're hearing
from leaders in the industry, right?
So Sam Altman did a blog post where
he predicted that superintelligence
is potentially a thousand days away.
Um, Anthropic recently warned that,
you know, these systems are advancing
so quickly, we need serious kind of
targeted regulation in the next 18 months.
Um, and so, You know, I guess maybe like
Anthony, I'll kind of kick it to you first
is what are we to make of this right like is
is AGI on the way is this kind of like how
do we kind of square a lot of what we've been
talking about this episode, which is, you know,
it's going to get harder with, I think, kind
of like pretty some pretty strong claims that
like, hey, we're about to have ultra powerful
systems, you know, in the next thousand days.
Look, we're talking about it.
So the headlines work, right?
Like, it's a compelling topic.
It, uh, attracts the public's attention.
It's like the superhero obsession
or whatever you want to call it.
Um, I think a lot of it is that, right?
Look, where are we today?
I don't even know what a working
definition of AGI is at this point.
Um, I can propose my own, but I think what
we're gonna start to see really matter is
more and more ways that AI is integrated and
embedded and helps in specific contexts, right?
So Naveen mentioned some, certainly coding
assistants, embedded coding assistants,
um, have made a lot of progress.
It's kind of an early set of use cases.
We've seen lots of utility.
We'll see a lot more of that.
Um, look, in terms of like, when does AI
reach, Some level of general intelligence,
uh, even if we take a definition of that
being like, you know, equivalence to
human capacity to not only, you know,
know things, but to reason, to perceive, I
mean, that's a very long way off, I'd say.
Yeah, I mean, I don't know if it's all, I guess
it depends on what you define as long way off.
Um, I think we will get there.
Uh, it's just going to take, it's
going to be harder than we think.
So everybody.
Our perception is very linear.
It's like, okay, this thing has been going and
every year it gets better and better and better.
So then by two more years, we're
going to have this other thing.
That's not actually how these technologies seem
to evolve and that's never really been true.
And so we always underestimate or
overestimate the technology in the short
term, but underestimate in the long term
because they actually work on exponentials.
So 10%, 5 percent improvement a
year on year, um, of something.
actually adds up a lot, you know, very
fast once you get to like year seven.
So I think what we're going to see
is in 10 years, we very well might
have something that does reason and
actually does understand causality.
My prediction has been between,
I think, by 30 years, it's a 95
percent chance we will solve that.
By 10 years, I think it's
like a 30 percent chance.
So that's kind of my bounds are
10 to 30 years from now, but I
think that's not that long, right?
So
like you're kind of saying, like, you
know, there's, there's people alive
today who will definitely see that.
Yeah, totally.
Right. I mean, uh, which I think is very cool.
Uh, but It's not something
that's going to happen next year.
I think that's just a hype
train, to be honest with you.
We haven't solved fundamental problems yet.
We will see around that precipice a
year ahead of time pretty clearly.
And right now, it's not super clear.
So to me, it doesn't feel credible to say that.
Well, I think we're also
conflating a lot of things.
Like, Cause and effect and causal understanding
versus super intelligence like there are
Causal models that are out in the world
today that can help break down and isolate
cause and effect relationships and things
particularly in like drug discovery That are
are widely used, you know So, are we just
talking about can we get the models to better
understand causal reasoning or are you talking
about sentience and in, like, every stretch
of the world and having a model that has a
personality and, you know, goes off and does,
you know, things of its own, uh, own will, so
to speak, and I think that In general, those
aspirations are really more around marketing,
and I don't think there's even necessarily
the right economic incentives to develop that,
versus developing, you know, cause and effect
reasoning, but developing better tools for
handling language and doing different tasks.
Absolutely.
Um, you know, I think in the next, three
to ten plus years, um, is more realistic.
Yeah, there's this adage in, um, kind of
financial markets that like the market be, can
be irrational longer than you can stay solvent.
And I was joking with a friend
recently, it's like, AGI can be imminent
longer than you can stay solvent.
It's like, it's just around the
corner, everybody believe me,
it's just around the corner.
Um, I guess, Anthony, maybe to go
back to you, I mean, I, I want to
kind of challenge the idea that it is
potentially just all marketing, right?
I think, I think one of the really
interesting comments, uh, that came out of
this, uh, kind of essay that Dario Amadei,
who runs Anthropic wrote, we were talking
about a few episodes ago, you know, he's
writing about the future of AI and how it's
going to change the world and everything.
A lot of people say, ah, marketing.
And, you know, some people kind of
looked up, you know, like his writings
from when he was like a grad student.
And he's like still writing
about this stuff, right?
Um, and I do think that that is kind of
an interesting thing that I would love
to kind of get your thoughts on is like,
it almost feels like in order to be able
to look past all of the current problems
with the technology, you almost kind of
have to be a true believer in some sense.
And like, in some ways, like, I
actually don't know if it is marketing
coming out of some of these companies.
Like, I think they do genuinely
believe that it is imminent.
I don't know how you think about that.
I think AI is going to change the world.
I think it's going to change it incrementally,
practically, and pretty quickly.
And it already is.
Um, in all the practical ways we've
talked about specific applications,
integrated with software, integrated with
capabilities that we want assistance with.
Um, no, I wouldn't say that
people are like disingenuous.
Uh, I just think that there's this cultural
kind of continued obsession with, you
know, intelligent, super anything, right?
Uh, and it's interesting and it's fun.
Another kind of more negative side of that is,
you know, uh, the whole existential debates that
have hopefully started to, to die back, I think.
Uh, but you saw that like, you know,
a year and a half ago, especially like
really, really, uh, with a lot of heat.
Um, Yeah, I'd say that it's just kind of so
natural and attractor, like it's hard not
to bring it up, but, uh, look, I don't know,
maybe I'm just too much of a pragmatist.
I just try to focus on all the ways that AI is
actually helping and will help, like every day,
like on this podcast probably before too long.
It's just to me, that's how the world
changes, not with some super intelligence.
Well, and I think what's interesting,
uh, is, I agree with you.
I don't think there's this
they're being disingenuous.
I think people really
believe it and that's fine.
Um, but we want to pull back
and contextualize a bit.
Like, do you care that the airplane
was invented in 1903 instead of 1910?
Does it really matter?
It doesn't, right?
I mean, these were splitting hairs a little bit.
Like, why am I right?
And someone else is wrong.
It actually doesn't matter.
Like, I think if it's It's three years,
like as Dario says, or it's 10 years.
If you look back at 50 years,
it doesn't matter, right?
Right.
Because of exponentials, you know?
So I think it's okay.
It's okay that we're exuberant and we believe.
I also think some of the anthropics, uh,
warnings, so to speak, uh, and the need for
safety and better understanding of these
problems aren't necessarily just predicated
by the arrival of super intelligence.
Dumb intelligence can be pretty dangerous.
And if it's out in the world, right, uh, if
we're starting to give LLMs, all these API
calls and ability to impact the world and
pull real data into their decision making.
So, you know, I, as we talk about it being
genuine, I think from that perspective.
It absolutely is true, uh, and something that
everyone should be aware of regardless of is
this quote AGI or superintelligence or not.
Yeah, for sure.
Yeah, I think that these last few comments
are really interesting because I think all
three of you kind of would picture yourself
as like realists in the world of AI.
Um, but kind of where we're almost
landing is, look, we're all agreed this
technology is going to be a huge deal.
We're just hair splitting over whether or
not it's going to be 10 years or 20 years You
know, two, two months from now, um, which I
think is, is a pretty interesting outcome.
Yeah.
Um, maybe the final comment I'll, I'll
kind of throw in, because I'm just kind
of curious to get all of your thoughts on
this is, you know, to talk about realism.
Like all three of you are talking
to customers that are in the market.
People that need to just basically
wake up in the morning and be like,
is this technology going to be better
than what I currently use in my stack?
And should I implement it?
Like, do you hear from customers?
They're like, And by the way, Anthony,
should I be worried that this technology
is going to destroy the world?
Like, I'm kind of curious about how much
of this is kind of, sort of, chin stroking
media discussion, or how much of it actually
is, like, influencing actual enterprise
decisions and discussions happening on the
ground, or if those two are basically, like,
completely separate worlds in some sense.
Certainly.
Lots of customers are concerned and ask
questions about accuracy, about trust
in systems, about how to implement
specific use cases, right, with a high
quality of output that they can trust
in deployment, that they can trust.
Save money or make money on and not have
a big liability with and there's lots of
challenges across the board In all sorts of
domains right in health and finance and in
legal and all many many areas I hear very
little if any questions about Their you know
helping AI, you know, destroy the world, right?
These big existential kind of, if I deploy
AI, am I going to, you know, contribute to
the robot army that takes over humanity?
Like, none of that stuff, right?
It's very practical.
It's business focused as it should be.
Right?
That's what I hear.
That's so interesting to me is kind of
like, we think about the AI discussion as
being like kind of one block, but I think
in practice it's actually these like pretty
distinct, you know, kind of like fora in
which these discussions are happening.
And okay, Naveen, if you've got thoughts
on this, on like what you're hearing from
customers, and whether or not this AGI
stuff even kind of like registers at all.
Yeah, I mean, I think Anthony nailed it.
It's very practically grounded.
That being said, I think the motivations are
such that I don't want to be the one who didn't
jump on the train and made the company get
left behind, whatever that company is, right?
And so, uh, there's a lot of tops down, uh, push
for, for getting AI coming even from the boards.
I've spoken to multiple boards of very large
public companies and, you know, that this
is a discussion front and center there.
And it's really about this is the next
technology transition, we have to be part of it.
No one's really talking about like, is it
going to take over the world or whatever.
It's just like, how do we, how do
we craft a strategy such that we
are, we are part of this new world?
Yeah, I'd echo both of those statements and I
think Overall, what I'm really optimistic about,
honestly, is that a lot of the conversations
with enterprise is about how do I take
advantage, but how do I also make sure I've got
the right protocols, the right control points,
the right safety measures in place, because, I
mean, their bottom line is ultimately at risk
with the deployment, and I think that provides
a lot of really helpful and healthy pressure
on model providers to develop the solutions that are needed for a more
responsible and governed approach versus a,
uh, just build as far and fast as you can.
And so I think that is going to ultimately
help us create a lot of the, the
tooling that's needed so that it isn't
necessarily the, the concern that, you
know, AGI is going to take over the world.
We will have hopefully built the right
controls and processes in place to be
able to have a very well governed world.
AI system.
Yeah, I can't think of a better
note to end on than that, Kate.
So, thank you.
Um, so I'm going to wrap us up for today.
Uh, Kate, as always, thanks
for coming on the show.
Really appreciate it.
And, uh, Anthony and Naveen, we hope to have
you on the show in the future, hopefully.
Thanks for joining us.
If you enjoyed what you heard, you
can get us on Apple Podcasts, Spotify,
and podcast platforms everywhere.
And we'll see you next
week on Mixture of Experts.