Generative AI Takes Center Stage at IBM Think
Key Points
- IBM Think’s research keynotes introduced a “new wave of computing” that expands beyond classical and quantum paradigms to include generative computing models.
- The conference announced the launch of Watsonx Orchestrate, delivering more than 150 enterprise‑ready AI agents for immediate use.
- The keynote’s light‑hearted moments—such as mascot penguin Arvind marching across the stage—showed a playful side that resonated with the audience.
- Tim Hwang’s Mixture of Experts podcast highlighted related AI news, including a New York Times story on AI hallucinations and recent OpenAI organizational moves.
- Kate Soule revealed her new book, **AI Value Creators**, which expands on generative AI strategies discussed at Think, and offered a free download link for podcast listeners.
Sections
- Generative Computing Takes Center Stage - Panelists highlight IBM Think’s research keynotes, celebrating the debut of generative computing, the launch of over 150 enterprise‑ready AI agents on Watsonx Orchestrate, and the light‑hearted mascot moment that energized the audience.
- From Prompt Engineering to Programmatic AI - The speaker critiques massive, brittle essay prompts as unsustainable, advocating the adoption of software‑engineering abstractions and clear control flow to integrate LLM capabilities into maintainable, production‑scale systems.
- Flexible Hybrid AI Model Overview - The speaker highlights a modular approach to AI, describing IBM's new Granite hybrid expert models that are memory‑efficient, fast, and support long context lengths as a complement to larger models.
- Rise of Hallucinations in Reasoning Models - The hosts discuss a recent New York Times article highlighting a surge in hallucinations among newer reasoning AI models, reference model‑card data showing the trend, and acknowledge they lack a clear explanation for why it’s occurring.
- Persistent Hallucinations in AI - The speakers discuss the ongoing problem of AI model hallucinations, questioning optimistic predictions of a near‑term fix and concluding that hallucinations are likely to remain a recurring challenge despite future techniques.
- Managing LLM Hallucinations in Business - The speakers debate how hallucinations affect different downstream applications, argue that reliability needs depend on use‑case, and acknowledge that hallucinations will persist despite research advances.
- Grounded AI and Truth Constraints - The speaker argues that as models get smarter their factual reliability drops, calling for new hybrid architectures that explicitly enforce truth constraints and noting the rumored $3 billion OpenAI acquisition of Windsurf as evidence that AGI hype may be more marketing than substance.
- Wraps vs Integrators in AI - The speakers debate whether emerging AI firms are merely GPT “wrappers” or genuine system integrators, noting OpenAI’s dominance in model building, the scarcity of robust integration, and how this perspective supports a $3 billion valuation.
- Vertical Integration and AI Moats - The speakers argue that as AI models become commoditized, firms will pursue competitive advantage by building end‑to‑end, vertically integrated ecosystems that generate high switching costs, likening future AI companies to Apple’s hardware‑software model.
- Next Week's Mixture of Experts - The host announces that the upcoming episode of the Mixture of Experts series will air next week.
Full Transcript
# Generative AI Takes Center Stage at IBM Think **Source:** [https://www.youtube.com/watch?v=5M-VD9F1W7A](https://www.youtube.com/watch?v=5M-VD9F1W7A) **Duration:** 00:27:56 ## Summary - IBM Think’s research keynotes introduced a “new wave of computing” that expands beyond classical and quantum paradigms to include generative computing models. - The conference announced the launch of Watsonx Orchestrate, delivering more than 150 enterprise‑ready AI agents for immediate use. - The keynote’s light‑hearted moments—such as mascot penguin Arvind marching across the stage—showed a playful side that resonated with the audience. - Tim Hwang’s Mixture of Experts podcast highlighted related AI news, including a New York Times story on AI hallucinations and recent OpenAI organizational moves. - Kate Soule revealed her new book, **AI Value Creators**, which expands on generative AI strategies discussed at Think, and offered a free download link for podcast listeners. ## Sections - [00:00:00](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=0s) **Generative Computing Takes Center Stage** - Panelists highlight IBM Think’s research keynotes, celebrating the debut of generative computing, the launch of over 150 enterprise‑ready AI agents on Watsonx Orchestrate, and the light‑hearted mascot moment that energized the audience. - [00:03:05](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=185s) **From Prompt Engineering to Programmatic AI** - The speaker critiques massive, brittle essay prompts as unsustainable, advocating the adoption of software‑engineering abstractions and clear control flow to integrate LLM capabilities into maintainable, production‑scale systems. - [00:06:11](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=371s) **Flexible Hybrid AI Model Overview** - The speaker highlights a modular approach to AI, describing IBM's new Granite hybrid expert models that are memory‑efficient, fast, and support long context lengths as a complement to larger models. - [00:09:18](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=558s) **Rise of Hallucinations in Reasoning Models** - The hosts discuss a recent New York Times article highlighting a surge in hallucinations among newer reasoning AI models, reference model‑card data showing the trend, and acknowledge they lack a clear explanation for why it’s occurring. - [00:12:22](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=742s) **Persistent Hallucinations in AI** - The speakers discuss the ongoing problem of AI model hallucinations, questioning optimistic predictions of a near‑term fix and concluding that hallucinations are likely to remain a recurring challenge despite future techniques. - [00:15:30](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=930s) **Managing LLM Hallucinations in Business** - The speakers debate how hallucinations affect different downstream applications, argue that reliability needs depend on use‑case, and acknowledge that hallucinations will persist despite research advances. - [00:18:36](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=1116s) **Grounded AI and Truth Constraints** - The speaker argues that as models get smarter their factual reliability drops, calling for new hybrid architectures that explicitly enforce truth constraints and noting the rumored $3 billion OpenAI acquisition of Windsurf as evidence that AGI hype may be more marketing than substance. - [00:21:45](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=1305s) **Wraps vs Integrators in AI** - The speakers debate whether emerging AI firms are merely GPT “wrappers” or genuine system integrators, noting OpenAI’s dominance in model building, the scarcity of robust integration, and how this perspective supports a $3 billion valuation. - [00:24:51](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=1491s) **Vertical Integration and AI Moats** - The speakers argue that as AI models become commoditized, firms will pursue competitive advantage by building end‑to‑end, vertically integrated ecosystems that generate high switching costs, likening future AI companies to Apple’s hardware‑software model. - [00:27:54](https://www.youtube.com/watch?v=5M-VD9F1W7A&t=1674s) **Next Week's Mixture of Experts** - The host announces that the upcoming episode of the Mixture of Experts series will air next week. ## Full Transcript
What's the most exciting thing to come out of IBM Think this year?
Kate Soule is Director of Technical Product Management for Granite.
Kate, welcome back.
Uh, what's your pick for IBM?
Think
My pick is the research keynotes.
We talked about a new wave of computing, so we've got traditional classical
computing, we've got quantum computing and at Think we announced a new way of
building with models generative computing.
It's really exciting.
Kaoutar El Maghraoui is a Principal Research Scientist and Manager
for Hybrid Cloud platform.
Kaoutar, welcome back.
What was your favorite?
My favorite was also the generative computing part, but also the launch
of a lot of a AI agents and, uh, at what's our watsonx Orchestrate
platform, over 150 enterprise ready
AI agents.
That's really huge.
Yeah, that is huge.
And we will talk about that.
And finally, last but not least is Skyler Speakman Senior Research Scientist,
Skyler watching, uh, the conference.
What was your favorite?
Yeah, a non-technical take on this is just how much fun they
were having during the keynote.
Arvind marched a mascot penguin across the stage and the crowd loved it.
Uh, so it was really cool to see people having fun, um, up on stage
during his keynotes, penguins, agents, and programming all that.
And more on today's Mixture of Experts.
I am Tim Hwang and welcome to Mixture of Experts.
Each week, MOE brings together the smartest, most talented, most wonderful
experts in all of artificial intelligence, uh, to talk a little bit about the
biggest news, uh, in the sector.
And this is a big episode.
We've got a lot that we need to talk about as per usual, a really
fascinating story coming outta the New York Times about AI and hallucination.
A bunch of news coming out of OpenAI, uh, in terms of its corporate organization
and its recent acquisition of Windsurf.
But first, uh, I wanted to start with IBM Think, which was the
big IBM conference of the year.
Tons and tons of announcements and things to go through.
But I think, uh, the one that was most important to me, of course, was that
I, and I do wanna start with is Kate, I realize, uh, you have a book coming out.
I. That was also kind of announced that IBM Think so maybe I'll
just start there for the plug.
Yeah, no thanks Tim.
So we did, uh, release a book.
I've got it here with me.
It's called AI Value Creators.
Really excited, uh, to be able to share it more broadly.
A lot of what we talked about at Think particularly, uh, in some of the future
looking sessions like on generative computing, we actually have whole
chapters dedicated to, in the book.
It's really all about how can, you know,
folks looking to not just build with generative AI, but kind of
build a competitive moat with generative AI, get the most value
and, and invest in strategic places.
So really, really excited for folks to check it out.
We actually have a download link for all of our Mixture of Expert
listeners, so we'll include that in the show notes and would love any,
uh, feedback the team has, uh, as they, they read through the content.
That's great.
And Kate, I guess for those who are kind of just getting their head
around generative computing.
What's the general concept there?
Do you wanna give us like a little bit of a flavor of how, you know, it
sounds like it's a big part of the keynote, it's a big part of the book.
Just kind of interested in how all these pieces are fitting together and
well, what is generative computing?
Yeah, so I think at the end of the day, it's really just trying to
bring some of generative AI back to the realm of computer science.
You know, if you look at how we've emerged building, uh, applications and agents with
LLMs today, it's all basically a form of
prompt engineering where we end up with these really massive, you
know, pages and pages of prompts.
We call them essay prompts in our book, where it can be
very difficult to maintain.
These prompts are very brittle.
You look at how they're written, they're kind of like over optimized and force
fit for a specific model, and it's just not very, uh, sustainable, secure.
There's all sorts of issues.
If we think about how we build in a more computer science forward
discipline, you know, there needs to be abstractions for key activities
that we want a model to take on.
And there needs to be ways to set, you know, clear control flow of how
we build programs versus, you know, instead of asking a model, first do
this, then do this, then do this.
You know, we can actually
build a lot of the same code.
We don't need to ask a model to do everything.
So it's really about how can we take some of these best practices from
software engineering and computer science and bring in all the power
that models have to be able to express, uh, natural language and run functions
in natural language and bring them together in a much more maintainable way.
Nice.
Yeah, that really is, I think the future is just like now moving
into like, how do we make this production, you know, at scale.
So it's very exciting to see.
Absolutely.
And I think there's a lot also that goes on when you start to build
things in a little bit more structure where you can take advantage of a
lot of techniques that are coming out in the field around inference
scaling and inference time compute.
So instead of running one big, massive prompt once, how do you
break it up into smaller parts, run multiple generations and use that to
create an even richer response often in far less time, far less compute.
Uh, and so all of that and more we, we really get into in the book.
That's great.
Yeah.
Well, I encourage everybody to check it out.
Um, I think the next one I want to touch on is Kaoutar.
You have already won the MOE award for mentioning agent first in the episode.
Um, but, uh, but it is genuinely exciting.
I mean, in some ways it's no surprise that IBM would be announcing a, a a a
kind of like product leap in agents.
But do you wanna talk a little bit about what's happening and,
and why you find it exciting?
Yes, definitely.
So IBM, you know, at Think introduced, you know, over 150 pre-built AI agents, um,
through the watsonx Orchestrate platform.
And I, I thought that's really huge, you know, enabling, you
know, basically enterprises to deploy AI driven workloads rapidly.
So these agents, they're.
They're designed to, to be kind of prebuilt, uh, uh, you can integrate them
seamlessly with popular enterprise tools like Salesforce and Workday and Adobe, and
allows, you know, businesses to automate tasks and enhance also productivity.
So.
And you know, this is, you know, kind of showcasing our approach, IBM's approach to
support the creation of custom AI agents.
I think, which is also very important, relying first on the Granite models
as well as models from Meta and Mistral.
So it's also modular approach that provides you flexibility,
that also facilitates, you know, tailoring, you know, your solutions
for diverse business needs.
I think that that was also very, very important.
Um.
So basically, you know, this flexibility that provides is not just about, you
know, one, you know, uh, one approach, but you know, you can integrate different
models, you know, in a, in a flexible and modular way and allows you also
to customize in addition to pre the prebuilt the existing AI agents that
you can just add and, uh, customize.
Yeah, for sure.
And I did wanna touch on that.
I mean, Skyler, before we talk about the mascot.
Which I do want to hear more about.
But, um, I guess, uh, Kate, uh, the mention of Granite, I guess
you've been name checked, so I do gotta kind of bring it back to you.
Um, there, I understand there is a announcement coming out about
Granite actually from IBM Think
so on Friday actually.
So we did a sneak, uh, preview.
We didn't tell anyone we were gonna do this.
We released a preview of our Granite 4 models, and we got to
talk about them a lot at Think.
That was also a really exciting part of the conference.
These models, if you, we can, um, post a link to the blog that that talks
about the new architecture behind them.
But basically they're a mixture of experts hybrid, uh, model.
So they are very fast, very efficient.
The tiny preview that we just released only takes 15 gigs of memory.
So, uh, even running, you know, 1 20 k context length
with multiple concurrencies.
So we think these models are gonna be really efficient and excellent
counterpoints to complement much larger models that are being deployed.
You know, having those bigger models and then the smaller efficient Granite
models working together hand in hand.
I really like the emphasis here on smaller domain specific and
also the energy efficiency.
'cause you know, if you see these models, they, you know, the, the sizes,
they range from three to 20 billion parameters as opposed to what you see,
like, uh, you know, trillion parameters or many billion parameters in the other,
in the open source or in other models.
So it's, it's, you know, the, the, the, the key thing here is, you know,
how do you build these things that are optimized for specific industries?
And offering cost effective and efficient alternative to the
larger general purpose model.
So I really like, you know, the, uh, focus on the efficiency here.
Yeah, for sure.
So, Skyler, uh, curious if you wanna tell us more about the mascot, but I
think in general, like, I, I thought what was very striking about your response
was you're like, it's so much fun.
Uh, which I think is actually like an important part of all this.
Um, but exactly.
To kind of hear what you saw.
Yeah, I know.
I think that just sort of captures it.
They had kind of this transition from having these Ferrari race.
Car team members up on stage talking about how they're using IBM Tech.
And then there was this, uh, pivot to IBM's relationship with Red Hat,
and of course, Linux more broadly, and a penguin mascot just starts
walking across the back of the stage.
Great.
So hats off to whoever had that planned.
Maybe it was last minute.
Maybe that's been someone's dream for a, for a year.
I don't know.
But I thought it was, uh, I thought it was well done.
Yeah, for sure.
And I do like, it's like one of the things I'm really fascinated by is
like how all the companies that are kind of in the AI space are kind of
coming up with their own brands about how they present AI stuff, right?
Like some companies are very serious and some companies are very
technical, uh, uh, like in, like, in kind of like a very granular,
kind of like almost academic way.
And it's, it's kind of fun seeing IBM kind of take like a certain
level of fun in terms of like how to present and talk about this stuff.
So it's very cool.
I'm gonna move us on to, uh, our next topic.
Um, super interesting article that kind of hit the New York Times, uh,
I believe this week or last week.
Um, focusing on sort of the kind of rise of hallucinations with, um.
the emergence of reasoning models.
Um, and we haven't talked about hallucinations on the show for a
little while, but obviously it kind of remains a sort of big question
and a big problem that people are sort of working on in the space.
Um, and I guess maybe Skyler, maybe I'll stay with you, is do you have
an intuition for why it seems so?
The article seem to argue that like reasoning models are like
newly hallucinatory in a way that we are learning to deal with.
And is that, is that the case?
And do you have an intuition for
why hallucinations themselves are not new?
Um, it does appear that they are on the rise.
There was this great contra position of they had asked, uh, you
know, a spokesperson for comment.
They said, no, they're, they're not on the rise.
But if you go and check the receipts and look at the model cards that
OpenAI also produces, you do see, o4 mini hallucinating more than
o3 and o3 hallucinating more than o1 is like definitely on the rise.
Yes it is.
Um, and but they're also very clear to say they don't know why and.
I, I'm also gonna draw a blank.
Sorry.
I'm not quite sure.
I don't have any really gut instincts as to why those are
increasing accuracies going up.
Uh, they're getting better at math, but hallucinations are also increasing,
so it is something that really does need a lot more attention paid to it.
Yeah, and I think this is one of the really interesting things is like, I feel
like the AI era is teaching us all the ways in which intelligence is very lumpy.
You know, like the model gets really good at one thing, but, and you kind of expect
that it'll be good at everything else in a well-rounded way, but like that kind of.
It doesn't seem to be the case.
Um, I guess, uh, Kate, like I'm curious if you've got
intuitions or similar like Skyler.
You're like, I, I don't know.
It's just weird.
Yeah, I mean, I will, uh, give, give my thoughts obviously.
I think there's a lot that's still left to be discovered,
but to me it seems like it's a
kind of classic example of just misaligned incentives.
So we've got, you know, these models are going through extensive reinforcement
learning pipelines in order to improve the model's verbosity among other
things to get it to say more and to try and craft these well-rounded
responses that humans will prefer.
And, you know, there is some degree of, you know,
any human likes to hear people who are persuasive speakers talk.
We're not very good at fact checking things, and we don't
naturally resonate with something that is just black and white.
The answer is X. We wanna know, y we wanna hear more and more thought,
and we question things less when hear that, uh, thought, um, process.
And that's a little bit encountered to a different objective function
that was originally solved for which is much more get the answer.
Exactly correct.
And that's how pre reasoning models were cer.
That was certainly the focus.
And so I expect there's just some, you know, misalignment in those objective
functions and we're trying to solve for a lot of different things and we're waiting,
having these really verbose thought processes that are much harder to check
for factual accuracy when that training data is created and that, you know,
just innately are going to promote having more chances to hallucinate in any
given response than you know the answer.
Is, the answer's x.
Kaoutar.
Are you, um, optimistic, uh, in the end with all this?
I remember a few years ago I was talking to a researcher who is like,
don't worry, and like 18 months
there will just no be, no more hallucinations.
We're gonna just crack the problem.
It's solved, right?
Like clearly there's gonna be less and less hallucinations
and it's just gonna be done.
And I guess kind of what's interesting about this article is almost the
idea that like hallucinations might be kind of like a thing that keeps
coming back as the technology advances.
Um, and I guess from where you're sitting, I mean, do you feel like yeah,
maybe in 2030, you know, we won't even be talking about hallucination anymore
'cause it's kind of a solved problem?
Or is this really something persistent that we're gonna be
dealing with for a long time?
Yeah, I think it's gonna be persists.
Uh, maybe they'll, we'll have, you know, I. Different techniques or
methods or maybe hybrid approaches where we need to do also factual check.
So what's happening here is these models they use probabilistic, not logic, you
know, uh, probabilities, you know, and not logic to predict these responses.
And reinforcement learning helps in math and coding, but also causes the model,
like Kate mentioned, to forget, you know, some of these al consistencies, you know,
to, you know, the, the reasoning models.
They take these multi-step approaches to the problem solving.
Each step introduces also this compound effect of hallucination.
So the tools today, they can't keep up.
So of course a lot of work in research to build tools to trace, you know, the
AI output back to the training data.
But these systems are very, you know, complex too, too
large to fully understand.
And the explanations even that are shown to the user sometimes
they really don't reflect the model's actual internal process.
So what are really these, the broad implications here?
So accuracy is kind of eroding here.
Even as the LLMs become more powerful in cognitive tasks, their grip on the factual
reliability, you know, is loosening here.
And of course this has a lot of enterprise concerns.
And so I think the challenge still remains unresolved.
You know, there's quite
many efforts from OpenAI, Google, DeepSeek, and others, there is no clear fix.
So hallucination appears to be, you know, kind of an intrinsic limitations
of the current model architectures.
So what I'm thinking is we need kind of hybrid approaches, not just relying
on the model, but see if we can.
You know, combine that with other systems to, to do these reasoning, symbolic
reasoning, combine them with symbolic reasoning systems or factual check-ins.
So hopefully that can kind of resolve these issues that we find.
Yeah, and I did wanna get into that as like, I mean, you Kaoutar, I think
you point out quite rightly, like from an enterprise standpoint, I'm
a company that's about to implement this stuff I'm reading in the New
York Times that like these great new models that people are trying to
pitch me on, like I hallucinate more.
I mean, Skyler, what's what's to be done, right?
I think.
Kaoutar is kind of throwing out like maybe we need more symbolic approaches, like
what is the kind of toolkit of things that we do to try to kind of deal with
this, particularly in a setting where, you know, a business is trying to
implement this, they need the reliability.
I think that point right there at the end is very important.
Which use case are these being built for hallucinations during your Google search?
It's annoying, but it's not, not game breaking.
Uh, using a tool in order to improve some sort of legal argument or medical
diagnosis, incredibly important.
So I, I think these, these hallucinations will always be with us.
Um, I did think it would be on a downward trend.
Tim, as you had said earlier, I am surprised there're going up because
there are teams of researchers working on this problem and
they seem to be falling behind the pace
the progress of the LLMs is if we're just kind of, you know, reading the
hallucination rates as they increase.
Um, so I think what's probably the most key important part here is
what's your downstream use case?
And, are hallucinations, game breaking in those.
Um, then, then there will be some serious pause about how you really
roll out AI into your workflows.
Um, if you're using it to, to speed up a, uh, internet query,
um, I think we're gonna have some entertaining hallucinations for
another five years to come yet.
And if I can make a plug for generative computing, like I think this is exactly
the type of thing we're trying to solve and to wrap our heads around
for real deployed use cases, how do we set up workflows so that it's not
just a model giving carte blanche to go and create tons of chain of thought,
do a bunch of actions, hallucinate some things, give a response back,
but instead, how can you have.
Very programmatic control steps with checks where you're validating the
outputs programmatically, uh, and where you really reduce the scope of
what the model does at any one point in time so that you can really try
and reduce your risks of hallucination and other safety issues and a keep.
Part to that is also bringing in additional layers of security.
So for example, we've got Granite Guardian models, which can detect hallucinations
in any grounded response or function call.
So there's all sorts of tools that you can start to layer in if you're
not taking what I call like the YOLO prompt approach where you just.
Create one big approach, one big prompt, throw it at the model and you know,
fingers crossed hope for the best.
But if you start to break this out, it takes a little bit more work to set up,
but it gives you so much more control over the risks and the performance at any
given part in the process that I think it will be, you know, really critical for.
Real life enterprise deployments.
Yeah.
I think this is still like one of the kind of funniest ironies I think of the
AI era is, you know, you've built a thing that's like, it's in the computer, but
it doesn't really behave like computing.
And like there's all this work now to kind of like put it back in the box and make it
behave like a more traditional computer.
'cause you need it for all sorts of like very practical, you know, reliability
reasons, security reasons, safety reasons.
Like there are prompts out there where it says in all caps, do not hallucinate.
Like that's not computer science.
Like this is, we've lost all, you know, uh, grounding to reality here.
That's not how computer science is done.
So we need to get to a better way of working.
Yeah.
It is the fact that we're seeing right now, the smarter these models
are getting at reasoning, the less we can trust them on facts.
So put in hallucinations, you know, they may require more than just reinforcement
learning as it is being used today.
So like, uh, Kate mentioned it, we really need new architectures and
new programing paradigm that really explicitly encode truth constraints.
On, or modular hybrid systems like that's combine LMS with verifiable databases
or symbolic logic engines, you know, and that's, you know, I think at the core of
what generative computing is trying to do.
I wanna move us to the last story of today.
Uh, it was announced, or rather it was leaked ultimately, um, that OpenAI
is about to make an acquisition of Windsurf, um, which is, uh, effectively
kind of a coating environment.
Um, and it would be the number that has been leaked is that the
acquisition would be $3 billion, right?
Which would make it the biggest OpenAI acquisition to date.
And obviously just like a gigantic acquisition, uh, in its own right.
Um, and.
You know, I guess maybe Kate, to go back to you, I like some people were saying
online that this is kind of like, in some ways like evidence that a lot of
this a GI stuff is marketing, right?
Because if you really believe that a GI was about to come about, why would
you spend $3 billion on, you know, essentially like a text editor with
like some AI components added to it.
Um, and, and so yeah, kind of curious about like how you size that up.
Like do you buy that argument, which is like, yeah, it kind of seems
like maybe opening eyes is speaking out two sides of its mouth here,
so.
I think OpenAI probably is speaking out of many different
sides of its mouth at all times.
But, um, I do think that it makes a lot of sense and I don't
think it's mutually exclusive.
Uh, so if you look at how OpenAI became the BM Methodist today,
they released a chat interface.
They found a UI that all of a sudden made their models relevant to the
mass consumers, and then they had
millions of people all of a sudden using that interface, generating
data that they use to bootstrap their way, like rocket ship their way
into really high performance models.
And I think what we're seeing is the killer use case of 2025 and probably
for a while is coding assistance.
And they don't have their own UI, their own access to developers in that arena.
So they're losing that advantage that
gave them this amazing starting point in position, and so I see it very much as
their, you know, and it makes total sense.
They would spend this type of money on it their way to try and regain some of
that advantage and to better understand how their users are using the models and
figuring out how to continue to improve the models moving forward.
Skyler, this is like a little bit of a weird outcome though, right?
Because I, I could have remembered when like Chat GPT first came out and everybody
was doing kind of like startups around AI, people were like, oh, well you're
just like a thin wrapper around GPT.
Or like, you know, that's not a real company, that's just a wrapper around
GPT, but like $3 billion, like, it doesn't really feel like these
rappers are, are quite valuable now.
Right.
And it's kind of almost like an inversion from what we thought,
you know, earlier in the game.
Um, is that the right interpretation?
I think
while we're talking about doublespeaker or talking about both sides
of your mouths, I think on one hand you can call it a wrapper.
I think another hand you can view Windsurf or some of these other,
uh, companies as integrators and OpenAI great at model building.
Um, but they haven't, as Kate's pointed out, they haven't really
integrated into other spaces.
They had a great chat bot interface.
Um, and I think while these models are continuing to grow, integration is
the complimentary scarce factor that's lagging behind and I, so yes, wrapper.
Or integrator, depending on which way you really view it.
Um, I do think, I, I do think OpenAI knows where it sits in
terms of the model building game.
Um, and they probably saw a bit of a, a bit of a weakness in their
own structure of how do we actually deploy this on people's machines?
That's not
a chat interface.
And so again, maybe thinking of this more as, uh, integrating systems into
the, uh, language models, uh, rather than a wrapper is probably why you
can up with a 3 billion as opposed to the, uh, just a wrapper I take.
Um, how, how it plays out.
We don't know.
Uh, but I do think there's this interesting take on the difference
between building models and then actually integrating those into workflows.
And this might be OpenAI covering its spaces on the ladder.
Yeah.
I love the idea that's kind of like a valuable wrapper is an integrator.
Yeah.
It's like, yes.
That's when, when once you get valuable enough, like that's what
you've transformed into, um, Kaoutar where's where does this all go?
Right. Because it kind of suggests.
Like this sort of vertical integration in the space where, you know, coding
assistance of obviously is like a really big use case as Kate mentioned.
And so it kind of makes sense that the model provider would eventually
kind of like get one of those, right.
And it would be vertically integrated.
Like I'm kind of thinking about like are there other domains you think
that an OpenAI might be interested?
Because I think what's interesting about AI right, is of course
that it can be applied across all these different domains.
And so it's kind of like, well maybe it's not gonna be a $3 billion
acquisition, but like where else could they be going, I guess.
That they might want to kind of create this sort of like, you know,
they both control the model layer and then also the application layer.
Yeah, that's a very good point.
And I think the, the example that's when surf shown showed us here is they build
this sticky developer workflow and, uh, additional trust layer over GPT.
Like, you know, what we all were referring to as the wrapper
and here OpenAI's reaction.
It's just not what they don't want just to own the model, but also the
developer experience in the ecosystem.
So it, it seems like we
enter in here a phase where these verticalized copilots, for
example, for finance, for law, for science, for medical, et cetera,
they're the new bottle ground.
And owning the UX layer is a very strategic approach here, and I think
that's what's, you know, it's a smart play that's open AI is doing, because
as the model layer commoditizes here.
The moat is the ecosystem and the developer tooling.
And especially as we are moving in more these agentic AI, this vertical
integration becomes very important if you really wants to have a strategic advantage
and be competitive in the marketplace.
Yeah, and I think it kind of leads to a world, um,
where it kind of feels like maybe OpenAI is gonna become, like, they're gonna
almost like take the Apple model, right?
Where like everything's vertically integrated, you know, they build the
hardware, they have like, you know, apps that are like, definitely their
apps and it's just kind of end to end.
Um, I mean, Kate, do you think that's gonna be the sort of future of AI where
you almost have like, kind of like some companies that are like Apple, other
companies that are just like kind of, you know, it's like the ThinkPad, right?
It's like a, a piece of a computer that you can run anything on?
No, I, I definitely agree.
And building on Kaoutar, I really like how you framed it.
As you know, we're starting to see
commoditization at the model layer.
And I think for a lot of, you know, tasks like coding assistance, we are
absolutely hitting a point where many models are gonna start to converge on
very similar layers of performance.
And so then how do you differentiate?
You make really high switching costs.
You may, or how do you develop your competitive moat?
Rather you make really high switching costs so that it's, once you're kind of
in the ecosystem, you're not gonna switch over to whoever's, you know, offering the
same offering for a few cents cheaper.
And from that perspective, I think OpenAI and I think other providers are
going to continue to invest in that.
And that's why it's really important we continue to support a robust
open source ecosystem in order to make sure that we have kind of.
Diversity of technology, of thought and ultimately are optimizing the
efficiency of generative AI and trying to continue to bring down costs and,
and push advantages and make sure that we don't just get kind of locked into
these, uh, single provider ecosystems.
Yeah, for sure.
Skyler, any thoughts on this?
An analogy I've heard once before was, I don't know, you go back 30 years and
people define their compute experience by what OS they used, you know, or you
Windows or your Mac, and then that's
it's converged and then it was what browser you use that identified your user
experience, and those have converged.
Um, right now we're in the space where people, you know, swear by one
particular, uh, LLM and I do think that will eventually converge as well.
There will be small nuances here and there, but at least from
a consumer perspective, I do see the some, uh, converging.
Um, so yeah, we've seen it before happening over technology
where that sort of decision defined your compute experience.
And then fast forward five years and you can see that actually a lot
of the options are pretty similar.
Um, I can see that sort of progression happening, um, here with
your chatbot of choice.
Yeah.
It kind of makes me think a little bit about, if you remember
that old commercial, like, oh, I'm, I'm a Mac, I'm a PC.
Yep. It's like I'm waiting for that commercial.
That'll be like.
I'm a, I'm an open, I'm an OpenAI coding assistant.
You know, like I'm an open source e coding assistant.
Uh, well more to come soon.
Um, as always, action packs a lot to cover, uh, way more to
cover than we have time for.
Um, but as always, thanks for joining us.
Skyler, great to see you again, Kaoutar, Kate, always great
to have you, uh, on the show.
And, uh, thanks to all you listeners.
Uh, if you enjoyed what you heard, you can get us on Apple Podcasts, Spotify,
and podcast platforms everywhere.
And we'll see you next week on Mixture of Experts.