NY Tech Week: AI and Quantum
Key Points
- Ash Minhas highlighted an IBM quantum‑computing event where participants accessed IBM’s quantum hardware via Qiskit and built an “8‑ball” circuit to generate random predictions.
- Anthony Annunziata announced a panel examining the business impact of open‑source AI, focusing on its value‑creation potential and unique advantages for enterprises.
- Sarah Amos described her IBM‑hosted masterclass on red‑team testing for multicultural and multilingual AI vulnerabilities, emphasizing hands‑on security practice.
- The “Mixture of Experts” podcast previewed upcoming discussions on major market reports (e.g., Mary Meeker’s analysis, Linux Foundation findings) and unusual behaviors observed in Claude 4.
- Attendees noted that New York Tech Week attracted a highly diverse, geographically dispersed crowd—including many students and long‑distance travelers—reflecting strong enthusiasm for AI career opportunities.
Sections
- NY Tech Week Highlights Unveiled - Panelists discuss quantum demos, open‑source AI business impact, and multilingual red‑team masterclass at IBM’s New York Tech Week.
- Upskilling, AI Spread, Real‑World Focus - The speakers emphasize the importance of professionals upskilling and sharing knowledge to steer AI toward concrete business applications, highlighting open‑source collaboration and the contrast between East‑coast emphasis on practical use cases and West‑coast hype‑driven talk.
- AI Business Shifts to Application Layer - The speaker highlights how global tech hubs like New York nurture innovation, while noting that the AI industry is transitioning from model hype to focusing on practical, last‑mile applications as the main source of value.
- Open‑Source AI Fuels Global Innovation - The speaker explains how increasingly accessible open‑source models are expanding AI experimentation and culturally tailored solutions worldwide, highlighting IBM’s AI Alliance and its recent launch in Vietnam.
- Open Source Dominance in AI - The speaker cites a Linux Foundation report revealing that 89% of organizations use open‑source components and 63% adopt open models in their AI stacks, suggesting that open source has effectively won the open‑vs‑closed debate.
- Open‑Source AI Adoption Challenges - The speakers discuss how impressive model performance is hindered by early‑stage, developer‑driven adoption that relies on open‑source transparency for customization, while also highlighting the resulting safety, bias, and fairness concerns.
- Debating Openness in AI Models - The speakers examine how open‑source AI models can enhance security and reduce costs, while wrestling with the ambiguous definition of “openness”—from transparent safety practices to the reality of closed model weights—and anticipate emerging norms to clarify the term.
- Unprecedented Cost Gap in LLMs - The speakers discuss a slide noting that training expenses for large language models are soaring while inference costs drop, raising concerns about profitability in what is increasingly seen as a commodity‑type business model.
- AI Safety as Competitive Advantage - A speaker argues that firms can differentiate and generate new revenue by prioritizing AI safety, standardizing evaluations, and building a safety ecosystem that adds high‑margin value layers above the core models.
- Trust, Safety, and Market Incentives - The speakers argue that as AI models proliferate and become more stochastic across supply chains, market pressures to boost adoption may undermine safety safeguards, echoing the “move fast and break things” lessons from social media.
- Towards Universal AI Model Standards - The speaker emphasizes that as interpretability advances, the industry and regulators will create common classifications and guidelines for AI systems—reducing disclaimer reliance and filtering hype—citing the recent Anthropic Claude release as an example.
- AI Models Are Not Magic - The speakers stress that large language models function as statistical next‑token predictors, not divine creations, highlighting alignment progress, uncertain data sources, and debunking myths about hidden script content like the Terminator.
- Human-like AI Reliability Concerns - The speakers explore how modern reasoning models exhibit increasingly human-like, sometimes unreliable behavior, raising questions about closing the value gap and ensuring dependable enterprise-scale deployment.
- Interface Design Impacts LLM Trust - The speakers argue that the way we present large language models—especially chat‑style interfaces—shapes user expectations and safety considerations, suggesting UI choices are as crucial as model behavior.
- Envisioning Future LLM Interactions - The speaker muses about LLMs possibly having “off days” to temper user expectations, reflects on training and inference costs, and imagines everyday, integrated AI experiences—ranging from calendar management to children conversing with everyday objects—shaping how different generations will engage with conversational models.
Full Transcript
# NY Tech Week: AI and Quantum **Source:** [https://www.youtube.com/watch?v=tU9Jal1-E6c](https://www.youtube.com/watch?v=tU9Jal1-E6c) **Duration:** 00:44:35 ## Summary - Ash Minhas highlighted an IBM quantum‑computing event where participants accessed IBM’s quantum hardware via Qiskit and built an “8‑ball” circuit to generate random predictions. - Anthony Annunziata announced a panel examining the business impact of open‑source AI, focusing on its value‑creation potential and unique advantages for enterprises. - Sarah Amos described her IBM‑hosted masterclass on red‑team testing for multicultural and multilingual AI vulnerabilities, emphasizing hands‑on security practice. - The “Mixture of Experts” podcast previewed upcoming discussions on major market reports (e.g., Mary Meeker’s analysis, Linux Foundation findings) and unusual behaviors observed in Claude 4. - Attendees noted that New York Tech Week attracted a highly diverse, geographically dispersed crowd—including many students and long‑distance travelers—reflecting strong enthusiasm for AI career opportunities. ## Sections - [00:00:00](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=0s) **NY Tech Week Highlights Unveiled** - Panelists discuss quantum demos, open‑source AI business impact, and multilingual red‑team masterclass at IBM’s New York Tech Week. - [00:03:05](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=185s) **Upskilling, AI Spread, Real‑World Focus** - The speakers emphasize the importance of professionals upskilling and sharing knowledge to steer AI toward concrete business applications, highlighting open‑source collaboration and the contrast between East‑coast emphasis on practical use cases and West‑coast hype‑driven talk. - [00:06:09](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=369s) **AI Business Shifts to Application Layer** - The speaker highlights how global tech hubs like New York nurture innovation, while noting that the AI industry is transitioning from model hype to focusing on practical, last‑mile applications as the main source of value. - [00:09:11](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=551s) **Open‑Source AI Fuels Global Innovation** - The speaker explains how increasingly accessible open‑source models are expanding AI experimentation and culturally tailored solutions worldwide, highlighting IBM’s AI Alliance and its recent launch in Vietnam. - [00:12:17](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=737s) **Open Source Dominance in AI** - The speaker cites a Linux Foundation report revealing that 89% of organizations use open‑source components and 63% adopt open models in their AI stacks, suggesting that open source has effectively won the open‑vs‑closed debate. - [00:15:21](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=921s) **Open‑Source AI Adoption Challenges** - The speakers discuss how impressive model performance is hindered by early‑stage, developer‑driven adoption that relies on open‑source transparency for customization, while also highlighting the resulting safety, bias, and fairness concerns. - [00:18:21](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=1101s) **Debating Openness in AI Models** - The speakers examine how open‑source AI models can enhance security and reduce costs, while wrestling with the ambiguous definition of “openness”—from transparent safety practices to the reality of closed model weights—and anticipate emerging norms to clarify the term. - [00:21:25](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=1285s) **Unprecedented Cost Gap in LLMs** - The speakers discuss a slide noting that training expenses for large language models are soaring while inference costs drop, raising concerns about profitability in what is increasingly seen as a commodity‑type business model. - [00:24:28](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=1468s) **AI Safety as Competitive Advantage** - A speaker argues that firms can differentiate and generate new revenue by prioritizing AI safety, standardizing evaluations, and building a safety ecosystem that adds high‑margin value layers above the core models. - [00:27:35](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=1655s) **Trust, Safety, and Market Incentives** - The speakers argue that as AI models proliferate and become more stochastic across supply chains, market pressures to boost adoption may undermine safety safeguards, echoing the “move fast and break things” lessons from social media. - [00:30:37](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=1837s) **Towards Universal AI Model Standards** - The speaker emphasizes that as interpretability advances, the industry and regulators will create common classifications and guidelines for AI systems—reducing disclaimer reliance and filtering hype—citing the recent Anthropic Claude release as an example. - [00:33:40](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=2020s) **AI Models Are Not Magic** - The speakers stress that large language models function as statistical next‑token predictors, not divine creations, highlighting alignment progress, uncertain data sources, and debunking myths about hidden script content like the Terminator. - [00:36:46](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=2206s) **Human-like AI Reliability Concerns** - The speakers explore how modern reasoning models exhibit increasingly human-like, sometimes unreliable behavior, raising questions about closing the value gap and ensuring dependable enterprise-scale deployment. - [00:39:51](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=2391s) **Interface Design Impacts LLM Trust** - The speakers argue that the way we present large language models—especially chat‑style interfaces—shapes user expectations and safety considerations, suggesting UI choices are as crucial as model behavior. - [00:42:56](https://www.youtube.com/watch?v=tU9Jal1-E6c&t=2576s) **Envisioning Future LLM Interactions** - The speaker muses about LLMs possibly having “off days” to temper user expectations, reflects on training and inference costs, and imagines everyday, integrated AI experiences—ranging from calendar management to children conversing with everyday objects—shaping how different generations will engage with conversational models. ## Full Transcript
What's the thing you're most excited about for this week's New York Tech Week?
Ash Minhas
is a Lead AI advocate.
Uh, Ash, welcome to the show.
Uh, what have you been seeing?
So, I, uh, went to a, an event here at IBM's offices, um, on quantum computing.
And actually I had a great time.
Because everybody in the room managed to get time on one of our quantum
computers using Qiskit, and we built this, uh, circuit that, uh, basically
emulates an eight ball and like, you know, sort of making random predictions.
It was really cool.
Anthony Annunziata is Director of AI Open Innovation.
Anthony, what will you be seeing this week?
Today we're hosting a
panel on the business impact of open source AI.
You hear a lot about open source AI from the technology perspective.
Today we're gonna explore its business impact, the value it
can deliver, and why it has some unique advantages for business.
And uh, Sarah Amos is Product Manager at, uh, HumaneIntelligence
Uh, Sarah, what will you be doing for New York Tech Week this week?
Yeah, so one of the most exciting things I did was actually host a
masterclass here at IBM, in which we had a whole bunch of people
conduct red teaming for multicultural and multilingual vulnerability.
All that and more on today's in-person episode of Mixture
of Experts, A Think podcast.
I am Tim Hwang, and welcome to a Mixture of Experts.. Each week, MOE
brings together the friendliest, most interesting and smartest panel of
technical experts, product leaders, and market analysts to talk about the
big stories in artificial intelligence.
We have a lot to talk about.
We're gonna talk about some really big market reports that have come
out of Mary Meeker over at Bond and the Linux Foundation will be talking
about some really weird behaviors coming out of Claude 4 per the, uh,
around the horn question.
I really want to start with New York Tech Week, which is this week, and one
of the reasons why we're here in person.
It's the largest New York Tech week ever.
And I'm kind of curious about like sort of the trends that you all have been
seeing as you've been going out there.
Maybe Sarah, I'll, I'll start with you.
'cause you actually taught a masterclass.
Curious about like what people are interested in, what people
are talking about, what's hot?
Yeah, so I think one of the things that struck me was just how much involvement
there was from folks coming out of town.
I even had a participant tell me that he had traveled over 4,000
miles just to come to New York Tech Week, which is pretty impressive.
We see geographic diversity, but we also see a lot of young folks, folks
who are either, uh, in their final stages of college or coming out of
college and looking for new jobs.
And obviously New York is, is an exciting place to be, but it's also, uh,
the idea that AI is such an important part of their future.
So that, that was the buzz that I was hearing all about the week.
Cool.
Yeah, I think the students are like a big part of this.
I keep seeing them at all the events and like, it's interesting how much...
Like AI itself has become like the thing that everybody wants to do when they
like, get outta college or whatever.
Um, and yeah, I'm kind of interested, I mean, you know, you may have heard, uh,
Dario Amodei CEO of Anthropic recently made these comments recently being like,
all the jobs are in trouble because of AI.
Kind of curious about how that's resonating among folks who are A,
graduating just now and then B, you know, really interested in this technology.
I mean, everyone's nervous with a headline like blood bath.
I mean, how can you not be Right?
It's very dramatic.
Yeah. That as in the words of Dario.
But, um, I think folks are still optimistic, um, wanting
to be part of that future.
And I think it's about trying to upskill themselves and
also teach others around them.
Um, because if they can catch this wave and also steer it towards
their own career goals, then it is very beneficial for them.
I think all of us as, as an industry and especially thinking about AI
alliance and open source efforts that IBM has championed is how can
we make sure that that innovation is spread around the world too?
Yeah, for sure.
I think I see you nodding.
I dunno if you wanted to get in here with a comment at all.
Well, I,
I mean, it's a great perspective,
Sarah.
I agree with all of it.
Of course.
Uh, yeah, maybe I'd add, uh, one or two things.
So from Tech Week in New York, being in New York, I think one
of the really healthy themes here is actually applying AI right?
In specific areas of business and beyond, right?
In finance and legal and advertising.
If you go to like conference on the West Coast, right?
You hear about tech for tech's sake, like very much you making it the
East coast, west coast thing, or.
You didn't try, but maybe, yeah, I didn't try.
What you hear here much more is like what people are doing with AI, like what
it needs to do in the real world, what it's doing, like specific use cases.
And I think it's really healthy and if you think about jobs and skilling and
impact, that's where most of the impact and changes in AI are gonna happen,
right on the front lines of using it.
Or something.
Mm-hmm.
Yeah, absolutely.
So Ash, if I can turn to you, I mean, you know, it was a little bit shocking
your answer because I feel like Quantum, we've been hearing about for such a long
time, and I think everybody always tells me, they're like, ah, but quantum's
like years away, we're never gonna be able to actually make it practical.
It's not really like a real thing, but it sounds like you actually
got to like play with a real.
Quantum computer kind of sounds like, right?
Yeah, yeah.
Um, that's right.
So our Kiki is like sort of, uh, online and open to everyone.
You can just go and Google it and look for, look for it and um, you
can actually get compute time on one of our com quantum computers.
And I think that was a real attraction to sort of the audience, was that you can
actually make a circuit and run it and watch it run and get output out of it.
And I mean, you know.
You to address your comment around, you know, is this real?
How far away is this?
I mean, if I'm doing that
in a lab during New York Tech Week in 2025.
I mean, and you just sign up to play with it.
It's like ridiculous.
Yeah. When you sign up to play with it, right.
Then that's, I mean, that's pretty real, right?
Mm-hmm. Yeah.
Yeah.
So how bullish are you coming outta that?
I mean, you know, I think the funny thing when we talk New York Tech
Week, it's like we're actually just talking about AI, right?
But it kind of sounds like here, I mean, there's other stuff going on
and I think it's so interesting.
I mean, Anthony, you kind of brought up.
Like this contrast of like, oh, you go to kind of West Coast, you know,
sort of AI events and it's like very abstract and it's like, look at
this crazy new model that we built.
But like here, it feels like there's a lot more of a culture in New York of like,
well, it's all about like application.
What are you actually gonna do with all this stuff?
Um, and I know you've been around the sort of East Coast tech world for a long time.
Do you think that's always been kind of the case?
Or is this sort of like changing or, I don't know if you, like, you feel
like the, the technical cultures are becoming more distinct with time?
A couple things.
Yeah. I'd say
that, um.
I'd say New York and most cities outside the Bay Area, California,
uh, are more kind of practically and kind of application.
Yeah. Dated about the Bay Area or anything.
No, I love it.
It's great.
No, it's great.
It's unique place.
Yeah, for sure.
For like, it's amazing and thank, thank God it exists, uh,
right uhhuh right for the world.
But at the same time, it's not the only place.
And I'd say like New York, like other places, London.
Paris, lots of places in the world, Tokyo.
They care a lot about what the technology's going to do, how it needs
to reach users and reach applications.
Right.
All those like last mile things that aren't such, just the last
mile, there's actually a lot there.
Yeah.
Has it always been like that?
I'd say like New York and most places have, you know, uh, maybe always been.
Like that.
Mm-hmm.
Uh, yeah.
You know, I last 10 years I think, like New York has been just growing,
growing as a tech scene and, you know, but, uh, I, I think it's like really
good and I, I see it staying grounded in like what you wanna do
with tech, right?
Yeah. Which I think is really healthy.
For sure.
You know, Sarah, one of the things we talk a lot about or have been talking a
lot about, uh, on MoE has been sort of the idea that like the business of AI
is changing in a pretty fundamental way.
Where, you know, 24 months ago it would be like, oh my God, this new model and
like you just had, you know, this huge acquisition for, of windsurf, right?
And, and so we've been talking a lot about how like, it seems
like this application layer.
These like actual, like practical implementations are becoming where
like a lot of the value is in AI.
And do you think that will kind of change like which cities are dominant?
I think it's kind of like a really interesting question if it turns out
that like actually where the action is happening on AI is in New York.
Yeah.
Because in some ways that's where the value in AI is flowing.
Yeah. Do you agree with that?
I'm just kind of riffing a little bit.
No. Yeah.
I mean,
I, I love this question because both, as a product person, I'm always thinking
not in terms of like technology
first and then putting it onto a problem.
Um, but rather trying to identify the problem, understand it, and
then, then find the solution.
Um, so there's that.
But then also I do think New New York is uniquely, um, capable of creating more
creative applications, and this is just a function of being the place
that so many people go to, right?
So unlike a single industry city like the Bay, we've got, uh, arts, we've got media,
we've got finance, we've got fashion.
And I think that even down downstream in terms of the people working
at these companies, most of my friends aren't in the tech industry.
And I think I, I gain a lot by exposing myself to people in different
industries and also understanding their concerns or their optimisms about AI.
And so I think having that greater understanding of a customer use case
means that us in New York, we can craft products that genuinely meet
their needs as opposed to perhaps just technology for technology's sake.
So yeah, I'm bullish on the application layer and that
also being important as we see
companies continuously investing in models.
Does that become more of a commodity?
Can the application layer be where you differentiate yourself?
Yeah, for sure.
Ash, do you agree with this?
This is like a very East Coast centric.
I know we're sitting here, you know, like, yeah, Madison Square
is like a block away, you know, um, on kind of how you know Anthony and
Sarah kind of assessing all this,
I think that, um, what we've seen over the last couple of years is as the
cost of inference drastically reduces,
there's gonna be more inference.
Right?
Okay.
And in essence, that means that there's gonna be more people who have access to
the technology, especially now as the open source models are now comparable
in performance to some of the more proprietary ones that we're gonna see.
Innovation come out in all sorts of places. For sure
big cosmopolitan areas like New York or London or Paris.
Okay.
It's just a, a melting pot of culture.
And the combination of
lower inference costs, the ability to experiment and innovate
quickly, and those melting pots of culture is obviously gonna, uh, breed
a lot of innovation here using AI.
Uh, but I think that we may find this happening in all
sorts of other places as well.
Like, I'm thinking like agriculture, where are, where are the
farms as like, not farms here.
So, but there may be innovation there, for example.
I was gonna agree fully and give a couple examples.
Yeah, please do it.
Drop 'em.
So the, the main program that I'm responsible for at IBM and and
globally here is the AI Alliance.
Which is a program that brings together a lot of different organizations
who are working in and around open source AI, and it's very global.
So two months ago I was in Vietnam in Hanoi, uh, launching kind of
a chapter there, and there's a very vibrant scene of startups and
companies that are taking advantage of, of open source and AI, right?
Open models, uh, creating, you know, custom versions, creating things
that reflect what they need in that culture, that language, that
business environment in Africa.
Similar things happening with startups that are operating,
uh, more on the edge, right?
Mobile based tech is like really big and important there.
Uh, you can't do that tying into, uh, you know, a centrally
hosted API to a big model, right?
So there's lots of ways that, uh, open source AI in particular.
In the tech scene is uniquely helping and addressing like people
and use cases like globally.
I think you're gonna see a lot more of that.
Any final thoughts, Sarah, before we move
on to the next topic?
Yeah, no, I think this kind of circled up nicely because the, the real issue isn't
SF versus NYC, even though this is New York Tech Week for sure, and I
got my New York Tech, um, hat on.
But totally these points about open source is really broadening
out and democratizing tech.
Um, so if, if a farmer in rural Kenya has the same access to an open source
model as perhaps, um, a user in a cosmopolitan city, what gains can
be made and spread throughout the population that we can all benefit from?
So that's where I'm the most excited.
Nice.
That's great.
Well, a lot more to look forward to and, uh, a lot more events
here at, uh, New York Tech Week.
Alright, so I'm gonna move us on to our next segment.
There's two sort of big industry reports that just came out fairly recently.
One from the Linux Foundation and the other one from, uh, you know, the
legendary Mary Meeker at Bond Capital.
Uh, most known for her like voluminous
slide decks, um, which, you know, have largely kind of focused on internet.
But what's so interesting is that this year's kind of drop
was like very AI focused.
Um, and so I wanna kind of talk about a little bit about both of them.
'cause I think often it's like there's so much going on in AI, it's really
hard to kind of collect all that data and like have like a kind of
grounded conversation in what's going on.
Um, and I wanted to start first with the Linux Foundation report, Linux Foundation,
of course, being in the open source world.
Um, and uh, I think the stat that I really wanted to talk about was this one, I'll
just kind of quote it, which is that they found that a significant majority,
89% of organizations are using some form of open source in their AI stack.
And almost two thirds, 63% of companies are using an open model.
Um, and you know, in the past, I think when we had this discussion
in the past, it's been like.
Oh, is, is closed source gonna win?
Or is open source gonna win?
Or you know, how is open source adoption happening?
This report kind of suggests is like, has open already won?
Like, I don't know if we're like already in a world where like open
source models in some ways have the advantage because they've just
been adopted by almost everybody.
And so I don't know if like this kind of classic distinction between
like open versus closed is even like a worthwhile debate anymore because
open dominates in so many places.
But I, I think I'll point it to you first 'cause you're looking at me skeptically.
No, not skeptically.
Yeah.
More in agreement.
Uh, but let me try to Yeah.
Dig in a little bit.
Sure.
Uh, so first being in open source AI, I wasn't too surprised by most of
the conclusions of that, uh, report.
Yes. It's great to see it.
One place are open.
Yeah.
Uh, it's great to see it in one place.
It really is for sure.
I'd say like on that, like open versus closed debate, like.
Yeah, I think it's more nuanced, right?
Mm-hmm. Take that statement.
89% of organizations are using some form of open source in their AI tech stack.
Of course they are.
I mean, Linux is open source.
You know, PyTorch is open source.
Many, I mean, many things are open source outside the model, right?
Yeah.
The models themselves, that's a healthy statistic of growth, right?
That's great.
That, uh, two thirds, yeah, about 63% Yeah.
Are now using, uh, some form of open, open weight model.
Mm-hmm.
That's really great.
Um, again, I'm not, not too surprised.
Of course they are, but yeah, maybe it should be, right?
Mm-hmm.
Because like, if you think about two years ago it looked like, you
know, AI maybe was gonna become kind of like cloud service style, right?
That's right.
A few clouds would have the APIs and that's everybody
would just use them, right?
Mm-hmm. Yeah.
It would be so great and easy and that that's, that's all you would need.
So it's kind of nice to see that
not play out that way.
Mm-hmm.
But you think it's still like a story in progress.
Like you, you see two thirds and you're like, well, there's still
that other third that could be open.
That's true.
Sure.
But I'd say more so like it's toward a more nuanced view.
Right. Uhhuh?
I think there's gonna be proprietary things that every organization
uses in AI in their stack.
Some will probably use some proprietary model services alongside open models.
Um, some will use us an opportunity to focus on bringing the
proprietary differentiation to a different part of the stack.
Right. Higher up.
Yeah.
So at the application layer, as Sarah was talking about.
Yeah, for sure.
And Ash, I'm curious if how like, 'cause it seems like where Anthony's kind
of pointing us is sort of the idea of like, it's not really open versus close.
What we're gonna see is like everybody's gonna use open to a
greater or less degree and there'll be like different ways of different
paradigms maybe of integrating open.
Is that kind of what you're seeing in your work?
Yeah, uh uh, for sure.
And I think that, um, one of the primary drivers for this is that
the space is still pretty nascent, right?
I mean, we have great model performance, okay, but the adoption of those
technologies and using them in like functional ways that add value and
bring, you know, sort of like a, a healthy return on the time and the
effort that's put into using them, we're still nascent and we're trying
to like work out what those things are.
And yeah, we have some core use cases now, but.
For a lot of organizations, it's the developers that are driving this.
Mm-hmm.
Right.
And they need to know what's going on in these open source pieces of
software and models because they're still tweaking and they're still
customizing and they're still adapting to the use cases that they have Right.
In within their own individual organizations.
And if the stack wasn't open to an extent, that wouldn't be possible.
Yeah. I love that argument and.
Sarah, curious if you have some comments on this.
'cause it's like, Ash, what I hear you saying is like, we have no idea what we're
doing in AI and like, isn't it great that it's open because like otherwise we would
really have no idea what we're doing.
I don't know.
And this is all these implications for safety and bias and fairness.
Yes, exactly.
Exactly.
No, I mean, open source, it.
It's so interesting from a safety perspective because what sometimes
comes to mind is, alright, open source models have historically been used for
harmful purposes, that perhaps closed source models will create guardrails
around to prevent that behavior.
Um, but I think saying, you know, proprietary good, open, bad from a safety
perspective is obviously too naive.
I think, you know, we.
Have greater transparency into the safety measures of open source models, right?
If we are only trusting proprietary closed source models on their own safety
measures, we're taking them at their word.
Whereas the beauty of open is that now the whole world is a tester, right?
They can red team it, they can analyze it, they can go through the code and
identify where vulnerabilities might be.
And so that's where it's promising to me 'cause greater transparency, um, helps
you know, safety in the long term.
Hmm.
Yeah.
And do you think, actually, I mean, I think one interesting historical
comparison, right, is like, you know, Apple versus Android, right?
Which would be the classic one, you know. In there
I think the way I often hear the story told is, well, Apple's closed.
Everything's controlled end to end and as a result it's more secure and more
private and all these sorts of things.
And you know, Android being an open platform has a lot more security
risks and you know, all these things we need to worry about.
But you actually told a story about ai, which is like almost the flip, right?
Where you're like, actually there's all these security advantages or safety
advantages that come from openness.
Do you think AI is gonna work in a very different way from what we've learned,
I guess in the mobile ecosystem?
Or like, are these different cases, I guess is what I'm trying to say?
Yeah, no, it's interesting 'cause um,
I think, I think it depends.
So if the closed source model companies do decide to open up and engage more
with the community in terms of red teaming, then they could take the
benefits that I just described that open source models do benefit from.
Um, however, uh, yeah, I mean if we think about, uh, similar to bug
bounties for cybersecurity, like.
Our nonprofit Humane Intelligence, we have bias bounties.
Yeah.
And so we are able to do those with open models.
Therefore, that leads me to believe that there's gonna be more of an adversarial
for good white hat hackers who are keeping on top of where the security
vulnerabilities may lie within open.
Um, and then my last thought is just, you know, in terms of, uh.
Especially the cost savings for cus for customers who are gonna adopt open their
ability to perhaps run these models, um, locally and then have even more control
over their own security, uh, risks.
And I guess Anthony, this goes to like an ongoing debate, I think in the space
and like this was actually, I know like one of the bits of discussion
around the Linux report is like, what is, what does open mean here?
Right? Because open could be:
we're, we're, we have open, you know, uh, transparency into what we
did in order to make the model safe.
But the model has closed weights, right?
Like, is that a form of openness?
You know, I think you certainly meet, you know, uh, free software radicals Yeah.
That are like, nothing is open enough for us.
Um, yes.
And I'm curious about how you see that kind of meta resolving.
Like, are we gonna get to some kind of common norm about like, yes,
this model's open versus not open.
'cause I guess, Sarah, what I hear you saying is it's very, it's fuzzy, right?
Like what openness means in this space.
I think eventually we will.
Mm-hmm.
I think there's gonna be plenty of debate and evolution, right.
In the meantime.
For sure.
I think we need to stay focused on like, the practicality of why anything
open matters or why something that's transparent is important, right?
It's the ability to understand it, to improve it, to adapt it, uh, to
use it as you see fit and therefore derive value in your own way.
Those are kind of the fundamental principles.
Mm-hmm.
And so if we think about software.
After a few decades, we have a really rigorous definition of
what open source software means in different licensees licenses.
Like for, um, for how to, uh, right.
To enable use.
Mm-hmm.
AI, like a pre-trained model, has really only been on the scene in a
big, broad way for a couple of years.
Right.
Yeah. And it's complex, right?
Um, is it a data artifact kind of
mm-hmm.
Uh, is it more like software?
Kind of mm-hmm.
Is it unique from the, from the two?
Yes, it is. Uhhuh, um, it has like compressed capability
and call it intelligence.
That, right, that no kind of shell of software alone has.
So I think it's gonna take some time.
I think we need to stay focused on why it matters, which is in my, in my view,
like a practical view of it, right?
Mm-hmm.
Yeah.
And if we can, if we can keep that focus, I think the definition
will continue to evolve.
And I think eventually, we'll, we'll wind up with sort of a commonly accepted
definition of what open source AI means.
Yeah. But
it might just not be until like 2050, basically.
So we'll see.
Yeah, we'll see.
Maybe before that, but that's right.
It might take a little while.
For sure.
Um, I'm gonna move us onto the second big industry report, uh,
which is the Mary Meer report.
This 300 plus page slide deck, um, it's, it cites a lot of the stats that
I think we're familiar with, but I think was useful for me to at least revisit.
There's a great chart in there, which is like, how many days did
it take to get to a million users?
And it's like.
You know, it's a fun comparison.
It's like the Model T car, you know, TiVo, um, the iPhone and then
at the very end it's like OpenAI, like five days to a million users.
Which like, I think again, like the deck was useful for me just reminding
myself like how crazy of this, this period that we're living through.
Um, but Ash, I wanted to talk to you about, in specific, about
one comment that's hiding in one slide in like eight point font
at the very bottom of the slide.
And it says, quote in the short term, it's hard to ignore that the economics of
general purpose LLMs look like a commodity bi- commodity business with venture scale
burn, which translated in my mind is like, this is really expensive and it's still
kind of unclear whether or not it's a business you can make more than kind of.
Commodity profits on.
What do you think about that?
Is that, is that concerning?
Yeah, it is.
That stood out to me as well.
Uh, I mean the, one of the first things that she says, and I think
that kind of underlines most of the report, is the word unprecedented.
Okay.
Right.
And, and, and.
In, in that vein, right?
This is unprecedented.
The amount of money that's being invested in training these large
models seems to be going up.
The GPUs are getting more efficient and you know, their power requirements are
kind of going down, um, as well as sort of like, um, their cost for like inferencing.
Um, but it kind of creates this like chart, which is like
costs are going up to train them, costs are going down drastically to run them.
So where's the math between those two things that are gonna close that gap to
like bring a return on investment for all this money that's being poured into this.
So Ash, what you're saying is like very concerning, right?
Like how do we fill that gap?
It's unprecedented.
The situation that we're in.
Um.
Sarah, what's the solution?
Yeah, well, you're not gonna like my answer.
'cause of course, being a trust and safety focused person, I read it with a
little bit of a different lens, so, okay.
Yeah. Well what, what was your read?
Yeah, yeah, yeah.
So out of the 300 and some odd pages, uhhuh.
So many dedicated to the potential revenue.
You know, so many charts of hockey sticks, Uhhuh, I swear I was at a Rangers game.
But, um, what about safety is my question Uhhuh?
Sure. And I get it.
Is a VC created report.
Yeah. And that is not the main thrust of it.
However, I do think we need to be having a more nuanced conversation
about when we are deploying a technology to so many users and how
responsible scaling is actually a good business decision.
Mm. Um, I think when I was looking through it, I found like the word bias once uhhuh.
Sure.
There's a little bit of a concern and so I just do think, you know, no, no
shade to the queen of the internet.
But I do think that might have been a little bit of a mis.
Opportunity to just talk through some of these issues, which could be barriers
for consumers trusting the technology.
Mm-hmm.
Yeah. And therefore adoption.
And I do think that then smart businesses are going to want to make
sure that they, um, deploy it safely, not just to avoid regulatory pressures.
Mm-hmm. Especially in the eu.
But also if you think about it from a cost savings perspective, uh,
finding a bug after you deployed way more expensive to, to fix than
if you can catch it in testing.
Yeah. Yeah.
So of course, that's why I, I beat my drum around, uh, more,
uh, more robust evaluations.
Yeah.
And you actually think that that will be like a, that will be a
commercial phenomena as well.
Oh yeah.
I mean, the sense that like kind of Ash is offering this question, which is how
do we navigate this world where the costs are crazy and we're still waiting for
the kind of business value to show up?
Are you kind of saying.
The competitive advantage here will be something like
safety, maybe something like,
yeah, it'll, the competitive advantage for the firms deploying AI will be
safety and how they can offer that as part of the product to customers.
But I also think there is an untapped market for, uh, in a firms that
want to also take advantage of this.
So building out a broader safety ecosystem.
I know we were just talking about how the model, uh, the open model environment,
like we don't have certain standards that we would like to standardize those.
I'd say the same for evaluations.
Yeah.
So there's a lot of, uh, potential revenue there that.
Ms. Meeker did not touch upon.
If you're listening, Mary Meeker, um, and you're nodding, I don't know
if you wanna, you wanna get on this kind of hard things, a couple thing.
Yeah, for sure. So first I,
I agree with that direction.
Take a little further.
Uhhuh, I'd say yes.
Value creation, uh, profit margin mm-hmm.
Will be in layers, above models, just like they are layers above computing hardware.
The layers that are closer to the application.
The layers where different companies who have use cases are gonna focus on.
There's lots of value and there's lots of, lots of margin there.
Mm-hmm.
Right.
On the topic of like overinvestment in AI, I think it's really
interesting if you take a step back
mm-hmm.
And think about the macroeconomic picture here.
Isn't it amazing that a set of investment decisions that happen,
like at a micro level, right?
Do I invest in that startup?
How much, what's the likely return, what are the rounds gonna look like?
Results in an incredible overinvestment.
Mm-hmm.
It's unprecedented in the, in
the ecosystem.
But isn't that amazing Uhhuh?
Because look at how fast it's pushing progress in competition.
For sure.
Yeah.
Like no rational decision at a macroeconomic level would ever
place that much like funding into AI development, but it's happening.
Yeah,
right.
Because this series of all of these micro decisions and, and startups and funding
rounds and all that collectively created this like amazing accelerator of progress.
That's right.
Yeah. Wow.
Are you I'm pretty excited by that.
Yeah.
And are you saying like, are you kind of making like a
wisdom of the market argument?
Right. Which is they wouldn't do this.
Right.
Well, what we're discovering is that people really do have confidence
that this is gonna generate value.
Well, I'd say there's overconfidence.
Sure. Okay.
And I think many of us will benefit from overconfidence.
That's right. Some people will lose a lot of money.
Yes. But I.
That's okay actually in the grand picture because we're all gonna benefit.
Yeah.
And there's a great book that came out called Boom, uh, I
think it was earlier last year.
Right?
It was kind of arguing that like, even, even irrational bubbles, which you could
try to make this argument, uh, have all these spillover benefits, right?
And like we should actually keep our eye on some of that.
Um, Ash, you wanna respond to some of these comments because I feel
like in some ways maybe you're holding back a little bit, but maybe
you're a little bit more skeptical.
One question that I always keep asking myself.
Okay.
Um, is...
Whenever you use something that's using a generative AI based backend,
mm, you'll see a disclaimer.
Like the answers might be wrong.
Double check them.
Is that gonna be forever, Uhhuh?
I mean, we all just gonna live in a world where AI is everywhere and
everything could all be wrong, and we just have to double check everything.
Like that's a really, really important thing to consider.
Right? Sure.
As we go and like sort of proliferate, um, models across all sorts of, um, uh,
supply chains and, uh, and, and, and, and value chains of, of information.
If all of that goes from being really, really
sort of deterministic to stochastic, then what do you trust anymore?
Mm-hmm.
Right.
Yeah.
And I think this is, it is like, I think one anecdote that I have in
mind, Sarah, when you were talking, kinda making the case that like maybe
safety is one of these things that you build value on, on top of, you
know, the hardware or, uh, the model.
Um, is is the case of, uh, ChatGPT image generation.
Right.
Where I think like one view you could have of that is that they concluded that,
um, consumers actually want less safety.
Mm-hmm.
Right?
We get more adoption the less we control the activity of the model.
Mm-hmm.
And this is kind of a perverse outcome, right?
Which is like maybe the market incentives are pushing people to
get more value out of the market by reducing their commitment to safety.
Is that a good interpretation?
Or, I don't know if you.
I can't help but think about
the lesson that I would've hoped we learned in the last
20 years with social media
and that lesson was, uh,
well, is that, uh, when you move fast and break things, uhhuh,
uh, you also break people, right?
And like especially as this is adopted at an even faster rate mm-hmm.
Than social media adoption according to the report.
Why can't we learn our lesson and do you know, more responsible scaling?
Mm-hmm.
You know, make sure that it is a business requirement for these models.
And I think un unfortunately, a lot of it is the genius out of the bottle.
Mm-hmm.
Chat OpenAI with open, uh, releasing ChatGPT into the wild,
probably a little prematurely.
Mm. Has sort of made it just.
The norm that these half-baked products are going out.
Mm-hmm.
And I do worry that, um, that, uh, business leaders who are making
decisions on which of these products to implement, and especially across
huge enterprises, are overestimating their overall capabilities.
Mm-hmm.
They're also looking at
these benchmarks, which purport high performance, but also a
benchmark is a very narrow view of the overall performance of a model.
Yeah.
And so I do, I do wonder if, you know, we've already seen some of
these AI first companies like, uh, Duolingo now backtracking, right?
Mm-hmm.
Mm-hmm. Yeah.
And actually hiring more people.
But I do think we are gonna be in a bit of a thrashy period as people, especially
businesses like very enthusiastically adopt, try to implement it.
There's the reality of any time you try to implement anything into any system mm-hmm.
There's some blowback and then we're kind of left now questioning, all
right, where do we go from here?
Yeah, for sure.
Ash, final comment on this is like, I mean, you offered this prompt
by saying, you know, everybody's got these disclaimers, don't
trust anything this model says.
Yeah. Right.
Um, and I guess, Sarah, what I hear you saying is, well, you know,
maybe we're like in this like.
Man, I hope we remember the lessons of social media moments.
Like, do you think it's gonna be like 10 years?
Everybody will be like, oh God, these models, you know, we really gotta have
a renewed commitment to, you know, veracity and validation in, you know,
model outputs or something like that.
I,
I, I do think that, um, there's lots of things, um, being developed currently.
Like, um, I think we may have talked about this on a past episode around
like mechanistic interpretability.
Yes. Yeah.
Right.
I think that, um.
As those areas mature, we'll have things in place, sort of controls
that should make, you know, those disclaimers being hopefully less required.
And we will mature as an industry that will get to a point where
we'll have a universal agreement, just like we will around what's an
open source model and what's not.
Around, you know, this model meets some sort of classification, which
means it can be used for this purpose.
Mm-hmm.
Yeah.
I think it's important that sort of like industry as well as sort of government
side of put some effort into, into doing that to make sure that we're
using not just ai, but we're using the right AI for the right use cases.
Says.
I'm gonna move us on to our last topic, which actually in some ways
is very related to what we've been talking about for the last few minutes.
Um, two sort of very interesting stories widely chattered about on social media.
And I think a big part of MoE's job is to just kind of like cut through the hype.
Uh, you hear so much about AI that's just like, what is that?
And you go digging.
And it's like, it turns out the story is not as, as amazing or as
scary as it was originally reported.
Um.
The one I wanted to really cover was this sort of interesting release that
Anthropic did with the launch of Claude 4.
Um, they released a, a model card that kind of describes how they
think about safety and all the things that they did around safety.
And there's one particular section, again, a little bit like the MEA
report, like kind of like buried deep in that system card that got
a lot of attention on social media.
They said that in specific context, Claude 4 would quote.
Blackmail people that believes are trying to shut it down.
And the specific study they did was to say they had a couple of test scenarios
where, um, a user would attempt to tell Claude that it was being shut
down and replaced, and that Claude would've access to a bunch of emails
that suggested that the person was involved in like an affair of some kind.
And you know, lo and behold, the model kind of like.
Threatens to expose that in response to the input of trying to be shut down.
So this is like, of course, very, you know, Terminator, our AI is gonna
take over the world and, and set off exactly that narrative online.
Um, and I guess Anthony, I'm curious how you respond to this sort of thing, right?
Like, is this, this is genuinely weird, but I guess the question
is, is it something we should really be worried about?
Should we be worried
about it?
A little bit.
Okay.
But not too much.
Okay.
Here's what I think's happening.
A little bit scared.
Here's what I think is happening.
Uhhuh.
Um.
We train models.
Yeah, we align models.
Yeah.
We try very hard to get them to solve problems.
We try to get them to pretend to think.
Mm-hmm.
I say pretend to think 'cause they're not really thinking.
Mm-hmm.
Right.
This is all statistics and trial and error behind the scenes.
Right.
So it shouldn't be surprising that as things move fast.
If artifacts of the training process show up, interesting behavior emerges.
And some of that may reflect human-like behavior because we're training
on all sorts of human data, right?
Mm-hmm. Right.
So, you know, trying to prevent itself from being shut down.
I mean, you could, you know, if it, if somewhere there are Hollywood
scripts compressed in there, right?
That's right. Like, okay.
And I, too frequently, I practically practic working, being shut down, right?
Yeah.
So, right.
The script of the Terminator, of course, is probably in there somewhere.
It likes to copy.
Okay.
So of course it behaves like that sometimes.
In fact, I'm, I'm surprised, like, uh, we've done a good job aligning
models so that more of that kind of doesn't show up actually.
It's kind of nice.
Yeah.
To see not, not that not being too, too prevalent, um, you know.
I think it's important to remember that we know how AI systems LLMSs work.
Mm-hmm.
Right?
They are statistical.
Right. There's math, it's next token prediction.
Yeah.
Um, you know, depending on what model, we may or may not know
kind of where the data comes from.
Mm-hmm.
It's not magic.
Yeah.
And like we should try hard not to pretend like we've created, like in
some godlike way, some new life form.
Sure. Because it distracts from like real.
Issues that can be engineered like well mm-hmm.
And tested well, right.
Totally out of the problem.
That's right.
Yeah.
But I think that that last clause of what you're saying
is, uh, is doing a lot of work.
Right.
Which is if we engineer these things well, um, Sarah, to turn to you, I think
there was another kind of anecdote, which is very related, that got shared around
online from this research group called Palisade ai, and they said, look, we did
a test with o3 We said, we asked it to do a bunch of math problems and then we said,
on the next one we're gonna shut you down.
And they said, low and behold, you know, the model gave back a response
where it tried to subvert, you know, the script that would shut it down.
Right.
And you know, I agree, Anthony, with everything you're saying, right.
It's not like there's a brain in the box that's like, I'm
gonna take over the world.
But it seems like you could imagine engineer implementing these systems
in a not very thoughtful way.
Where this disabling behavior of the model really does have a safety impact.
And so how do we, how do, what do we make of that?
Right. Yeah.
It's like this, like weird made up behavior actually has
practical impact on the ground.
Yeah.
Yeah.
I, I was seeing some critiques online that were saying, well,
they planted that evidence, like going back to the Claude example.
They put the emails in there.
Mm-hmm.
And they said, you have no other options other than to
blackmail or to uh, shut down.
Mm-hmm.
Ah, gotcha.
Right.
But I think it's less of that like, yes, you're doing that 'cause it's.
You're stress testing it.
Mm-hmm. You're red teaming it.
Yeah.
And we actually want to discover if prompted to these certain ends, would
it actually, uh, enact that outcome?
Mm-hmm.
Um, versus something that this is like an emergent behavior
that it would just do unprompted.
Mm-hmm.
Yeah.
Um, but I think it makes the case of why we need to stress test them.
And I think it might get lost among the headlines that this
was in a controlled environment.
Um.
It's something that we, we wanna test things, not when, like, we
don't wanna wait for a fire to test.
We wanna test it with smoke, even if we have to make the smoke ourselves.
Mm-hmm.
Um, and so given that, uh, you know, increasingly we are going to have
applications where user data is contained in systems that, especially if we go
all in on agents, agents will have access to, I'm not worried tomorrow
about that type of situation happening.
But I think it's, it's, I actually applaud, uh, Anthropic for
releasing that in the safety card.
Mm-hmm.
Because I think it also opens up a conversation then for the other
proprietary models to answer, is something similar happening with their models.
For sure.
Yeah. Yeah.
Ash, what I love about this conversation is that.
You know, computers didn't use to behave like this.
Like my favorite like set of things is actually coming outta like the reasoning
models where you're like, could you just think harder about the problem?
And like the computer delivers a better result.
Like we're actually, it seems to me dealing with like computers
that now behave in these like kind of very human like ways as
a result of their training data.
And like, uh, we were talking a little bit earlier about like,
how do you close that value gap?
And it feels like.
You know, will you really want to implement some of these systems
if they're kind of like weirdly humanly unreliable in this way?
Right.
I guess what I'm trying to point to is like computers we've designed
because they're like really good at following instructions and something
have this model that's like really good at doing things, but occasionally is
just like, I'm gonna blackmail you.
Or like, I don't know.
The other one would be the, the ChatGPT getting lazy around the holidays thing,
and it's like, how do you make these systems reliable enough that like.
You know, you would want to use like an enterprise would wanna use that at
massive scale in a way that really would drive value, I guess is the question.
Well, I think the first thing we need to do is we need to make
sure that we stop training the models on any episodes of Black Mirror.
Yeah, exactly.
That was where we went wrong is like, yeah, I mean that actually, but it's
actually kind of a serious comment.
Yeah.
Is basically like one way of dealing with Anthony.
The problem that you're proposing is we just get a lot more, uh, Orthodox about
how we treat training data, which is like.
Something we haven't really done with an ai.
Uh, do you think that's an approach?
Uh, absolutely
Uhhuh. I mean...Okay.
Fine.
You know, software, as you said, right.
You know, it's deterministic.
We are expecting it to do things and
those expectations are you're gonna do this sequence of instructions, or you're
gonna go, oh, there's an error for whatever reason, and we'll have bugs,
but, you know, we'll figure that out.
Mm-hmm.
Um, with, with, with, uh, something that's operating with a level of
stochasticity and you're getting back sort of like predicted things.
Okay.
Um, I think it means that, yeah, absolutely.
We need to have far more rigor on the data that we're putting in.
I mean, like, there's the.
Age old saying of garbage in, garbage out, right?
Mm-hmm. Yeah.
Let's make sure we're not putting garbage in, you know, and we don't have
to deal so much for the garbage out.
That's right.
Anthony, you wanna
jump in?
I agree.
It's a big challenge.
Uh, it's actually something that the AI Alliance is starting to take on.
Mm-hmm.
We have an initiative and we're bringing a lot of organizations together
that are active in the data space.
Mm-hmm.
Uh, curators, tool makers, and so on, with the big ambition to try to
build a much better corpus of data for training and tuning models.
Mm-hmm.
Um.
Yeah, that's challenging, right?
This is internet scale and beyond data.
This is like massive generated data sets and so on.
The, there's, you know, many techniques and nuances in the
post training phase, right?
Uh, so it's not easy, but it is a big challenge that we're starting to take on.
Wouldn't it be great if we had the choice, right, of different levels
of data sets to train models on?
Mm-hmm.
We could decide, or an organization can decide what level of scrutiny
or clear or, or, or screening and so on, they want to use.
That's, that would be, I think, very helpful.
Mm-hmm.
Um, we're, we're gonna try.
Yeah, for sure.
Yeah, and it's, I think those efforts are like really exciting.
It's like very ambitious, but if you're able to pull it off,
I think it could be really huge.
Sarah, I think maybe the last bit of this I would love to talk
about before we have to close up on the show is about interface.
So, you know, I had a conversation with a friend recently where I
was like, it's so lucky that chat ended up being the like key initial
experience that people have with these systems because it models talking
to a human and humans are unreliable and they have weird emotions and
occasionally they try to blackmail you.
Right? And so it's like, it's actually like good.
That the paradigm that we bring to interacting with LLMs is that they
are weird and fuzzy and unreliable.
Because I could imagine designing an AI LLM experience that like, I don't know,
looks like a calculator or like, looks like a, you know, a terminal, right?
Which increasingly we are doing, but I'm curious about how you
think a little bit about that in the trust and safety world, right?
Which is, it turns out that like it may be more than just the model.
It may be like what interfaces we choose that kind of set our expectations
with what the model can and can't do.
And that's, that's kind of safety relevant, isn't it?
Yeah, no, it, it is like safety is go, goes at all levels of the life cycle.
Mm-hmm.
And what's really interesting is we are seeing repeatedly people are turning
to these models, not for what maybe the creators even originally thought.
Mm-hmm.
So in terms of like talk therapy
Yeah.
And the potential negative societal effects that come with talking with
a system that has been optimized to, uh, be helpful to you, to, you know,
be sycophantic.
Yeah.
And that's, that's some of the red teaming that we do is actually
sycophancy, uhhuh uh, testing.
Yeah.
Um, and like what kind of society do we have when a bunch of people are just
constantly told that they are right.
And replacing interactions with real people who in the course
of a day challenge each other.
Yeah.
Um, and of course what I'm talking about is the whole
vertical of companion AI.
Mm-hmm.
Yeah.
But, um, aside from that, you know, I think, I dunno, it's, it's interesting
'cause I think a lot about how, um, users will take the results from an LLM and
just blindly trust it as authoritative.
Right. Yeah.
And sometimes maybe we could see the.
Weird edge, like the silver lining of all of this, of the LLM acting
weird, for lack of a better word.
Mm-hmm.
Yeah.
As indicating like, wait, this is not a perfectly neutral,
uh, authoritative source.
Like you can query it different ways and it gives you different answers.
And I think ultimately that's important for us to keep in mind.
So that way we don't fall into the temptation of
believing in some computer God.
Mm-hmm.
But rather remind ourselves of the stochastic nature of the,
of the probabilistic nature that is undergirding these systems.
So yeah,
for sure.
Ash, I wanna give you the last word, but I kinda wanna bring it full circle, right?
Because I think there's a part of me that's kind of like, is
part of the problem that like.
Yeah, it's like the Bay Area.
It's a bunch of nerds.
They want to train like Spock, right?
They want like a Vulcan conversational experience.
But the problem is that it, it conveys greater authority
than it otherwise should.
And so the joke would be like if you did a tri-state ai, it'd be like, you know, it
would be like, it would be kind of mean.
Yeah.
And you're like, yeah.
And the kind of, the question is like, should we be fine tuning these models?
To like be more unreliable, right?
Like should we have LLMs have a bad day?
Like you log into ChatGPT and it's like, I'm just not feeling it today, man.
Like that would maybe be better in terms of like training the user to have the
right expectations around these systems.
Obviously no company would ever do that, but I think that's kind of
the interesting question, right?
Right now,
the, uh, the genies out the bottle as you, as you said, using chat as that mechanism.
Right.
But going back to what we were talking about on sort of like how much these
models cost to train and what, how much inferencing costs go? I think
what's more interesting to me is what are the other ways that we're gonna
start interacting with these models?
In our day-to-day lives that are kind of like no longer, you just
like having a intimate chat with it and it's just happening.
Like it's accessing your calendar.
Yeah.
And it's like doing other stuff.
Right.
And I think that, um, this level of like sort of, uh, conversational AI
that, that we have today, I think this is probably just sort of, I
don't know, a little bit com like
a novelty factor I think for us as a generation.
Mm-hmm.
But for like people who don't have the internet right now, or I think
about, you know, sort of like my nieces and nephews and so forth.
Right.
They're probably gonna be interacting with these systems in very
different ways than we are today.
Yeah. For sure.
I cannot wait until young kids are just like talking to ate objects,
assuming they'll talk back.
Yeah.
Like that's gonna be the future of like kids touching screens,
assuming that the touch screens.
Yeah, exactly.
Um, anyways, uh, this is an incredibly rich discussion.
Sarah, Anthony, Ash, thank you for coming on the show.
Uh, and thanks to all you listeners for joining us.
Uh, if you enjoyed what you heard, you can get us on Apple Podcasts,
Spotify, and podcast platforms everywhere, and we will see you again
next week on Mixture of Experts.