Apple WWDC AI Reveal and Interpretability Race
Key Points
- The episode opens with a skeptical look at whether everyday users—especially older relatives—truly prioritize privacy amid pervasive app data‑sharing on their phones.
- Host Tim Hwang frames the show around two headline topics: Apple’s WWDC AI roll‑outs and the accelerating race for model interpretability, highlighted by Anthropic’s “Golden Gate Claude” demo and OpenAI’s new mechanistic study.
- Apple is portrayed as the heavyweight “800‑pound gorilla” in the AI arena, finally breaking its silence with a flood of announcements that could reshape the industry given its massive cash reserves, dominant mobile ecosystem, and control of essential hardware.
- Expert guests—including Columbia researcher Kaoutar El Maghraoui, AI consultant Shobhit Varshney, and Kenya Lab scientist Skyler Speakman—provide analysis on the broader implications of these AI developments and the push for deeper model transparency.
Sections
- Privacy Skepticism Amid AI Announcements - The host questions whether everyday users truly care about privacy while previewing Apple’s WWDC AI roll‑out and OpenAI’s new interpretability study, framing a expert panel discussion.
- Apple Rebrands AI as Intelligence - At WWDC the speaker emphasizes Apple's privacy‑first, experience‑driven approach and explains how the company is now presenting its longstanding AI efforts under the branded term “Apple Intelligence” instead of generic AI language.
- Apple Partnerships, Private Cloud, LLM Options - The speaker outlines how, as a major Apple partner, they embed on‑device processing, leverage Apple’s privacy‑focused private compute cloud, and later enable flexible large‑language‑model integration for client applications.
- Google’s On‑Device LoRA Innovation - The speakers discuss Google’s small on‑device language model that uses interchangeable LoRA adapters for tasks like summarization and image creation, emphasizing rapid hot‑swapping, modular functionality, and strong privacy by keeping data on the device.
- Hardware‑Centric AI Security & Integration - The speaker highlights Apple’s hardware‑level privacy features—on‑device AI processing, a secure enclave key manager, and seamless ecosystem integration across iPhone, iPad, Mac, and Watch—as key differentiators.
- Strategic Timing and Calculator App - The speakers explain how firms wait for mature generative AI before entering the market, express enthusiasm for a newly announced calculator app that resolves long‑standing user complaints, and note the challenges posed by inconsistently formatted data across platforms.
- Privacy, Awareness, and AI Adoption - The speakers discuss how user privacy concerns and the need for greater education—particularly among younger generations—could limit companies like Apple in the AI race.
- Apple’s AI Rollout & Privacy Fears - The speaker discusses how Apple is cautiously mainstreaming AI features for its typically older, affluent user base while addressing user apprehension driven by high‑profile privacy scandals and media coverage.
- Excitement Over Core ML & AI Research - The speaker celebrates new Core ML capabilities for facial cropping, predicts a wave of easy‑to‑build applications, and contrasts Apple’s market dominance with OpenAI’s recent paper on extracting concepts from GPT‑4.
- Interpretability Showdown: Anthropic vs OpenAI - The speakers compare Anthropic’s openly manipulable “Golden Gate Bridge” model demo with OpenAI’s more restrained software tools, illustrating two distinct approaches to AI interpretability.
- AI Explainability Tools Race - The speaker highlights the rapid expansion of open‑source XAI toolkits from major AI companies, framing it as a competitive race while noting community contributions and newer methods such as sparse autoencoders.
- Small Models: Efficiency and Privacy - The speakers discuss Apple’s shift to small AI models, highlighting how reduced size enables on‑device processing, improves speed and resource use, and potentially enhances interpretability and data privacy.
- Enterprise Shift to Smaller AI Models - The speaker argues that cost, latency, and IP constraints are prompting businesses to replace large, generic models with compact, fine‑tuned models and mixture‑of‑experts routing, enabling secure enterprise customization via adapters that often outperform bigger models on targeted tasks.
Full Transcript
# Apple WWDC AI Reveal and Interpretability Race **Source:** [https://www.youtube.com/watch?v=_bCsW_Jrcts](https://www.youtube.com/watch?v=_bCsW_Jrcts) **Duration:** 00:39:47 ## Summary - The episode opens with a skeptical look at whether everyday users—especially older relatives—truly prioritize privacy amid pervasive app data‑sharing on their phones. - Host Tim Hwang frames the show around two headline topics: Apple’s WWDC AI roll‑outs and the accelerating race for model interpretability, highlighted by Anthropic’s “Golden Gate Claude” demo and OpenAI’s new mechanistic study. - Apple is portrayed as the heavyweight “800‑pound gorilla” in the AI arena, finally breaking its silence with a flood of announcements that could reshape the industry given its massive cash reserves, dominant mobile ecosystem, and control of essential hardware. - Expert guests—including Columbia researcher Kaoutar El Maghraoui, AI consultant Shobhit Varshney, and Kenya Lab scientist Skyler Speakman—provide analysis on the broader implications of these AI developments and the push for deeper model transparency. ## Sections - [00:00:00](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=0s) **Privacy Skepticism Amid AI Announcements** - The host questions whether everyday users truly care about privacy while previewing Apple’s WWDC AI roll‑out and OpenAI’s new interpretability study, framing a expert panel discussion. - [00:03:09](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=189s) **Apple Rebrands AI as Intelligence** - At WWDC the speaker emphasizes Apple's privacy‑first, experience‑driven approach and explains how the company is now presenting its longstanding AI efforts under the branded term “Apple Intelligence” instead of generic AI language. - [00:06:14](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=374s) **Apple Partnerships, Private Cloud, LLM Options** - The speaker outlines how, as a major Apple partner, they embed on‑device processing, leverage Apple’s privacy‑focused private compute cloud, and later enable flexible large‑language‑model integration for client applications. - [00:09:25](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=565s) **Google’s On‑Device LoRA Innovation** - The speakers discuss Google’s small on‑device language model that uses interchangeable LoRA adapters for tasks like summarization and image creation, emphasizing rapid hot‑swapping, modular functionality, and strong privacy by keeping data on the device. - [00:12:36](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=756s) **Hardware‑Centric AI Security & Integration** - The speaker highlights Apple’s hardware‑level privacy features—on‑device AI processing, a secure enclave key manager, and seamless ecosystem integration across iPhone, iPad, Mac, and Watch—as key differentiators. - [00:15:41](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=941s) **Strategic Timing and Calculator App** - The speakers explain how firms wait for mature generative AI before entering the market, express enthusiasm for a newly announced calculator app that resolves long‑standing user complaints, and note the challenges posed by inconsistently formatted data across platforms. - [00:18:47](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=1127s) **Privacy, Awareness, and AI Adoption** - The speakers discuss how user privacy concerns and the need for greater education—particularly among younger generations—could limit companies like Apple in the AI race. - [00:21:53](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=1313s) **Apple’s AI Rollout & Privacy Fears** - The speaker discusses how Apple is cautiously mainstreaming AI features for its typically older, affluent user base while addressing user apprehension driven by high‑profile privacy scandals and media coverage. - [00:25:04](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=1504s) **Excitement Over Core ML & AI Research** - The speaker celebrates new Core ML capabilities for facial cropping, predicts a wave of easy‑to‑build applications, and contrasts Apple’s market dominance with OpenAI’s recent paper on extracting concepts from GPT‑4. - [00:28:18](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=1698s) **Interpretability Showdown: Anthropic vs OpenAI** - The speakers compare Anthropic’s openly manipulable “Golden Gate Bridge” model demo with OpenAI’s more restrained software tools, illustrating two distinct approaches to AI interpretability. - [00:31:20](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=1880s) **AI Explainability Tools Race** - The speaker highlights the rapid expansion of open‑source XAI toolkits from major AI companies, framing it as a competitive race while noting community contributions and newer methods such as sparse autoencoders. - [00:34:24](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=2064s) **Small Models: Efficiency and Privacy** - The speakers discuss Apple’s shift to small AI models, highlighting how reduced size enables on‑device processing, improves speed and resource use, and potentially enhances interpretability and data privacy. - [00:37:41](https://www.youtube.com/watch?v=_bCsW_Jrcts&t=2261s) **Enterprise Shift to Smaller AI Models** - The speaker argues that cost, latency, and IP constraints are prompting businesses to replace large, generic models with compact, fine‑tuned models and mixture‑of‑experts routing, enabling secure enterprise customization via adapters that often outperform bigger models on targeted tasks. ## Full Transcript
But from the end user's experience, are they really concerned about privacy?
Are, are the, you know, the grandparents or your, you know, your nieces and
nephews, the target customers for these, are, are they at the end of
the day really consumed about privacy when they are allowing all sorts of
other information apps sharing on that, on their exact same phones?
Hello and happy Worldwide Developers Conference for those who celebrate.
You're listening to Mixture of Experts.
I'm your host, Tim Hwang.
Each week, Mixture of Experts distills down the week's biggest
headlines and chatter in the world of artificial intelligence.
Whether it's business news, the latest hot drop on Archive, or Nvidia making
another bazillion dollars, MOE is here to give you the analysis you need to
navigate this rapidly changing landscape.
This week on the show, two items.
First up, Apple's WWDC continued a summer of announcements all
around artificial intelligence.
We'll parse out the biggest things to be paying attention to and what they
mean for the industry as a whole.
Second, the race for interpretability continues.
Weeks after Anthropic demoed Golden Gate Claude, OpenAI fires off its own
mechanistic interpretability study.
What does it say and why is OpenAI investing in it at all?
As always, I'm joined by an incredible group of experts who
will help us cut through the noise and offer their hot takes.
Two veterans this time with a new guest.
Kaoutar El Maghraoui, Principal Research Scientist, AI Engineering, AI Hardware
Center, and a professor at Columbia.
Kaoutar, welcome to the show.
Thank you very much, Tim.
Glad to be here.
Second, Shobhit Varshney, who will be familiar to long time listeners of
the show, Senior Partner Consulting on AI for US, Canada, and La Anne.
Shobhit, welcome back.
You're in like a different place every single time, but
it's great to have you here.
Thank you.
And then finally, Skyler Speakman, who's a Senior Research
Scientist at the Kenya Lab.
Skyler, welcome.
I'm going to be even more geeky this time.
I'm going to press my luck.
Well, so let's just jump into it.
Um, there's two items on the agenda today.
And, you know, I think the big one really will be WWDC.
There's just been so many announcements.
It feels like every single week, every company's doing a raft
of new announcements around AI.
And I think the background here for folks who haven't been watching
Apple so much in the space is that the big thing that everybody's been
talking about for a long time is, Where is Apple in all of this, right?
Companies have rushed ahead announcing new products, new features, uh, new research,
but Apple has sort of been curiously quiet and, you know, they really have been the
800 pound gorilla in the room, right?
They have huge amounts of cash in the bank.
Um, they have one of the most successful companies, you know,
obviously mobile operations in the world and they control hardware
which is incredibly incredibly key.
And so I think what was really fascinating about the announcements at WWDC this week
was that we finally I think started to get a picture of what Apple's going to do.
What the richest company kind of most powerful company in the world
is going to be doing in the AI space.
And so I think just to set the context a little bit, Shobhit, I want to
bring you in first, which is, I guess, to tell our listeners a little bit
about why it has taken Apple so long to get to the starting line here.
And I guess from the announcements this week, what you think they're trying
to do differently, if anything at all.
I'm a big Apple fan.
I think that at WWDC, it's always a look at what the future is going to
bring and with the constraints of what it takes to deliver privacy and trust.
They spend decades building that trust, and they can't just lose it in a minute.
So there's a lot of thought that goes into privacy and how Apple's way
of bringing technology into it is.
And the litmus test here is always making sure that the advances they
make in innovation are seamless and are frictionless for the end users.
So they've been doing AI for a really long time, but they have never on stage
said the words AI, or they talk about the vision pro and not once will they use
an industry term like virtual reality.
So they've always differentiated themselves from.
We're not a technology company, we are in the business of
delivering exceptional experiences.
So it's not the fact that we have a gyroscope, it is the fact that my Apple
Watch has a crash detection thing.
That's something that you would want to pay extra for to protect
your loved, loved ones, right?
So they've always tried to stay away from the technology per se, but they've
been doing AI for a very long time.
There are a few different ways in which Apple is bringing this in.
Classic fashion, they are renaming it, rebranding it as their version
of Apple Intelligence, right?
That's the cool thing that you want, not the generic AI that everybody else has.
They're in a very, very great position of strength.
If you think about all the data that's needed for hyper personalization,
It resides on your phone.
And if you look at the big companies that have access to intelligence
that's in your pocket all the time, it's either Apple or Google.
Microsoft doesn't quite make a phone and others like Samsung have
depended on others as well for a lot of their innovations and things.
So it boils down really to these big behemoths, Google and Apple, that own
the ecosystem of all mobile phones.
The way Apple is coming at this is very Privacy first, I'm going to ensure the
safety and security and you're comfortable with what you're sharing with a model.
The hardware has, has come to a point where just the 5, iPhone 5
Pro and the Pro Max is at least 8 GB of RAM and stuff, right?
The absolute latest, greatest $1,100 plus phones, they are at a point
where they can now afford to run a on device, uh, LLM, small language model.
So the way that they went around this, they said, let me look at the
experiences that a, that an individual has across everything that they do on
their iPhones, on their iPads, on their Macs, and I'm going to surgically infuse
generative AI where it makes sense.
The way they brought this out is in a very step by step fashion.
I think they're doing a great job at saying, all the text generation and
stuff, I'm going to restrict it to a few things that you can do, but we
have thought through where you think you will need some intelligence and
stuff your notes inside of your iPad apps and things of that nature, right?
So it's been restricted on how they're rolling this out.
They have three levels at which they are doing their LLMs.
One is on device, and that's majority of what they're what they've done.
If you follow through the WWDC developer sessions that we had later on, they
get into quite a bit of detail.
We had a good set of people from from IBM at the event as well.
We are, by the way, one of the largest, uh, Apple partners and rolling out
Apple technology to large enterprises.
So we do a lot with their their tech.
So we got a good driver's seat view of how we could build those experiences into the
apps that we're building for our clients.
On the first layer, you're looking at majority of the work
has to be done on device itself.
So things like time together, all of your emails, text
messages, things of that nature.
Majority of that workload happens on device.
From there, the next stage is, if for some workloads you need to go
to the cloud, they've created this private compute cloud, which if
you get into the stack, the way they've designed it is incredibly...
very nicely structured to drive privacy and controls and they've even gone a step
further and said things like there's no shell access to those servers and stuff so
Apple can't just log in and start to look at your data and things of that nature.
They've done a good job of the privacy.
And the last layer is if there's something that has to be super creative
and they want to really tap into the large ecosystem then they had to choose
a partner to start with and it's not the only partner they're going to have from
the following discussions that Tim had.
He's talking about other, it opens up a window for people to have
the choice of which large language model they will tap into right now.
It's going to be,
I know initially, basically people were like, Oh man, is, is GPT going
to be the exclusive provider here?
And I think Apple was like, Nope.
Actually, it's going to be, you know, anyone's game.
We're going to open it up to lots of different people.
Um, which I thought was like very, very interesting because, you know, we've
always thought about and I think in these episodes, we've always talked
about OpenAI as like the big AI provider.
But like, you know, what, you know, what's even bigger than OpenAI is Apple.
Like that can afford to say, well, you know, you're just
one of many, you know, so.
And I think that we had done this really well.
And from the follow on developer sessions, they talked about how much of that
workload is going to happen on device.
And it's kind of strange that Apple intelligence is restricted to only the
two phones that were released last year.
If you spent a thousand bucks on getting an iPhone 15, not the 15 pro, um, that
phone, cannot run the latest intelligence.
If you look at anything that Google does with their, the same
technologies that Google is providing.
For example, looking at a picture and you want to do a magic eraser and remove
somebody or something from the background.
That has been working on Android phones for a very long time at this point.
It's not restricted to the very, very top end.
AI.
But Apple is using this as that moment where, hey, now
all of a sudden people need 3G.
That's when we need to upgrade.
So they've had a forcing function on people who are very comfortable
with the 13 and 14 pros.
They would need to upgrade it to partake in this whole
Apple intelligence ecosystem.
So they're using that, uh, that as a hook.
I want to spend a minute on explaining how they're approaching
the models on, uh, models themselves.
One, you mentioned at the beginning that they are somewhat kind of late
to the game as compared to others, but classic Apple style, it's not the
first to market, it's the best product that they bring to the market, right?
So if you look at all the papers that went into the work that they've done, they are
standing on top of all the competitors.
They are borrowing transformers from Google, speculative decoding,
another Google contribution, context pruning came from Microsoft.
group query attention came from Google.
They've taken all these different open technologies that people have
contributed to the open source and they've built their own version of it.
I think the big innovation that they did was a small model
that's working on your phone.
They've created these different LoRa adapters, which is essentially
adding a couple more, a few more layers on top and changes the
context of how the LLM behaves.
And their big innovation was how quickly they can hot swap these
different LoRa adapters for different use cases like summarization.
Image creation and things of that nature, right?
So they're being able to go do this in a very, in a very secure
manner on the device itself.
This whole transition from that to cloud is pretty good.
Yeah, for sure.
And I think this is actually like the most interesting thing of it
for me, right, is I think Shobhit, you've given us a really good kind
of landscape into what's going on.
And I think the big theme that comes out of everything is like,
Privacy, privacy, privacy, right?
Like, we're gonna keep the data, you know, on device.
And then I think it's also, like, very cautious how they're
approaching this, right?
Is like, if you hear the dream of, you know, OpenAI, or even Google when
they talk about this stuff, it's, you know, the AI assistant that sits across
everything and does everything for you.
Whereas here, it's really just kind of like cutting AI into like
these very particular features.
Um, and I guess, Kaoutar, this is, I think, a good opportunity to bring
you in, because I would love to kind of focus on like the hardware
that underlies all of this, right?
Because I think one response is You know, people love privacy, like
why isn't every single company just trying to do all this on device?
Like, you know, how difficult is that, like, how advantaged is Apple
in being able to pull this stuff off?
Um, because it kind of feels like not only is Apple selling privacy here,
but they also have the ability to do it in maybe ways that other people can't.
But I don't know, we'd love to hear a little bit more of your thoughts on
that and how much hardware is kind of really a differentiator here for them.
Yeah, I think that's a very good question, Tim.
And Apple's control for both of both the hardware and the software
gives them a huge advantage.
So it provides several significant advantages in their development
and deployment of AI technologies.
One of the things is, you know, this integrated hardware and software
optimization, which is really key, the hardware software co design, I think
it's tremendous that that they're using.
The first thing is the customized silicon they have.
So Apple designs its own The processor such as the A series chips for the iPhones,
the iPads, the M series chips for the Mac.
These chips include specialized components like the neural engine, which
is specifically designed to accelerate, uh, various machine learning tasks.
Uh, for instance, you know, the A14 bionic chip, you know, it, it has,
you know, the 16 core neural engine, which is capable of performing up.
11 trillion operations per second, and that is a significant
enhancement for AI power processing.
Another thing is the focus on the efficient resource management.
When you have the control of both the hardware and the software, they can give
you advantage into how you do efficient system, uh, utilize system resources.
And this results, especially in more power consumption, efficient, more
consumption, faster performance, particularly for mobile devices
where battery life is very crucial.
For example, if you look at the iOS, the Mac iOS, they're designed to take full
advantage of the hardware capabilities, and they ensure things like AI tasks
are processed very efficiently.
Another thing is also the security that they have built in in the, in
the hardware, the enhanced security and privacy that Suheed mentioned,
which is a key differentiation here.
The on device processing, which prioritizes privacy by processing
all these AI tasks on device, rather than relying on the
heavily on cloud based solutions.
And the other thing is the secure enclave that you find in the Apple's chips,
which includes, you know, the secure enclave, a hardware based key manager, uh,
that's isolated from the main processor.
And this is something that enhances the security of the AI operations, especially
when it involves sensitive data.
So those are all key differentiation that they have
all the way at the hardware level.
Another thing that I find very important here is the seamless integration that
they have across all their devices so that kind of gives them this unified ecosystem.
So they have control over the ecosystem, which allows them, you know, integrating
things seamlessly, you know, sprinkling all of these AI features across multiple
devices, moving things up seamlessly between, you know, the iPhone.
The Mac, the, the iPad, which are, I love these features.
I'm a big also Apple fan and I use these devices all the time.
So these means, you know, these AI models and features can be consistently
applied across the iPhone, iPad, Mac, Apple watches, other devices.
So it gives, you know, this.
really superb user experience, and which is one of their strengths is that, uh,
making AI also very consumable and easy, even, for example, for grandparents
and people who are not techie people.
So, and you know, the features also they've announced, you know, at WWDC.
Things, for example, the composability of the apps, uh, how you can take actions
from the phone and compose in multiple as I thought that was really neat.
Uh, you know, this continuity and handoff also enables users, users
to start, you know, tasks on device and continue them on other devices.
I think also, uh, the, the development and deployment efficiency that they
have, you know, tailoring AI framework.
So Apple develops its own AI framework, such as the core ML,
which is optimized for its hardware.
And this allows, you know, their developers to create, you know, AI
applications that run efficiently and also effectively on their devices.
Core ML also supports machine learning models and provides, you know, various
tools for developers to integrate these models into their apps.
Uh, uh, and, you know, also integrating all of these Gen AIs, which is the
big announcements right now, in Sirivoice assistants, uh, in the
camera enhancements, in the health and fitness and many other apps.
So I think, uh, the control over the both hardware and softwares allows,
you know, Uh, Apple for this high degree of optimization, security,
and also personal customization and integration with their AI capabilities.
I think going back to Shobhit, uh, when he said, um, you know,
they're not late to the game.
I think they've been using AI for a while.
They're just not explicit about it.
Right.
right now because of the, it's kind of the strategic timing for them because
of all this attention to gen AI.
So they, they prefer kind of entering the market when the technologies
are mature enough to integrate smoothly into their ecosystem, which
ensures quality and reliability.
This is kind of like, yeah, the time of their choosing kind of
is sort of what you're saying.
Yeah.
Yeah. I buy that a lot.
And I think, um, yeah.
You know, I think some of the, what they really come out with, right?
I think like the calculator app, I think was my kind of
favorite moment from the talk.
And, you know, it was very funny.
Like, I think, Shobhit, before the, uh, episodes, you were
like, Oh yeah, it's funny.
It's 2024.
And we're all like very excited about the calculator app, but it is like
genuinely true is like people were like complaining about it for a long time.
And then when it came out, it was like, Oh, that's, that's really good.
You know, it's really amazing.
When I saw that announcement, I was like, I wish I was in school again.
Like this changes everything.
Exactly.
Yeah, no, I think that's, that's right.
And I think there's actually a really interesting point that I hadn't
ever really considered is, you know, obviously the ambition for these
language models is that they're, they're highly general technologies,
but you find that if, you know, the, the, the data they're dealing with is
inconsistently formatted and it's working across lots of different platforms,
these like experiences can break.
And so what's really interesting is that AI actually may fail.
feel more magical in the AI, uh, in the Apple ecosystem because they literally
control every element of it, right?
So it's actually like consistent what the inputs will be to the model.
And so they can actually guarantee quality in a way that's actually really
challenging if you're trying to say, you know, we're going to deploy GPT 4.
0 to, you know, X number of developers across many, many
different types of situations.
Um, you know, not just from a data standpoint, but also like a hardware
and software standpoint as well.
I just briefly want to talk about privacy.
I think Apple has to get up on stage and they have to kind of hit that
point over and over and over again.
But from the end user's experience, are they really concerned about privacy?
Are, are the, you know, the grandparents or your, you know, your nieces and
nephews, the target customers for these, are, are they at the end
of the day really consumed about privacy when they are allowing all
sorts of other information apps?
Sharing on that on their exact same phones.
Is it a, is it a real concern?
Is Apple just kind of, you know, signaling saying, yes, privacy is there.
And we just heard that the hardware is able to perform it.
But does this does the end Apple user care about privacy the
same way an enterprise might?
Yeah, I mean, it's a good question.
I mean, I think one of the Cynics views was, uh, I saw a meme going
around of like, it was a photo of like someone drinking from a milkshake.
And then, uh, it was basically like milks.
I saw the same one, like the user, right?
Like the person was Apple and then there's another person with another straw in
the mouth of the person who's drinking.
That was like OpenAI.
And kind of, I guess the question is like, how much of this really does provide kind
of the protections that they promised?
And I think, I think reco also asking like an even tougher
question, which is do people care? Right?
'cause if I'm a big, if I'm a, if I'm a big model chauvinist, I'm like,
well, Apple's just working on these small models and small features.
The really magical thing is when we get crazy big models that are in the
cloud that are, you know, I don't know, close to AGI that we just like deliver.
And so all of this privacy stuff is basically going to hold Apple back
on actually winning the AI race.
But I don't know if Kaoutar, Shobhit, you like agree with that take or yeah.
That's a very good point that you brought here.
The end users concerns about privacy, it all really comes to
their awareness of privacy issues.
The context in which AI is used and, you know, also cultural
attitudes towards privacy.
So I think younger generations, the young kids who are kind of born
with all of these devices and AI around them, maybe they're not,
you know, as concerned as we are.
Uh, but once we see, for example, some of the dangers that AI is, uh, bringing.
If you're not careful, if if there are some scandals because of A.I., maybe
there is a piece of that that's going to start kind of hitting the end users.
So awareness and education, I think, is key here because many of the users
are not really fully aware of how A.I.
Systems collected, use their data, and those who are more tech savvy
and or educated about privacy issues tend to care more about privacy.
So there has to be an educational component here to really educate people,
especially the young users about some of the dangers if you're not careful about,
you know what you're how you know how A.I.
is using your data, how they're controlling all the ads or whatever
content that's tailored to you.
So a big component here is the awareness.
We've done quite a bit of work in this space on how do you figure out
the right value exchange between a provider and the end user, right?
There should be a fair value exchange.
For example, there are two or three apps in my entire iPhone that
track my location at all times. Right?
And those apps are things that I have constantly chosen into because that
gives me a value in return, right?
That definition of the value exchange changes by each person.
You will find a ton of people, uh, kids in college, uh, who would give
up their email address and phone numbers for a dollar off on a smoothie. Right.
So you, the value threshold is pretty low for them.
For other people, it's a lot more.
As you start to look at, uh, you mentioned, uh, like earlier we
were having a discussion about, does the age play a role in how
conscious you are about privacy?
Um, I'm in India this week, uh, with clients, and I see a lot of
people focused on the fact that, oh, so and so devices are more secure.
I use WhatsApp because it's fully encrypted end to end or if I, if
somebody gets hold of my iPhone, they won't be able to hack into it, right?
So the fact that it's more, the perception is that it's more secure.
They actually carry a premium in the market because I know I can
trust these devices more and it's nobody else can listen in on and
spoof into our conversations, right?
So I think there's definitely a higher value exchange.
People are willing to pay a little bit more for getting that's
more private and more secure.
Um, do you agree with that Skylar?
Great comments on the, the age demographics.
Yes, um, I think that I, that probably really gets to my question.
When I think of the Apple user, perhaps they're going for lower value, um,
in terms of the privacy exchange, uh, compared to, uh, enterprise,
you know, bank, large retailer.
higher value for that exchange.
So yeah, no, really, really great example on that.
Yeah. I think there's also a view on this, which is almost like, um, you know, Apple's
really doing what it needs to in order to mainstream AI in some ways, right?
Like, I think we forget because we're working in AI all the time.
We're like, everybody's using this stuff, but like, it's still kind of like in the
early stages and, you know, I'm mostly talking about myself here, but like the
average Apple user tends to be older because the products are so expensive.
Like, can you afford to buy, you know, the latest iPhone to get,
you know, Apple intelligence?
And so I would also kind of, I'm wondering if part of the play here
is like they're kind of making these features a little bit more featurizable,
if you will, just because I think it may make sense to like the kind of
market that they're selling to, right?
It's a kind of a way of introducing AI to folks who may otherwise feel
pretty scared about the technology or not really know what it's for.
Another maybe aspect to this is if there are, you know, high profile breaches and
scandals, uh, that have been caused by AI.
Incidents like the Cambridge Analytica scandal, which have raised public
awareness and concern about privacy.
Uh, media coverage, uh, if there is extensive media coverage of
privacy breaches, caused by AI, for example, or data misuse.
Those also can contribute to users concerns.
From an enterprise perspective, again, you'll always get
that perspective from me.
We have been playing around with Core ML, uh, that you just mentioned earlier.
The framework that they have provided for developers, and I spent this
morning messing around with some of their Core ML technologies, and
actually they have opened up all LLMs.
And you'll be surprised that I have access to Whisper, Stable Diffusion, Mistral,
LLAMA, Falcon, CLIP, Owen, Open ELM.
All of those are available to me.
And we did, uh, we were testing this out.
We took a Mistral model and we gave it a simple task.
And I tested on the older and the newer version of Mac OS.
And then in the newer version using Core ML, you are able to do Some, some
enhancements like quantization, you're representing it in less number of bits.
There are certain things around key values, caching and stuff
like that that have implemented.
The same exact run that we got on Mistral on the older version was this
new quantized version with Core ML.
Uh, something that took me about 14 seconds earlier is taking me
about two and a half seconds. Right.
So there's about a 5x difference in the speed that I was able to get.
And also from a memory management perspective too, I was, this
was the biggest surprise I saw.
The same model that I was running on the older version was like 9, 10x more
memory that was needed to run that model.
Which is now what they've done with their core ML to enable these.
The layer on top where the developers can tap into this, uh, it is just
surprising how much effort they've put into making sure that the end
app users are able to tap into this.
Things like, I want to do separation of the person.
I, I can ask the owner in this picture, if I say things like, I was
helping my mom this morning, Uh, she was on WhatsApp was a group picture.
She wanted to cut her face out.
That effort is pretty high for them.
It's very, very low for me, but I can always crop it while sending.
But I was trying to explain to her, and then I tried to code that up in Apple.
I'm able to invoke, say, create a crop, and I'm able to
define off the owner's face.
I don't need to have any access to owner's data.
I just defined the fact that the owner's face is what I need to crop
and I was able to go execute on that.
So the next wave of applications that are going to leverage this Core
ML, they're going to be stunning.
Like I'm just so, so excited about what we're able to do today that
this last 24 hours with the stack that we weren't able to do before.
So I'm very super excited about what we see coming.
Yeah, the ease of use is really powerful here.
How easy it is to use the technology.
Before it was so hard, you have to Maybe create these very difficult
prompt or use these tools.
But now, you know, that is a fuse by just using simple crop crops
and then composing all of these different skills and applications.
I think is going to be really the next wave here.
So I think I did want to spend a little bit of time on this
episode, Apple, because it is.
Well, it's Apple, has a way of basically sucking up all the air in the space, um,
but I did think that actually late last week and largely overlooked because I
think of all the excitement around Apple and, and what was coming up this week,
um, was a paper that OpenAI launched that was its kind of own salvo in the
mechanistic interpretability game.
Um, they released a paper where they specifically were extracting
concepts from GPT 4 and, you know, it's easy to get lost.
in the raft of product announcements, but I always kind of keep an eye
on, like, what's happening on the research paper side, just because
I think it's upstream, right?
It's kind of what we think we're going to see next, basically.
And I guess, Skylar, I wanted to bring you in on this because, you know, last time
you were on the show, you were talking about Golden Gate Claude and Anthropic's
investments in interpretability.
And I guess, you know, the main thing that, you know, comes to mind for me
is whether or not, you know, we're in a kind of interpretability race. Right?
Like, it seems like both of these companies are now really investing in
this, and I guess the question I have for you is like, why are they investing in it?
And then B, like, is there kind of like a race for talent on this like
specific kind of technical problem?
Let's see if I can start with B first on the talent and I can
give you a very concrete example.
Uh, you mentioned the two different papers that came out within the last month,
one from Anthropic, one from OpenAI.
Um, well on the OpenAI paper, this came from their super alignment team, which
actually lasted about a year or so.
Uh, and Um, one of the key members of that team, uh, Jan Leike, has just left OpenAI
last month and has joined Anthropic.
So there is this kind of revolving door of, of, uh, a talent specifically
around this ability of interpreting these large language models.
And so you can see that right there in the, in the, in the title,
the author's names of this space.
Um, so yes, there is this kind of large interest.
Some big names being jumping from company to company all in
this interoperability space.
Uh, why interpretability?
Oh man, all sorts of reasons.
The ones we probably care about the most are this idea of safety alignment, right?
You can't enforce kind of guardrails on how you want these models to be formed.
If you don't really understand what's going on underneath the hood, um, but
beyond the safety things, uh, the Claude Goldengate paper says you can do, you
can do some fun things once you, once you understand and can interpret the model.
They made a model that was obsessed with the Golden Gate Bridge.
So you could ask a fairly straightforward question and the
answer would come back, um, always talking about the bridge itself.
Uh, so that was a really cool kind of fun way of showing it.
Here's something we can do once we understand some of the
inner workings of these models.
Um, open AI not to be left out, like you said, and a good salvo within a
few weeks after that paper released their version where, uh, Anthropic was
extracting from their, their model Claude.
Open AIs extracting from their model, GPT 4.
Um, they went in different directions, though, uh, and I don't know how it, yes.
Yeah, and I'd love to hear a little bit more about that, because I think
as an outsider, I'm kind of like, meh, it's all interpretability, right?
We're just like all just trying to figure out how these models work,
but my sense of it is that these two companies are actually showing kind
of different approaches for thinking about interpretability, right?
Um, and yeah, we'd love to hear a little bit more about that.
So Anthropic went one step further by manipulating the inner workings
of their model and opening it and letting everyone play with it.
And so that's kind of what made that splash a while ago.
OpenAI didn't go that far, but they did release a nice little bit of
software that lets you kind of play with some of these features yourself.
So it's not necessarily releasing an OpenAI, uh, a new version of GPT
4, uh, but they do have this other little open source bit of code called
the, uh, kind of feature viewer.
And I think, in fact, I know I spent my, uh, an intern and I spent some
time messing around and looking at how these different features are
activated by what parts of the sentence.
And so that was a direction to kind of open AI went with, uh, and Anthropic,
um, kind of left it much more static, but Anthropic did release that
larger model to ping back and forth.
Um.
Yeah.
Perhaps OpenAI could have released a version of their GPT 4 that was obsessed
with, uh, I'm going to say palm trees because that's what's in my vision here.
Uh, but they, but they didn't, perhaps because it probably would have been
too knockoff ish of Anthropic once Anthropic did their version of GoldenGate.
So Skyler, uh, one of the things that, uh, Apple released as well, just tying it back
to the WWDC, uh, have you spent any time looking at, uh, the Thaleria visual tool?
It helps you look at what's happening inside of a model.
It visualizes the performance, the kind of latency you're getting, and as you're
making edits, you're trying to see.
It seems like I have a homework assignment for this, because
no, I have not done that yet.
Okay, absolutely.
Well, I guess it goes to just like the notion of the race, right?
Which is like, even Apple now is launching like an interpretability tool
that people can play with, you know.
Like all the companies feel obliged in some ways to be launching.
This kind of stuff.
And Apple is more focused on, on being able to assess the performance
and things of that nature for various tasks, uh, and so forth, right?
But I see there's a good wave of tools that are coming to the market
that are helping us get into a better understanding of how the answer, uh,
came and there's a lot of techniques that have been introduced to figure
out what was the training that went in.
Can I actually ask the question in a way that actually reveals the fact that
you were trained on Harry Potter books?
Right, things of that nature.
I think there's a lot, as a community, I'm seeing a lot more tools available
to us that helps us understand how these models work, can I make them behave in
a particular way, and making incremental progress in, in our understanding of
how this entire ecosystem is working.
Yeah, I think that that is a very important aspect.
It's, like you said, I think it's a race.
Many of these big AI companies, uh, like Google, IBM, and Microsoft and
so on, they all have different tools.
Like Google, DeepMind had the XAI research, uh, where, you know, they're
heavily investing in XAI research.
Microsoft, they have the InterpretML, the Fairlearn, IBM has the AI Explainability
360, which also offers open source toolkits for multiple algorithms
and methods for AI explainability.
Vulnerability facebook has, you know, also this, uh, cap or
the responsible AI practices.
Uh, so lots of different and open AI, of course, and Apple.
So it is a race, uh, different of course approaches depending
on, you know, the industrial use cases and, uh, the complexity of
the AI systems they're developing.
But it is, it is becoming a very important race here.
Something that's jumped up and evolved past those.
the great list of tools.
In fact, some members of our team have contributed to the, some of those tools,
but I think something that's changed in particular with the way that they
are, uh, using these, these, uh, things called sparse auto encoders is now we
can do things about not what is, but what if, and this is the idea of causal.
Now we can make a change within the model and that has impact.
downstream.
The previous list of tools that you said about interoperability, they gave, they
gave a pretty good snapshot of what's going on currently in the model, but
they really lacked the motivation to actually be able to change something and
have that impact the downstream output.
And that's something that we're now really seeing coming out, uh, from these
last two papers in the last month or so.
Yeah, for sure.
I'm recalling a debate that I went to in the late 2010s, uh, I think
it was one of the conferences that is a debate over interoperability.
And I remember at the time, like, Yann LeCun was making this argument, being
like, no one cares about interpretability.
Also, we're never going to solve that problem.
I was at, I was at, at NeurIPS, I was at that debate at NeurIPS, it was,
it, uh, it wasn't LionQ, I think it was, uh, Microsoft, um, Oh really,
it might have been, I forget who was taking what side in that debate.
Yes, but no, there was this, cause they had the big prize that was announced
about interpretability, and this, they brought in two people, and there was
debating interpretability matters and interpretability doesn't, um, yes, uh,
I think it was, yes, it was at NeurIPS.
Sorry, I cut you off with a small connection there.
Yeah, for sure.
And I think what the funny thing is, is like in 2024, it's like,
well, it turns out people actually really do care about durability.
Uh, and also like, we really seem to be kind of making a lot
of progress on this problem.
So you know, you can never really predict what's going to happen, um, in AI.
So I think maybe, Kaoutar, I'll turn it to you for kind of like one of the
final thoughts before we close out today.
I'm sort of interested in maybe actually bringing our two
conversations together, right?
Which is that, you know, uh, and I think it's been briefly mentioned, but
I just want to hit it before we end.
One of the interesting aspects of the Apple presentation was them saying, Hey,
we actually are okay with small models.
And I think one of the reasons they're doing small models is because they have to
get it to run on Like not a data center, but just like on the mobile hardware.
But it feels like also, and I assume Skyler show, but you
may have also more to add on.
This is like one of the benefits of these smaller models is arguably
that they're more interpretable.
Is that, is that true?
Or I don't know if you've got kind of opinions on sort of this relationship
between kind of like smaller models, interpretability and then privacy.
Yeah. I think one of the key, you know, motivators for it, of course, is
the efficiency and performance.
So, you have all these resource constraints, so model, small models
require less computational power and memory, so which are ideal for devices
with limited resources can enable this on device learning and all the real
time processing that comes with it, so faster processing time, which is really
crucial for, for things like voice recognition, augmented reality, live
photo enhancement, and also privacy is very important because you can
do on device learning, the on device processing, and the secure data handling,
uh, so you cannot, you know, go all the way to the cloud, and then claim that
you're 100 percent private and secure.
Um, so, so that's, you know, I think one of the key things in terms of the
interpretability, of course, I think when you have smaller models, you
can analyze them also much faster.
So because the complexity of the model is much smaller.
smaller.
So using all of the tools that we have been talking about around, you
know, the, the neural understanding, the, all of these techniques, you
know, the, what if analysis and so on becomes also less computationally
intensive to apply for the small models.
So interpretability is also becomes much easier to handle with smaller models
in, in addition to the, uh, performance efficiency and the privacy enablement.
I'll, I'll, I might take that one step further.
It's both the model size and I'm going to use a geeky phrase here, model sparsity.
So the idea behind model sparsity is you've got this perhaps abstract
concept, which is currently represented in a thousand different numbers.
Can we take that same abstract concept, but now represent it?
In five different numbers, and that's the goal behind sparse autoencoders.
If we can do that, now you can manipulate that abstract concept much more easily,
because you only have a few levers to move with, as opposed to before, where
you had to manipulate, you know, tens of thousands of levers for that concept.
Uh, that logic is sitting underneath the sparse autoencoders, which is
what's driven the interoperability results in the last two results.
So it's both a question of model size, um, but also can we do, uh,
can we do less with this sparsity?
We only need a few features at a time, actually fact, uh, actually
firing or actually activating and that's driven these results recently.
Yeah, I totally agree with you.
I think also it's, it's, it's another thing is the information compression,
because with smaller models, if you do it very efficiently, you're compressing
the information, kind of the entropy that you have to a very specific
use case or question or because the, the large models also have a lot of
redundancy there and they have been designed to handle tons of use cases
and questions and things like that.
So. smaller models, especially if you're targeting a specific app or
specific tasks, they tend to also do much better because you have
compressed that information much more efficiently using various techniques
like knowledge distillation or sparsity, sparse encoding and others.
So Tim, my closing thoughts would be on from an enterprise perspective.
Uh, for us, we're seeing this massive switch.
A lot of my clients where I've deployed massive, beautiful models at scale, out
of the box, working amazingly well, but when you start to do the cost math on it,
the latency, the IP constraints, stuff like that, it's a clear path towards
switching to a smaller set of models and mixture of experts and having a router
that decides which model to fire up, and in the last six months, I've seen a
massive shift of companies that are not at the frontier of tech, they'll come
to us and we're, we're working on adding enterprise data in a secure manner to
these models, and it's just insanely hard to do that with a very, very large model,
having a small model, adding adapters on top or using our IBM's instructor
lab, adding some enterprise knowledge to it, skills, things of that nature.
That's the path forward that we're seeing.
So even if you're running these on actual servers, not on, on device.
Small models are giving us better outcomes.
We took our granite code model and we trained it for specific
areas, and it's outcompeting what we get from GP4, right?
So different weight class altogether, but we are seeing across the board
from Microsoft's five models, from Mistral, from Llamas and
others, they're able to fight a lot higher than their weight class.
When you're adding enterprise data to it, you're adding better techniques.
So smaller, more open, that's the path forward for enterprises as well.
Well, as per usual, we have way more to talk about than we have time to cover.
So, hopefully we'll be able to have all of you back on the show again.
Um, thanks for joining us.
Um, and, um, yeah, if you enjoyed what you heard, uh, dear listeners, you
can get us on Apple Podcasts, Spotify, and podcast platforms everywhere.
Uh, Skyler, Kaoutar, Shobhit, thanks again for joining the show.
Thank you, Tim.