OpenAI vs Google Showdown
Key Points
- The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry.
- Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search.
- The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering.
- Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises.
Full Transcript
# OpenAI vs Google Showdown **Source:** [https://www.youtube.com/watch?v=T6DGGHlkYa0](https://www.youtube.com/watch?v=T6DGGHlkYa0) **Duration:** 00:41:00 ## Summary - The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry. - Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search. - The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering. - Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises. ## Sections - [00:00:00](https://www.youtube.com/watch?v=T6DGGHlkYa0&t=0s) **AI Showdown Intro: OpenAI vs Google** - The host opens the “Mixture of Experts” podcast, previews a debate on the week’s major OpenAI and Google announcements, and introduces returning panelists Varney and Chris alongside first‑time guest Brian Casey. ## Full Transcript
[Music]
hello and welcome to mixture of experts
I'm your host stim Hong each week
mixture of experts brings together a
world-class team of researchers product
experts Engineers uh and more to debate
and distill down the biggest news of the
week in AI today on the show The Open Ai
and Google showdown of the week who's up
Who's down who's cool who's cringe what
matters and what was just hype we're
going to talk about the huge wave of
announcements coming out of both
companies this week and what it means
for the industry as a whole so for
panelists today on the show I'm ay
supported by an incredible panel uh two
veterans who have joined the show before
and a a new uh contestant has joined the
ring um so first off uh Varney he's
the senior partner Consulting for AI in
US Canada and latam welcome back to
the show thanks for having me back Tim
love this yeah definitely glad to have
you here uh Chris hey who is a
distinguished engineer and the CTO of
customer transformation Chris welcome
back hey nice to be back yeah glad to
have you back uh and joining us for the
first time is Brian Casey who is the
director of digital marketing who has
promised a 90-minute monologue uh on AI
and search summaries which I don't know
if we're gonna get to but we're gonna
have him have a say Brian welcome to the
show we'll have to suffer through show
bit and Chris for a little bit and then
we'll get to the monologue but thank you
for stuff yeah exactly
exactly um well great well let's just go
ahead and jump right into it so
obviously there were a huge number of
announcements this week open AI came out
of the gate with its kind of raft of
announcements uh Google IO is going on
and they did their set of announcements
and so really more things I think were
debuted promised coming out then we're
going to have the chance to cover on
this episode but sort of from my point
of view and I think I wanted to use this
as a way of organizing the episode there
were kind of three big themes coming out
of Google and open AI this week we sort
of take in turn and use to kind of make
sense of everything so I think the first
thing is multimodality Right both
companies are sort of obsessed with
their models taking video input and
being able to make sense of it and going
from you know image to audio text to
audio um and I want to talk a little bit
about that second thing is latency and
costs right everybody touted the fact
that their models are going to be
cheaper and they're going to be way
faster right and you know I think if
you're from the outside you might say
well it's kind of a difference in kind
things get faster and cheaper but I
think what's happening here really
potentially might have a huge impact on
Downstream uses uh of AI and so I want
to talk a little bit about that
Dimension and sort of what it means um
and then finally uh I've already kind of
previewed a little bit um Google made
this big announcement that I think is
almost literally going to be like many
people's very first experience with llms
in full production uh Google basically
announced that going forwards uh the US
market and then globally uh those users
of Google search will start seeing AI
summaries at the top of each of their
sort of search results um that's a huge
change we're going to talk a little bit
about what that means and um if it's
good I think is a really big question uh
so looking forward to diving into it
[Music]
all so let's talk a little bit about
multimodal first so there's two showcase
demos from Google and open Ai and I
think both of them kind of roughly got
at the same thing which is that in the
future you're going to open up your
phone you're going to turn on your
camera and then you can wave your camera
around and your AI will basically be
responding in real time and so show I
want to bring you in because you were
the one who kind of flagged this being
like we should really talk about this
because I think the big question that
I'm sort of left with is like you know
where do do we think this is all going
right it's a really cool feature but
like what kind of products do we think
it's really going to unlock and maybe
we'll start there but I'm sure I mean
this topic goes into all different
places so I'll give you the floor to
start so Monday and Tuesday were just
phenomenal infliction points for the
industry altogether is getting to a
point where an AI can make sense of all
these different modalities it's an
insanely tough problem we've been at
this for a while we've not gotten it
right we spent all this time trying to
create pipelines to do each of these
speech to text and understand and then
text it takes a while to get all of the
processing done the fact that in 2024 we
are able to do this what a time to be
alive Man U I just feel that we are
getting finally getting to a point where
your phone becomes an extension of of
your eyes of your listening in and stuff
like that right and that is a that has a
profound impact on some of the workflows
in our daily lives Now with an IBM I
focus a lot more on Enterprises so I'll
give you more of an Enterprise a view of
how these Technologies are actually
going to make a make a difference or not
in both cases gini and and open eyes is
40 and by the way in my case 40 does not
stand for Omni Omni for me 40 means oh
my God it was really really that good so
U we're getting to a point where there
are certain workflows that we do with
Enterprises like you are looking at
transferring Knowledge from one person
to the other and usually you're looking
at a screen and you have a bunch of here
is what I did how I Sol for it yeah we
used to spend a lot of time trying to
capture all of that and what happened in
the desktop classic BP processes these
are billions of dollars of work that
happens right yeah and I think I pause
you there like I'm curious if you can
explain because again this is not my
world I'm sure a lot of listeners aren't
it isn't their world as well how did it
used to be done right like so if you're
you're trying to like automate a bunch
of these workflows
is it just people writing scripts for
every single task or like I'm just kind
of curious about what it looks like yeah
so Tim let's let's pick a more concrete
example say you have Outsourcing a
particular piece of work and your
Finance documents coming in you're
comparing it against other things you're
finding errors you're going to go back
and send the email things of that nature
right so we used to spend a lot of time
documenting the current process and then
we look at that 7 29 step process and
say I'm going to call an API I'm going
to write some scripts and all kinds of
issues used to happen along the way
unhappy PA so and so forth so the whole
process used to be codified in some some
level of code and then it's
deterministic it does one thing in a
particular flow really well and you
canot interrupt it you can't just barge
in and say no no no this is not what I
wanted can you do something else so
we're now finally getting to a point
where that knowledge work that work that
used to get done in a process that will
start getting automated significantly
with announcements from both Google and
uh open ey so far people would solve it
as a decision step-by-step flowchart but
now we're at Paradigm Shift where I can
in the middle of it interrupt and I can
say hey see what's on my desktop and
figure it out I've been playing I've
been playing around with with opening
eyes 40 its ability to go look at a
video of a screen and things of that
nature it's pretty outstanding we are
coming to a point where the the speed at
which the inference is happening is so
quick then now you can physically we can
actually bring them into your workflows
early it was just take so long it was
very clunky it was very expensive so you
couldn't really justify adding AI into
those workflows it'll be you do liver
Arbitrage or things of that nature
versus trying automated so the these
kind of workflows infusing AI in doing
this entire process into an phenomenal
unlock one of my clients is um big CBG
company and uh as we walk into the
aisles they do things like planograms
where you're looking at a picture of the
of shelf and these consumer product
goods companies would give you us a
particular format in which you want to
keep different chips and drains and so
on so forth and each of those labels are
turned around or they are in different
place you have to audit and say am I
placing things on the sh the right way
like the consumer product goods wanted
to that's called plog real plog IDE here
so earlier we used to take pictures a
human would go in and note things and
say yes I have enough of the bottles in
the right order then we started to take
pictures and analyzing it you start to
run into real world issues you don't
have enough space to back up and take a
picture or you go to the next Isis and
the lighting is very different and stuff
like that so AI never quite scaled and
this is the first time now we're looking
at models like Gemini and others where I
can just walk past it and as create a
video and just feed the whole 5 minute
video in with this context length of 2
million plus and stuff it can actually
inest it all number do missing yeah
right so those those kind of things that
were very very difficult to do for us
earlier those are becoming a piece of K
the big question here is how do I make
sure that the AI phenomal stuff that we
seeing is grounded in Enterprise so it's
my data my planogram style or my
processes my documents not getting
Knowledge from elsewhere so in all the
demos one of the things that I was
missing was how do I make it go down a
particular path that I want right if the
answer is not quite right how do I
control it so I think a lot more around
how do I bring this to my Enterprise
clients and deliver value for them those
some of the open questions Chris I
totally I do want to get into that I see
Chris coming off mute though so I don't
want to break his role I don't know if
Chris and you got kind of a view on this
or if you disagree you're like ah it's
actually not that impressive uh Google
Glasses back baby yeah yeah
no so I I think I think multimodality is
a huge thing and covered it
correctly right there's so many use
cases in the Enterprise but also in uh
consumer based uh scenarios and I think
one of the things we really need to
think about is we've been working with
llms for so long now which has been
great but the 2D Tech space isn't enough
for generative AI it's it's we want to
be able to interact real time we want to
be able to interact with audio um you
know and you can take that to things
like contact centers where you want to
be able to transcribe that audio you
want to then have AIS be able to respond
back in a human way and you want to chat
with the assistants like like you saw on
the open AI demo you know you don't want
to be sitting there go well you know my
conversation is going to be as fast as
my fingers can type you want to be able
to say hey you know what do you think
about this what about that and you want
to imagine new scenarios so you want to
say what what does this model look like
what does this image look like you know
tell me what this is and you want to be
able to interact with the world around
you and to be able to do that you need
multimodal uh models and and
therefore like in the Google demo where
you know yeah she picked up the glasses
again you know so I jokingly said Google
Glasses back but but it really is it's
if you're going and having a shopping
experience retail and you want to be
able to look at what the price of a
mobile phone is for example you're not
going to want to stop get your phone out
type type type you just want to be able
to interact with an assistant there and
then or see in your glasses what the
price is and I give the mobile phone
example for a reason which is the price
that I pay for a mobile phone isn't the
same price as you would pay right
because it's all contract rates and if I
go and speak if I want to get the price
of how much am I paying for that phone
it takes an advisor like 20 minutes cuz
they have to go look up your contract
details Etc they have to look up what
the phone is and then they do a deal mhm
in a world of multimodality where you've
got something like glasses on it can
recognize the object it knows who you
are and then it can go and look up what
uh what the price of the phone is for
you and then be able to answer questions
that are not generic questions but
specific about you your contract to you
right exactly that that is where
multimodality is going to start start to
come in kind of sounds like right yeah
totally I mean Chris if I have you right
I mean this is one of the questions I
want to pitch to both you show and you
Chris on this is you know actually my
mind goes directly back to Google Glass
like the the bar where the guy got beat
up for wearing Google Glass years ago
that was like around the corner from
where I used to live at San in San
Francisco oh wow and you know there's
just been this dream and obviously all
the open AI demos uh and Google demos
for that matter are all very consumer
right that you're walking around with
your glasses and you're looking around
the world and you know get prices and
that kind of thing this been like a
long-standing Silicon Valley dream and
it's been very hard to achieve and I
guess one thing I want to run by you is
like and the answer might just be both
or we don't know is like if you're more
bullish on the beta b side or on the
beta C side right because I hear what
chit's saying and I'm like oh okay I can
see why Enterprises really get a huge
bonus from this sort of thing um and and
I guess it's really funny to me because
I think there's one point of view which
is everybody's talking about the
consumer use case but the actual
near-term impact may actually be more on
the Enterprise side but I don't know if
you guys buy that or if you really are
like this is the era of Google Glass you
know it's it's back baby so so I can
start first Tim um we have been work
with apple Vision quite a bit um with an
IBM with our clients and a lot of those
are Enterprise use cases in a very
controlled environment so things that
where things break in the consumer world
you don't have a controlled environment
you have Corner cases that happen a lot
right in an Enterprise setting if I'm
help if I'm wearing my my vision Pros
for two hours at a stretch doing I'm a
mechanic I'm fixing things right that's
a place where I need additional input
and I can't go look at other uh things
like pick up my cell phone and work on
it I'm underneath I'm I'm fixing
something in the middle of it right
those use cases because the environment
is very controlled I can do AI with
higher accuracy it's reputable I know I
can start trusting the answers because I
have enough data coming out from it
right so you're not trying to solve
every problem but I think we'll see a
higher uptake of these devices U by the
I love the the Rayband glasses from meta
as well great great to do something
quick but when you don't want to switch
but I think we we have moving to a point
where Enterprises will go deliver these
at scale the tech starts to get better
and adoption is going to come over on
the B Toc sign but in the consumer goods
we'll have multiple attempts at this
like we had with Google classes and
stuff it'll take a few attempts to get
better on the Enterprise side we will
learn and make the models a lot better
but I think there's insane amount of
value that we're delivering to our
clients with apple Vision Pro today in
Enterprise settings I think it's going
to follow that problem totally yeah and
it's actually interesting I hadn't
really thought about this in chis in is
like um basically like the phone is
almost not as big of competition in the
Enterprise setting right whereas like
the example that Chris gave was like
literally you're trying to be like is
this multim modal device faster than
using my phone in that interaction which
is like a real competition but if it's
something like a mechanic you know they
don't have they don't they can't just
pull out their phone um Chris any final
thoughts on this and then I want to move
us to our next topic yeah and I was just
going to give another kind of use case
scenario I I often think of things like
the oil rigs exam example so
a real sort of Enterprise space where
you're wandering around and you have to
go and do safety checks on various
things and most of their time if you
think of the days before the mobile
phone or before the tablet what they
would have to do is go look at the part
do the inspection the visual inspection
and then walk back to a PC to go fill
that in and then these days you do that
with a tablet on the rig right but but
then actually you need to find a
component you're going to look at you
have to do the defect analysis you want
to be able to take pictures of that you
need the G location of where that part
is so that the next person can find it
and then you want to be able to see the
notes that they had before on this and
then you got to fill in the safety form
right so they have to fill in a t ton of
forms so there's a whole set of
information if you just think about AI
just having you know even your phone or
glasses pick either to be able to look
at that part be able to have the notes
contextualized in that geospatial space
be able to fill in that form be able to
do an analysis with AI it's it's got a
huge impact on Enterprise cases and
probably multimodality in that sense has
probably got a bigger impact I would say
in the Enterprise cases than the
Consumer spaces even today and I and I
think that's something we really need to
think about the other one is and again I
know you wanted this to be quick there
Tim is the clue and generative AI is the
generative part right so actually I can
create images I can create audio I can
create music things that don't exist
today so and with the text part of
something like an llm then I can create
new creative stuff I can create develops
pipelines doer files whatever so there
comes a part where I want to visualize
the thing that I create I don't want to
be copying and pasting from one system
to another right that's not any
different from the oil rig scenario so
as I start to imagine new new business
processes new pipelines new uh Tech
processes I then want to be able to have
the real-time visualization of that at
the same time or be able to interact
with that and that's why multimodality
is is really important probably more so
in the Enterprise space yeah that's
right I mean I think some of the
experiments you're seeing with like
Dynamic visualization generation are
just like very cool right uh because
then you basically have you can say like
here's how I want to interact with the
data the system kind of just generates
it right on the Fly um which I think is
very very exciting
all right so next up I want to talk
about latency and cost so this is
another big Trend you know I think it
was very interesting that both companies
went out of their way to be like we've
got this offering and it's way cheaper
for everybody um which I think suggests
to me that you know these big huge
competitors in AI all recognize that
like your your per token cost is going
to be this huge bar to getting the
technology more distributed um so
certainly one of the ways they sold 40
was that it was cheaper and as good as
GPT right everybody was kind of like
okay well why do I pay for pro anymore
if I'm just going to get this for for
free and then Google's bid of course was
Gemini 1.5 flash right which is okay
it's going to be cheaper and faster
again um and I know Chris you threw this
uh sort of topic out so I'll kind of let
you have the first say but I think the
main question I'm left with is like what
are the downstream impacts of this right
for someone who's not really paying
attention to AI very closely like is
this just matter of like it's getting
cheaper or do you think like these are
actually these economics are kind of
changing how the technology is actually
going to be rolled out
I think latency and smaller models and
tokens are probably one of the most
interesting challenges we have today so
if you think about like the GP T4 and
everybody was talking like oh that's a
1.8 trillion model or whatever it is
that's great but the problem with these
large models is every layer that you
have in the neural network is adding
time to get a response back and not not
only time but cost so if you look at the
demo that open AI did for example what
was really cool about that demo was the
fact that when you were speaking to the
assistant it was answering pretty much
instantly right and that is the real
important part and when we look at
previous demos what you would have to do
if you were having a voice interaction
is you'd be stitching together kind of
three different pipelines you need to do
uh Speech to Text then you're going to
run that through the model and then
you're going to do text to speech back
way so you're getting latency latency
latency before you you get a response
and that timing that it would take
because it's not in the sort of 300
millisecond mark it was too long for a
human being to be able to interact so
you got this massive pause so actually
latency and the kind of tokens per
second becomes the most important thing
if you want to be able to interact with
models quickly and be able to have those
conversations and that's sort of why
also multimodality is really important
because if I can do this in one model as
well then it means that I'm not sort of
jumping pipelines all the time so the
smaller you can make the model the
faster it's going to be now if you look
at the GPT 4 on the model I don't know
if you've played with just a text mode
it is lightning fast when it comes back
very fast now yeah it's and noticeably
so like it's just like it feels like
every time I'm in there's like these
improvements right so and and this is
what you're doing you're sort of trading
off reasoning versus uh speed of the
model right and and as we move into kind
of agentic platforms as we move into
multimodality you need that latency to
be super super sharp because you're not
going to be waiting all the time so
there is going to be scenarios where you
want to move back to a bigger model that
is fine um but you're going to be paying
the cost and that cost is going to be
the cost uh the price of the tokens in
the first place but also the speed of
the response and I think this is the
push and pull that model creators are
going to be playing against all of the
time and and and therefore if you can
get a similar result from a smaller
model and you can get a similar result
from a faster model and a cheaper model
then you're going to go for that but in
those cases where it's not then you may
need to go to the larger model to kind
of reason so this this is really
important totally yeah I think there's a
bunch of things to say there I mean I
think one thing that you've pointed out
clearly is that like this makes
conversation possible Right like that
you and I can have a conversation in
part because I have low latency is kind
of the way to think about it and like
now that we're reaching kind of human
like parody on latency you know finally
these models can kind of Converse in a
certain way the other one is actually I
really thought about that there is kind
of this almost like thinking fast and
slow thing where basically like the
models can be faster but they're just
not as good at reasoning um and then
there's kind of this like deep thinking
mode which actually is like slower in
some ways so Tim U the way we are
helping Enterprise clients again have
that kind of focus in in life there's a
split there's a there's there are two
ways of looking at applying gen in the
industry right now one is at the use
case level you're looking at the whole
workflow into to end seven different
steps the other is going and looking at
it at a subtask level right so I'll just
take pick an example I'll walk you
through it so say I have an invoice that
comes in and I'm taking an application
I'm pulling something out of it I'm
making sure that that's as for the
contract I'm going to send you an email
saying your voice is paid right so some
sort of a flow like that right so say it
is seven steps just very simplified
right I'm going to P things from the
backend systems using apis step number
three I'm going to go call a fraud
detection model that has been working
great for three years step number four
I'm extracting things from a paper right
an invoice that came in that extraction
I used to be doing with OCR 85% accuracy
humans will do the Overflow of it at
that point we're taking a pause and
saying we have reason to believe that
llms today can look at an image and
extract this with higher accuracy yeah
say we get up to 94% so that's nine
points higher accuracy of pulling things
out so we pause at that point and say
let's create a set of constraints for
step number four to find the right
athletes and the constraint could be
what's the latency like we just spoke
how quickly I need the result or can
this take 30 seconds and I'll be okay
with it second could be around cost if
I'm doing this a thousand times I have a
cost envelope to work with versus a
human doing if I'm doing it a million
times I can invest a little bit more if
I can get get accuracy out of it right
so the ROI becomes important then you're
looking at security constraints around
does this data have any identified Phi
data Pi data that really can't leave the
cloud I have to bring things closer or
is this something that is military grade
secrets and has to be on Prem right so
have certain constraints around that so
you come up with a list of five six
constraints and then that lets you
decide whether what kind of an llm will
actually check off all these different
constraints and then you you start
comparing and bringing it in so the
split that we seeing in the market is
one way with llm agents and with these
multimodal models they're trying to
accomplish the entire flow work for end
to end like you saw with Google's
returning the shoes right it's taking an
image of it is going and looking at your
Gmail to find the receipt starting the
return giving your a QR code with the
whole return process done so just
figured out how to go create the entire
endtoend workflow but where the
Enterprises are still focused is more on
the subtask level that point we are
saying this step step number four is
worth switching and I have enough evals
before and after I have enough metrics
to understand and I can control that I
can audit that much better the thing
that from an Enterprise perspective
these end to end multimodal models it'll
be difficult for us to explain to SEC
for example why we rejected somebody's
benefits on a credit card things of that
nature so I think in the in the
Enterprise World we're going to go down
the path of let me Define the process
I'm going to pick small models to
Chris's point to do that piece better
and then eventually start moving over to
hey now let me make sure that those that
framework evals and all of that stuff
can be applied to intoing multim models
I guess I do want to maybe bring in
Brian here you like release the Brian on
this conversation um because I'm curious
about like kind of like the marketer
view on all this right because I think
there's one point of view which is yes
yes chrisit like this is all nerd stuff
right like I yeah know it's like latency
and cost and speed and whatever the big
thing is that you can actually talk to
these AIS right and I guess I'm kind of
curious from your point of view about
like I mean one really big thing that
came out of like the open AI
announcements was we're going to use
this latency thing largely to kind of
create this feature that just feels a
lot more human and lifelike um than you
know typing and chatting within Ai and I
guess I'm kind of curious about like you
know what you think about that move
right like is that ultimately like going
to help the adoption of AI is it just
kind of like a weird sci-fi thing that
open AI wants to do and also I mean I
think if if you've got any thoughts on
you know how it impacts the Enterprise
as well was just like do companies
suddenly say oh I understand this now
right it's because it's like the AI from
her I can buy this um just kind of
interesting thinking about like the the
sort of surface part of this because it
actually will really have a big impact
on the market as well it's kind of like
the technical advances are driving the
the marketing of this I I mean I do
think when you when you look at like
some of the initial reviews of I want to
say like the pin and rabbit like I
remember one of the one of the scenarios
that was being demoed
was I think I think he was looking at a
car and he was asking a question about
it and the whole interaction took like
20 seconds there and he went through he
was just showing that he could do the
whole thing on his phone in the same
amount of time but the thing that I was
thinking about when I was watching that
was like he just did like 50 steps on
his phone that was awful as opposed to
just pushing a button and asking a
question and it was like it was very
clear that the ux interaction of just
like like asking the question and
looking at the thing was a way better
experience than pushing the 50 buttons
on your phone but the 50 buttons still
won just cuz it was faster to do 50
buttons than to you know deal with the
latency impact of um of where we were
before and so it actually it reminded me
a lot of just the way I used to hear
remember hearing Spotify talk early
about the way that they thought about
latency and the things that they did to
just make the first 15 seconds of a song
Land um essentially so that it felt like
you know a like a file that you had on
your device because I think from their
perspective they if it felt like every
time you wanted to listen to a song that
was buffering as opposed to sitting on
your device you were never going to
really adopt on that thing because it's
horrible experience relative to just
having the file locally and so they put
in all this work so that it felt the
same and that wound up being a huge part
of how the technology ended up getting
and the product ended up getting adopted
and you know I do think there's a lot of
a lot of stuff we're doing that is
almost like I don't want to say back
office but like just Enterprise
processes around how people do things
operational things
but there are plenty of ways where
people are thinking about the way that
we do more with like agents in terms of
how that involves like customer
experience whether it's support
interactions whether it's like bots on
the site you can just clearly imagine
that that's going to play a bigger role
in customer experience going far forward
and if you feel like every time you ask
a question that you're waiting 20
seconds to get a response from this
thing like you're just getting the other
person on the end of that interaction is
just getting matter and matter and
matter the entire time where the more it
feels like you're talking to person and
that they're responding to you as fast
as you're talking I think the more
likely it is that people are going to
accept that as an interaction model um
and so I do think that that latency and
like making that feel to you like to
your point about having a human beings
being zero latency um I think that's a
necessary condition for a lot of these
interaction models and so it's going to
be super important going forward and to
me it's also when I think about the
Spotify thing it's like our people are
going to do interesting things to solve
for the first 15 seconds of an
interaction as opposed to the F the
entire interaction like you know can you
get there was a lot of talk about like
open AI model I want to say like
responding with like sure or just like
some space filling entry point um so it
like it could catch up with the rest of
the the dialogue so I think it'll I
think people will prioritize that a lot
because it'll matter a lot I love the
idea that like to save to save cost
basically opening eyes like for the
first few turns of the conversation we
deliver the really fast model so it
feels like you're really having like a
nice flowing conversation and then
basically once you build confidence they
like fall back to like the slower model
that has better results where you're
like oh this person is a good
conversation list but they're also smart
too right is like kind of what they're
trying to do by kind of playing with
model delivery um so we got to talk
about search but Chris I saw you go off
mute so do you want to do a final quick
hit on the question of latency before we
move on no I I was just coming to come
up with what Brian was saying there and
and what you were saying Tim I totally
agreed it was always doing this hey and
then repeat the question so I I wonder
if underneath the hood as you say is
there's a much smaller classifier model
that is just doing that hey piece and
then as you say there's probably a
slightly larger model actually analyzing
the real thing so I I do wonder if
there's two small models or a small
model and a slightly larger model in
between there for that interaction so
it's super interesting and but maybe the
thing I wanted to add to that is we
don't have that voice model in our hands
today we only have the text model so I
wonder once we get out of the demo
environment and then maybe in a 3 weeks
time time or whatever we have that model
whether that's going to be super
annoying every time we ask a question
it's going to go hey and then repeat the
question back so it's cool for a demo
but I wonder if that will actually be
super annoying in two weeks
[Music]
time all right so last topic that we got
a few minutes on uh and this is like
Brian's big moment so Brian get get
yourself ready for this I mean Chris you
can get yourself ready because
apparently Brian's gonna you know you
know everyone else can leave the meeting
yeah take our eyebrows off here with his
with his uh with his rant so the the
setup for this is that basically Google
announced uh that AI generated overviews
will be rolling out to us users and then
everybody uh in the near future and I
think there's two things that to set you
up Brian I think the first one is this
is what we've been talking about right
like is AI gon to replace search here it
is you know here it is consuming the
preeminent search engine so I think it's
like we're here right this is happening
and then the one is like I'm a little
nostalgic you know someone who grew up
with Google um you know I'm like the 10
Blue Links you know like the search
engine you know it's like a big part of
how I experienced and grew up with the
web and um you know this seems to me
like kind of a big shift in how we
interact with the web as a whole and so
I do want you to kind of first talk a
little about what you think it means for
the market um and uh and how you think
it's going to change the economy of the
web yeah so I
follow two communities I would say
pretty closely online I follow The Tech
Community and pretty closely and then I
as a somebody works in marketing I
follow my seo's community um and they
have very different reactions to uh to
what's going on I think your first
question though of um you know is this
the equivalent of swallowing the web um
and from the minute what's funny is from
the minute sort of chat GPT arrived on
the scene people were proclaiming the
death of search now for what it's worth
if you've worked in marketing or on the
Internet for a while people have
proclaimed the death of search as like
an annual event month for the last like
25 years and so um this is just like
part for the course on on some level but
what's interesting to me is that you had
this product chat GPT which is fastest
growing consumer product ever 100
million users faster than anybody else
and what was interesting is it sort of
like speed run speedran the sort of
growth cycle that usually takes years or
decades like well maybe not decades but
like it takes a long time for most
consumer companies to do what they did
the interesting thing about that is if
it was going to totally disrupt search
you would have expected it to show up
and happen sooner than it would have
with other products that maybe would
have had a slower sort of growth
trajectory um but that didn't happen
like if somebody who watches their
search traffic super closely like
there's been no chaotic drop of of this
like people have continued to use search
engines and like one of the reasons I
think that that happened is because
people actually misunderstood um like
like the equivalent of like chat gbt and
Google as competitors um with one
another I know Google and open AI
probably are on some level but I don't
know that those two products are and the
reason I was thinking about that is like
if if chat GPT didn't you know within
the within basically the time plan we've
had so far uh disrupt Google the
question is like why why didn't that
happen and I think you could have a
couple different hypothesis for that
like one you could say the form factor
wasn't right it wasn't text that was
going to do it it was we needed Scarlet
Joan
on your phone and that's the thing
that's going to do it and so they're
maybe leaning into that thought process
a little bit you could say it was
hallucinations like oh the content is
just not accurate uh yeah right so
that's a possibility around it you could
say just like learn consumer Behavior
people have been using this stuff for 20
years it's going to take a while to get
them to do something different you could
say Google's advantages in distribution
so it's like we're on the phone we got
browsers um it's really hard to you know
get the level of penetration that we
have I think all of those probably play
some role but my biggest belief is that
it's actually impossible to separate
Google from the internet itself um
Google's kind of like the operating
system for the web so to disrupt Google
you actually are not disrupting search
you have to disrupt the internet um and
it turns out that that's an incredibly
High bar uh to have to disrupt because
you're not only dealing with search
you're dealing with the capabilities
whether it's Banks or Airlines or you
know retail whatever it is of every
single website that sits on the opposite
end of the internet it turns out that
that's like an orous amount of
capability um that's built up there and
so I looked at I look at that and say
like for as much as like I think this
this technology has brought to the table
hasn't done that thing um yet and so
because it hasn't done that there hasn't
been some dramatic shift there the thing
that Google search is not good at though
um and I think you see it in a little
bit in terms of how they described what
they think the utility of AI overviews
um will be is that it's not good complex
multi-art questions of saying like if
you're trying to plan if you're doing
anything from like doing a buying
decision for a large Enterprise product
or like planning your kids's birthday
party like you're going to have to do
like 25 queries along the way there and
you just you've just accepted and
internalized that you have to do 25 qu I
like that is like basically like search
is one shot right like you just say it
and then responses come back so there's
no yeah sorry go ahead yeah yeah and so
like the way I was thinking about llms
is they're kind of like internet sequel
um in a way where you can ask this like
much more complicated question and then
you can actually describe the way that
you want the output of that thing to
look it's like I want to compare these
three products on these three dimensions
go get me all this data and that would
have been 40 queries um at one point but
now you can do it in one and search is
terrible at doing that right now you
have to go cherry-pick each one of those
data points but the interesting thing is
that that's also maybe the most valuable
query to a user um because you save 30
minutes and so I think Google looks at
that and says
um if we seed that particular space of
complex queries to some other platform
like that's a long-term risk for us and
then if it's a long-term risk for them
what it ends up being is a long-term
risk for the web um I think so I
actually think it was incredibly
important that Google bring this type of
capability into into the web even if it
ends up being disruptive a little bit
from a Publisher's perspective because
what it does is at least preserves some
of the dynamic we have now of like the
web still being an important thing and I
hope that used to your point I have like
present and past Nostalgia for it I
would say yeah exactly so I think it's I
think it's important that it continues
to evolve if we all want the web to
continue to persist as like a healthy
Dynamic Place yeah for sure no I think
that's a that's a great take on it and
you know Google always used to say look
we measure our success based on how fast
we get you off our website right and I
think kind of Brian what you're pointing
out which I think is is very true is
that like what they never said was
there's this whole set of queries we
never surface that you know you really
have to kind of keep keep searching for
right and like that's that ends up being
kind of like a the the the search volume
of the future that everybody wants to to
capture um well uh so Brian I think we
also had a little intervention from AI
the thumbs up thing we were joking about
that before the show it's just
yeah my ranking for worst AI feature of
all time um so um but um make up the
thumbnail on the on the video that's
right yeah exactly um well great so
we've got just a few minutes left show
but Chris any final parting shots on
this topic sure so I I'm very bullish I
think AI overviews um have a lot of
future as long as there's a good
mechanism of feedback incorporating and
making it hyper personalized a simple
query like I want to go have dinner
tonight say I tell you I want looking
for a th restaurant yeah if you look if
I go on on open table or yel or Google
and try to find that there's a
particular way in which I think through
it the filters that I apply are very
different from how Chris was do it right
so the way I make a decision if
somebody's making that decision for me
great the reason why Tik Tok works so
much better than Netflix on an average I
think I I was um listening to a video by
Scott and he mentioned that we spend
about 155 minutes a week browsing
Netflix on an average in the US
something of that nature like pretty
exited amount of time versus Tik Tok has
just completely taken that fallacy of
choice out for you when you go on Tik
Tok the video that they have pick
there's just so many data points the 17c
video average 16 minutes of viewing time
across your Tik Tok engagement and you
have so many data points coming out of
it seven 71 of them every few seconds
right so they have hyper personalized it
based on how you interact with things
right because they have not not asking
you to go pick a channel a choice that
nature just showing you the next next
next thing in the sequence hence the
stickiness they've understood the brains
of teenagers and then and that
demographic really really well I think
that's the direction that Google will go
into it'll start hyper personalizing
based on all the content if they're
reading and finding out where the
receipt of my shoes are they know what I
actually ended up ordering at a
restaurant that I went to right so the
full feedback loop coming into the
Google ecosystem I think it's going to
be brilliant if they get to a point
where they just make a prediction on
which restaurant is going to work for me
everything they know about me that's
right yeah I mean the future is they
just going to book it for you and a car
is going to show up and you're going to
get in it's going to take you some place
right uh so conf they'll send a
confirmation from your email exactly
right uh Chris 30 seconds you've got the
last word 30 seconds search is going to
be a commodity and I think as we see the
AI assistant era I dare you yeah but it
will be a commodity because we are going
to interact with search via these
assistants it's going to be theer on my
phone which will be enhanced by uh AI
technology it's going to be Android and
Gemini's version on there we we are not
going to be interacting with Google
search in the way we do today with
browsers that is going to be
commoditized and we're going to be
dealing with her assistants who are
going to go and fetch those queries for
us so I I think that's going to be
upended and and at the heart of that is
going to be latency and multimodality as
we said so uh I think they got to PIV it
or they're going to be disrupted yeah I
was going to say just like if that
happens what's interesting is that all
of the advantage Google has actually
vanishes like and then it's an even
playing field against every other llm
which is you know that's a very
interesting Market situation in that at
that point yeah I'm gonna pick that up
next week that's a very very good topic
when we should get more into it um great
well we're at time uh show bit Chris uh
thanks for joining us on the show again
uh Brian we hope to see you again
sometime um and to all you out there in
radi land if you enjoyed what you heard
you can get us on Apple podcasts Spotify
and podcast platforms everywhere and
we'll see you next week for mixture X
where so