Learning Library

← Back to Library

OpenAI vs Google Showdown

41m • Unknown Channel • ai-ml • news • intermediate • Watch on YouTube ↗

Key Points

The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry.
Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search.
The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering.
Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises.

Sections

00:00:00 AI Showdown Intro: OpenAI vs Google - The host opens the “Mixture of Experts” podcast, previews a debate on the week’s major OpenAI and Google announcements, and introduces returning panelists Varney and Chris alongside first‑time guest Brian Casey.

Full Transcript

# OpenAI vs Google Showdown **Source:** [https://www.youtube.com/watch?v=T6DGGHlkYa0](https://www.youtube.com/watch?v=T6DGGHlkYa0) **Duration:** 00:41:00 ## Summary - The “Mixture of Experts” podcast episode focuses on the latest showdown between OpenAI and Google, dissecting their recent flood of announcements and what they signal for the AI industry. - Host Stim Hong is joined by returning panelists Varney (senior AI consulting partner) and Chris (distinguished engineer/CTO of customer transformation), plus first‑time guest Brian Casey (director of digital marketing) who is slated to give a lengthy monologue on AI and search. - The discussion organizes the news around three major themes: multimodality (both firms pushing models that handle video, image, audio, and text inputs), latency and cost reductions (faster, cheaper inference that could unlock new downstream applications), and a flagship Google reveal that could become many users’ first exposure to the company’s next‑gen AI offering. - Throughout, the panel debates which announcements are truly impactful versus hype, aiming to clarify which technologies are “cool” and which are “cringe” for developers and enterprises. ## Sections - [00:00:00](https://www.youtube.com/watch?v=T6DGGHlkYa0&t=0s) **AI Showdown Intro: OpenAI vs Google** - The host opens the “Mixture of Experts” podcast, previews a debate on the week’s major OpenAI and Google announcements, and introduces returning panelists Varney and Chris alongside first‑time guest Brian Casey. ## Full Transcript

0:00[Music] 0:09hello and welcome to mixture of experts 0:11I'm your host stim Hong each week 0:13mixture of experts brings together a 0:14world-class team of researchers product 0:16experts Engineers uh and more to debate 0:19and distill down the biggest news of the 0:21week in AI today on the show The Open Ai 0:24and Google showdown of the week who's up 0:26Who's down who's cool who's cringe what 0:29matters and what was just hype we're 0:30going to talk about the huge wave of 0:32announcements coming out of both 0:33companies this week and what it means 0:35for the industry as a whole so for 0:37panelists today on the show I'm ay 0:39supported by an incredible panel uh two 0:42veterans who have joined the show before 0:43and a a new uh contestant has joined the 0:46ring um so first off uh Varney he's 0:49the senior partner Consulting for AI in 0:52US Canada and latam welcome back to 0:54the show thanks for having me back Tim 0:56love this yeah definitely glad to have 0:58you here uh Chris hey who is a 1:01distinguished engineer and the CTO of 1:02customer transformation Chris welcome 1:04back hey nice to be back yeah glad to 1:07have you back uh and joining us for the 1:09first time is Brian Casey who is the 1:11director of digital marketing who has 1:13promised a 90-minute monologue uh on AI 1:16and search summaries which I don't know 1:17if we're gonna get to but we're gonna 1:18have him have a say Brian welcome to the 1:20show we'll have to suffer through show 1:22bit and Chris for a little bit and then 1:23we'll get to the monologue but thank you 1:25for stuff yeah exactly 1:27exactly um well great well let's just go 1:30ahead and jump right into it so 1:32obviously there were a huge number of 1:34announcements this week open AI came out 1:36of the gate with its kind of raft of 1:38announcements uh Google IO is going on 1:41and they did their set of announcements 1:43and so really more things I think were 1:46debuted promised coming out then we're 1:48going to have the chance to cover on 1:49this episode but sort of from my point 1:52of view and I think I wanted to use this 1:54as a way of organizing the episode there 1:55were kind of three big themes coming out 1:57of Google and open AI this week we sort 2:00of take in turn and use to kind of make 2:01sense of everything so I think the first 2:04thing is multimodality Right both 2:06companies are sort of obsessed with 2:08their models taking video input and 2:10being able to make sense of it and going 2:12from you know image to audio text to 2:14audio um and I want to talk a little bit 2:16about that second thing is latency and 2:19costs right everybody touted the fact 2:21that their models are going to be 2:22cheaper and they're going to be way 2:23faster right and you know I think if 2:25you're from the outside you might say 2:27well it's kind of a difference in kind 2:28things get faster and cheaper but I 2:30think what's happening here really 2:32potentially might have a huge impact on 2:34Downstream uses uh of AI and so I want 2:36to talk a little bit about that 2:37Dimension and sort of what it means um 2:40and then finally uh I've already kind of 2:42previewed a little bit um Google made 2:44this big announcement that I think is 2:45almost literally going to be like many 2:47people's very first experience with llms 2:51in full production uh Google basically 2:53announced that going forwards uh the US 2:56market and then globally uh those users 2:58of Google search will start seeing AI 3:00summaries at the top of each of their 3:02sort of search results um that's a huge 3:04change we're going to talk a little bit 3:06about what that means and um if it's 3:07good I think is a really big question uh 3:10so looking forward to diving into it 3:12[Music] 3:17all so let's talk a little bit about 3:19multimodal first so there's two showcase 3:22demos from Google and open Ai and I 3:25think both of them kind of roughly got 3:27at the same thing which is that in the 3:28future you're going to open up your 3:29phone you're going to turn on your 3:30camera and then you can wave your camera 3:32around and your AI will basically be 3:34responding in real time and so show I 3:37want to bring you in because you were 3:38the one who kind of flagged this being 3:39like we should really talk about this 3:41because I think the big question that 3:42I'm sort of left with is like you know 3:44where do do we think this is all going 3:46right it's a really cool feature but 3:47like what kind of products do we think 3:49it's really going to unlock and maybe 3:50we'll start there but I'm sure I mean 3:51this topic goes into all different 3:52places so I'll give you the floor to 3:54start so Monday and Tuesday were just 3:56phenomenal infliction points for the 3:58industry altogether is getting to a 4:00point where an AI can make sense of all 4:03these different modalities it's an 4:05insanely tough problem we've been at 4:07this for a while we've not gotten it 4:08right we spent all this time trying to 4:10create pipelines to do each of these 4:12speech to text and understand and then 4:14text it takes a while to get all of the 4:16processing done the fact that in 2024 we 4:19are able to do this what a time to be 4:21alive Man U I just feel that we are 4:23getting finally getting to a point where 4:25your phone becomes an extension of of 4:27your eyes of your listening in and stuff 4:29like that right and that is a that has a 4:32profound impact on some of the workflows 4:33in our daily lives Now with an IBM I 4:36focus a lot more on Enterprises so I'll 4:38give you more of an Enterprise a view of 4:40how these Technologies are actually 4:42going to make a make a difference or not 4:44in both cases gini and and open eyes is 4:4840 and by the way in my case 40 does not 4:51stand for Omni Omni for me 40 means oh 4:53my God it was really really that good so 4:56U we're getting to a point where there 4:58are certain workflows that we do with 5:00Enterprises like you are looking at 5:03transferring Knowledge from one person 5:04to the other and usually you're looking 5:06at a screen and you have a bunch of here 5:07is what I did how I Sol for it yeah we 5:09used to spend a lot of time trying to 5:11capture all of that and what happened in 5:12the desktop classic BP processes these 5:15are billions of dollars of work that 5:17happens right yeah and I think I pause 5:19you there like I'm curious if you can 5:20explain because again this is not my 5:22world I'm sure a lot of listeners aren't 5:23it isn't their world as well how did it 5:25used to be done right like so if you're 5:27you're trying to like automate a bunch 5:28of these workflows 5:30is it just people writing scripts for 5:31every single task or like I'm just kind 5:32of curious about what it looks like yeah 5:34so Tim let's let's pick a more concrete 5:36example say you have Outsourcing a 5:38particular piece of work and your 5:40Finance documents coming in you're 5:41comparing it against other things you're 5:43finding errors you're going to go back 5:45and send the email things of that nature 5:46right so we used to spend a lot of time 5:48documenting the current process and then 5:51we look at that 7 29 step process and 5:54say I'm going to call an API I'm going 5:55to write some scripts and all kinds of 5:57issues used to happen along the way 5:58unhappy PA so and so forth so the whole 6:01process used to be codified in some some 6:03level of code and then it's 6:05deterministic it does one thing in a 6:06particular flow really well and you 6:08canot interrupt it you can't just barge 6:09in and say no no no this is not what I 6:11wanted can you do something else so 6:13we're now finally getting to a point 6:14where that knowledge work that work that 6:16used to get done in a process that will 6:19start getting automated significantly 6:20with announcements from both Google and 6:22uh open ey so far people would solve it 6:24as a decision step-by-step flowchart but 6:27now we're at Paradigm Shift where I can 6:29in the middle of it interrupt and I can 6:31say hey see what's on my desktop and 6:32figure it out I've been playing I've 6:34been playing around with with opening 6:36eyes 40 its ability to go look at a 6:38video of a screen and things of that 6:40nature it's pretty outstanding we are 6:42coming to a point where the the speed at 6:44which the inference is happening is so 6:45quick then now you can physically we can 6:47actually bring them into your workflows 6:48early it was just take so long it was 6:50very clunky it was very expensive so you 6:52couldn't really justify adding AI into 6:54those workflows it'll be you do liver 6:57Arbitrage or things of that nature 6:58versus trying automated so the these 7:01kind of workflows infusing AI in doing 7:03this entire process into an phenomenal 7:05unlock one of my clients is um big CBG 7:08company and uh as we walk into the 7:11aisles they do things like planograms 7:12where you're looking at a picture of the 7:14of shelf and these consumer product 7:17goods companies would give you us a 7:19particular format in which you want to 7:21keep different chips and drains and so 7:22on so forth and each of those labels are 7:24turned around or they are in different 7:26place you have to audit and say am I 7:28placing things on the sh the right way 7:30like the consumer product goods wanted 7:32to that's called plog real plog IDE here 7:36so earlier we used to take pictures a 7:37human would go in and note things and 7:39say yes I have enough of the bottles in 7:41the right order then we started to take 7:42pictures and analyzing it you start to 7:44run into real world issues you don't 7:46have enough space to back up and take a 7:48picture or you go to the next Isis and 7:50the lighting is very different and stuff 7:51like that so AI never quite scaled and 7:54this is the first time now we're looking 7:55at models like Gemini and others where I 7:57can just walk past it and as create a 7:59video and just feed the whole 5 minute 8:02video in with this context length of 2 8:04million plus and stuff it can actually 8:06inest it all number do missing yeah 8:09right so those those kind of things that 8:11were very very difficult to do for us 8:13earlier those are becoming a piece of K 8:16the big question here is how do I make 8:18sure that the AI phenomal stuff that we 8:21seeing is grounded in Enterprise so it's 8:24my data my planogram style or my 8:27processes my documents not getting 8:30Knowledge from elsewhere so in all the 8:31demos one of the things that I was 8:32missing was how do I make it go down a 8:36particular path that I want right if the 8:38answer is not quite right how do I 8:39control it so I think a lot more around 8:41how do I bring this to my Enterprise 8:43clients and deliver value for them those 8:45some of the open questions Chris I 8:48totally I do want to get into that I see 8:49Chris coming off mute though so I don't 8:51want to break his role I don't know if 8:52Chris and you got kind of a view on this 8:53or if you disagree you're like ah it's 8:55actually not that impressive uh Google 8:57Glasses back baby yeah yeah 9:00no so I I think I think multimodality is 9:04a huge thing and covered it 9:06correctly right there's so many use 9:08cases in the Enterprise but also in uh 9:12consumer based uh scenarios and I think 9:14one of the things we really need to 9:16think about is we've been working with 9:17llms for so long now which has been 9:19great but the 2D Tech space isn't enough 9:23for generative AI it's it's we want to 9:26be able to interact real time we want to 9:28be able to interact with audio um you 9:31know and you can take that to things 9:32like contact centers where you want to 9:34be able to transcribe that audio you 9:36want to then have AIS be able to respond 9:38back in a human way and you want to chat 9:40with the assistants like like you saw on 9:41the open AI demo you know you don't want 9:44to be sitting there go well you know my 9:46conversation is going to be as fast as 9:48my fingers can type you want to be able 9:50to say hey you know what do you think 9:51about this what about that and you want 9:53to imagine new scenarios so you want to 9:56say what what does this model look like 9:58what does this image look like you know 10:00tell me what this is and you want to be 10:01able to interact with the world around 10:03you and to be able to do that you need 10:06multimodal uh models and and 10:10therefore like in the Google demo where 10:12you know yeah she picked up the glasses 10:14again you know so I jokingly said Google 10:16Glasses back but but it really is it's 10:19if you're going and having a shopping 10:21experience retail and you want to be 10:23able to look at what the price of a 10:25mobile phone is for example you're not 10:27going to want to stop get your phone out 10:28type type type you just want to be able 10:30to interact with an assistant there and 10:32then or see in your glasses what the 10:35price is and I give the mobile phone 10:36example for a reason which is the price 10:40that I pay for a mobile phone isn't the 10:42same price as you would pay right 10:44because it's all contract rates and if I 10:46go and speak if I want to get the price 10:49of how much am I paying for that phone 10:50it takes an advisor like 20 minutes cuz 10:53they have to go look up your contract 10:55details Etc they have to look up what 10:57the phone is and then they do a deal mhm 10:59in a world of multimodality where you've 11:01got something like glasses on it can 11:03recognize the object it knows who you 11:05are and then it can go and look up what 11:08uh what the price of the phone is for 11:09you and then be able to answer questions 11:12that are not generic questions but 11:13specific about you your contract to you 11:16right exactly that that is where 11:18multimodality is going to start start to 11:21come in kind of sounds like right yeah 11:24totally I mean Chris if I have you right 11:26I mean this is one of the questions I 11:26want to pitch to both you show and you 11:28Chris on this is you know actually my 11:31mind goes directly back to Google Glass 11:33like the the bar where the guy got beat 11:34up for wearing Google Glass years ago 11:36that was like around the corner from 11:37where I used to live at San in San 11:38Francisco oh wow and you know there's 11:40just been this dream and obviously all 11:42the open AI demos uh and Google demos 11:44for that matter are all very consumer 11:46right that you're walking around with 11:47your glasses and you're looking around 11:49the world and you know get prices and 11:50that kind of thing this been like a 11:52long-standing Silicon Valley dream and 11:53it's been very hard to achieve and I 11:56guess one thing I want to run by you is 11:57like and the answer might just be both 11:59or we don't know is like if you're more 12:00bullish on the beta b side or on the 12:02beta C side right because I hear what 12:04chit's saying and I'm like oh okay I can 12:06see why Enterprises really get a huge 12:08bonus from this sort of thing um and and 12:11I guess it's really funny to me because 12:12I think there's one point of view which 12:13is everybody's talking about the 12:14consumer use case but the actual 12:16near-term impact may actually be more on 12:18the Enterprise side but I don't know if 12:19you guys buy that or if you really are 12:21like this is the era of Google Glass you 12:22know it's it's back baby so so I can 12:26start first Tim um we have been work 12:29with apple Vision quite a bit um with an 12:31IBM with our clients and a lot of those 12:34are Enterprise use cases in a very 12:35controlled environment so things that 12:37where things break in the consumer world 12:40you don't have a controlled environment 12:41you have Corner cases that happen a lot 12:44right in an Enterprise setting if I'm 12:47help if I'm wearing my my vision Pros 12:49for two hours at a stretch doing I'm a 12:52mechanic I'm fixing things right that's 12:54a place where I need additional input 12:56and I can't go look at other uh things 12:58like pick up my cell phone and work on 13:00it I'm underneath I'm I'm fixing 13:02something in the middle of it right 13:04those use cases because the environment 13:06is very controlled I can do AI with 13:09higher accuracy it's reputable I know I 13:11can start trusting the answers because I 13:13have enough data coming out from it 13:14right so you're not trying to solve 13:15every problem but I think we'll see a 13:18higher uptake of these devices U by the 13:21I love the the Rayband glasses from meta 13:23as well great great to do something 13:26quick but when you don't want to switch 13:28but I think we we have moving to a point 13:30where Enterprises will go deliver these 13:32at scale the tech starts to get better 13:35and adoption is going to come over on 13:37the B Toc sign but in the consumer goods 13:39we'll have multiple attempts at this 13:41like we had with Google classes and 13:42stuff it'll take a few attempts to get 13:44better on the Enterprise side we will 13:46learn and make the models a lot better 13:48but I think there's insane amount of 13:49value that we're delivering to our 13:51clients with apple Vision Pro today in 13:53Enterprise settings I think it's going 13:55to follow that problem totally yeah and 13:56it's actually interesting I hadn't 13:57really thought about this in chis in is 13:59like um basically like the phone is 14:02almost not as big of competition in the 14:04Enterprise setting right whereas like 14:05the example that Chris gave was like 14:07literally you're trying to be like is 14:08this multim modal device faster than 14:11using my phone in that interaction which 14:13is like a real competition but if it's 14:15something like a mechanic you know they 14:17don't have they don't they can't just 14:18pull out their phone um Chris any final 14:19thoughts on this and then I want to move 14:20us to our next topic yeah and I was just 14:22going to give another kind of use case 14:24scenario I I often think of things like 14:26the oil rigs exam example so 14:29a real sort of Enterprise space where 14:31you're wandering around and you have to 14:33go and do safety checks on various 14:35things and most of their time if you 14:38think of the days before the mobile 14:39phone or before the tablet what they 14:41would have to do is go look at the part 14:42do the inspection the visual inspection 14:44and then walk back to a PC to go fill 14:47that in and then these days you do that 14:49with a tablet on the rig right but but 14:51then actually you need to find a 14:52component you're going to look at you 14:54have to do the defect analysis you want 14:56to be able to take pictures of that you 14:58need the G location of where that part 15:00is so that the next person can find it 15:03and then you want to be able to see the 15:04notes that they had before on this and 15:07then you got to fill in the safety form 15:09right so they have to fill in a t ton of 15:10forms so there's a whole set of 15:12information if you just think about AI 15:15just having you know even your phone or 15:18glasses pick either to be able to look 15:20at that part be able to have the notes 15:21contextualized in that geospatial space 15:24be able to fill in that form be able to 15:26do an analysis with AI it's it's got a 15:29huge impact on Enterprise cases and 15:31probably multimodality in that sense has 15:34probably got a bigger impact I would say 15:36in the Enterprise cases than the 15:37Consumer spaces even today and I and I 15:40think that's something we really need to 15:42think about the other one is and again I 15:45know you wanted this to be quick there 15:46Tim is the clue and generative AI is the 15:50generative part right so actually I can 15:54create images I can create audio I can 15:57create music things that don't exist 15:59today so and with the text part of 16:02something like an llm then I can create 16:04new creative stuff I can create develops 16:07pipelines doer files whatever so there 16:09comes a part where I want to visualize 16:12the thing that I create I don't want to 16:14be copying and pasting from one system 16:18to another right that's not any 16:20different from the oil rig scenario so 16:22as I start to imagine new new business 16:25processes new pipelines new uh Tech 16:27processes I then want to be able to have 16:29the real-time visualization of that at 16:31the same time or be able to interact 16:33with that and that's why multimodality 16:35is is really important probably more so 16:37in the Enterprise space yeah that's 16:39right I mean I think some of the 16:40experiments you're seeing with like 16:41Dynamic visualization generation are 16:43just like very cool right uh because 16:46then you basically have you can say like 16:47here's how I want to interact with the 16:49data the system kind of just generates 16:51it right on the Fly um which I think is 16:53very very exciting 16:59all right so next up I want to talk 17:00about latency and cost so this is 17:02another big Trend you know I think it 17:04was very interesting that both companies 17:06went out of their way to be like we've 17:08got this offering and it's way cheaper 17:10for everybody um which I think suggests 17:12to me that you know these big huge 17:14competitors in AI all recognize that 17:16like your your per token cost is going 17:18to be this huge bar to getting the 17:19technology more distributed um so 17:22certainly one of the ways they sold 40 17:25was that it was cheaper and as good as 17:27GPT right everybody was kind of like 17:29okay well why do I pay for pro anymore 17:31if I'm just going to get this for for 17:32free and then Google's bid of course was 17:34Gemini 1.5 flash right which is okay 17:36it's going to be cheaper and faster 17:38again um and I know Chris you threw this 17:41uh sort of topic out so I'll kind of let 17:43you have the first say but I think the 17:44main question I'm left with is like what 17:46are the downstream impacts of this right 17:48for someone who's not really paying 17:49attention to AI very closely like is 17:52this just matter of like it's getting 17:53cheaper or do you think like these are 17:55actually these economics are kind of 17:56changing how the technology is actually 17:57going to be rolled out 18:00I think latency and smaller models and 18:03tokens are probably one of the most 18:06interesting challenges we have today so 18:08if you think about like the GP T4 and 18:11everybody was talking like oh that's a 18:131.8 trillion model or whatever it is 18:16that's great but the problem with these 18:19large models is every layer that you 18:22have in the neural network is adding 18:26time to get a response back and not not 18:28only time but cost so if you look at the 18:32demo that open AI did for example what 18:35was really cool about that demo was the 18:37fact that when you were speaking to the 18:39assistant it was answering pretty much 18:42instantly right and that is the real 18:45important part and when we look at 18:47previous demos what you would have to do 18:49if you were having a voice interaction 18:51is you'd be stitching together kind of 18:54three different pipelines you need to do 18:56uh Speech to Text then you're going to 18:58run that through the model and then 18:59you're going to do text to speech back 19:01way so you're getting latency latency 19:03latency before you you get a response 19:06and that timing that it would take 19:08because it's not in the sort of 300 19:10millisecond mark it was too long for a 19:13human being to be able to interact so 19:14you got this massive pause so actually 19:17latency and the kind of tokens per 19:20second becomes the most important thing 19:23if you want to be able to interact with 19:24models quickly and be able to have those 19:26conversations and that's sort of why 19:28also multimodality is really important 19:31because if I can do this in one model as 19:33well then it means that I'm not sort of 19:35jumping pipelines all the time so the 19:38smaller you can make the model the 19:40faster it's going to be now if you look 19:42at the GPT 4 on the model I don't know 19:45if you've played with just a text mode 19:47it is lightning fast when it comes back 19:50very fast now yeah it's and noticeably 19:52so like it's just like it feels like 19:54every time I'm in there's like these 19:55improvements right so and and this is 19:58what you're doing you're sort of trading 19:59off reasoning versus uh speed of the 20:03model right and and as we move into kind 20:07of agentic platforms as we move into 20:10multimodality you need that latency to 20:12be super super sharp because you're not 20:14going to be waiting all the time so 20:16there is going to be scenarios where you 20:18want to move back to a bigger model that 20:20is fine um but you're going to be paying 20:22the cost and that cost is going to be 20:24the cost uh the price of the tokens in 20:27the first place but also the speed of 20:29the response and I think this is the 20:31push and pull that model creators are 20:33going to be playing against all of the 20:35time and and and therefore if you can 20:38get a similar result from a smaller 20:41model and you can get a similar result 20:44from a faster model and a cheaper model 20:47then you're going to go for that but in 20:48those cases where it's not then you may 20:50need to go to the larger model to kind 20:52of reason so this this is really 20:53important totally yeah I think there's a 20:54bunch of things to say there I mean I 20:56think one thing that you've pointed out 20:57clearly is that like this makes 20:58conversation possible Right like that 21:00you and I can have a conversation in 21:02part because I have low latency is kind 21:04of the way to think about it and like 21:06now that we're reaching kind of human 21:07like parody on latency you know finally 21:09these models can kind of Converse in a 21:11certain way the other one is actually I 21:13really thought about that there is kind 21:14of this almost like thinking fast and 21:15slow thing where basically like the 21:17models can be faster but they're just 21:19not as good at reasoning um and then 21:21there's kind of this like deep thinking 21:23mode which actually is like slower in 21:25some ways so Tim U the way we are 21:27helping Enterprise clients again have 21:29that kind of focus in in life there's a 21:32split there's a there's there are two 21:33ways of looking at applying gen in the 21:35industry right now one is at the use 21:38case level you're looking at the whole 21:40workflow into to end seven different 21:42steps the other is going and looking at 21:44it at a subtask level right so I'll just 21:46take pick an example I'll walk you 21:48through it so say I have an invoice that 21:50comes in and I'm taking an application 21:52I'm pulling something out of it I'm 21:54making sure that that's as for the 21:56contract I'm going to send you an email 21:58saying your voice is paid right so some 21:59sort of a flow like that right so say it 22:02is seven steps just very simplified 22:05right I'm going to P things from the 22:07backend systems using apis step number 22:09three I'm going to go call a fraud 22:11detection model that has been working 22:12great for three years step number four 22:15I'm extracting things from a paper right 22:17an invoice that came in that extraction 22:19I used to be doing with OCR 85% accuracy 22:23humans will do the Overflow of it at 22:25that point we're taking a pause and 22:27saying we have reason to believe that 22:28llms today can look at an image and 22:30extract this with higher accuracy yeah 22:32say we get up to 94% so that's nine 22:35points higher accuracy of pulling things 22:37out so we pause at that point and say 22:40let's create a set of constraints for 22:42step number four to find the right 22:44athletes and the constraint could be 22:46what's the latency like we just spoke 22:48how quickly I need the result or can 22:50this take 30 seconds and I'll be okay 22:51with it second could be around cost if 22:54I'm doing this a thousand times I have a 22:56cost envelope to work with versus a 22:57human doing if I'm doing it a million 22:59times I can invest a little bit more if 23:01I can get get accuracy out of it right 23:03so the ROI becomes important then you're 23:05looking at security constraints around 23:07does this data have any identified Phi 23:09data Pi data that really can't leave the 23:11cloud I have to bring things closer or 23:13is this something that is military grade 23:15secrets and has to be on Prem right so 23:17have certain constraints around that so 23:19you come up with a list of five six 23:20constraints and then that lets you 23:22decide whether what kind of an llm will 23:24actually check off all these different 23:26constraints and then you you start 23:27comparing and bringing it in so the 23:29split that we seeing in the market is 23:31one way with llm agents and with these 23:33multimodal models they're trying to 23:35accomplish the entire flow work for end 23:38to end like you saw with Google's 23:39returning the shoes right it's taking an 23:41image of it is going and looking at your 23:43Gmail to find the receipt starting the 23:45return giving your a QR code with the 23:47whole return process done so just 23:49figured out how to go create the entire 23:50endtoend workflow but where the 23:53Enterprises are still focused is more on 23:55the subtask level that point we are 23:57saying this step step number four is 23:59worth switching and I have enough evals 24:02before and after I have enough metrics 24:03to understand and I can control that I 24:06can audit that much better the thing 24:08that from an Enterprise perspective 24:09these end to end multimodal models it'll 24:12be difficult for us to explain to SEC 24:14for example why we rejected somebody's 24:17benefits on a credit card things of that 24:19nature so I think in the in the 24:20Enterprise World we're going to go down 24:22the path of let me Define the process 24:25I'm going to pick small models to 24:27Chris's point to do that piece better 24:29and then eventually start moving over to 24:31hey now let me make sure that those that 24:34framework evals and all of that stuff 24:35can be applied to intoing multim models 24:38I guess I do want to maybe bring in 24:40Brian here you like release the Brian on 24:42this conversation um because I'm curious 24:44about like kind of like the marketer 24:46view on all this right because I think 24:48there's one point of view which is yes 24:50yes chrisit like this is all nerd stuff 24:52right like I yeah know it's like latency 24:54and cost and speed and whatever the big 24:57thing is that you can actually talk to 24:58these AIS right and I guess I'm kind of 25:00curious from your point of view about 25:02like I mean one really big thing that 25:03came out of like the open AI 25:05announcements was we're going to use 25:07this latency thing largely to kind of 25:09create this feature that just feels a 25:10lot more human and lifelike um than you 25:13know typing and chatting within Ai and I 25:16guess I'm kind of curious about like you 25:18know what you think about that move 25:20right like is that ultimately like going 25:22to help the adoption of AI is it just 25:24kind of like a weird sci-fi thing that 25:25open AI wants to do and also I mean I 25:27think if if you've got any thoughts on 25:29you know how it impacts the Enterprise 25:30as well was just like do companies 25:32suddenly say oh I understand this now 25:34right it's because it's like the AI from 25:35her I can buy this um just kind of 25:37interesting thinking about like the the 25:39sort of surface part of this because it 25:41actually will really have a big impact 25:42on the market as well it's kind of like 25:43the technical advances are driving the 25:45the marketing of this I I mean I do 25:47think when you when you look at like 25:49some of the initial reviews of I want to 25:52say like the pin and rabbit like I 25:54remember one of the one of the scenarios 25:57that was being demoed 25:58was I think I think he was looking at a 26:00car and he was asking a question about 26:02it and the whole interaction took like 26:0420 seconds there and he went through he 26:06was just showing that he could do the 26:07whole thing on his phone in the same 26:09amount of time but the thing that I was 26:11thinking about when I was watching that 26:12was like he just did like 50 steps on 26:14his phone that was awful as opposed to 26:16just pushing a button and asking a 26:17question and it was like it was very 26:19clear that the ux interaction of just 26:21like like asking the question and 26:23looking at the thing was a way better 26:26experience than pushing the 50 buttons 26:28on your phone but the 50 buttons still 26:30won just cuz it was faster to do 50 26:31buttons than to you know deal with the 26:33latency impact of um of where we were 26:36before and so it actually it reminded me 26:39a lot of just the way I used to hear 26:41remember hearing Spotify talk early 26:44about the way that they thought about 26:45latency and the things that they did to 26:47just make the first 15 seconds of a song 26:50Land um essentially so that it felt like 26:53you know a like a file that you had on 26:55your device because I think from their 26:56perspective they if it felt like every 26:58time you wanted to listen to a song that 27:00was buffering as opposed to sitting on 27:01your device you were never going to 27:02really adopt on that thing because it's 27:04horrible experience relative to just 27:06having the file locally and so they put 27:09in all this work so that it felt the 27:11same and that wound up being a huge part 27:12of how the technology ended up getting 27:14and the product ended up getting adopted 27:16and you know I do think there's a lot of 27:19a lot of stuff we're doing that is 27:22almost like I don't want to say back 27:23office but like just Enterprise 27:25processes around how people do things 27:27operational things 27:28but there are plenty of ways where 27:31people are thinking about the way that 27:32we do more with like agents in terms of 27:34how that involves like customer 27:35experience whether it's support 27:37interactions whether it's like bots on 27:39the site you can just clearly imagine 27:41that that's going to play a bigger role 27:43in customer experience going far forward 27:46and if you feel like every time you ask 27:47a question that you're waiting 20 27:49seconds to get a response from this 27:50thing like you're just getting the other 27:52person on the end of that interaction is 27:53just getting matter and matter and 27:54matter the entire time where the more it 27:56feels like you're talking to person and 27:58that they're responding to you as fast 28:00as you're talking I think the more 28:01likely it is that people are going to 28:03accept that as an interaction model um 28:05and so I do think that that latency and 28:08like making that feel to you like to 28:10your point about having a human beings 28:12being zero latency um I think that's a 28:14necessary condition for a lot of these 28:16interaction models and so it's going to 28:17be super important going forward and to 28:19me it's also when I think about the 28:20Spotify thing it's like our people are 28:22going to do interesting things to solve 28:24for the first 15 seconds of an 28:25interaction as opposed to the F the 28:27entire interaction like you know can you 28:29get there was a lot of talk about like 28:32open AI model I want to say like 28:33responding with like sure or just like 28:35some space filling entry point um so it 28:39like it could catch up with the rest of 28:40the the dialogue so I think it'll I 28:42think people will prioritize that a lot 28:44because it'll matter a lot I love the 28:45idea that like to save to save cost 28:47basically opening eyes like for the 28:48first few turns of the conversation we 28:50deliver the really fast model so it 28:51feels like you're really having like a 28:52nice flowing conversation and then 28:54basically once you build confidence they 28:55like fall back to like the slower model 28:57that has better results where you're 28:58like oh this person is a good 28:59conversation list but they're also smart 29:01too right is like kind of what they're 29:03trying to do by kind of playing with 29:05model delivery um so we got to talk 29:08about search but Chris I saw you go off 29:09mute so do you want to do a final quick 29:11hit on the question of latency before we 29:12move on no I I was just coming to come 29:14up with what Brian was saying there and 29:16and what you were saying Tim I totally 29:18agreed it was always doing this hey and 29:21then repeat the question so I I wonder 29:23if underneath the hood as you say is 29:25there's a much smaller classifier model 29:27that is just doing that hey piece and 29:30then as you say there's probably a 29:32slightly larger model actually analyzing 29:35the real thing so I I do wonder if 29:37there's two small models or a small 29:39model and a slightly larger model in 29:41between there for that interaction so 29:43it's super interesting and but maybe the 29:45thing I wanted to add to that is we 29:48don't have that voice model in our hands 29:50today we only have the text model so I 29:53wonder once we get out of the demo 29:55environment and then maybe in a 3 weeks 29:57time time or whatever we have that model 30:00whether that's going to be super 30:01annoying every time we ask a question 30:03it's going to go hey and then repeat the 30:05question back so it's cool for a demo 30:08but I wonder if that will actually be 30:10super annoying in two weeks 30:12[Music] 30:16time all right so last topic that we got 30:18a few minutes on uh and this is like 30:20Brian's big moment so Brian get get 30:22yourself ready for this I mean Chris you 30:25can get yourself ready because 30:26apparently Brian's gonna you know you 30:27know everyone else can leave the meeting 30:29yeah take our eyebrows off here with his 30:31with his uh with his rant so the the 30:33setup for this is that basically Google 30:35announced uh that AI generated overviews 30:38will be rolling out to us users and then 30:41everybody uh in the near future and I 30:44think there's two things that to set you 30:45up Brian I think the first one is this 30:47is what we've been talking about right 30:48like is AI gon to replace search here it 30:50is you know here it is consuming the 30:53preeminent search engine so I think it's 30:55like we're here right this is happening 30:57and then the one is like I'm a little 30:59nostalgic you know someone who grew up 31:00with Google um you know I'm like the 10 31:03Blue Links you know like the search 31:04engine you know it's like a big part of 31:06how I experienced and grew up with the 31:08web and um you know this seems to me 31:10like kind of a big shift in how we 31:12interact with the web as a whole and so 31:14I do want you to kind of first talk a 31:16little about what you think it means for 31:17the market um and uh and how you think 31:20it's going to change the economy of the 31:21web yeah so I 31:24follow two communities I would say 31:26pretty closely online I follow The Tech 31:28Community and pretty closely and then I 31:31as a somebody works in marketing I 31:33follow my seo's community um and they 31:35have very different reactions to uh to 31:39what's going on I think your first 31:40question though of um you know is this 31:44the equivalent of swallowing the web um 31:47and from the minute what's funny is from 31:49the minute sort of chat GPT arrived on 31:51the scene people were proclaiming the 31:53death of search now for what it's worth 31:54if you've worked in marketing or on the 31:56Internet for a while people have 31:57proclaimed the death of search as like 31:59an annual event month for the last like 32:0225 years and so um this is just like 32:05part for the course on on some level but 32:08what's interesting to me is that you had 32:09this product chat GPT which is fastest 32:11growing consumer product ever 100 32:13million users faster than anybody else 32:16and what was interesting is it sort of 32:17like speed run speedran the sort of 32:19growth cycle that usually takes years or 32:22decades like well maybe not decades but 32:24like it takes a long time for most 32:25consumer companies to do what they did 32:28the interesting thing about that is if 32:29it was going to totally disrupt search 32:32you would have expected it to show up 32:33and happen sooner than it would have 32:35with other products that maybe would 32:36have had a slower sort of growth 32:38trajectory um but that didn't happen 32:41like if somebody who watches their 32:42search traffic super closely like 32:44there's been no chaotic drop of of this 32:48like people have continued to use search 32:50engines and like one of the reasons I 32:52think that that happened is because 32:54people actually misunderstood um like 32:57like the equivalent of like chat gbt and 32:59Google as competitors um with one 33:01another I know Google and open AI 33:04probably are on some level but I don't 33:05know that those two products are and the 33:07reason I was thinking about that is like 33:09if if chat GPT didn't you know within 33:12the within basically the time plan we've 33:14had so far uh disrupt Google the 33:17question is like why why didn't that 33:19happen and I think you could have a 33:20couple different hypothesis for that 33:21like one you could say the form factor 33:24wasn't right it wasn't text that was 33:25going to do it it was we needed Scarlet 33:27Joan 33:28on your phone and that's the thing 33:29that's going to do it and so they're 33:31maybe leaning into that thought process 33:33a little bit you could say it was 33:34hallucinations like oh the content is 33:36just not accurate uh yeah right so 33:38that's a possibility around it you could 33:41say just like learn consumer Behavior 33:43people have been using this stuff for 20 33:44years it's going to take a while to get 33:45them to do something different you could 33:47say Google's advantages in distribution 33:49so it's like we're on the phone we got 33:51browsers um it's really hard to you know 33:54get the level of penetration that we 33:55have I think all of those probably play 33:57some role but my biggest belief is that 34:00it's actually impossible to separate 34:02Google from the internet itself um 34:04Google's kind of like the operating 34:05system for the web so to disrupt Google 34:07you actually are not disrupting search 34:09you have to disrupt the internet um and 34:11it turns out that that's an incredibly 34:12High bar uh to have to disrupt because 34:14you're not only dealing with search 34:16you're dealing with the capabilities 34:17whether it's Banks or Airlines or you 34:20know retail whatever it is of every 34:22single website that sits on the opposite 34:24end of the internet it turns out that 34:26that's like an orous amount of 34:28capability um that's built up there and 34:31so I looked at I look at that and say 34:33like for as much as like I think this 34:36this technology has brought to the table 34:38hasn't done that thing um yet and so 34:41because it hasn't done that there hasn't 34:43been some dramatic shift there the thing 34:45that Google search is not good at though 34:48um and I think you see it in a little 34:50bit in terms of how they described what 34:52they think the utility of AI overviews 34:55um will be is that it's not good complex 34:58multi-art questions of saying like if 35:00you're trying to plan if you're doing 35:02anything from like doing a buying 35:03decision for a large Enterprise product 35:06or like planning your kids's birthday 35:07party like you're going to have to do 35:08like 25 queries along the way there and 35:10you just you've just accepted and 35:12internalized that you have to do 25 qu I 35:14like that is like basically like search 35:16is one shot right like you just say it 35:18and then responses come back so there's 35:19no yeah sorry go ahead yeah yeah and so 35:22like the way I was thinking about llms 35:23is they're kind of like internet sequel 35:26um in a way where you can ask this like 35:28much more complicated question and then 35:30you can actually describe the way that 35:31you want the output of that thing to 35:33look it's like I want to compare these 35:34three products on these three dimensions 35:36go get me all this data and that would 35:37have been 40 queries um at one point but 35:40now you can do it in one and search is 35:42terrible at doing that right now you 35:44have to go cherry-pick each one of those 35:46data points but the interesting thing is 35:49that that's also maybe the most valuable 35:51query to a user um because you save 30 35:54minutes and so I think Google looks at 35:56that and says 35:58um if we seed that particular space of 36:01complex queries to some other platform 36:04like that's a long-term risk for us and 36:06then if it's a long-term risk for them 36:07what it ends up being is a long-term 36:09risk for the web um I think so I 36:11actually think it was incredibly 36:12important that Google bring this type of 36:14capability into into the web even if it 36:16ends up being disruptive a little bit 36:18from a Publisher's perspective because 36:20what it does is at least preserves some 36:23of the dynamic we have now of like the 36:24web still being an important thing and I 36:26hope that used to your point I have like 36:29present and past Nostalgia for it I 36:32would say yeah exactly so I think it's I 36:34think it's important that it continues 36:36to evolve if we all want the web to 36:37continue to persist as like a healthy 36:39Dynamic Place yeah for sure no I think 36:41that's a that's a great take on it and 36:43you know Google always used to say look 36:45we measure our success based on how fast 36:47we get you off our website right and I 36:49think kind of Brian what you're pointing 36:50out which I think is is very true is 36:52that like what they never said was 36:53there's this whole set of queries we 36:55never surface that you know you really 36:57have to kind of keep keep searching for 36:59right and like that's that ends up being 37:00kind of like a the the the search volume 37:03of the future that everybody wants to to 37:05capture um well uh so Brian I think we 37:08also had a little intervention from AI 37:10the thumbs up thing we were joking about 37:11that before the show it's just 37:13yeah my ranking for worst AI feature of 37:16all time um so um but um make up the 37:20thumbnail on the on the video that's 37:22right yeah exactly um well great so 37:24we've got just a few minutes left show 37:26but Chris any final parting shots on 37:28this topic sure so I I'm very bullish I 37:32think AI overviews um have a lot of 37:35future as long as there's a good 37:38mechanism of feedback incorporating and 37:41making it hyper personalized a simple 37:43query like I want to go have dinner 37:45tonight say I tell you I want looking 37:47for a th restaurant yeah if you look if 37:49I go on on open table or yel or Google 37:52and try to find that there's a 37:54particular way in which I think through 37:55it the filters that I apply are very 37:56different from how Chris was do it right 37:58so the way I make a decision if 38:01somebody's making that decision for me 38:03great the reason why Tik Tok works so 38:07much better than Netflix on an average I 38:09think I I was um listening to a video by 38:12Scott and he mentioned that we spend 38:14about 155 minutes a week browsing 38:18Netflix on an average in the US 38:20something of that nature like pretty 38:21exited amount of time versus Tik Tok has 38:23just completely taken that fallacy of 38:26choice out for you when you go on Tik 38:28Tok the video that they have pick 38:30there's just so many data points the 17c 38:32video average 16 minutes of viewing time 38:35across your Tik Tok engagement and you 38:37have so many data points coming out of 38:39it seven 71 of them every few seconds 38:41right so they have hyper personalized it 38:44based on how you interact with things 38:46right because they have not not asking 38:48you to go pick a channel a choice that 38:51nature just showing you the next next 38:52next thing in the sequence hence the 38:54stickiness they've understood the brains 38:56of teenagers and then and that 38:57demographic really really well I think 38:59that's the direction that Google will go 39:00into it'll start hyper personalizing 39:03based on all the content if they're 39:04reading and finding out where the 39:06receipt of my shoes are they know what I 39:08actually ended up ordering at a 39:09restaurant that I went to right so the 39:11full feedback loop coming into the 39:13Google ecosystem I think it's going to 39:15be brilliant if they get to a point 39:17where they just make a prediction on 39:18which restaurant is going to work for me 39:21everything they know about me that's 39:22right yeah I mean the future is they 39:23just going to book it for you and a car 39:25is going to show up and you're going to 39:25get in it's going to take you some place 39:27right uh so conf they'll send a 39:30confirmation from your email exactly 39:33right uh Chris 30 seconds you've got the 39:35last word 30 seconds search is going to 39:38be a commodity and I think as we see the 39:40AI assistant era I dare you yeah but it 39:44will be a commodity because we are going 39:46to interact with search via these 39:48assistants it's going to be theer on my 39:50phone which will be enhanced by uh AI 39:54technology it's going to be Android and 39:57Gemini's version on there we we are not 40:00going to be interacting with Google 40:01search in the way we do today with 40:03browsers that is going to be 40:04commoditized and we're going to be 40:06dealing with her assistants who are 40:07going to go and fetch those queries for 40:09us so I I think that's going to be 40:11upended and and at the heart of that is 40:14going to be latency and multimodality as 40:16we said so uh I think they got to PIV it 40:20or they're going to be disrupted yeah I 40:21was going to say just like if that 40:23happens what's interesting is that all 40:25of the advantage Google has actually 40:26vanishes like and then it's an even 40:28playing field against every other llm 40:30which is you know that's a very 40:33interesting Market situation in that at 40:34that point yeah I'm gonna pick that up 40:36next week that's a very very good topic 40:38when we should get more into it um great 40:40well we're at time uh show bit Chris uh 40:43thanks for joining us on the show again 40:45uh Brian we hope to see you again 40:46sometime um and to all you out there in 40:49radi land if you enjoyed what you heard 40:50you can get us on Apple podcasts Spotify 40:53and podcast platforms everywhere and 40:55we'll see you next week for mixture X 40:57where so