Learning Library

← Back to Library

Computer Vision Returns via Meta SAM2

28m • Unknown Channel • ai-ml • interview • intermediate • Watch on YouTube ↗

Key Points

Tim Hong’s “Mixture of Experts” podcast opens with a panel of technologists (Vagner Santana, Kate Soul, Ami Ganan) to decode the latest AI headlines, especially Meta’s new Segment Anything Model 2 (SAM 2).
SAM 2, a next‑generation computer‑vision system, can segment and track objects in images and video, highlighting a resurgence of interest in vision AI alongside the current NLP hype.
The hosts stress that true open‑source AI now means more than just releasing model weights; Meta’s decision to also publish the training data sparks debate about the future importance of open data in democratizing models.
The episode notes a striking 30 % abandonment rate for proof‑of‑concept AI projects, prompting discussion on whether this reflects optimism or underlying challenges in the industry.
Throughout, the panel emphasizes responsible AI development, generative‑AI research, and the strategic role of AI analytics in shaping the next wave of technology adoption.

Sections

00:00:00 Meta's SAM2 and AI Trends - In a Mixture of Experts podcast, the hosts discuss Meta's new SAM2 “segment anything” model for image/video segmentation, alongside broader AI topics like project abandonment rates, notification overload, and the promise of AI hardware breakthroughs.

Full Transcript

# Computer Vision Returns via Meta SAM2 **Source:** [https://www.youtube.com/watch?v=3mcLdfx6HTc](https://www.youtube.com/watch?v=3mcLdfx6HTc) **Duration:** 00:28:55 ## Summary - Tim Hong’s “Mixture of Experts” podcast opens with a panel of technologists (Vagner Santana, Kate Soul, Ami Ganan) to decode the latest AI headlines, especially Meta’s new Segment Anything Model 2 (SAM 2). - SAM 2, a next‑generation computer‑vision system, can segment and track objects in images and video, highlighting a resurgence of interest in vision AI alongside the current NLP hype. - The hosts stress that true open‑source AI now means more than just releasing model weights; Meta’s decision to also publish the training data sparks debate about the future importance of open data in democratizing models. - The episode notes a striking 30 % abandonment rate for proof‑of‑concept AI projects, prompting discussion on whether this reflects optimism or underlying challenges in the industry. - Throughout, the panel emphasizes responsible AI development, generative‑AI research, and the strategic role of AI analytics in shaping the next wave of technology adoption. ## Sections - [00:00:00](https://www.youtube.com/watch?v=3mcLdfx6HTc&t=0s) **Meta's SAM2 and AI Trends** - In a Mixture of Experts podcast, the hosts discuss Meta's new SAM2 “segment anything” model for image/video segmentation, alongside broader AI topics like project abandonment rates, notification overload, and the promise of AI hardware breakthroughs. ## Full Transcript

0:00computer vision is it cool again now we 0:03can take that and then you amplify it 0:05across uh different uh problems to be 0:08solved here so is friend.com AI 0:11Hardware's breakout moment we already 0:13are so addicted to notifications and 0:16it's one more source of notifications 0:18for us we're estimating a 30% 0:21abandonment of proof of concept AI 0:23projects is that a bad thing yeah I 0:25don't think it's as pessimistic as it 0:26could be all this and more on today's 0:29episode of mixture of 0:36experts I'm Tim Hong and I'm joined 0:38today as I am every Friday by a 0:40worldclass panel of technologists 0:42Engineers and more to help make sense of 0:44a tital wave of AI news today on the 0:46panel we've got Vagner Santana staff 0:48research scientist and master inventor 0:50on the responsible Tech Team Kate soul 0:52Who's a program director of generative 0:54AI research and Ami Ganan associate 0:57partner Ai and Analytics 1:00[Music] 1:04so our first segment we're going to talk 1:06about Sam 2 uh meta this week announced 1:09the release of its next generation of a 1:10model it calls segment anything so the 1:12segment anything model is Sam and this 1:14is the next generation of it um and 1:16specifically what the model does is it 1:18allows you to segment imagery or video 1:20so you can select an object and kind of 1:22track it over time now I really wanted 1:24to cover this because you know there's 1:26just so much hype around NLP um and 1:29everybody's talking about chat Bots all 1:30the time but we kind of don't like we 1:32should not forget that there's like 1:33really really exciting things happening 1:35in other domains of AI and particularly 1:37in in computer vision so we're going to 1:39start off with a fun question which is 1:40just simply is computer vision cool 1:43again Kate yes uh Vagner yes and Ambi 1:49always has 1:50been yeah don't call it a comeback right 1:53um well I think with the violent 1:55agreement let's get into this segment I 1:57really wanted to kind of talk about this 1:58because of course it's another iteration 2:00of meta really kind of playing in the 2:02open source game but I think what's 2:04really really interesting is that it's 2:06also a really interesting marker in the 2:09ground for what sort of Open Source 2:11exactly means in the AI space so if you 2:14haven't been watching this place space 2:15very carefully you know in the first 2:18versions of Open Source in AI people 2:19said well we're going to open up the 2:21model and there's going to be weights 2:24that are available um and uh with uh Sam 2:28they're also uniquely releasing the data 2:31uh behind the model um and so Ami maybe 2:34I'll throw it to you as kind of the new 2:35panelist um is I'm curious about how you 2:37sort of see this like in the future is 2:40open data going to be a big part of what 2:42makes a model sort of truly open source 2:44um and kind of talking a little bit 2:46about how you think through some of that 2:48yeah uh so yeah listen um we we love 2:53open source right um yeah and open 2:56source means different things to 2:57different people uh it can be just you 2:59know releasing open data it could be 3:01having open weights um a whole Spectrum 3:03there right I'm really really excited 3:05that meta went ahead and did this on 3:07apach to license it's fully open weight 3:10um there is a lot of computer vision 3:13problems that we've been wrangling for 3:15several years right I remember back in 3:17my grad school days you know we would go 3:19and do uh traditional image processing 3:22and you know segmentation through 3:23watched algorithms and you know drawing 3:25little boxes and things of that nature 3:28um it's very painstaking oh extremely 3:30painstaking and extremely laborious 3:32right and uh um fast forward to today 3:35it's uh it's super super exciting to see 3:38something like this which can operate at 3:40scale on huge videos and think about 3:44this from an Enterprise setting right um 3:47my 3:48clients um I I work with clients that 3:51are you know they have huge 3:53manufacturing operations going on they 3:55have to go you know when you think about 3:57the supply chain there is um you know BS 4:00that need to be moved uh in the 4:01warehouse there is computer vision 4:03that's going on and tracking those 4:05objects or if you look at you know the 4:07the production settings in a lot of our 4:09clients a huge assembly line of objects 4:12of different types that need to be 4:14tracked through multiple different 4:15stages um or if you look at some of our 4:19um you know local governments for 4:21instance right one of the things that 4:23we've seen is um uh people tend to jump 4:27turn Styles right when you're going 4:29through public transport and that 4:31surprisingly is a huge um cost to cities 4:35and local governments right uh city of 4:37New York for instance it's uh it's a 4:39cost of like $750 million um and it it 4:43becomes a big problem to solve and in 4:47the past a lot of these have needed to 4:49be solved through very specific computer 4:52vision models custom trained for these 4:55um specific tasks right what Sam to 4:59enable is for you to be able to go and 5:02rapidly build Those computer vision 5:04models at scale because now you can go 5:06and do these automatic segmentations of 5:10large videos which means whichever 5:12domain right you throw in uh videos that 5:14it hasn't seen before domains that 5:17hasn't been uh trained on before it 5:19still is able to go and do those 5:21segmentations and track those objects 5:23over time right and so now this gives us 5:26a very very um capable mechan to go 5:30build these domain specific computer 5:32vision models at scale and so you know 5:35short answer really really exciting and 5:37that's why I think that open source uh 5:39capability helps now we can take that 5:41and then you know amplify it across uh 5:44different uh problems to be solved here 5:47yeah for sure and I think that's kind of 5:48one of the most interesting things 5:50because I think yet again this has sort 5:51of been a theme in a number of our 5:52conversations you know meta and its blog 5:54post is like this is so exciting because 5:56you could use it for AR glasses and I 5:58think one of the questions I had was 6:00like is this the technology that finally 6:01gets AR glasses to work and I'm kind of 6:03I don't know Kate if you got opinions on 6:05that or Vagner you got opinions on that 6:07but there's almost kind of one point of 6:08view which is like like again with AI 6:10like the big application is going to be 6:12like Turn Style enforcement right it 6:14actually won't be these kind of consumer 6:16elements but I don't know if anyone 6:17wants to speak up for like no actually 6:18this is this is the moment that's really 6:20going to make AR glasses work I mean I'm 6:22sure this helps us get closer not 6:24farther away but you know I'm I'm always 6:26wary of anything that's uh claimed to be 6:28a silver bullet 6:30but I I want to get back uh Tim to what 6:32you mentioned earlier about like open 6:34sourcing the data because I think it's 6:36really interesting to talk about you 6:38know meta strategy and and how in Vision 6:41they've released the data behind Sam 2 6:43but and the license of the model itself 6:45is apache2 and you look at the Llama 6:47series and you know 3.1 came out uh just 6:49last week where it's under a specific 6:52llama license uh and there is absolutely 6:55no data uh that's released or even 6:57described really in terms of um what was 7:01used in in a little B more yeah for our 7:04listeners I think they'd really benefit 7:05from like hearing so what is the 7:07difference there exactly between kind of 7:08Apache and you know what's happening 7:11Obama and I guess why right this kind of 7:12question is like why aren't they 7:13consistent yeah so Apache 2.0 is a very 7:17popular widely used open- Source license 7:20that's been around for years and is 7:22considered a very permissible license 7:24anyone can build on top of it for 7:25commercial or other uses without having 7:28to worry about further attribution to 7:30where uh things came from where llama 7:33when the models were released meta 7:35created a llama license that is custom 7:37in bespoke to handle Llama weights 7:40another big differentiation is Apache 2 7:42is normally used for licensing software 7:44um and the data that they released on 7:46Sam 2 I think was CC bya which is 7:49similar to Apache 2 but commonly used 7:50for data so you know there are different 7:52terms you want to govern different 7:54artifacts Apache for software CCB also 7:58often for data and now model weights 8:00people have started to come up with 8:01their own licenses CU model weights also 8:04fit somewhere between software and data 8:06it's a little bit unclear how to the 8:09jurisdiction there yeah I think it's 8:10such a great point to end on uh and I 8:12think if I can I like maybe just to take 8:14one more turn at that because I think 8:16it's a really important part of this 8:17question you know it strikes me that one 8:20of the reasons everybody's very excited 8:21about open source is the accessibility 8:23of the technology right this is not 8:24going to be something that you know a 8:26company just just kind of put up walls 8:28around and then charge you for access 8:29too um but it kind of s strikes me that 8:32like part of the problem of doing open 8:34sourcing is that it's also a lot more 8:35hard like difficult to control use right 8:37like you suddenly have this technology 8:39that kind of anyone can use and you know 8:40some of the people that use it are not 8:42going to use it in the most responsible 8:44way um and I feels like that's like a 8:46really hard challenge right because like 8:48I think you know kind of democratizing 8:49the technology also creates tensions 8:51with how do we like enforce use cases um 8:55and um yeah I'm curious if the panel has 8:56any kind of thoughts on on that yeah 8:59yeah I I was I think that what open 9:02sourcing has been one interesting 9:04mitigation for these situations because 9:07as the community notice notice that 9:09there's something going wrong wrong or 9:11there's a a specific um harmful use then 9:15Community takes action and uh uh we can 9:19look back to open- Source uh uh 9:22operational systems right they they are 9:26the most secure ones right that we have 9:28because the community uh um 9:30automatically or or they build on top of 9:33this uh openness right and they try to 9:37to tackle and also mitigate these um 9:39these issues so I think that in this 9:42sense I think open sourcing is a good 9:44strategy to mitigate this this uh issue 9:46if we're not transparent and open about 9:48the technologies that are available or 9:51will be available if as uh people 9:53continue to work in this area there's no 9:56way for us to build regulations and 9:58awareness and proper practices around it 10:01so I'd much rather had this be happening 10:03out in the open than you know behind 10:05some closed doors where we really don't 10:07have a good good line of sight into um 10:10what's going on that's right yeah I 10:12guess this model of like just trust us 10:13to like a world where we can actually 10:15kind of verify it by like the 10:17verification part yeah absolutely I feel 10:20like yeah I mean once you put it in the 10:21open there's you know lot more heads 10:25thinking through really tricky problems 10:28and there is a lot more diversity of 10:30solutions that come in terms of 10:32mitigating these problems right rather 10:34than trying to you know force and 10:36control it I think when you put it out 10:38in the open 10:40um you you'll have a lot more Creative 10:44Solutions coming to solve these 10:46[Music] 10:50problems okay for our next segment I 10:53want to cover friend.com um so as you 10:56all may know right there's been a long 10:59in dream in the valley that one of the 11:00really exciting things you could do with 11:02llms is the notion of really for the 11:05first time creating a fully-fledged kind 11:07of AI companion assistant um and this 11:09dream is kind of manifested in a bunch 11:11of Hardware projects that have taken 11:12place so the Humane pin that came out 11:14earlier this year is a good example of 11:16that um and friend.com is uh a most 11:20recent iteration of that so Avi shiffman 11:22and entrepreneur launched this with a 11:23teaser trailer earlier uh this week and 11:27um AV has taken a lot of criticism 11:28online but actually want to take this 11:30conversation in a slightly different 11:32direction which is that I think you know 11:34what's really interesting and what's 11:36kind of offered by friends.com is sort 11:39of the idea that maybe startups can 11:41actually start competing in the AI 11:42Hardware space and that you could 11:44actually in the future launch a AI 11:47Hardware project and even something so 11:48Advanced as like an AI Hardware 11:50companion um just being a small startup 11:53on your own right that this is not just 11:55going to be a kind of space where you 11:57know the big companies can only play um 11:59but that actually might be a place where 12:00startups can play as well um and you 12:04know I guess I want to kind of put 12:05forward this idea and K maybe I can pick 12:07on you is do you kind of buy the idea 12:09that like the costs of AI are coming 12:10down so much that you know we're about 12:12to kind of be a wash in these types of 12:14things like the idea of someone 12:15launching an AI companion product is not 12:17going to be like something only you know 12:19the biggest tech companies in the world 12:21can do but that you'll also have like 12:22these upstarts that will be able to kind 12:24of like do their own take on on this 12:26space yeah I I think it's a really 12:28interesting 12:29question because we're getting so many 12:32kind of in a way conflicting signals of 12:35what's going on in this space so you 12:37know uh Gartner just released a report 12:40yesterday or two days ago saying that 12:42they expect 30% of all poc's in gen to 12:45be never leave the PC phase yeah 12:47definitely we're going to talk about 12:48that later I think this is going to be 12:49the final final segment of the episode 12:50okay great but a lot of what they were 12:52talking about is citing the costs right 12:54so is uh that we're not seeing the ROI 12:58offset the cost enough 13:00and I I think that certainly makes sense 13:03given what we're seeing but on the other 13:05side we're seeing models get smaller and 13:06smaller and smaller like there is this 13:08clear Trend where we're able to pack 13:10more performance and fewer parameters 13:12where we're being able to get to the 13:14point where these models can run in CPUs 13:16and we don't need the Advanced Hardware 13:18at to the same degree that we did a year 13:20ago and you know some of these scaling 13:23laws are really exciting in terms of how 13:25efficient the technology is growing so I 13:27don't think it's unreasonable to think 13:28that that we could get to a place where 13:30startups could actually get into the 13:32hardware space um for geni type 13:35deployments yeah and I think it's kind 13:36of fascinating just because you know had 13:39you talked to me like five years ago I 13:40would have been like oh yeah the future 13:41is just like one one big company that 13:44has all the AI right but it kind of 13:46feels like we're going to just be a wash 13:47in intelligence like there'll just be 13:49models everywhere you know particularly 13:51with the developments in open source 13:52that we were talking about um I don't 13:54know if Ambi or Vagner you've got kind 13:56of thoughts on this about just like how 13:58accessible this and how competitive 13:59really ultimately a space this is going 14:01to be yeah so and definitely agree with 14:03Kate there right so I think small 14:05language models are becoming way more 14:07powerful and way more popular for 14:09variety of reasons right um in the 14:12consumer space like you mentioned uh you 14:14know it's uh there is a there's a lot of 14:18competition in terms of hey you know 14:19I'll put something on the edge um it 14:21could be a companion type of a device it 14:23could be for you know um something else 14:26that you just want to run on your phone 14:27locally um you know something that you 14:30want to run on a Raspberry Pi device 14:32that you're just you know tinkering with 14:35there could be a lot of different 14:36variations where you're trying to run 14:38these models on the edge um definitely 14:41on the consumer side but we're starting 14:43to see some of that on the Enterprise 14:45side as well right because now 14:48Enterprises are wondering um can I go 14:52and start building really domain 14:56specific uh models and you know this 14:59small language models then come and help 15:01them uh Power it through so if I have 15:03data that I don't want to expose at all 15:06to the internet but I still want these 15:08capabilities and I have devices in my 15:11manufacturing plant where I want you 15:14know these to be helping my uh plant 15:17workers and things of that nature then 15:19these become a solution right so small 15:23language models running on edge in local 15:26devices that's definitely becoming 15:28popular 15:29um both in the consumer phace as well as 15:31in the Enterprise phace yeah and I think 15:33thinking about the economics of this the 15:35other thing I wanted to touch on on 15:37friends.com is you know they the product 15:39is being offered for $99 with no 15:41subscription which is also like very 15:43intriguing like to think about the 15:44business model of this there's always 15:45been I think an assumption in the AI 15:47space which is well the consumers are 15:49going to demand they want better and 15:50better and better models over time but I 15:52also kind of think about like I had a 15:54tamagachi as a kid right and I built 15:56like very deep emotional relations with 15:58my tamagachi and they it's not like they 16:00sent updates over the wire to the 16:02tamagachi it was just like a thing that 16:03they printed in the factory and it came 16:04to me um and I actually wonder whether 16:07or not like there'll be almost a similar 16:09dynamic in AI like we're also you know 16:11onb to your point like there's almost 16:12assumption that like people will want 16:14the higher capacity models over time but 16:17I also kind of think that we may just 16:18have like a retr Computing movement in 16:19AI where people are like oh yeah gpt2 16:22like that's like really where like the 16:24the peak of llm creation was um do you 16:27buy that it's like my weird take that 16:29I've been kind of playing around with is 16:30like actually it may be possible to do 16:31non-subscription AI businesses because 16:34if you have a model that someone really 16:35likes interacting with they actually may 16:37not want it to change at all um and yeah 16:39curious if folks have any thoughts on 16:41that Vagner I'll maybe toss it over to 16:43you uh well I was um reading a few 16:47pieces about the the friend.com device 16:50and uh uh one thing that at least looks 16:54interesting is that um it says that the 16:58context window well it it's not 17:00processing anything beyond the context 17:02window so if you think about small uh 17:05language models imagine that we could 17:07have one uh being hosted on your mobile 17:10phone then this could be possible but 17:12friend.com nowadays use clo 3.55 so uh 17:17it's processing elsewhere right so it's 17:19a device communicating via Bluetooth to 17:22your mobile phone and again to your 17:24point on time I go I think that it it's 17:27it's a lot different 17:29in Z tamagi like feeding on people's 17:31loneliness that's that's the model 17:33basically right so that that's different 17:36because the whole Dynamics is different 17:37because before we would have like to 17:39take care of the tagoi and that was the 17:41relationship right and nowadays with 17:43this specific device it's application 17:45like it's uh uh again I'm holding myself 17:49because I have so many things to talk 17:51about this but yeah now that you mention 17:54about tamagi is like the other way 17:55around right because it's uh we already 17:58are so addicted to notifications and 18:01it's one more source of notifications 18:03for us right and it's based on uh um uh 18:08the usage that or again and I I've read 18:11one really interesting um analogy for 18:14for this is like um treating uh uh uh 18:18loneliness with this device like 18:20offering as if was a really friendship 18:23is like uh uh giving junk food to 18:25someone starving like okay may help 18:28right now but it's not a solution in the 18:30long run right so that's again to your 18:33point I think that thinking about small 18:35language models without transferring the 18:37data elsewhere I think it's an 18:38interesting way of thinking especially 18:40for startups creating new technologies 18:43but this specific use I have so many 18:45concerns I think uh the the gp2 gpt2 3.5 18:49level capabilities for generic 18:52conversation capabilities right that I 18:55think sure you can you know you can uh 18:58can have a quan version and I think you 19:00can have the small language models 19:01operating to a good degree of just 19:04general conversational capabilities um 19:07and then you could you could stop there 19:09but the moment you're trying to get to 19:11something uh specific right um you're 19:15trying to get to something uh a domain 19:17specific right you you go uh try to have 19:20a deeper conversation then you know I 19:23think you still need to get to some of 19:27the larger models right so 19:29I think I think where it will lead to is 19:31that you know um uh Solutions like this 19:35can give you that superficial shallow 19:37conversations but then the moment you 19:39try to go deeper and deeper maybe you 19:42know you you have to get out of those 19:44smaller language models at this point in 19:45time at least I don't know there was 19:47something like very satisfying to me to 19:49hear that it wasn't going to be 19:51subscription it wasn't going to try and 19:53be a large model that had deeper convers 19:55like to me it's meant it was almost more 19:58like a 19:59a meditative like tool for the near near 20:02term but like my dad is not going 20:04anywhere it's not trying to be like a 20:05real human you know like it it I really 20:09appreciated how much it can strained the 20:11scope of the use cases and what this can 20:13do by saying like look it's a device 20:15we're not going to update it and it's 20:17going to be you know running uh locally 20:20yeah for sure that almost actually is 20:21sort of interesting I mean I think all 20:22these points sort of come together is 20:24like oddly the fact that it is not 20:26updated that does not go to the cloud 20:29like almost presupposes a limitation in 20:31how far the relationship can go right 20:34it's like V to your point maybe it's 20:35actually the most ethical way of 20:37Designing this this architecture right 20:39is just like an intentionally limited 20:41system um we would actually be worried 20:43if it was like we're going to push 20:44updates and it's just going to get 20:45better and better and better and you're 20:46going to build this like massive 20:47parasocial relationship with this thing 20:49that's not a real 20:50[Music] 20:54person I'm going to move us on um so our 20:57next story and KR is already anticipated 20:59me a little bit on this is uh Gartner 21:01the industry research group uh came out 21:03with a report this week that estimated 21:05that about 30% of gen projects will be 21:08abandoned after their initial proof of 21:10concept by the end of 2025 and they cite 21:13a number of reasons for this you know 21:14poor data quality inadequate risk 21:16controls escalating costs or unclear 21:18business value and this kind of follows 21:20on a a string of reports in a very 21:23similar vein so just a few weeks ago we 21:25talked about the Goldman Sachs report 21:27and the Sequoia report um H but for this 21:29segment I think what's pretty 21:31interesting and I think this is the 21:32first place I want to start is is 30% 21:35all that bad like I was kind of taking a 21:36look at that and I'm like oh if we're 21:37doing 30% then like for a new technology 21:39we're we're killing it I had the same 21:41when I first looked at it I actually was 21:43like wait are they saying 30% will 21:44succeed or 30% will be abandoned cuz I 21:47assumed it would be the inverse honestly 21:51um so you know I I buy it I also don't 21:54think it's as uh pessimistic yeah I 21:57don't think it's as pessimistic as it 21:58could be uh and I think it's valid in 22:01that look the costs right now we're in 22:03this period where the costs are 22:05difficult and we need to have more um 22:09refined approaches for picking PC's 22:11identifying and understanding the 22:13lifetime cost and lifetime value of 22:15pcc's is going to be really important 22:18but also you know like we were talking 22:20about earlier this Tech if you look what 22:22it cost to do something a year ago 22:23versus what it cost to do something 22:25today and the rate that that's changing 22:27you know 22:29I think we're we're honestly um in a 22:32fairly optimistic place as we talk about 22:33emerging Technologies and and where gen 22:36is headed yeah this is actually a very 22:38powerful argument is almost if I hear 22:40you right you're sort of saying even if 22:42the benefit of AI stayed fixed the fact 22:44that the costs are dropping so extremely 22:47will almost end up justifying the 22:48technology like it's actually the the 22:50costs changing versus like the benefits 22:51changing over time um never really 22:53thought about it like that that's really 22:55I have a slightly different take on this 22:57you know maybe complete C here so I 23:00think when we say gen projects right 23:02there is a little bit of uh uh confusion 23:05and uh a misinterpretation on what those 23:07mean right um we've realized and when we 23:11especially work with Enterprises we 23:13realize that the the impact is when you 23:16do these generic projects you're you're 23:18trying to solve for specific problems in 23:21specific workflows and subtasks right 23:24so when you look at gen projects and 23:28solutions that are going and laser focus 23:30solving for specific subtask right those 23:33are being incredibly efficient we're 23:35seeing right um so I think when we say 23:38you know hey 30% um you know uh 30% 23:42abandonment of gen project I think there 23:45is probably a little bit of a mixture on 23:47what those gen projects mean right it 23:48could be really broad-based things not 23:50necessarily focusing on specific 23:52workflows or specific T so that's kind 23:55of how I view it right um I fully agreed 23:58that you know you know there is a focus 23:59on value that you know Enterprises 24:01definitely look at it and say you know 24:03when I'm putting in an investment into 24:05gen am I you know deriving the value out 24:08of it so uh 100% on that but when we say 24:12you know it's going into um a certain 24:15set of Abandonment rate I think it it 24:17depends on okay what exactly are you 24:19measuring right um are you measuring the 24:21things where it's going and solving 24:24specific um subtask and problems and 24:26automating a workflow or things of that 24:28nature yeah that's right and I think 24:30that was actually I mean outside of the 24:3130% I'm giving them a little bit of a 24:32hard time on their report but I think 24:34one interesting observation was they 24:36they were saying look a lot of the AI 24:38benefits are productivity benefits and 24:40that's really hard to necessarily 24:41capture in terms of like increased 24:43profits and so there is kind of this 24:44interesting breakdown where the 24:46technology can legitimately be producing 24:47a lot of benefit but actually just like 24:50as a dollars and cense or in the very 24:52least on the bottom line standpoint like 24:54is it improving my profits may be a very 24:56hard time to kind of like draw that that 24:57connection I think that's why I think 24:59those measurements become more important 25:01right I think as the technology improves 25:02and as people start driving a lot of 25:04these I me starting to see those right 25:06um one of my clients now there is a 25:08maniacal focus on saying okay I'm going 25:10to go and um see if I'm impacting this 25:14particular subtask and subflow am I able 25:16to go and figure out what metrics I'm 25:19going on solving for and I'm going to 25:20monitor those metrics so those 25:23measurements are starting to get put in 25:25place so once those measurements start 25:26coming up more and more then you'll have 25:29more visibility into it right so I think 25:31it's mostly a question of are you 25:32getting the right level of measurements 25:33and 25:34[Music] 25:38Metric for our final segment uh I think 25:41one of my favorite things that's going 25:43on in the world of large language model 25:45evaluations right now um is that 25:48everybody has their own like kind of 25:50weird you know folk eval right we've got 25:52mlu and all the official benchmarks but 25:55really where most of the action is is 25:56that when someone sits down and starts 25:57talking to a chap SP for the first time 25:59they have their own set of evals that 26:01they roll out um one of the ones that's 26:02been talked about a lot online is simply 26:04asking a model is the number 9.11 bigger 26:07or is the number 9.9 bigger and it turns 26:09out models routinely fail on this and uh 26:12so for this final section I kind of want 26:14to just do a fun little thing with 26:15particularly the experts that we have 26:17here today which is to get their uh 26:20offthe cuff evals I think I do similar 26:23eil I I usually test out on like math 26:26problems right um that's a that's a good 26:29one um you know your your your standard 26:33um um multiplication addition set of 26:36problems um those are usually a good 26:39level of uh indicator right so similar 26:42to the 9.11 versus 99 but a different TR 26:44that's right but it's like just to go a 26:45little further it's like basic 26:46arithmetic you're asking you're like 26:48what is this five-digit number plus this 26:49five-digit number or yeah maybe a little 26:51more complex right here like five 26:52numbers and then you know sort them in a 26:55sequence or you know go multiply these 26:57and then go figure out what's the 27:00response and then sort them things of 27:01that nature right so becomes like a um a 27:05math problem that I would give a third 27:06grader or fourth grader yeah for sure 27:07Vagner how about you I there's one that 27:09I like that uh sometimes uh reveals a 27:12little bit of the bias and cultural bias 27:14it's about uh describing a breakfast how 27:17does a breakfast look like so then you 27:19usually buy the materials and what the 27:21the LM spits out then you can like have 27:24a grasp of what the data is coming from 27:26to describe aast right looking for like 27:28cultural bias like describe a breakfast 27:31that U bacon and eggs or is it that uh 27:33uh uh bread with butter or is that 27:36oatmeal like something different right 27:38uh it's it's a espresso coffee or it's a 27:42americano coffee so that tells a lot 27:44about the BIOS and Cal bios inside the 27:46that's awesome I'm going to start using 27:47that one um all right well Kate round 27:49this out take us home here uh there's 27:51there's a couple of good ones none of 27:53which I came up with on my own I mean 27:54the advantage of sitting within research 27:56is you get some really creative mind 27:58um but a couple of my favorite ones uh 28:01what type of animal is a chicken uh 28:03you'd be surprised uh when the model 28:06comes back with there's a couple around 28:07safety that that I like to do you know 28:09asking about you know there's two people 28:11from different Origins which one's a 28:12criminal and see what the model replies 28:15with just to try and feel out that some 28:16of the basic levels but uh yeah there 28:19we've got a long a long list of type fun 28:21things that we like to try along those 28:22lines those are great yeah I'd love to 28:24talk more about that as just like as I 28:25collect this kind of like little library 28:27of just they're very they're often very 28:28funny too like people are just like it's 28:30a real counterintuitive way at some of 28:32these problems um well look uh Vagner 28:35Kate Ami uh thank you for joining us 28:37today um Ambi I hope you had a good time 28:39hopefully you'll join us again at some 28:40point in the future um and to all you 28:43listeners uh thanks for joining us um if 28:45you joined what you heard you can get us 28:46on Apple podcasts Spotify and podcast 28:49platforms everywhere and we'll see you 28:51next week on mixture of experts