Learning Library

← Back to Library

Granite 4, Sora 2, OpenAI E‑Commerce

41m • Unknown Channel • ai-ml • news • intermediate • Watch on YouTube ↗

Key Points

The episode introduces the “Mixture of Experts” panel—featuring Kate Sol, Kush Varsni, and Kautar El Magraui—to discuss new AI developments like Granite 4, Sora 2, OpenAI’s e‑commerce ChatGPT features, and a security bonus segment.
Granite 4, launched on Hugging Face, offers a suite of compact, hybrid‑architecture language models that run on a single low‑cost GPU, making them attractive for developers and enterprises seeking affordable LLM deployment.
Recent AI news highlighted by Ayla McConnen includes Meta’s plan to serve ads based on AI‑assistant conversations, Microsoft’s “Vibe working” agent that automates Excel, PowerPoint, and document creation, DoorDash’s Dot delivery robot, and the emergence of Tilly, an AI‑generated actress being courted by Hollywood agencies.
The show teases upcoming discussions on Sora 2’s video‑production capabilities, OpenAI’s new e‑commerce integrations, and a bonus interview with Matt from Security Intelligence, encouraging listeners to subscribe for deeper AI insights.

Sections

Full Transcript

# Granite 4, Sora 2, OpenAI E‑Commerce **Source:** [https://www.youtube.com/watch?v=LAXAmXHNGeM](https://www.youtube.com/watch?v=LAXAmXHNGeM) **Duration:** 00:41:53 ## Summary - The episode introduces the “Mixture of Experts” panel—featuring Kate Sol, Kush Varsni, and Kautar El Magraui—to discuss new AI developments like Granite 4, Sora 2, OpenAI’s e‑commerce ChatGPT features, and a security bonus segment. - Granite 4, launched on Hugging Face, offers a suite of compact, hybrid‑architecture language models that run on a single low‑cost GPU, making them attractive for developers and enterprises seeking affordable LLM deployment. - Recent AI news highlighted by Ayla McConnen includes Meta’s plan to serve ads based on AI‑assistant conversations, Microsoft’s “Vibe working” agent that automates Excel, PowerPoint, and document creation, DoorDash’s Dot delivery robot, and the emergence of Tilly, an AI‑generated actress being courted by Hollywood agencies. - The show teases upcoming discussions on Sora 2’s video‑production capabilities, OpenAI’s new e‑commerce integrations, and a bonus interview with Matt from Security Intelligence, encouraging listeners to subscribe for deeper AI insights. ## Sections - [00:00:00](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=0s) **AI News Roundup on Mixture of Experts** - The podcast preview outlines a panel discussion on new AI models like Sora 2, Granite 4, ChatGPT e‑commerce tools, and a headline about Meta leveraging AI chat data for ads. - [00:03:59](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=239s) **Granite Sets New Open‑Source Safety Standard** - The speakers highlight Granite's ISO 42001 certification as a pioneering step in open‑source AI governance, debating whether it’s an outlier or indicative of a broader move toward stronger safety, compliance, and transparency, citing the Stanford Transparency Index. - [00:09:59](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=599s) **Scaling Efficient Models & Sonnet 4.5** - The speakers discuss expanding a high‑efficiency architecture for larger deployments, then shift to highlighting Claude Sonnet 4.5’s coding‑focused capabilities and its recent release. - [00:13:25](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=805s) **Shift Toward Specialized Hybrid AI** - The speaker discusses moving from broad foundation models to sustainable, task‑specific AI by combining pre‑training with efficient inference‑time adaptation to handle dynamic, low‑data environments. - [00:18:36](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=1116s) **Balancing Fun Prototypes with Robust Production** - The speakers contrast OpenAI’s consumer‑oriented “vibe” apps like Sora 2 with Anthropic’s coding focus, highlighting the challenge of turning playful prototypes into secure, production‑grade solutions. - [00:24:45](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=1485s) **Cost Sustainability of AI Video Models** - The speakers discuss the high compute expense of large video‑generation models, question whether such services can remain cheap in the coming years, and consider future hardware innovations and pricing models. - [00:28:14](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=1694s) **Scaling AI Video: Cost & Storage Challenges** - The speakers discuss the financial and technical hurdles of scaling AI video generation, highlighting inference costs, massive storage needs, and the broader context of OpenAI’s latest release and open‑source competition. - [00:34:31](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=2071s) **OpenAI vs Google in Agentic Commerce** - The speaker contrasts OpenAI's rapid, Stripe‑centric, user‑experience‑focused approach to agentic e‑commerce with Google's consortium‑driven, interoperable AP2 protocol strategy, while noting Anthropic's positioning as a developer‑friendly tool. - [00:37:36](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=2256s) **AI Agents Pose Social Engineering Threat** - Experts discuss how AI agents differ from traditional software vulnerabilities, being vulnerable to malicious prompts and social engineering rather than code exploits. - [00:41:03](https://www.youtube.com/watch?v=LAXAmXHNGeM&t=2463s) **Promoting the Security Intelligence Podcast** - The hosts summarize upcoming cyber‑security content, announce upcoming in‑depth expert interviews, and tell listeners how to access the show on IBM’s YouTube channel and major podcast platforms. ## Full Transcript

0:01Yeah, Sora 2 is, I think the Vibe video producing 0:07app. I mean the Claude is the Vibe coding we 0:12have. I mean vibe thinking, Vibe everything going on. I'm 0:16just Vibe living at this point. Exactly. Exactly right. All 0:20that and more on today's Mixture of Experts. I'm Tim 0:28Huang and welcome to Mixture of Experts. Each week MOE 0:31brings together a panel of cutting edge minds to help 0:34you digest the week's news in artificial intelligence. This week 0:37we've got a classic MOE panel which I'm very excited 0:41about. We've got Kate Sol, Director of Technical Product Management 0:44for Granite, Kush Varsni, IBM Fellow AI Governance, and Kautar 0:48El Magraui, Principal Research scientist and Manager for the hybrid 0:51cloud platform. We have so, so many different kinds of 0:55AI models to talk about. This week we're going to 0:57be talking about Granite 4 Sonnet 4.5, Sora 2. We're 1:02also going to talk about these new E commerce features 1:05that OpenAI has announced with ChatGPT. And we're actually doing 1:08a bonus segment with Matt from Security Intelligence, so stay 1:12tuned for that. But first I really wanted to turn 1:14to eili, who's going to give us the. Hey everyone, 1:22I'm Ayla McConnen, a tech news writer for IBM Sync. 1:25I'm here with a few AI headlines you might have 1:27missed this week. Meta will soon show ads on your 1:30Facebook and Instagram accounts drawing on conversations you had with 1:33its AI assistant. The jury, however, is still out on 1:37whether this move is clever or creepy. First we had 1:41Vibe coding, and now Microsoft has introduced Vibe working. This 1:45refers to an AI agent that will write docs, crunch 1:48numbers in Excel. And design PowerPoint slides for you. Next 1:51time you order in, you might want to take a 1:54closer look at your delivery name. DoorDash has released Dot, 1:57an AI robot that can cross bike lanes, parking lots, 2:00even sidewalks to bring you those late night bites you're 2:03craving. There's some exciting new talent in Hollywood. Several agencies 2:08are attempting to sign Tilly, an entirely AI generated actress. 2:12What do you think? Could Tilly be considered for an 2:15Academy Award? Let us know your thoughts in the comments. 2:18Subscribe to the Think newsletter for more AI insights. And 2:22now back to the episode. So I want to jump 2:28right into it and talk a little bit about Granite 2:314. Kate, you're on the panel. You've been obviously very 2:33close to this. What's coming out with Granite 4? What's 2:36exciting? What should be people be paying attention to? We're 2:38really excited to announce Granite4 was launched on Hugging Face 2:42this past Thursday. The models feature a range of very 2:47efficient smaller language models. So they're really designed for developers 2:51to pick them up, play with them, deploy them, as 2:54well as for enterprise customers that are looking for models 2:57and options for LLMs that don't require 8H1 hundreds to 3:01host. So these models all can fit on a single 3:04GPU, including like an L40s, a 100, like much cheaper 3:07GPU options thanks to their new hybrid architecture, which helps 3:13gain some memory efficiencies. Yeah, that's great. And is this 3:16is the right way of thinking about it? Like, what 3:17is the delta you think between say, Granite 3 and 3:20Granite 4? Right. Like, has the team been kind of 3:22like focusing on particular things? It sounds like a little 3:24bit of what you're saying is the trend of small 3:27being beautiful is kind of continuing with Granite 4. Curious 3:29if there's other kind of folks things that the team 3:31has been really focused on. Yeah. So with Granite 4, 3:33we're definitely doubling down on small efficient models. But Even 3:37the smallest Granite 4 model, which takes like maybe 4 3:41gigabytes, even running 128k context length, outperforms the biggest Granite 3:473 model. So we're reducing the memory footprint while improving 3:51the size. We also were able to secure ISO 42001 3:56certification for this family of models right before the release. 3:59So Granite is now one of the first, if not 4:01the first, open source models out on hugging face that 4:04has ISO 42001 certification, showing just the degree of governance, 4:09safety and security that we put into our AI model 4:11development system. So I think this is great and I 4:13think particularly with Kowtar and Kush, there's a couple other 4:16angles I sort of wanted to bring into the Granite 4:18four story. Kush, maybe we can pick up on this 4:20last point, which I think is a really interesting one, 4:23is, you know, I think one of the constant fears, 4:25and I'll just be candid about Open source has always 4:27been, well, does open source mean that people are just 4:29going to be releasing models with not a whole lot 4:31of governance, not a whole lot of safety? I know 4:34one of the projects, and I think, Kate, you've talked 4:36about it before, that the Granite team has been focused 4:39on is kind of safety and compliance. And I guess 4:42I'm curious if you want to give kind of a 4:43sense of like how this is evolving in open source 4:45generally. Is Granite kind of an outlier here or is 4:49it kind of like sort of leading the way of 4:50kind of like a whole trend of doing more of 4:52this kind of work when we do New open source 4:54releases. Yeah. So I think it's part of a trend, 4:58but I think we're ahead. Right. So there's this. Not 5:02that you're biased or anything, just a little bit. So, 5:05yeah, the Stanford Transparency Index has been out for a 5:10couple of years and it's been tracking, I mean, how 5:13transparent are different models? Different actually processes for the model 5:18building itself. Right. So with that, I mean, we've been 5:23interacting with the Stanford team. That's been really great. They 5:28should be coming out with their next leaderboard pretty soon 5:31and hopefully we'll be doing well on that. So that, 5:36I mean, shows the trend. This ISO 42001 is a 5:40great sort of testament to the overall process. I mean, 5:45the broader team has been undertaking. So I think all 5:48of that is part of it. We also have been 5:51cryptographically signing the models for the first time. So that's 5:56a new feature, another type of transparency and other type 5:59of verification. What does that get you? Actually, I actually 6:02don't. This is the first time I'm hearing about something 6:03like that. Yeah, yeah. So the idea is like when 6:07you're training a model, there's checkpoints right along the way 6:12and who knows what happened? No one can really know 6:15unless there's some sort of ability to verify this thing. 6:19And so you can actually have this cryptographic signing mechanism 6:24in there and then you release those with the keys, 6:28I guess, the cryptographic signature out and then someone can 6:32go back and actually verify that. Yes, this is what 6:34happened. This is the way that the training actually was 6:39done. So yeah, all of that is amazing stuff and 6:43yeah, really excited to see where this goes next. Katzer, 6:47a question I had for you was Kate made a 6:50crack about it a little bit earlier, but as someone 6:51who's always kind of dreamed of owning a multi GPU 6:55rig at home, it kind of seems like these small 6:58models are getting really, really, really, really powerful. And I 7:02guess I should just ask the question of. Just like 7:04it kind of feels like you actually don't need a 7:06whole lot of hardware to do some pretty incredible things 7:08right now. And is that trend going to continue? It 7:11kind of feels like, Tim, the idea that you would 7:14want to buy a multi GPU rig at home is 7:16kind of almost like an absurd thing to do now, 7:18just given how great small models are. Yeah, I think 7:21we have a great proof point here with the granite 7:23models, tiny models. So while everyone else is racing to 7:29make these bigger models, the frontier models, I think IBM 7:32is fundamentally changing the question from how do we make 7:36these models Bigger to how do we really make them 7:39smarter per compute. And I think this is really very 7:42important because the numbers that we see with granite, like 7:46Kate mentioned really tell a great story. 72% almost, you 7:50know, less memory, half the model size or even less 7:54better performance, you know, even 4x or more longer contexts 7:58and can run on consumer GPUs. I think this is 8:01really great. This isn't about being competitive on benchmarks. This 8:05is really about redefining what good means from most capable 8:09to most efficient and also being capable. So like you're 8:13saying, we probably don't need these huge multi GPU machines 8:19especially for special use cases. Enterprise focused. I think efficiency 8:24really matters and I think this is really strategic brilliance 8:28from IBM looking at this trajectory of AI development, saying 8:33that the trend that we're seeing is unsustainable. But when 8:38we look at the business, business what it needs. Compute 8:41costs are rising exponentially. Regulatory pressure is mounting. Enterprise customers 8:47care about TCO more than the leaderboards. And also the 8:51environmental costs are becoming impossible to externalize. So I think 8:56that's really, I think it's a win story here that 8:59we have with Granit 4 and also the architectural innovations 9:03with this hybrid architecture. I think it's really a huge 9:06leap forward because this entire architectural shift to Bamba is 9:11a bet that state space models I think are the 9:14future. Because transformers dominated here for a while. They scaled 9:18well, but also they scale very expensively. So if we 9:22really can deliver these comparable results with a much more 9:26efficient architecture, I think it's a great story. Well, Kate, 9:29to round out the segment and I think this is 9:31probably the worst possible question to ask someone right after 9:33a big launch is Granite 5. What can we expect 9:36for Granite 5? It's too soon to start talking about 9:39Granite 5, Tim. We're still really excited for Granite 4, 9:42but there's a lot going to come next with Granite 9:44Forest. So we've got models that will feature like thinking, 9:48style, reasoning capabilities coming down the road. We've got smaller 9:52models even than our tiny 3B model coming down the 9:56road. So think like in the 100 millions of parameters. 9:59So those are going to be really cool. Yep. Again 10:02and we are going bigger. So we will be able 10:05to take this efficient architecture and scale it and see 10:08similar gains in efficiency, but deploy it at a bigger 10:12size. So there's a lot of exciting work ahead and 10:15we're really excited to do all of this work in 10:17the open and share it with the broader community. Nice. 10:19That's great. Well, we'll definitely keep an eye on that. 10:25I'm going to move us on to our next topic. 10:27And in some ways, this episode has become very model 10:30heavy, actually, in some ways, because we're going to talk 10:32a little bit about Claude Sonnet 4.5 and we're going 10:35to talk about Sora 2. Right, the new OpenAI video 10:39model. But let's talk a little bit about Sonnet 4.5 10:42first dropped just earlier this week. And Kush, maybe I'll 10:46start with you. I think one of the most interesting 10:48things about this drop, I keep calling it a drop, 10:51this release is that early on, I think these models 10:57and foundation model companies often advertise themselves as jack of 11:01all trades. We're going to launch a model and we're 11:03going to show you benchmarks across everything you might want 11:06to use it for. But this release is very coding 11:10focused. Right? The whole blog post is sonnet 4.5. You 11:14use it for coding. It's great at coding. Did we 11:16remind you how much good it does at coding? And 11:20this is a little bit weird. I mean, a few 11:22weeks back we talked about this NDER paper that came 11:26out where people looked at, say, what people actually use 11:28ChatGPT for, and it turns out, like, coding is actually 11:31like a really, really small segment of like the overall 11:34use case for, you know, ChatGPT, which is like maybe 11:38the most widely deployed service in the space. I guess 11:41maybe a question to you, Kush, is like, why are 11:44we really narrowing how we sell, talk about focus these 11:47models? Because it seems very clear that Anthropic's making a 11:50bet that what you really should think of these models 11:53for is for coding. Yeah, it actually relates to what 11:56Kautar was saying about granite. Right. What is granite useful 11:59for those enterprise use cases where we can be very 12:03specific. And if you don't have a user in mind 12:06and you're creating a model, you're creating a system, then 12:09you're kind of in a weird position that, like, what 12:14is it really good for? So I think it's actually, 12:16I mean, makes sense for Anthropic to pick a lane 12:18and be like, this is what we're going for. And 12:22coding happens to be the one that they've gone for. 12:24We'll talk about the other models later on. And I 12:29think they're starting to pick their lanes as well. So 12:33from that point of view, I think it's good. And 12:37yeah, I mean, the size being huge is maybe what 12:42is needed for that kind of use case for the 12:46coding assistant. And we've been experimenting and we're finding anthropics 12:53models to be the ones that you do want to 12:55go for, for coding. So they have the success there. 12:58So they're building on it and keep going down that 13:01route. Yeah, yeah, for sure. And I guess Katar, on 13:04this front, you know, we're guilty of this ourselves at 13:06moe, as we always say, like the foundation model companies 13:09or like the AI companies. I guess this has me, 13:12you know, Kush's response has me thinking a little bit 13:14about like, does that even make sense anymore? Like I, 13:17I don't know. Is OpenAI really competitive with anthropic over 13:21time? If it turns out that they're going to be 13:23kind of pointing their models at pretty radically different things? 13:26You know, we used to talk about AGI, general intelligence. 13:29It seems like we're headed to asi, which is like 13:32specific intelligence, right? Or like super specific intelligence. A S 13:35S I, you know. Yeah, I think it's really interesting 13:39to see how all of these shifts are happening. Of 13:42course, I think when we started the foundation models was 13:45like one model that rules them all and then you 13:47can specialize. But now you're seeing these shifts to like 13:51now it's really important given the cause and all of 13:54these things, the sustainability issues, we need to go the 13:58route of the, the specific models. But I think it's 14:02going to be a hybrid approach because the training strategy 14:05that we're doing, this pre training, it still uses the 14:08foundational models kind of principles, which is still going to 14:12be important. It's just how do you guide these models? 14:15What are efficient techniques that you can use to train, 14:20pre train, but also to, you know, in the inference 14:24scaling paradigm, which becomes also super important to guide these 14:28models real time to do the right things at the 14:33least possible cost possible. So I think those strategies are 14:37becoming super important. Focusing more on what can I get 14:42these models to do during inference as opposed to pre 14:45training. Because I think the flexibility that we need to 14:48augment these models with during inference is becoming also super 14:51important, especially in dynamic environments where you don't have enough 14:55data or you're seeing these new situations and you need 14:59the model to be flexible, to adapt. But if we 15:03talk about Claude the Sonnet 4.5, I really like this 15:08developer focused strategy which is a developer first strategy which 15:14gives Claude the focus to continue to innovate for the 15:19software engineering tasks and so on. I think, which is 15:21great. I think that this latest release, I feel it's 15:25a big leap, especially with the 30 hours reasoning capability 15:29they have. I think that's really huge. Kate, tucked away 15:32at the bottom of the blog post for 4.5 was 15:35one of the weirdest, coolest research demos it's been my 15:39pleasure to see recently. It was a demo called Imagine 15:43with Claude and it looks like a Claude interface, but 15:46you would say, oh, I really want an app that 15:49does this and it would in effect generate that on 15:52the fly for you. And I had a lot of 15:54fun creating an imaginary terminal that called an imaginary hugging 15:59face for an imaginary model and having a conversation with 16:01it. But it was just very fun to play with 16:04and was like a very different interaction from chatbots. Right. 16:08And I guess I'm kind of curious to get your 16:10thoughts on what you thought about the demo and whether 16:12or not what they're kind of showing off there might 16:15really look like the future. I think kind of what 16:17they're proposing is in the future you don't have software, 16:19you just say what you want and then the software 16:21kind of appears. Is that where we're headed? So I 16:23mean I think Claude was very specific calling this Imagine 16:28with Claude demo an experiment. It's certainly something that they're 16:32throwing out there. You know, this idea that we're not 16:35going to have this pre canned software. We're going to 16:38start to just create software as we go about our 16:41day to day and live our lives and oh, I 16:43need to do xyz, let's create some custom software for 16:46that. I think there's a lot of practicalities that need 16:51to be solved before that becomes a reality. Let's put 16:54it that way. Um, you know, platforms exist for a 16:58reason. You know, investing in things at scale that many 17:01people are going to be using work for a reason 17:04versus custom. Coming up with one bespoke piece of software 17:08for every single task is going to be inefficient for 17:12every single thing. But I could see us getting to 17:14a hybrid world where we start to empower folks. I 17:18mean, is it that different from how people use things 17:20like Airtable and kind of other lightweight no coding based 17:25solutions to spin up a new database or kind of 17:28website to help them do X, Y and Z? I 17:31think it's just maybe a new riff on that, but 17:34more powered with LLMs and maybe a little bit more 17:37sophisticated. So ultimately you're a little skeptical. Yeah, I mean, 17:41I think it's going to be like a personal kind 17:44of day to day saver, not necessarily the new way 17:46that enterprises run their business. Yeah, like I guess I 17:49agree with you that I have a hard time imagining 17:51like a bank is going to be like, well, let's 17:53just like come up with some kind of payments processing 17:55infrastructure on, on the fly. Well, this is a great 18:02time, I think, to bring in a kind of third 18:05model into the discussion. One of the big news stories 18:10of the week, of course, has been OpenAI releasing its 18:13latest Sora 2 model. And we've been thinking about it 18:18very much in terms of models and talking about models 18:22on this episode. But I think the right way to 18:24think about Sora 2 is actually it's an app launch 18:27more than anything else. Right. It's not just an incredible 18:30kind of video generation model, but it also is like 18:33this mobile first social experience that they're trying to create. 18:37And I guess Kush maybe to toss it back to 18:38you because you got the original version of this question. 18:40It's like, it sure seems like OpenAI, you know, whereas 18:43Anthropic really wants to focus on coding, OpenAI is thinking 18:47about it very much as like a, almost like a 18:48consumer, if not an entertainment use for this technology. And 18:52do you think this kind of shows like that they 18:53really are kind of seriously differentiating into that space? Yeah, 18:57I think that's it. So yeah, Sora 2 is, I 19:01think the Vibe video producing app. I mean the, the 19:08Claude is the Vibe coding we have. I mean Vibe 19:12thinking, Vibe, everything going on. I'm just Vibe living at 19:15this point. Exactly, exactly. Right. And it's the same issue 19:19though, right? I mean it's an app, it's fun, it 19:22can be used for a bunch of things. But once 19:25you get to like the serious end of things, that's 19:29where I mean, you need to put in those extra 19:33sort of processes. A lot of the, the great security, 19:37the great robustness, I mean all of that sort of 19:40thing applies no matter what you start vibing on. Right. 19:44So I think in general in anything that you're producing, 19:48I mean there's some initial provocation, some initial thinking, you 19:52maybe get to a prototype and then you take that 19:55prototype and make it into the eventual product. And yeah, 19:59I think we're kind of closing the gap between those 20:01two ends. And that's great because the speed of innovation, 20:05the speed of production can happen, but then there still 20:09needs to be that other side. I mean just the 20:12provocative prototype isn't the product. So I think that's where 20:16we still need to finish the job in the right 20:20way. Kay, have you played with it yet? I'm kind 20:22of curious if you've. I haven't played with it yet, 20:25but maybe just building on that last question, I think 20:29what's really interesting is Both Claude and OpenAI, anthropic rather, 20:35and OpenAI with these model releases have really focused on 20:38the application. I mean, a huge part of the Anthropic 20:41release blog was focused on Claude code. Right. Versus just 20:45the endpoints themselves. And so we see these frontier model 20:48providers really focusing at a different layer at the stack 20:51and I think that's going to continue and I really 20:53think that the open source community needs to figure out 20:56a way to compete at that same level because I 21:03more versus just download the weights and take them off 21:07and run with them. So I'd love to see more 21:10work going on there. The things that I did see 21:12with the Sora to release, I mean, can we talk 21:15about the branding move, jiu Jitsu that they did, calling 21:18deep fakes cameos, basically. So you can now take a 21:23video of yourself and impose it into video or your 21:26friends and you can share these deep fakes, let's call 21:29them, for what they are of yourselves and cameo yourself 21:34into videos. So. So I see a lot of concerning 21:38issues with that as well as more broadly some of 21:41the things they're working on. But it certainly is an 21:44interesting update to the ecosystem. Yeah. So a lot to 21:48run through there. I think. Maybe let's pick up on 21:50that last point is obviously one of the kind of 21:54responses to the Sora 2 release was in some ways 21:58kind of like a collective horror around this technology. And 22:01I think I heard two things from friends on the 22:04social media chatter. Right. One of them was what you 22:05were saying, which is they're doing deepfakes. They're just calling 22:08it cameos. And then the second one was, have we 22:11created the infinite slot machine? Are we just creating this 22:16shallow video that's going to just keep us strapped to 22:20our phone forever? On that last one, do you think 22:22those risks are real? Do we feel like this is 22:24going to be. This is really changing the nature of 22:27content in a way that might be unhealthy. I think 22:29there's significant risks and they are real. And I mean 22:32OpenAI did try and address some of it in their 22:36release. They talked about doom scrolling and things they're doing 22:39to help prevent it. I think they have some sort 22:42of recommendation filtering and algorithm that you can customize with 22:47natural language. I don't see how that helps. If I 22:49can tell you more specifically what I want to see, 22:51I would think that improves. Yeah, yeah. But who knows, 22:56Like I said, I haven't had a chance to try 23:03And do we have the right safeguards in place to 23:07drive important things like creativity and expression without kind of 23:11enabling mass disinformation, mass slop, kind of reducing the human 23:17experience of creativity? And the other thing I'd love to 23:19get your comment on, just again, pulling from the granite 23:22discussion, is I think you were making an argument that, 23:25look, open source has been really focused on the model 23:29and what you see all these big companies moving towards 23:31is competing on interface. And I guess the kind of 23:34question for you is do you think Open Source kind 23:38of has the chops to go and compete at that 23:41layer? I still remember what is it? LibreOffice was the 23:46open Source office and it was great because I don't 23:48know, it was open to free opener software, but interface 23:51wise it. It had its challenges. And one argument is 23:54maybe this has always been kind of a challenge for 23:55Open Source. Do you think this time is maybe different 23:57or is there similar kind of difficulties? Look, the world 24:01runs on Linux and open source software and code all 24:05the time while putting your own user experience on front 24:09of it. So I think we need to get to 24:11a similar paradigm with open source AI where we're enabling 24:15the community to engage more at the kind of application 24:19framework level versus just the pure model weights. And then 24:23that empowers individual companies, individuals to go out and create 24:27their own versions that are going to power their businesses 24:31and their lives. So I think it's going to be 24:33a mix. I think we see this pattern in open 24:35source software development all the time and there's no reason, 24:38I mean, this is one part of the stack that 24:40doesn't require hundreds of GPUs burning in order to contribute. 24:45So we actually can engage in a more meaningful way 24:48with I think a lot of the open source developer 24:50community than you can in the training portion of the 24:53kind of stack itself. So I see there's a ton 24:55of opportunity there if we can kind of coalesce around 24:58some of these broader patterns and applications that we think 25:02we're going to need to drive success. Yeah, for sure. 25:05Kaltar, maybe a final question on this. I was once 25:08standing in a data center when someone started like a 25:11fine tuning run and you hear all the GPUs turn 25:14on and the room gets really hot really quickly. And 25:17I think a little bit about the amount of compute 25:21that goes into powering a mass Rollout of a video 25:26generation model and OpenAI can't be making money on this. 25:32This is an enormous increase in their burn rate to 25:36offer it at these prices, I guess, I suppose is 25:39the way to think about it. But how sustainable is 25:42this? Right. Do we think that in four or five 25:45years you will have access to these kinds of models 25:48at these prices or is it just kind of like 25:50this is a demo for the moment to show off 25:52the technology and the business model is ultimately going to 25:55be. You're going to pay a lot more for getting 25:57access to this kind of thing. Yeah, I don't think 25:58they probably. Have. A good answer to all of these 26:03questions. These companies are thinking about these things. It's of 26:07course there's a lot of work to work on next 26:11generation AI hardware with these technologies like neuromorphic and in 26:15memory computing and 3D integration, packaging lots. IBM also is 26:21doing tons of innovation in that space. So that has 26:25to continue to drive more efficiency all the way down 26:28to the silica level and even invent or innovate in 26:32other technologies like phase change memory or others where you're 26:37combining both compute and memory together. Those are crucial to 26:42continue to advance. But if we just want to look 26:44at these maybe two profiles, the Sora and the Sonnets. 26:49So the recent AI update to the tools like take 27:04an hour. So this is some recent statistics. So for 27:09Sora too like the inference cost is extremely high per 27:12generation and the use frequency, it's episodic here. So users 27:17creates few videos and the compute pattern you are seeing 27:20massive bursts could be infrequent depending on the usage. But 27:24if there is mass adoption of these things, you can 27:26imagine the pressure this puts on computes and the energy. 27:29If I look at Claude. So in their pricing it 27:33remains the same as Claude 4 which is about like 27:37$3 to $15 per million tokens. And they can run 27:41autonomously for 30 hours on complex multi step tasks. So 27:46in terms of the inference cost, it's you know, compared 27:49to the video one, it's moderate per token but it's 27:53continuous. It, it runs for hours and days. So it's 27:56like if you look at Sora, it's like a sports 27:59car, incredible power in short bursts, expensive per mile, while 28:03cloth 4.5 it's maybe like a semi truck, moderate power 28:08but runs 24 hours. It's like a hauling cargo here. 28:15So of course there are implications here and I think 28:17we need to look at this holistically. What is the 28:20cost? How sustainable is this? I think right now we're 28:24in the phase of proving the technology, proving the capabilities, 28:28but to do this in massive production, get that scale, 28:32there is a lot of more work to be done 28:35to optimize that stack. Yeah, it makes me think a 28:37little bit too. We've talked about inference, but what's also 28:40quite unique here is that like, text is like really 28:43cheap to store. Right. But like, if this goes well, 28:46you're talking about like YouTube levels of just storage that 28:51you need to do, which is like quite a different 28:53thing altogether. Also, like in, in some ways, like there's 28:55the inference cost, but increasingly also just like the storage 28:58cost becomes really expensive. If the idea is we're going 29:01to just store whatever video you created indefinitely. So store 29:05and then maybe train on it. Who knows? Yeah, exactly. 29:08Right. I mean, ultimately, yeah, there's the. There's the train 29:10element as well. So we'll have to see how this 29:13all unfolds. I'll move us on to our kind of 29:20like, final topic of the day, which is another release 29:25from OpenAI, but I think it's worth raising in kind 29:30of the context of the discussion we've been having so 29:32far today. Right. I think we've talked a lot about 29:34what's happening in open source, how the big kind of 29:38foundation model companies are maybe differentiating with time. And this 29:41one seems to be maybe like another kind of indicator 29:45of maybe how differentiated some of these companies are becoming. 29:49So basically, OpenAI, they released a thing called Buy with 29:52ChatGPT, which is the idea that ultimately GPT will be 29:56able to be kind of an E commerce agent for 30:04week, which is the Google payment protocol. It sounds like 30:08OpenAI, not to be left behind, is also announcing its 30:11own sort of agentic payment protocol. But let me kind 30:16of just go to a very basic question, is like, 30:18do you think there's enough trust in products like ChatGPT 30:21to have them do purchases on your behalf? I guess 30:25part of me is always wondering at what point do 30:28I feel comfortable giving agents access to my wallet? And 30:31that feels like a big question for this market. I 30:33mean, I'm not going to give ChatGPT access to my 30:36Fidelity account, but maybe my credit card where I can 30:39refute a charge that was made that I don't agree. 30:41Yeah, it's like, it's kind of like, it's Working up 30:42to some parts of your payments. Yeah, exactly. So I 30:45think there's plenty of trust for OpenAI to try and 30:49optimize the user experience on purchasing. They've got all the 30:52incentives in place. I do not trust OpenAI with, you 30:55know, my personal conversations and how they handle all sorts 30:59of mental health issues and sora to deep fakes and 31:01everything else. But I think their incentives are pretty well 31:04aligned to kind of user needs when it comes to 31:06pushing dollars through their platform. So I, I think it 31:11makes sense. They're clearly targeting kind of wide consumer reach 31:14and audience. And yeah, I think there probably will be 31:18plenty of trust there for giving ChatGPT your credit card 31:23and see what happens now. Will the banks trust? It 31:26would be interesting to see how the credit card companies 31:28work out refuted transactions that ChatGPT made. And does that 31:32count as a fraudulent transaction? X, Y and Z. That 31:35could be really interesting to see that play out a 31:37little bit. Yeah, I think the contracting element of this 31:39is really interesting. I also love, kind of, I'm getting 31:42a vibe of like Kate's like kind of frenemies approach 31:44to these companies where it's like, well, here the incentives 31:47are aligned. I have no problem with it. Here the 31:48incentives seem really disaligned. I feel bad about it. Yeah, 31:52there you go. Well, actually on incentives, I mean, Kush, 31:54one of the maybe conspiracy theories that came up around 31:58this launch was, well, the minute you start talking purchases, 32:02you start talking ads ultimately, right? Which is like, well, 32:05you have an agent that's going to buy stuff on 32:06your behalf. Wouldn't someone pay to be the product that 32:10ChatGPT recommends? Is that where we're headed with some of 32:14these products like that? Like ads is going to be, 32:16you know, everything old is new again and maybe ChatGPT 32:19really is kind of just the new Google. Do you 32:22think that's where we're going to end up? Not sure 32:24about that actually. Because like if we look historically, things 32:28like MPESA in East Africa came about so you could 32:32do purchases through your phone and have your mobile wallets 32:35and these sort of things. There's this whole thing of 32:38UPI in India and like kind of the middleware in 32:42some sense for the purchasing of things. And once you 32:47get ads in there, I mean it just isn't the 32:50thing. I think really it's getting money to flow through 32:53your system and you, I mean, take a little bit 32:55of a cut somehow somewhere. And I think that's the 32:59bigger story here. And because ads can come in, they 33:06can, sure. I mean do something or the other. But 33:09what really makes the world go around is where does 33:12the money flow? If you can capture that, then you're 33:16really golden. And I think government regulation needs to step 33:20in very quickly on this because this is a very 33:24critical infrastructure sort of piece for not just individual countries 33:28but for the global finance system. And I think if 33:31we just let this go without having a public sort 33:37of facing point of view on this, then it'll be 33:41a little bit of a challenge getting out of it 33:43because once there's kind of corporate capture of these sort 33:47of things, the infrastructure is in place and then you 33:50can't really undo it. So that's kind of my biggest 33:53concern on this. Yeah, and I think regulatory aspect of 33:56this is going to get very interesting. It's just like 33:57how do you step in? What are the rules that 33:59you need to put in place? So maybe a final 34:02question, we'll kind of close out Kautar, I'll give you 34:04the last word is imagine I'm the CEO of Amazon, 34:08I'm on my yacht sipping my martinis or whatever that 34:11the CEO of Amazon does. Am I worried by where 34:15this is all going as kind of like the Internet's 34:18prominent e commerce provider that has kind of sold everything 34:23now, right? Like they like do groceries, they do books, 34:26they do everything, do developments like this. Like are they 34:28a threat to companies like Amazon? I would be worried. 34:31You know, I think OpenAI here is trying to differentiate 34:37itself and trying also to have a big play in 34:40the agent e commerce. So they're I think trying to 34:44ship fast, all the experience build also the protocol around 34:48what actually works. They're using this agent ACP protocol, the 34:55agent agent e commerce protocol and of course right now 35:00it's Stripe centric but theoretically it's open to others. They 35:04open source the code that lets other merchants use also 35:08their interface and so on so they can open up 35:10also for other merchants. So I think they're trying to 35:14bet on the first mover adventures in agentic commerce. And 35:21for them I think it matters more, I think than 35:24maybe perfect interoperability, which Google Play is more focused on. 35:29So they're willing to kind of accept somewhat centralized solution 35:34like Stripe as the hub in exchange of velocity, really 35:38getting this to work fast. If you look at what 35:41Google's play, they're trying to build a consortium first here, 35:46get the buy in from 60 more partners to ensure 35:50true interoperability using the AP2 protocol, which is more ambitious 36:04whoever ships first. So I think Open's AI theory here 36:10is trying to own the user experience, control the transaction, 36:14kind of become the front door to commerce, which is, 36:18of course, a threat to Amazon, while Google's theory is 36:22setting the standards, be the protocol player here, but don't 36:26own any one experience. But if you look at Anthropis 36:30theory, be kind of the best tool that developers and 36:34businesses can use to build their own experiences. So I 36:38think everybody is trying to bet on one strategy, one 36:41angle, and we'll have to see how all of these 36:43things play together. But it's interesting to watch. Kate Kush 36:47Kowtar. This is one of my favorite panels. Hopefully we'll 36:50have everybody back on real soon, but that is all 36:53the time that we have for today. And next up, 36:55we're going to have a quick segment with Matt Kaczynski 36:57to do a cybersecurity segment. Well, Matt, we're really glad 37:05to have you on the show for our listeners. Matt 37:07Kaczynski is the host of Security Intelligence, a new podcast 37:11that has just launched focusing on cybersecurity questions. And we 37:14want to have you on the show because it is 37:16national, I believe, Cybersecurity Month. That's correct. And I guess, 37:20Matt, just to maybe, like, kick off the discussion, maybe 37:23just to riff on that last segment a little bit, 37:25strikes me that once we start using AIs for payments, 37:29it's going to get hacked super quickly. And so I'm 37:32kind of curious if you want to give, like, a 37:34little bit of a security gloss on that last discussion. 37:36Like, how is that space evolving? Are we all screwed? 37:39You know, I just, like, want to learn a little 37:40bit about that because certainly people are going to be 37:42using these products. That means they're going to get burned 37:44using these products, too, right? Absolutely. Yeah. So first off, 37:47thanks for having me here, Tim. First time, long time. 37:50But yeah, you know, it's interesting, right? Because the last 37:54episode we did of Security Intelligence this week, I asked 38:02do for you? And what was really interesting to me 38:04was the first answer I got from, from Jeff Croom, 38:07a distinguished inventor here at IBM, said, anything right for 38:11Jeff's money. Right now, he feels like it's too early 38:14in the game to be connecting this stuff to anything, 38:17really. And the other panelists kind of agreed because here's 38:20the interesting thing about the AI agents and the way 38:22that they're different from some of the other security challenges 38:25we've faced in the past test is that you're not 38:27hacking these things the same way you would say, like, 38:30I don't know, a piece of software, your traditional piece 38:32of software. Right. You're not exploiting some kind of bug 38:34in the code. You're not dropping a payload or writing 38:36a script. You're basically social engineering these things. Right. If 38:39you say the right words to these things, you can 38:42get them to do some, some pretty malicious stuff. And 38:45so the, the question that raises for cybersecurity experts today 38:48is, okay, we don't really know how to stop people 38:51from getting socially engineered. How do you stop an AI 38:54from getting socially engineered? And that's the big question, right? 38:59That's what people are trying to figure out in the 39:00space today. Yeah. And I think it makes me think 39:02a little bit about, I mean, so these are papers 39:04I'm kind of obsessed with in the AI space, which 39:06are like, oh, if you are encouraging to your AI, 39:09it performs better. You tell it, it's really important to 39:13my job for you to get this right and the 39:15AI does better. And it's kind of interesting that you 39:17use the phrase kind of social engineering because the way 39:19we try to get humans to be better at social 39:21engineering is we literally show them like documentation where you 39:25should ask a question, if someone is asking for your 39:27password. Do we need to do that as fine tuning 39:30for our models? Do we train them to get better 39:32at social engineering using the exact same methods that we 39:34use to get humans to be better at social engineering? 39:36I think that we kind of do, actually. Right. And 39:39it's a question of figuring out what that education looks 39:42like. Again, going back to the conversation that I had 39:44with the panelists on our episode this week, they all 39:47kind of landed at what we need is some kind 39:49of real world version of Asimov's three laws of robotics. 39:53Right. But for teaching, these are our AI agents and 39:58whatnot to avoid or maybe better detect social engineering with 40:01again, the caveat that all this education we've put into 40:05people, it still hasn't stopped it. People still get scammed 40:08and they probably will get scammed forever. So in some 40:11ways it's like, look, we may never stop the AIs 40:13from getting scammed, but if we can set up some 40:15kind of standard universal approach to telling them what to 40:19watch out for, maybe we can cut down on it 40:21a little bit more. Like we cut down on it 40:23with people. That makes a lot of sense. Well, I 40:26Think tell us a little bit more about the new 40:27show. So I understand that it launched fairly recently. And 40:30what will you guys be focusing on? We've been talking 40:32AI just because Moe is like AI pilled. But it 40:34sounds like cyber security might be a much broader focus 40:36for you guys. Absolutely. So, yeah, we basically, to be 40:39frank, we ripped off your format for Moe. We do 40:42a. We get a. We get a week. Hey, you 40:44guys, you hit something good. We just took it. So 40:47we get a panel going every week, three experts and 40:50myself, and we sit down and we break down the 40:52latest stories in cybersecurity. And now granted, a lot of 40:55it does have to do with AI because this is 41:02we also cover a ton of other stuff too, right? 41:03You know, your DDoS attacks, your, you know, new app 41:07vulnerabilities, big hacks, that kind of stuff. And then we've 41:10also got, coming up soon, some pretty special one on 41:13one in depth interview episodes with some experts that are 41:16more, a little more narratively focused. That's going to be 41:18a kind of bonus thing we do. But yeah, that's 41:21the kind of gist of the show. Nice. That's great. 41:23Well, if people want to find out more about it, 41:24where do they find you? Where do they find the 41:26show? Absolutely. Head over to the IBM YouTube technology. IBM 41:30Technology YouTube channel. And then of course, we're also found 41:33wherever podcasts are hosted. Security Intelligence is the name of 41:36it. Go type that in, search it out. You'll find 41:38us. Nice. Well, Matt, we'll have you back on Moe 41:41sometime. And thanks for joining us today. I would love 41:43to. Thank you, Tim. And that's all the time that 41:45we have for today. If you enjoyed the episode, you 41:47can get us on Apple Podcasts, Spotify and podcast platforms 41:50everywhere. And we'll see you next week on Mixture of 41:52Experts. Experts.