Learning Library

← Back to Library

Synthetic Data, AI Agents, Safety

Key Points

  • The episode opens with a debate on whether AI progress will increasingly reduce human‑in‑the‑loop tasks by improving agents, or whether the impact will depend more on specific use‑case requirements and the limits of abstraction.
  • Nvidia’s recent launch of the Nemotron‑4 340B model family, engineered specifically for synthetic data generation, highlights a shift toward using artificially created datasets to scale and accelerate LLM training.
  • The show will examine the latest enterprise‑focused AI agents, discussing current capabilities, real‑world deployments, and the most significant business implications they may bring.
  • Former OpenAI chief scientist Ilya Sutskever has founded Safe Superintelligence Incorporated, sparking conversation about its potential role as a new competitor in the AI safety and alignment space.
  • The panel—featuring product manager Maya Murad, generative‑AI research director Kate Souleis, and IBM fellow Kush Varshney—offers diverse perspectives on research, product development, and governance challenges shaping today’s AI landscape.

Sections

Full Transcript

# Synthetic Data, AI Agents, Safety **Source:** [https://www.youtube.com/watch?v=G8lnNR1-rsw](https://www.youtube.com/watch?v=G8lnNR1-rsw) **Duration:** 00:43:46 ## Summary - The episode opens with a debate on whether AI progress will increasingly reduce human‑in‑the‑loop tasks by improving agents, or whether the impact will depend more on specific use‑case requirements and the limits of abstraction. - Nvidia’s recent launch of the Nemotron‑4 340B model family, engineered specifically for synthetic data generation, highlights a shift toward using artificially created datasets to scale and accelerate LLM training. - The show will examine the latest enterprise‑focused AI agents, discussing current capabilities, real‑world deployments, and the most significant business implications they may bring. - Former OpenAI chief scientist Ilya Sutskever has founded Safe Superintelligence Incorporated, sparking conversation about its potential role as a new competitor in the AI safety and alignment space. - The panel—featuring product manager Maya Murad, generative‑AI research director Kate Souleis, and IBM fellow Kush Varshney—offers diverse perspectives on research, product development, and governance challenges shaping today’s AI landscape. ## Sections - [00:00:00](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=0s) **AI Agents, Synthetic Data, and Superintelligence Startup** - In the opening of the “Mixture of Experts” podcast, host Tim Huang with product manager Maya Murad preview a discussion on Nvidia’s Nemotron‑4 340B LLM for synthetic data, the shifting role of enterprise AI agents, and Ilya Sutskever’s new Safe Superintelligence venture. - [00:03:03](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=183s) **Nvidia's Synthetic Data Strategy** - The speaker speculates that Nvidia releases massive open‑source models like the 340‑billion‑parameter version mainly to generate synthetic data for training smaller, production‑ready models, because running such huge models directly is prohibitively costly. - [00:06:12](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=372s) **Enterprise Interest in Synthetic Data** - The panel discusses how enterprises are increasingly seeking synthetic data to augment limited datasets, reduce costs, and address policy concerns in the era of generative AI. - [00:09:19](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=559s) **Nemotron's Open Licensing Breakthrough** - The speaker explains that Nvidia's Nemotron release is notable for its permissive model terms that allow commercial use of synthetic data—unlike other restrictive AI licenses—marking a potentially pivotal shift in the industry. - [00:12:23](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=743s) **Understanding AI Agents and Modular Systems** - The speaker explains that AI agents go beyond basic model inputs and outputs, requiring modular system engineering—integrating retrievers, external tools, and models—to build effective enterprise AI applications. - [00:15:28](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=928s) **Evolving AI Agent Paradigms** - The speakers trace the shift from a single, monolithic model to step‑by‑step pipelines and now to autonomous agents that self‑generate instructions and interact with external services, emphasizing the challenge of reliably connecting models to those tools. - [00:18:41](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=1121s) **Enterprise AI Agents Amid Legacy Systems** - The speakers argue that while a single, turnkey AI model is a dream, real‑world deployment must navigate legacy infrastructure, making controlled, low‑risk enterprise agents a practical first step. - [00:21:49](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=1309s) **Enterprise Preference: Agentic vs Programmatic AI** - The speaker notes that most enterprises currently stick to systematic, programmatic RAG solutions with only limited experimentation in fully autonomous agents, and hypothesizes that narrow, high‑value tasks favor programmatic approaches while broader, less defined problems may benefit from agentic methods. - [00:25:01](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=1501s) **Human‑in‑the‑Loop Trust Debate** - The speakers discuss whether advancing AI models will reduce the need for human oversight, emphasizing current safety concerns and the role of organizational culture in trusting autonomous systems. - [00:28:07](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=1687s) **Ilya Sutskever’s New AI Venture** - The panel debates the freshly launched Safe Superintelligence Inc., questioning whether Ilya Sutskever’s startup can realistically compete with industry giants given its ambitious superintelligence goals and the enormous capital required. - [00:31:12](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=1872s) **AGI Dreams vs SaaS Realities** - The speakers debate whether pursuing superintelligence is a coherent mission while acknowledging the financial pull toward day‑to‑day B2B SaaS applications. - [00:34:18](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=2058s) **Balancing Long‑Term AI Vision with Immediate Needs** - The speaker invokes the Haudenosaunee seventh‑generation principle to argue that while future AI risks merit consideration, we should prioritize protecting current systems and focusing on practical, diversified AI solutions rather than fixating solely on superintelligence. - [00:37:42](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=2262s) **Balancing Superintelligence and Business Speed** - The speakers examine how AI firms like OpenAI and Anthropic reconcile lofty general‑purpose AI ambitions with practical business pressures, highlighting Anthropic’s launch of Claude 3.5 Sonnet which, like rivals, prioritizes fast model performance over pure intelligence. - [00:40:55](https://www.youtube.com/watch?v=G8lnNR1-rsw&t=2455s) **Beyond Monolithic AI Models** - The speakers contend that large language models such as GPT‑4 function as composite systems—often mixtures of smaller expert models—signaling that future AI advances will rely on modular, system‑level architectures rather than single monolithic models. ## Full Transcript
0:00Are we just waiting for these, like, agents to get better and better? 0:03And humans will have to do less and less in the loop. 0:05Or, you know, is it less dependent on the model's capability 0:10and more dependent on kind of the use case, like where is it headed? 0:14Is there a limit to how much we can abstract away? 0:27Hello and good morning. 0:28From my hotel room in San Francisco, 0:30you're listening to Mixture of Experts, and I'm your host, Tim Huang. 0:33Each week, mixture of experts brings together a stellar group from research, 0:36engineering, product sales, and more to tackle and discuss 0:38the biggest trends in artificial intelligence. 0:41So this week on the show, we've got three stories. 0:42First, Nvidia announced the launch of Nemotron-4 340B 0:46an LLM, specifically designed to aid in the creation of synthetic data. 0:50How big of a deal is it, and what does it say 0:52about the next stage of AI training? 0:54Second, we'll talk about recent developments for agents in the enterprise. 0:57Our agents reality now. 0:58And what can we expect the biggest impacts to be. 1:01And third, and finally, just earlier this week, former OpenAI chief scientist 1:04Ilya Sutskever launches a company, Safe Superintelligence Incorporated. 1:08What is it? And does it have a chance of becoming a new contender in the space? 1:11As always, I'm joined by an incredible group of panelists 1:13that will help us navigate what has been another action packed week in AI. 1:17So joining us for the first time today, I'm Maya Murad, product manager for AI incubation. 1:22Maya, welcome to the show. 1:23Thanks for having me. 1:25And then we've got two veterans 1:26who are joining that we haven't seen for some time, 1:28but I'm very excited to have both of them back. 1:29Kate Souleis a program director for generative AI research. 1:33Kate, Welcome back. 1:35Hey, everybody. Great to be here. 1:36And finally, Kush Varshney, who was part of the very first episode. 1:40So he's the OG, for a mixture of experts. 1:43He is an IBM fellow working on issues surrounding AI governance. 1:47Yeah, O.G., I guess it's a great designation. 1:51Yes, it's some kind of distinction. 1:53So, Kush, welcome back. 1:59Well, great. So I think the first, story I really wanted to dive into was, 2:03you know, I caught in the kind of constant, 2:05you know, wave of papers and releases coming out. 2:09a launch that happened from Nvidia late last week on Friday. 2:12And because it was a Friday launch, I think it kind of 2:14got lost in the news cycle. 2:16But I did want us to kind of focus on it. 2:18Nvidia released a model, a class of models, 2:21called Nemotron-4 340B and it's a set of models 2:25that are specifically designed for synthetic data generation. 2:29and I think it's so interesting because if you're not familiar 2:31with the background here, right, 2:32the way we train LLMs, get them to do their magic. 2:36A lot of it relies on data. 2:38and in the past, the way people have done this is literally getting lots 2:41and lots of real world text, to train and improve their models. 2:46And so, you know, the dream of synthetic data, I think is very fascinating. 2:49It's kind of like the idea that in the future, 2:51we actually won't even need text from the real world. 2:53It'll just be like stuff that another AI model generates. 2:57and I guess, Kate, I want to throw it to you first, 2:59cause I know you've thought a lot about this 3:00area and been kind of doing some work in the space. 3:03I'm kind of curious, just as a first cut. 3:05Like, why is Nvidia getting into this area? 3:08I'm really kind of curious about why this kind of hardware company 3:11is saying what we're going to be launching these models. 3:13And one of the models we're launching is, is a synthetic data model. 3:15So I guess the first question I'll throw to you is like, do you have any thoughts on that? 3:19Conspiracy theories. 3:20Maybe not conspiracy theories, but just like why it is 3:22that they're investing in this space at all. 3:25So, I mean, I 3:26think there's, some more straightforward answers 3:30and then maybe some side answers on why they're working in this space. 3:33I mean, one example of what you might think of why Nvidia is working, 3:38and released the 340B model into the open source, 3:43is for synthetic data specifically is because they're recognizing that 3:47no one wants to run inference on a 340 billion parameter model 3:50for real tasks like the value of models the size. 3:53I think originally there was a lot of excitement. 3:56Everyone wanted to build as big a model as possible and see how far they could push the field, 4:00but the reality is like no one's actually going to go and deploy this in production 4:03and hit a 340 billion parameter model every single time you want to run. 4:06Inference is just too costly. 4:08But there is a lot of value in running inference on this model. 4:13Once using the data that you create 4:16to train a much smaller model and then deploying that out in production. 4:21And so I think part of this might be just the field is starting to find new ways 4:25to add great value to these really big models they invested on early on. 4:29because they do take a while to train, but it actually makes a ton of business 4:32sense for Nvidia to, if you think about it, 4:34because customers need their customers need to get 4:37models hosted on their compute running as soon as possible. 4:40And what we're seeing is the easiest, not the easiest, but 4:44one of the most exciting. 4:45And, most powerful ways to start to take models, improve them, 4:49use them for their use case and get them out into the world 4:53and deploy them in production is to align them using synthetic data. 4:56So more and more customers are using and consumers are using synthetic data 5:01in order to actually take models and tune them for your use cases 5:05and for your task. 5:06And so if Nvidia can help customers with that cycle and help get models out 5:09into production faster, you know, ultimately that's going to create 5:13some some good drag for their their products. 5:16Yeah, it's such an interesting dynamic. 5:18I think particularly in that first point that you're making is, you know, Nvidia 5:21like the company that I identify with like really big compute, right. 5:25Like, you know, every release is like bigger and bigger. 5:27And the models that you can theoretically 5:29run on them are bigger and bigger and bigger. 5:30But I guess sometimes you're kind of saying 5:31here Nvidia is like conceding the reality that like most people actually 5:35are not going to be doing that. 5:36It's just like such an expensive thing. 5:38And so almost they have to win the synthetic data game, or 5:41in the very least kind of like support that use case just because it isn't. 5:44It isn't what most people are doing to create 5:46like the biggest, biggest, biggest models in the whole world. 5:48Yeah. I mean the model, if you look at the, supported intended uses, it's 5:53not just intended for synthetic data generation. 5:55It's a perfectly reasonable chat model. 5:57It can be used for chat use cases. 5:59But if you look at how it's been marketed by Nvidia, every single press 6:03release and blog and paper is all about the synthetic data aspect. 6:06And so I think they're recognizing that this really is 6:08the only viable way to try and get value out of models of size. 6:13And it's also a really exciting way. 6:14I mean, it's just where the field is headed 6:15and where people are seeing a lot of, a lot of use and value. 6:18Yeah, for sure, Maya I'm kind of curious, 6:20So I can bring you into the discussion. 6:21So I think one of the reasons 6:23we want to bring you on is my understanding 6:24is that you've done a ton of work with kind of enterprises, right? 6:27And getting them to integrate new kinds of technology. 6:30I guess I'm curious about what you're sort of seeing out in the space as like 6:32synthetic data becoming, you know, a bigger part of the discussion. 6:35Do people want it? 6:37I'm just kind of curious about like, 6:38what the market demand for, for this is looking like over time. 6:41I think it's a very exciting space. 6:43And it's a it's a need that existed prior to the rise of generative AI. 6:48So customers are they have their own data, but they're also limited to it. 6:52And it's costly to create task specific data. 6:54So there is a really strong premise of can I start with only 6:58a few examples and augment that data set to for various 7:02use cases to train my own model to evaluate on a use case to red team? 7:07so there's a lot of value to be drawn out of it, 7:10and it's one of the top customer inquiries we get. 7:13yeah, that's really interesting. 7:15Yeah. I wonder if, like in the future, 7:17I just I'm curious if any of the panel is kind of view on this as like, 7:21you know, we've we've so identified and I think this has actually been 7:23certainly something on the policy side. 7:24People have been like, oh, you know, AI is so data hungry. 7:28And so therefore it is kind of privacy invasive, right. 7:31Like it just needs all of this data to get working. 7:34Synthetic data for me has always been like, well, maybe in the future, 7:36like actually real world data is actually not going to be 7:38that big of a deal because you have a few examples 7:41and then you scale up with synthetic and there you go. 7:43I mean, is that a fantasy or are we headed to that world? 7:46I mean, I think in terms of using synthetic data 7:48as a tool to protect privacy, it certainly has a lot of value. 7:52I don't think we have quite, 7:54you know, there's a lot of work to be done in that space. 7:56in order to really take advantage of the promise that it offers. 8:00But my I've mentioned that even before, 8:04like, LLMs became around, we've been using, synthetic data. 8:08And I think that's very much true. 8:09Like we've looked at, for example, 8:11a couple of years ago, using synthetic tabular data just to create 8:15privacy protected versions of sensitive data sets where you can mask information. 8:21so there's a lot of different kind of really cool applications that 8:24I think are going to start to converge 8:25with synthetic data around privacy, how that impacts generative AI training. 8:29and maybe help drive the space forward more. 8:31I know, Kush, you do a lot of work in this space. 8:33Maybe you can comment. 8:34Yeah. No, I mean, I think the the privacy aspects are an important part of it. 8:40I think the exciting thing is actually going back 8:42to the fundamentals of probability. 8:44I know in the first couple episodes we covered Kolmogorov and other friends, 8:49from that time and just the ability to sample like really high 8:52dimensional, spaces, with just a few examples 8:57is like mind blowing in some ways, even though it feels very normal. 9:01Like with five data points, you can sample this like 9:04trillion dimensional space and cover it so well. 9:08I mean, I think that's pretty crazy. So, 9:11yeah, I mean, this is just changing how work is done. 9:16I think, in the whole generative AI space. 9:19Yeah, for sure. 9:20I think it's really, 9:21yeah, it's a really interesting way of sort of thinking about it. 9:24And yeah, I think we want to do more collaborative type sections, 9:27like it feels like that was actually really good for us to go so, so wonky. 9:31but I think it's actually important, I think in terms of like, exposing, 9:34you know, what's actually really going on under the hood. 9:36I mean, I guess one question for you is, 9:38given that this has been a long term research objective, right. 9:42Do you see the Nemotron release this kind of largely 9:44sort of incremental like, is this a big deal this launch, or is it 9:47pretty much just like the most recent salvo in the race for synthetic, data? 9:53So honestly, I think what I 9:56what is the most exciting part about this 9:58release is the release of the model terms, 10:02where Nvidia is actually saying 10:04we want this model to be used to train other commercially viable models 10:10and make no claims over models trained with synthetic data created by Nemotron 10:14which is very different than almost any other model provider 10:18that's created, especially ones creating their own custom model terms. 10:21If you look at like the Llamalicense, you know, gamma license 10:25and others that are being created, they all prohibit that type of use. 10:29So the model itself is exciting. 10:30And, you know, 10:31I think people will start experimenting with it 10:33and seeing how far they can take it in the next couple of weeks. 10:36But what they did that is really totally different from 10:40how anyone else is behaving is actually the permissions in which it was released. 10:44it's like the legal terms are the real innovation here. 10:47I'm sure there's 10:48lots of other innovation too, but that's what I'm sure you're excited about it. 10:52yeah. Yeah, for sure. 10:54And I think it's kind of I mean, you know, Nvidia is 10:56kind of playing a dangerous game here though, right? 10:58Because there's a bunch of people who are making these models proprietary 11:00because they want to build businesses at the synthetic data layer. 11:04I guess Nvidia is kind of saying like, we sort of don't care. 11:06Like we would rather just have everybody have access to this. 11:09And, you know, for us to sort of enable all the secondary commercial models 11:14rather than creating like a market around synthetic data, specifically. 11:18I mean, you got to use Nvidia compute to, 11:21to generate synthetic data on a 340 billion parameter model. 11:25Then you have to use compute to train it, and then you have to 11:27use compute to host it. 11:28So I think they're playing a good long game there. 11:32yeah I saw this great tweet recently, which is, you know, the adage of like, oh, 11:35when there's a gold rush, everybody should be selling shovels, 11:37but it's almost kind of like Nvidia is like when everybody's selling shovels, 11:40you need to be selling shovel making machines. 11:42And I was like, 11:43that's actually like such a good way of kind of capturing 11:45what it is that they're what they're doing in the space. 11:53There's a very broad topic I 11:54think we want to cover, on this second agenda item, which is agents. 11:58Right. This is like the jargon in the space shift so quickly. 12:01It's like it feels like a few months ago, people were like, oh, agents are coming. 12:05And then now it's kind of like agents are here and everybody's working on it. 12:07And, I guess, you know, Maya, I think part of the first thing 12:12I want to kind of bring you in on is just 12:14if you want to explain to our listeners what are agents exactly? 12:16And like, why is it a jump from what we've had in the past? 12:19because I think the definition and the distinctions have not always 12:21been so clear. 12:23And so I think even for me, I don't really know. 12:25but we're just like, it's a good place to start. 12:27Like, what are agents and why are they different? 12:29Yeah, absolutely. 12:30So the way that I like to explain what our agents is to first contextualize 12:34where we are in terms of building applications with AI. 12:37So in 2022, all about foundation models, all about large language models. 12:42I think we've learned since that simply inputs and outputs from 12:46a model is not going to unlock those high impact, enterprise use cases. 12:52So if I want to understand what is my vacation policy 12:55or if I want to retrieve real data, I have to build a system around it. 12:59And actually, Berkeley came out with a really good paper 13:01this year talking about compound AI systems. 13:05and it's a reflection of the fact that you need to go back 13:08to plain old systems engineering to build AI applications. 13:12So modular components that are fit to solve certain problems. 13:15So you have a retrievers, you have external tools, 13:18and then you have your model interacting with those to solve a problem. 13:22And we're still in this world and agents still operate 13:25within the space of an AI system. 13:29the way most of AI systems are built, for example, 13:31retrieval, augmented generation is one of the most popular kinds 13:34is me as a developer, I prescribe the flow. 13:37So take the user query, run it through a search, retrieve 13:41the results, feed it a model, and then the model generates an answer. 13:45So this is a flow that has been pre described and it's fixed. 13:49If I give it something else it's not going to work because it was pre 13:53described to solve problems maybe related to a vacation policy. 13:57and agentic approach in a system 14:00is that the LLM can reason through how to solve the problem 14:03and understand what is the path to chart to to answer a query. 14:08And this is done through two capabilities and that are building on top of large 14:11language models. 14:12So one, large language models have really great reasoning capabilities. 14:16And they're improving as LLMs are getting 14:19bigger and stronger and are seeing more data. 14:22And it's operating the same way we as humans do it. 14:24So if I give you a complex question like how many times can the UK 14:28fit in the US if you if you, I ask you to give me a 14:32an answer on the top of your head, 14:34you're not going to get you're most likely won't get it right 14:36unless you're a really great geography buff. 14:39but we as humans, 14:40the way we think about it is we break down the problem into smaller parts. 14:43So let me find the size of the UK. 14:45Let me find the size of the US. 14:47What tools do I have at my disposal to find this? 14:50Maybe Wikipedia is a trusted source, 14:52and then let me do some math to divide the bigger by the smaller. 14:55And this is exactly how an agent would reason about it. 14:58And there's no magic behind it. 15:00You're just literally prompting the model to say, think step by step, 15:04create a plan. 15:05And then for each part of the plan, you have access to tools. 15:09so that's the other part of it, the ability to act and to call on tools. 15:12So a tool could be an API that interfaces with Wikipedia. 15:16It could be a calculator, it could be a piece of program 15:19that can run a script for you. 15:21And when you combine all of those together, you're actually giving 15:24a lot more autonomy to the model, to how to solve the problem. 15:28You're not scripting how the solution would be. 15:30The model takes care about how to solve it. 15:32And this is what we mean by an autonomous agent. 15:34So long answer. 15:36But I think this is helpful to contextualize that. 15:38It's a continuation of where we are with systems design. 15:41Yeah, that's really useful. 15:43It sort of feels like we've kind of moved through these three acts already. 15:47Right. And like lasts, you know, 24 months, right. 15:49Where like my the way you kind of phrased it was the first act was everybody 15:53thought we'd have one big model 15:55and do inputs, outputs and like problem solved, right. 15:57Like we're done. 15:58And then it feels like act two was, oh my God, that doesn't work at all. 16:02What we need to do is kind of like prescribe all of these steps 16:05and then kind of like insert the model into the process. 16:09And then I think act three is kind of it sounds like a little bit of where 16:11we're going now with agents is, well, we kind of go back to that big model state. 16:16But the trick is, I guess, that we're allowing the model to like, 16:19develop these step by step instructions on their own and then I guess, 16:22enable them to kind of like reach out and interact with all of these systems. 16:27so it, it strikes me that the big challenge here is 16:30can the model actually touch all these outside services? 16:33because, you know, we've had stuff like chain of Thought for a very long time. 16:36That interface seems to have been the difficult one. Right? 16:38Which is like, how do you actually get the model 16:40to go and interact with all these services? 16:42Because I suppose, in fact, it's kind of like 16:44get the agent to write an API on the fly. 16:47Is that kind of how people are approaching it? Yeah. 16:49So models are being trained to be better at generating 16:53a correct output to interface with an API or to run a piece of code. 16:57There's also, actually good privacy 17:00and security considerations here when you move to the agentic space. 17:04So prompt injections would be quite scary. 17:07And then I think I pass it on to Kush to maybe talk more about that. 17:10But the other part, as well as code execution, 17:14we've heard from even like I think the founder of OpenDevin, said 17:17the first time we run it, like all the systems in the file 17:19system were deleted because of like the agent behaving. 17:22I have to be like really careful and how you architect that. 17:24And yeah, I don't know if you want to comment 17:26more on like some of the security privacy considerations with agents. 17:32Yeah. 17:33so clearly when, 17:35these things are interacting with, various, cyber physical sort of systems, 17:40then there is a risk of, the different types of, 17:44prompt injection attacks and what we call indirect prompt injection attacks 17:48that can either get corrupted because something out in the real world 17:51has, problem or that is the models themselves are going out and, 17:56actually causing stuff, that happen in the real world. 18:00So I think, yeah, I mean, it is a danger. 18:03but I think the promise is also really, really great. 18:07And I think we can build in some of the, this security privacy sort of guardrails 18:12into the way that these, decisions actually interact. And. 18:17Yeah, I mean, as Tim said, I mean, the code itself, like the API code 18:22is being generated on the fly, is like the, 18:26the most exciting part about this, because, code has traditionally been 18:30written, for like a fixed sort of thing, but here, 18:35if it's being generated, then, you can compose things, as 18:40in a creative way. 18:41And, I think that's, a unique thing. 18:44Yeah. All of these things seem to point to kind of the struggle of deploying AI 18:48in the real world, 18:49where, you know, the act, one that I described is almost like the 18:52AI person's dream, where it's like 18:54it's a complete vacuum, and we just have one model 18:57that does everything completely out of the box input output. 18:59And we're, you know, we're done. 19:01and, you know, each step of this seems to have been like, actually, you know, 19:06there's actually all these legacy systems you need to deal with. 19:09my is it right from Kush's last comment that this is one reason 19:12why agents are potentially really good in the enterprise? 19:16Because I guess what I imagine here is, 19:18you know, your main worry about agents on like a consumer side 19:20is like they're going out into the world and doing all sorts of things, and, 19:23you know, they're subject to all of these inputs from the public 19:26that might be attempts to manipulate these systems. 19:28But I guess in the enterprise, like, 19:29you control a lot more of the variables, right? 19:31You could have an agent that 19:32just like operates within a business and that feels like it might 19:36like, constrain the problem more in a way that makes agents more viable. 19:40do you agree that. Yeah. 19:42So I think in general, when you're taking a new piece of technology, 19:45especially one that has more autonomy in how it can operate, 19:49so starting with low risk areas, maybe back end services in a company, 19:54I think that would be a good place to start. 19:56And then gradually exposing it 19:57more as you architected systems to safeguard it from threats. 20:01Or it's just 20:02guardrail, it's internal behavior so it doesn't end up wiping your system. 20:06So I think with all technology like you start with the 20:09the less risky place and then you gradually add more risk to it. 20:13And I think that's a sound approach. 20:16and then, in this world. 20:18So I talked about compound AI systems. Right. 20:21And then there's kind of on the one hand side 20:23you have a programmatic way of doing things. 20:25And on the other side is the agentic like way. 20:27I don't think it's going to be one or the other. 20:30The two are going to talk to each other. 20:32So for some problems, there's a especially for narrow problems 20:36where there's a very specific way of solving the problem that's repeatable. 20:40You're not going to get 20:42something out of left field, a query out of the field 20:45that doesn't like fit the solution you had in hand go for efficiency. 20:48You go for the programmatic approach, 20:50for problems that there's many ways of solving it. 20:53especially like for example, how to solve software engineering problems. 20:57this is where an agentic approach can be easy because there's 21:00multiple ways of going through a path to solving it. 21:03And these two will come together to solve problems in the enterprise. 21:07It's not going to be one or the other, 21:09and you're going to always going to apply a systems mindset of how can I solve 21:13the problem most cost efficiently with the right guardrails around it? 21:16Totally. Yeah, it kind of there's like basically 21:19a real big question here on, you know, for a given enterprise, 21:23if you if you broke down all the tasks they need to do on a given day, right? 21:27Like what is what is structured and what is like requires an agent 21:31is actually kind of a big question about how that market's going to separate. 21:34Like, I don't genuinely know, like if you took an average enterprise in America 21:37and you say, let's map all your business processes, 21:39you know, do they tend to be quite routine in the way that like, 21:43you know, does this sort of like we just need to do this in this 21:46kind of like very systematic, systematic programmatic way? 21:49Or does it require like, you know, like the agent 21:51to go out and kind of like exercise some innovation? 21:55Do you have any early signal on that, 21:56like for the clients that you work with, the people that you talk to? 22:00are enterprise is still mostly favoring kind of this very structured approach or 22:04is there like like are there particular kinds of businesses 22:06that are like, oh yeah, the agent, that's exactly what we need. 22:09Yeah, I would say most enterprises are still in this systematic approach. 22:13RAGRis a very popular use case. 22:15Most enterprises have built programmatic RAG, 22:19there's degrees of agenticness you can build into it. 22:21So maybe you could have a self-reflection loop. 22:24so you could take the output, of what the RAG system, provided 22:29and have the model reason on is this actually solving the problem 22:32at hand and maybe loop one more time. 22:34So you we're seeing some companies dabble, but not fully embracing the 22:39the full autonomy of an agent solving a problem and for multiple reasons. 22:44one is we're trying to understand which sets of problems are better suited 22:48for a fully agentic approach compared to the programmatic approach. 22:52I have a hypothesis of if you have a narrowly defined solution 22:56that can unlock a lot of business value, let's say it's like you're interacting 23:01with a database and there's a narrow set of commands you want to apply. 23:05You can engineer all the fall back loops by hand. 23:08And if you solve this, you unlock $1 million for the company. 23:11Go for the programmatic way. 23:13And then if you have a system 23:14that's going to get so many different queries 23:16that there's so many, you need different trajectories to solve it, 23:19and you can't even think through all of the different ways this could fail. 23:22And running a pilot program, you could uncover some, 23:26but there's a lot more to figure out. 23:27I think this is where having more autonomy in the system could be useful. 23:31But we're still early days 23:33and we're still trying to thread this tension between the two. 23:36And and I think cost efficiency will come into this equation as well. 23:40For some problems, you're willing to spend more for 1% more accuracy, 23:45and then for other problems, 23:46you're going to be a lot more tighter on what's the right solution to implement. 23:51Well, and isn't there, a third dimension on all of this, 23:54which is like, what do you want to handle programmatically? 23:57What can you start to free up and allow more agentic approach? 24:01But then what what always needs a human in the loop, 24:04like humans and agents working together with just humans separately. 24:08so I'm curious, Mayaif you have any thoughts on that, 24:11on that other layer of it, 24:12I think great agent systems that where we are right now might need a human in the loop. 24:19just and it, it would be great to have the system know when to plug in the human. 24:24And you could like with the genetic frameworks out there, you could prescribe 24:27for the model to call in a human for help. 24:31so I think it depends on like the, the what is the risk of harm 24:35or the risk of failure and the availability of an expert. 24:38And I think that's what you would consider of how you would bake it in. 24:42I think most autonomous agents we've seen, have a strong element of human in the loop. 24:49So if you're looking at the, the GitHub workspace assistance, 24:54all of this require you to revise the plan of the agent before it executes it, 24:59and then you get to see every step implemented. 25:01Then you have so many ways to, to, 25:04for recourse or changing the, the how the problem is being solved. 25:08And I think right now we're leaning very strongly in having human the loop 25:11because we're early days and there's many ways this could go wrong. 25:16And I don't know if this answer your questions. 25:17Well, well, I was just going to ask, are we just waiting for these, like, agents 25:21to get better and better. 25:23And humans will have to do less and less in the loop or, you know, 25:27is it less dependent on the model's capability 25:29and more dependent on kind of the use case, like where where is it headed? 25:34Is there a limit to how much we can abstract away? 25:36Yeah, I think these are great questions. 25:39I'm always in favor of me personally. 25:41I'm always in favor to have people engaged and for safety. 25:46I it's an open question of, are these models going to get better 25:50at what they do and be able to implement what they need more autonomously? 25:55Where we are 25:56with the technology right now, you definitely need a human in the loop. 25:59Yeah, I think we're also I mean, the meta here is 26:02that it feels like we're talking a little bit about trust, right. 26:07there's a hypothesis I've been chasing after, 26:08which is whenever people talk about AI, they're almost like the AI is coming. 26:12And like, what does the AI enterprise look like? 26:16but the fact of the matter is like this. 26:18This depends so much on, like, the culture of a given company, right? 26:22And like the degree to which they trust autonomy. 26:24and what you could actually anticipate is that like the, 26:27the implementation of, like the programmatic approach versus 26:29the agentic approach, 26:31like where you lie on that spectrum will almost be like totally defined 26:34by like how much management trust the technology 26:37and then what their general behavior is like. 26:38I'm curious, like if a company who already is like very structured 26:42with its employees and it's like, here's the big list of things you need to do. 26:45Like, it will be no surprise that when they implement AI, 26:48it will be like programmatic in that way, right? 26:50Whereas you might see cultures of, you know, enterprises 26:53that are just a lot more like, well, 26:54we just like give people the goals and they sort of figure out what happens. 26:57And it will similarly not be a surprise that when they implement technology 27:00or AI that they'll say, yeah, actually working with agents, you know, 27:03so long as that there's enough kind of trust in the technology. 27:06Yeah, there's the risk appetite. 27:08So I think that we're seeing with companies and larger enterprises, 27:12like we said earlier, you might start in low risk places. 27:15So like your HR systems, your back end systems. 27:19And then you would bring it more consumer facing. 27:22and it's I think it will be interesting to see this year 27:25what would be the appetite of enterprises to adopt more agentic like behavior. 27:29Yeah. 27:29That'll be so fascinating to see, 27:31especially because I think some of those determinations at even what functions are 27:34low risk is going to depend a lot on like what else is happening in the world. 27:38Like you can imagine like a very high profile failure, basically totally 27:42lowering people's appetite to like implementing agents for a long time. 27:46And there's almost kind of like a Google Glass effect 27:48where it's like, oh, you have one wearable, 27:50which is like an enormous failure in the market. 27:52And then it's like difficult to, like, convince people 27:54to put anything on their face for the next decade. 27:57and so there's, there's a little bit it's kind of like crossfade, which is like, 28:00all these companies are making their own decisions, 28:02but it's also in like a soup of like what they're seeing out there in the press 28:05and in the open and what their competitors are doing. 28:13Well, great. So I think to bring us home, I want to go to our third topic. 28:17there is a company that was launched, just this week, I think like 48 hours 28:20ago, 24 hours ago, by the name of Safe Superintelligence Incorporated. 28:25and if you haven't been following this soap opera, the background is that Ilya Sutskever, 28:29who is formerly the chief scientist, one of the key players in OpenAI, 28:33and played a role in the board drama and kerfuffle. 28:38that happened not too long ago. 28:40was, left the company. 28:43The circumstances are still disputed. 28:45and has reemerged with a new company that, promises 28:48to finally deliver on the dream of superintelligence. 28:53and, Kush, I think, you kind of threw yourself, 28:57kind of in front of the bus to kind of talk about this topic. 29:00and I guess I'm kind of curious, just as a starter, how big of a deal is this? 29:04You know, there's my point of view, which is it's becoming increasingly clear 29:09that, like, in order to play in the AI space, you need very, very, very deep pockets. 29:13And, you know, there's kind of a part of me 29:14that this sort of despair is at competitiveness 29:17in this space where I'm like, I don't agree with Ilya. 29:19I'm very skeptical of all the superintelligence stuff. 29:21But like, man, if I don't hope that like, a small startup can, 29:24like go up against the big folks, is that even a possibility, like, like, 29:28does this this company SSI have a chance to, like, become a player in the space? 29:33Yeah. I mean, I think this is just a round of like multiple rounds that are happening 29:39where you start with idealism and you then like kind of face the reality. see the professionalism 29:47of its openAI started with an ideal sort of thing. 29:50Then anthropic started with some ideals sort of thing. 29:53And now this is kind of the, the next one. 29:57and I think what's happening is. 30:00Yeah, I mean, the deep pockets are the, the important aspect of it. So, 30:06because you need so much sort of investment to even get off the ground and, 30:11at some point the question becomes so 30:14for where the money came from, what's the return that they're getting? 30:17It's hard to stay, like, isolated and only be like, a research organization 30:22or like have like that single, like, oh, we only have one thing on our roadmap. 30:27We only have one sort of goal 30:29because, like, people want to, they want to get paid to. 30:32They do. I think that that's right. 30:34And like there's this conflict between like scaling and caring. 30:39so you can like, scale, 30:41and that is what I mean, capitalism is all about, 30:46you can care just focus on like, one person, 30:48one sort of issue and then really go deep in that. 30:52And like, you can't really do both. And so, 30:55I think that's where where the conflict is. And 31:00is this going to be successful? 31:02Maybe, but I think it'll be just one more round of this. 31:05So for a couple of years, 2 or 3 years, whatever it is, 31:09they'll kind of have their point of view, keep it. 31:12But then, yeah. 31:14I mean, someone will be at the table, asking to be fed. 31:17So, I think we'll, see we'll see what happens. 31:20Two forces interact. 31:22yeah. I've a friend who's observing, like, you start out trying to deliver on AGI, 31:25and then you find yourself being like, we got to do B2B SaaS. 31:29Like, actually, it's like you're eventually 31:30kind of dragged towards this, just even if you want to fund the dream, 31:34like basically like where the money's going to come from is like these kind of 31:37very like day to day kind of, applications. 31:41Kush, I actually we haven't talked about this in the previous episodes. 31:44I mean, do you buy their mission? 31:46Like, is the goal of superintelligence like something 31:49we should even be chasing after? Does it is it a coherent goal? 31:51Right. Like, I think there's, like, 31:53real kind of critiques that people have made in that space. 31:55But as someone who thinks about AI governance, 31:57who's like researching these issues, 32:00how do you size up, I guess this ideathat, like, we're going to do a company where the promises, 32:04Yeah, like to an investor even you put money in and we deliver on superintelligence, right. 32:09Like, it's like the old DeepMind mission, which is like 32:11we solve intelligence first, and then after that, we solve everything. 32:15yeah. 32:15You know, is that something that you think is like, 32:19the right way of approaching some of these problems? 32:21Yeah. No, it's a great question. 32:23And, let me like, give somewhat of a historical perspective, 32:27at least for me, like the first time I heard about superintelligence 32:30at all, it was, December of 2015. 32:35so this was at the NEURIPS conference. 32:38there was this whole day, like symposium, which they don't do anymore, 32:41but there was one. 32:42It was "the algorithms among us" was the, the title of it. 32:46And, there were a lot of different things. 32:49It was about like the societal benefits of AI. 32:51And I mean, things that I was thinking like, oh, I would be really interested in. 32:55And then I show up at this thing and, one of the, 32:59the presentations is, Nick Bostrom talking about superintelligence and, and, 33:06like in this whole, like, sort of day. 33:09but the word safety kept coming up again and again and again, 33:12and no one was defining what even safety is, what they mean by it. And 33:17I think, like, I came home, like 33:19I tried to figure out what safety means to me. 33:23kind of wrote something about it as well, which was, 33:26minimizing the probability of harms and risks 33:29and minimizing the possibility of unexpected harms and, 33:33this sort of stuff, which then lends itself to, 33:37kind of more, like clear and present sort of harms 33:40clear and present things that affect society now and then. 33:44at the same time, in 2016, there was this, 33:48now famous paper coming came out, which was this 33:51concrete problems in AI safety, Dario Amodei 33:54I was the first author on that one. And, 33:57that somehow, like, 33:58just caught the attention of people. 34:02it became kind of like this religious sort of thing that, 34:07this existential risk, this big sort of, 34:12sort of thing for like, 100 generations 34:14into the future, like, what is I going to do to to humanity? 34:18And I mean, to me, yes, 34:21we do need to think a little bit into the future. 34:24so there's this, concept called the seventh generation principle. 34:29So this comes from the Haudenosaunee, tradition and, 34:33like. Yes. I mean, you can think 150, 200 years into the future 34:37and think about what might happen, the consequences. But, 34:41like, so for into the in advance 34:44is a little bit, pretentious in my opinion. So, 34:49superintelligence. 34:51There are risks, of course. 34:53but I would much rather both from a personal, societal 34:57and enterprise perspective, focus on, kind of. 35:00What can we do? 35:02where do we take things and where do we protect things? 35:05protect things now? Yeah. 35:06I think kind of what I'm trying to reconcile is, 35:08and it may just be that we end up, like, talking a little bit about kind of like 35:11how the broad universe of AI is going to continue to diversify 35:14and look very different at very different places. 35:16It does seem like I know we've spent like maybe the last 40 minutes 35:19kind of talking about trends 35:20that are like almost very opposite of, you know, superintelligence, right? 35:24Or like what Illya working on, which is like, turns out 35:27a lot of people don't really need huge, huge, huge, gigantic models, right? 35:30Like, ergo, you know, like what the synthetic data stuff is, 35:33is kind of pushing towards like similarly like a lot of the issues that businesses 35:37are dealing with there, like, you know, how do we query a database effectively. 35:41And so it kind of feels like maybe there's actually going to be this, like the 35:44these two worlds were very similar for a while, 35:46which was, oh, LLMs are going to deliver on superintelligence. 35:49And oh, by the way, 35:50we can also do B2B SaaS, but it kind of feels like maybe over time 35:53those technical agendas are going to go further apart. 35:55Yeah, I think at least from my perspective, from the agent space, 35:58the types of agents that are coming out 36:00are very narrowly defined to solve a specific task. 36:03And then you have instantiations of narrow agents. 36:06So an agent that focuses on data analysis, that collaborates with an agent 36:10that can do reporting. 36:12And this is kind of the path forward. 36:14That's that is getting gaining adoption and traction. 36:17And for several reasons you're able it's more democratic. 36:20You can work with open source models, smaller models that you can self-post. 36:24So you have full control over the system that you're building. 36:27And I think this is the opposite view of the big monoliths 36:31that are trying to take a stab at superintelligence. 36:35And I think we and IBM research, I think I know which camp we're in, 36:40and it's like betting on giving you as developer more autonomy, 36:44you being able to control which models you use, trying to use your own data. 36:48And the way to go about it is more narrow applications. 36:51I have an open question in my head of like, 36:53how broad and general purpose we can go, and there's a limit to 36:56what would be useful to society like Kush mentioned, 36:58just echoing what Maya said, I don't think that they are going to be 37:02very well aligned, 37:04incentivize moving forward. 37:07you just look at where like, we're going to be incentivized to develop 37:10from the types of tasks. 37:11You don't need super general intelligence. 37:15but I do question and I 37:17question, you know, whether this should be developed 37:20in a kind of proprietary closed company versus in a more like academic 37:24consortium and other groups that might be better incentivized and, 37:28have maybe, better priorities 37:32as we talk about, like, what society could actually benefit from 37:35for developing this type of technology versus just like some man behind a curtain 37:39with some VC money going at it. 37:42Pay no attention to the man behind the curtain. 37:44yeah. 37:45Yeah, it's it's so competitive. 37:47And I think in part like the 37:50this kind of divergence like for some of these companies. Right. 37:52Like so OpenAI anthropic is another great example. 37:55Right. Where it's 37:55basically like very much instantiated by people who like really believe in 37:59want to work on massively a agentic systems. 38:02Right, that are incredibly general purpose. 38:05You know, they would call it superintelligence. 38:07but in practice they've also 38:08had to kind of just deliver on like the day to day of being a business. 38:12And it's kind of like to what degree can these businesses kind of keep these two 38:15objectives in line or like even work on things that achieve both ends? 38:20And I think kind of the question we're asking is, well, maybe at some point 38:22the research or product development agendas here become quite different. 38:27it producer Han's actually just dropped that anthropic 38:30just this morning launched a new model in the space. 38:34so they've just dropped, Claude 3.5 Sonnet, and, 38:39you know, one of the things that everybody's 38:41kind of observing or chattering about this morning 38:43is that it's actually very similar to what Google and OpenAI 38:47I have done over the last few weeks, which is speed. 38:50They want to launch models that are just very, very, very fast. 38:53And it is kind of interesting is like, why do you actually work on speed? 38:57Well, it's not necessarily because you think the superintelligence 38:59needs to be really fast, right? 39:00Like, you know, what you're doing for speed is like, oh, it 39:03actually opens up all of these other interesting consumer applications, like, 39:06you know, talking on the phone with your AI, why you like, walk along the street. 39:10and yeah, and it's a 39:12I think a really good example is like, if anything, the recent 39:15set of announcements this summer have all been kind of pointed 39:18towards the more practical than the more sort of speculative. 39:21but yeah, I don't know if the panel's got kind of thoughts on that 39:24or if they saw the recent Claude release. 39:25I think it was literally like 90 minutes ago. 39:27So no, I haven't seen the Claude released. 39:29But I think on that point 39:31maybe being counter what I said before, which could be interesting. 39:35You know, you think about the types of models 39:38that we're working with today, the class of technology with transformers. 39:41It's not they're not efficient at all. 39:43They require huge amounts of energy. 39:45Like there's a lot of question 39:46whether this is really the type of technology 39:49that will actually drive superintelligence. 39:51So if we are being more incentivized to focus on things like efficiency 39:55and speed, will that unlock and maybe help us discover new types 39:59of models, new types of architectures that could serve as a much better basis 40:03for potential in general intelligence systems? 40:06Yeah. That's right. 40:07So we actually kind of unexpectedly like in chasing through B2B, SaaS, 40:10we actually end up like coming out the other end 40:12and being like, I guess we have superintelligence now. 40:14It's sort of like, sounds like what you're saying. 40:16Well, I, you know, maybe it it just helps us get a little bit more diverse 40:20in where we're, we're investing. 40:22But. Yeah. 40:23Right. 40:23And to go back to something we said earlier about like 40:26we used to have all this promise and like monolithic models 40:29will solve all your problems and will achieve superintelligence. 40:31And then I spoke a little bit 40:32about how it's actually like systems that are bringing a practical angle. 40:36There's some really interesting papers and studies coming out that if you compare 40:40like the best in class models like GPT4o and and system approach, 40:45whether it's a genetic or not, using smaller models, you're actually on 40:49like a burrito efficient curve, you're actually achieving better accuracy 40:52than GPT4 and a much more cost efficient way. 40:55And I think when you like, if you're talking to enterprise customers, 40:59when like, this is the selling point of you're able to do more, 41:03more accurately and more cheaply. 41:05And I wonder if it's like in the future, are we going to get monolithic models 41:09that are going to be better and out of the box, 41:13or it's going to the power is going to come from a systems approach. 41:16Well, and even the definition of like a monolithic model is, is different. 41:21Like we're seeing like, okay, this model is actually, you know, 41:24a mixture of experts of like eight smaller 8 billion parameter models 41:28either fuzed together, but sometimes they can be more independent, you know. 41:31So what is a model or a monolithic model versus what is a system model? 41:36I think those lines continue to blur. 41:39Yeah, I think we're I think we're very much past the monolithic model. 41:43Like I think we all can safely say, like GPT4 41:45is much closer to a system than it is to a monolithic model. 41:49Yeah, like the era of the big model is actually already over. 41:52Like we're not actually in that world anymore. 41:54I it's it's also true to fact. 41:55I mean, if you're one of those people who uses, like, the evolution of the human 41:59mind as kind of like a projector or forecasting for like where AI is going. 42:03Yeah, it's kind of a view that like the human brain was kind of like clues 42:06together over many, many like centuries and millennia, 42:10which is like one piece being bolted onto another piece, bolted onto another piece. 42:13And so in some ways, it's kind of a surprise 42:15that if we're working on general intelligence 42:17as something that resembles a human mind, that you'd actually end up with a model, 42:20ultimately that is like a bunch of pieces running around. 42:23It's like a bunch of kids in a trench coat, 42:24actually, is like how we achieve general intelligence. 42:27Yeah. 42:27And even the safety work that, Ilya, I mean, was leading 42:32on the super alignment, where it's, it's a smaller model 42:36that's kind of controlling to the bigger model and making sure it doesn't go 42:40haywire is, again, I mean, this sort of architecture, 42:44the sort of view that, 42:45I mean, you're going to have a bunch of things working together. 42:48So, I mean, the way they described it, was, like this weak 42:52to strong generalization. 42:54so you have a weak model that's controlling the strong model, but 42:58I think a better way to think about it is like a wise model 43:01that's controlling the strong model. 43:02So there's some aspect of wisdom that's coming in. 43:06You like different properties of different components. 43:10can actually, keep things under control as well. 43:13So just like, our wise, host Tim here keeps all of us under control. 43:19I think, 43:20the there's rules. 43:22I mean, there's reasons for, for all of us to exist and, kind of work together. 43:26So, that's great. 43:28Well, thanks all the listeners for joining us again. 43:30please join us next week. 43:31And as always, if you enjoyed what you heard, 43:33you can get us on Apple Podcasts, Spotify, and, 43:35better podcast platforms everywhere. 43:38so, Maya, Kate Kush, thanks for joining us and hope to have you back on the show. 43:42at some point in the future. 43:44Thank you. Thanks, everyone. Thank you. 43:46That's great.