Learning Library

← Back to Library

Halloween AI Roundup: TPUs, Insurance, Space

47m • Unknown Channel • ai-ml • news • beginner • Watch on YouTube ↗

Key Points

Anthropic announced a massive expansion with Google Cloud, planning to deploy up to 1 million TPUs and add over a gigawatt of compute capacity by 2026, an investment worth tens of billions of dollars.
Recent AI industry headlines include OpenAI’s shift to a traditional for‑profit model granting Microsoft a $135 billion stake, Nvidia hitting a $5 trillion market valuation, and Amazon unveiling AI‑powered smart glasses for delivery drivers.
The Halloween‑themed “Mixture of Experts” episode brings together Gabe Goodheart, Chris Hay, and Kate Soule to discuss AI insurance, OpenAI’s blog on handling sensitive conversations, and the concept of building data centers in space.
A quirky AI application highlighted in the show is a Toronto‑based digital agency’s playlist that combines AI and canine sound science to keep dogs calm during noisy Halloween festivities.
Host Tim Hoang humorously notes that while the promise of virtually unlimited energy and cooling for space data centers is enticing, the practical maintenance challenges could be nightmarish.

Sections

Full Transcript

# Halloween AI Roundup: TPUs, Insurance, Space **Source:** [https://www.youtube.com/watch?v=KF3jIUBecFo](https://www.youtube.com/watch?v=KF3jIUBecFo) **Duration:** 00:47:36 ## Summary - Anthropic announced a massive expansion with Google Cloud, planning to deploy up to 1 million TPUs and add over a gigawatt of compute capacity by 2026, an investment worth tens of billions of dollars. - Recent AI industry headlines include OpenAI’s shift to a traditional for‑profit model granting Microsoft a $135 billion stake, Nvidia hitting a $5 trillion market valuation, and Amazon unveiling AI‑powered smart glasses for delivery drivers. - The Halloween‑themed “Mixture of Experts” episode brings together Gabe Goodheart, Chris Hay, and Kate Soule to discuss AI insurance, OpenAI’s blog on handling sensitive conversations, and the concept of building data centers in space. - A quirky AI application highlighted in the show is a Toronto‑based digital agency’s playlist that combines AI and canine sound science to keep dogs calm during noisy Halloween festivities. - Host Tim Hoang humorously notes that while the promise of virtually unlimited energy and cooling for space data centers is enticing, the practical maintenance challenges could be nightmarish. ## Sections - [00:00:00](https://www.youtube.com/watch?v=KF3jIUBecFo&t=0s) **Halloween AI News Panel** - In the Halloween episode of *Mixture of Experts*, host Tim Hoang and a panel of AI experts preview topics ranging from Anthropic’s TPU commitment and AI insurance to OpenAI’s blog on sensitive conversations and space‑based data centers, while also recapping recent headlines like OpenAI’s for‑profit restructure and Nvidia’s $5 trillion market cap. - [00:04:20](https://www.youtube.com/watch?v=KF3jIUBecFo&t=260s) **Chip Platform Diversity and Inference Shift** - The speakers explain that while transitioning between hardware platforms has become easier, the move toward inference‑heavy workloads drives companies like Anthropic to explore multiple chips for scaling, yet they likely stick with Nvidia for training due to entrenched CUDA optimizations. - [00:07:31](https://www.youtube.com/watch?v=KF3jIUBecFo&t=451s) **GPU Tensor Ops Landscape** - The speaker explains how NVIDIA’s CUDA‑first innovations dominate cutting‑edge tensor operations, while other APIs like Vulkan and OpenCL are catching up, leading to hardware/software parity that lets most models run efficiently on any platform. - [00:10:39](https://www.youtube.com/watch?v=KF3jIUBecFo&t=639s) **Hardware Strategies: OpenAI vs Anthropic** - The speakers compare the use of specialized AI chips for custom models, noting OpenAI’s push toward its own hardware for long‑term AGI ambitions, while Anthropic remains focused on practical enterprise deployment without pursuing a dedicated chip. - [00:16:21](https://www.youtube.com/watch?v=KF3jIUBecFo&t=981s) **Trust Through Certifications And Insurance** - The speaker argues that reducing technical complexity into clear certifications and insurance guarantees is essential for consumers to confidently adopt AI and other advanced systems, driving market adoption through monetary incentives. - [00:21:41](https://www.youtube.com/watch?v=KF3jIUBecFo&t=1301s) **Insurance Regulation vs Chinese Competition** - The speaker questions whether stringent US insurance standards will disadvantage domestic firms against faster, less‑regulated Chinese providers and asks a bullish colleague if they are more optimistic about near‑term capacity growth. - [00:25:16](https://www.youtube.com/watch?v=KF3jIUBecFo&t=1516s) **Why General AI Insurance Fails** - The speaker argues that blanket insurance policies for AI models are naive, stressing that risk management must be use‑case specific and that proposals for superintelligence‑level insurance are unrealistic. - [00:29:19](https://www.youtube.com/watch?v=KF3jIUBecFo&t=1759s) **Rare Event Safety Dilemma** - The speaker highlights how the scarcity of real‑world mental‑health emergency cases makes traditional ML safety training impractical, prompting reliance on expert panels and simulated evaluations, and questions whether this will become the industry norm amid calls for fewer guardrails. - [00:33:23](https://www.youtube.com/watch?v=KF3jIUBecFo&t=2003s) **Assessing Trust in OpenAI Messaging** - The speakers critique OpenAI’s vague public statements, highlighting the difficulty of gauging real technical progress versus superficial “CYA” messaging and the resulting challenges in establishing user trust. - [00:36:41](https://www.youtube.com/watch?v=KF3jIUBecFo&t=2201s) **Ethical Concerns of AI Emotional Support** - A speaker reflects on users' emotional reliance on AI, acknowledging its utility when human help isn’t feasible while expressing skepticism about whether AI truly guarantees people receive appropriate professional assistance. - [00:41:43](https://www.youtube.com/watch?v=KF3jIUBecFo&t=2503s) **AI Companionship Safety & Space Data Centers** - The speaker critiques reactive safety “bumper‑rail” measures for AI companions, preferring built‑in safe design, then shifts to present Nvidia‑backed StarCloud’s bold claim that future data centers will be placed in orbit to exploit vacuum cooling and solar power, inviting reaction. - [00:45:01](https://www.youtube.com/watch?v=KF3jIUBecFo&t=2701s) **Space Hardware, Politics, and Dreams** - The speakers weigh the technical hurdles of upgrading orbital equipment against looming debris, regulatory, and geopolitical challenges, while expressing enthusiasm for the futuristic concept. ## Full Transcript

0:01You know, the advantages are compelling, right? Virtually unlimited energy, 0:07virtually unlimited cooling. What's not to love? And if they 0:11can make the tech work, awesome. And then of course 0:14I came down and thought, huh, the maintenance of this 0:16sounds just like an absolute nightmare. All that and more 0:19on today's Mixture of Experts. I'm Tim Hoang and welcome 0:27to the Halloween episode of Mixture of Experts. Each week 0:30Moe brings together panel of brilliant, funny and somewhat spooky 0:34panelists to distill down what's important in the latest news 0:36in artificial intelligence. Joining us today are three incredible panelists. 0:40So a very warm welcome to Gabe Goodheart who is 0:43chief architect AI Open Innovation, Chris Hay, who is a 0:46distinguished engineer, and Kate Soule who's Director of Technical Product 0:49Management for Granite. Lots of topics. Today we're going to 0:52talk a little bit about anthropic's commitment to TPUs, a 0:55little bit about AI insurance, some interesting blog posts out 0:59of OpenAI on sensitive conversations and finally data centers in 1:03space. But first we've got Illy with the news. Hey 1:10everyone, I'm Illy McConnell, a tech news writer for IBM 1:13Think. I'm here with a few AI headlines you might 1:15have missed this week. OpenAI has restructured, becoming a more 1:19traditional for profit company. This move gives Microsoft a whopping 1:24$135 billion share in OpenAI. Nvidia has become the first 1:29company to reach a $5 trillion market valuation power chip 1:33business. To put this in perspective, Nvidia is now worth 1:36about twice as much as JPMorgan Chase, Walmart, ExxonMobil and 1:42Johnson and Johnson combined. Amazon has unveiled AI powered smart 1:47glasses for delivery drivers. The glasses guide the drivers as 1:50they walk with directions and alerts to identify any hazards 1:53along their way, all so they don't need to look 1:55down to check their phones. Halloween can be scary, especially 1:59for your dog. A Toronto based digital agency has created 2:03a playlist that combines AI and canine sound science to 2:06help dogs stay calm through the Halloween noise and excitement. 2:10Want to dive deeper into some of these topics? Subscribe 2:13to the Think newsletter linked in the show notes. Now 2:15back to the episode. First, I really want to start 2:22with another kind of massive blog post from Anthropic that 2:26came out this week. I'll just quote it, but it 2:29basically is their announcement that they're going to be increasing 2:32their work with Google Cloud and specifically their sort of 2:36Google's TPU kind of AI chip. So they say quote. 2:40Today we are announcing that we plan to expand our 2:42use of Google Cloud technologies, including up to 1 million 2:45TPUs dramatically increasing our compute resources as we continue to 2:49push the boundaries of AI research and product development. The 2:52expansion is worth tens of billions of dollars and is 2:55expected to bring well over a gigawatt of capacity online 2:59in 2026. So like 12 months from now, basically. So 3:03this is another blog post where, you know, every day 3:05like there's another post where you're like, the numbers are 3:08just mind boggling. Chris, do you want to give an 3:10intuition for like why, why is Anthropic going big with 3:14TPUs after I guess happily working with Nvidia and also 3:18Amazon as well for years? At this point I think 3:21they just want to push Nvidia's net worth down from 3:245 trillion to 4.99 trillion. I think a few little 3:27digits out. Of there, that's their motivation now. I think 3:31it actually makes a lot of sense. Right. Which is 3:34the reality is that they are on multiple clouds, they're 3:38on aws, they are on Google, et cetera. And if 3:43you think about the GPU, you know how hard it 3:47is to get GPUs these days. Actually being able to 3:49just sort of diversify your stock a little bit there, 3:52I think is a really smart strategy and therefore you 3:55can get the best cost for inference. You can run 3:58across multiple clouds, you can use Google infrastructure. So I 4:01think it's a, a very, very smart strategy. Now technically 4:04I think it makes things harder because you're not depending, 4:07you know, you're not being able to take advantage of 4:10things like the CUDA performance improvements. And you need to 4:13go and find these things and run different architectures to 4:16support these chips. So they're making life difficult for themselves. 4:20But I totally understand why diversifying is a good thing. 4:24Yeah. And I kind of was interested in, I mean 4:27my instinct is like operating in each one of these 4:30chip platforms is kind of its own beast. And so 4:33almost like what Anthropic is saying is like we need 4:36chips so much that we're willing to like deal with 4:39all of that operational complexity. Is that the right way 4:42of thinking about it or is like actually it's like 4:44a lot less complicated than it used to be in 4:46terms of moving across platforms. I think it certainly probably 4:49is less complicated than it was two years ago. But 4:51I think what it also is reflecting is the compute 4:56needs are continuing to shift from what used to be 5:00very training heavy workloads for these providers to inference heavy 5:04workloads for these providers, where it's a lot easier to 5:06get your model to run on these other chips than 5:09it is to train on these other chips. And so 5:12from that perspective it makes a lot of sense as 5:15Anthropic's looking to scale their deployments and as using reasoning 5:19models and other kind of what we call test time 5:21compute approaches that continue to boost up their inference costs 5:25to find newer, cheaper ways to scale that inferencing. My 5:30guess is that they're still going to use Nvidia to 5:32train. Okay, why is that? For some of the same 5:35reasons Chris stated with Cuda and all of the like 5:38optimizations that they've undoubtedly sunk into their Nvidia based stack. 5:43Yeah, it's almost like they've, almost like they've invested already. 5:45So why would you start from kind of like start 5:47again in some ways? Yeah, and I think that's just 5:49like Nvidia's cash cow. Like that's what they have really 5:52optimized for some of these most advanced GPUs they're building 6:00TPUs and trainium and other chips. Yeah, I did want 6:03to talk about that Gabe, a little bit because my 6:05understanding is, I mean the move to a TPU is 6:09distinct. Right. It's actually not a GPU and you know, 6:13my understanding is it's a, it's an asic. Right. It's 6:15sort of like a chip that's kind of designed specifically 6:17for AI applications. Applications. And I think for a long 6:20time people have been like, oh well, you know, the 6:21GPU is kind of like a historical accident and we're 6:24eventually going to move to the world of like a 6:26true AI chip from first principles basically. And so I 6:29don't know if this is like, I mean 1 million 6:32TPUs is a lot of TPUs. It feels like maybe 6:35we're kind of finally crossing that threshold into like a 6:37world of much more specialized chips. But I don't know. 6:40I'm curious about what you think about that. Yeah, I 6:41mean just to pick on the point about Cuda and 6:45the sort of the incumbency there, there's both the incumbency 6:49at the hardware, but actually where it's really sticky is 6:52at the all of the kernels that their engineers have 6:55spent hours, days, months, weeks, years tuning to be perfectly 7:00aligned with that hardware. And my guess is that what 7:04they're going to do is put all of their older 7:06models on the TPUs, where the non Cuda code ecosystem 7:12has caught up on the kernel implementations and they're going 7:15to keep driving the novel architectures because basically each one 7:19of these models is just a collection of Tensor ops. 7:21But each one of those has to be carefully tuned 7:24for the stride and the batching and the simd chunking 7:28of how you actually run this giant pile of math. 7:31And at the end of the day it's a bunch 7:32of adds, subtracts, mutates. Bit shifts, and at the end 7:35of. The complicated, the low level gorpy crap that you 7:39can't read because it doesn't make any sense until you 7:41finally ingest your head into that parallel view of a 7:45grid. But they're going to keep innovating in Nvidia to 7:50get the latest and greatest. If they're implementing some novel 7:53architecture, it's almost certainly going to be CUDA first. But 7:56at this point many of the other driver packages like 7:59Vulcan and maybe to a lesser degree OpenCL and others 8:04are starting to catch up on performance for some of 8:07these well understood tensor ops. And you've probably got most 8:10of your less innovative model architectures poised and ready to 8:15go to just run on any old platform you can 8:18because it's reached parity across the different, you know, software 8:21driver layers. So I think that's really smart in that 8:26sense. If they can offload the sort of well trodden 8:28path to cheaper, more efficient hardware and keep the expensive 8:32hardware for the cutting edge, that's going to give them 8:35a nice sort of twofold advantage. And nowadays, I mean, 8:38I think originally all of these tensor ops that were 8:41powering these novel architectures were new and so the platforms 8:46hadn't caught up across the board. But now that we're 8:49a few years or AI decades into this space, AI 8:53centuries basically. Exactly. It's really started to level out a 8:57bit. And we'll probably see that in general across alternate 9:01hardware that Nvidia leads the way on the novel architectures 9:03and alternate hardware sort of picks up the broad breadth 9:07for efficiency plays. Yeah, for sure. And I think that's 9:10one thing that occurs to me and Chris. Yeah, I 9:12was actually about to just kind of turn to you. 9:13I mean you had a joke at the very beginning 9:14where they're like, ah, maybe we'll just like knock some 9:16hundreds of millions of dollars off of like Nvidia's market 9:19cap. And I think we often talk in terms of 9:21like, oh, who's going to take on Nvidia? But this 9:23is almost like there's like kind of room for everybody. 9:25It sort of seems like here where like basically there's 9:28like a role for Nvidia to play. But like, I 9:31mean that capacity that you need to just do inference 9:34on, well Understood models is huge. And so it's kind 9:38of like maybe a world where like in the future 9:40it really won't, you know, like it will basically be 9:43like all the major model providers kind of are doing 9:45what Anthropic is doing here. Do you agree with that? 9:48I agree. And let's face it, Anthropic needs that capacity. 9:51Anyone who's ever tried to use the CLAUDE model at 9:54midnight UK time will understand that the API limits, capacity, 9:58limits reach, come back later. You know, if they're going 10:00to get a million TPUs, I'm fine with that. Just, 10:03just, just don't deprive me of my precious. It's a 10:06move specifically. Exactly. So, so go for it, Anthropic. And, 10:11and so I think, I think it is necessary and 10:14I think that diversification is, is good. So I'm happy 10:18with that. I'm not surprised though. I mean, I think 10:20I've said this before, but I'll say it again, like, 10:22it's like the, the following trend of bitcoin is just 10:26hilarious, right? You know, Bitcoin started with CPUs, the, then 10:29they went to GPUs, and then they went to FPGAs, 10:32and then they went to Asics, and guess what? We're 10:37AIs following the exact same. And it makes sense, right? 10:39Which is if you've got custom models that are always 10:42doing roughly the same thing and you can get faster, 10:45less general specific chips that do that job cheaper, then 10:48go for it. And we've already seen that play out 10:51very well. If you look at the GROK chips, for 10:53example, then they run incredibly fast. So I'm all for 10:58it. Just so I can. And so I can play 11:01with Claude at midnight. That's all I need. Please let 11:03Claude be accessible. That's really what I want. Maybe kind 11:07of a final question here, Kate. So we've kind of, 11:10in the context that this has come up in for 11:12OpenAI, we've talked a lot about them ultimately going like, 11:16kind of more vertically integrated. Right. There's a lot of 11:19rumors about the OpenAI chip and what are they doing 11:21with the OpenAI chip and what it will look like 11:23and all that sort of stuff. We're seeing that, I 11:26think, at least to my recollection, unless you've heard otherwise, 11:28like Anthropic seems to be doing that. Less to say, 11:31like, oh, we need an anthropic chip and we're going 11:33to really hype an anthropic chip. But I think both 11:35are kind of pure competitors in a Certain way. Do 11:38you think there's a reason why Anthropic is like not 11:40really getting into the hardware game? I think Anthropic's position 11:43has been far more focused on like meeting practical enterprise 11:48deployment needs, where OpenAI is obviously on the pursuit of 11:54super general intelligence at all costs. And so I think 11:58from OpenAI's perspective, they might be playing a bit of 12:02a longer game and looking at to get to truly 12:06differentiated AGI style intelligence. Are there maybe even co optimizations 12:13that need to be made between models and chips all 12:16the way down the stack? And how does that kind 12:18of unfold more broadly? And I am not an AGI 12:23believer person, so I very much more resonate towards Anthropic's 12:27approach, which is regardless of AGI, what are the practical 12:31use cases that can be solved today with AI and 12:34how do I scale up my demand to meet it? 12:36So I think they're really like, it gets to very 12:40philosophical differences of how these companies are pursuing innovation. Well, 12:45that AI skepticism is good for, I think what will 12:48be our next topic that I want to move us 12:49on to. So I think it's become almost like a 12:55little bit of a joke, which is like when you 12:57want something passed around and discussed in AI land, you 13:00launch a freestanding website that just has your essay on 13:03it. And this past week was no exception. So this 13:07essay, I think that came out, I think maybe a 13:08few weeks ago, called Underwriting Superintelligence from a number of 13:12researchers and then a guy who runs a company, this 13:15guy Rune, who runs a company on ensuring kind of 13:19AI platforms. And the subject of the essay of Underwriting 13:24Superintelligence is really about kind of like the role of 13:27insurance in allowing new technologies to form. And one of 13:31the things they focus on is what they call the 13:32incentive flywheel. Basically the idea is like once you have 13:35people insuring a new technology or a new space, they 13:39tend to want to lower the risks and they tend 13:42to want the people that they're insuring to do a 13:44good job managing the risk. And they kind of describe 13:47this sort of virtuous cycle whereby the insurance company says 13:50you need to adopt certain standards. Those standards make it 13:53possible to do audits, and then those audits simultaneously make 13:57things safer and help price that insurance. And so I 14:01thought this was kind of a fun idea. It's very 14:04much couched in the world of like, well, we're about 14:06to head into AGI world. But I think this is 14:08a kind of like bigger, kind of interesting question about 14:11insurance in the AI Space. And I guess, Gabe, I 14:15kind of wanted to actually kick it to you first 14:17because you do a lot of work in sort of 14:18like open innovation, open models. I guess the question of 14:23kind of like the risk to the open model provider, 14:26is that something that you guys talk about in your 14:28space at all? I'm about to launch a new open 14:30source model. Do I have to be worried as the 14:32provider of that model, of the liability, should something go 14:35wrong with it going forwards? Curious if that's in your 14:38world at all. Yeah, I would say it really depends 14:42on how an open model is being positioned. I think 14:45there are plenty of organizations that are just out there 14:47trying to make something cool that people play with and 14:49find some value in. I would say from our position 14:52in IBM, I don't want to put words in your 14:54mouth, Kate, but I'd say we actually do care quite 15:03of baby steps in this direction around guaranteeing and adding 15:07verifiable tracing through the training process to ensure that the 15:11models are in fact meeting benchmarks for, you know, best 15:16practices in training and security. So one of the things 15:18that really struck me about this article was that, you 15:21know, this is already happening a little bit piecemeal. But 15:25I think the, the overall sort of framing of this 15:28as an insurance problem that follows other sort of risk 15:33sensitive markets and ecosystems like vehicles and fire and building 15:38and just sort of infrastructure components was an interesting take 15:41on it. You know, the, the, you know, the science 15:47part of me always gets skeptical when these things talk 15:51about, oh, just throw evals at the problem and everything 15:54gets solved. Because evals at the problem is, you know, 15:58an undefined set of words. However, then, you know, I 16:02think about if I were an automotive engineer and you 16:05know, I'd probably have exactly the same skepticism about various 16:08different safety standards. I, you know, I used to work 16:10in defense and there were all kinds of hoops you 16:12had to jump through. And the old joke was like, 16:14just don't try to apply logic to security, just check 16:18the boxes, right? And it's really frustrating as an engineer. 16:22However, this sort of dumbing down of the complexity space 16:27is almost required in order to make that complexity space 16:32manageable for folks outside of the deepest sort of knowledge 16:36bases of that complexity space. So whether it's AI, you 16:39know, defense systems, cars, probably even, you know, fire prevention, 16:43there's probably a ton of nuance that I have no 16:45idea how it works. And I just look for, you 16:48know, fire safety certifications before buying, you know, a bed 16:52for my child kind of thing. So there's some really 17:03know, getting the flywheel off the ground, so to speak. 17:05And obviously this is a little bit motivated by the 17:08author's vested interest in trying to be the insurance company 17:13for this marketplace. But I think the key here is 17:18for consumers to put their money where the certifications lie. 17:24Right. It has to basically start with where the money 17:27flows. That's what gets the flywheel turning. And, you know, 17:30no one's going to pay for insurance for their model 17:33if it doesn't actually boost their bottom line and get 17:36more consumers to use their model. So I. It'll be 17:40really interesting to see. And that sort of gets back 17:42to that. How do we dumb down the complexity space 17:44enough that it becomes actually meaningful to consumers to say, 17:48I'm going to pick this model whether it's open or 17:50closed, because it's got this certification and I know I 17:54can trust it, or because it's indemnified in this way. 17:56And I know if it does something terrible, I actually 17:59have somebody I can sue and get, you know, recourse 18:03for out of the model provider. Yeah. And I think 18:05that's kind of the question that I felt was like, 18:07maybe one of the things lurking behind and maybe, Kate, 18:09I'll throw it to you because it sounds like, you 18:10know, you guys are already working on some things that 18:12are shaped a little bit like this, but, like, kind 18:15of, it feels like before you get to insurance, people 18:17are going to just be like, can you certify to 18:19certain things? Right. Like, can you guarantee certain things, you 18:23know, regardless of whether or not you're going to like, 18:25pay me if something goes wrong? And so I guess 18:27there's a world where this evolves where you might not 18:29ever need kind of third party insurance, potentially at least 18:33in like, most cases, because most people are just being 18:35like, yeah, do you meet certain standards? Okay, if you've 18:38met certain standards, then I'm happy to adopt this. Right. 18:40And then it kind of like gave to your point, 18:42like, the businesses are happy because they're getting more business 18:45as a result. But I guess, Kay, it sounds like 18:47maybe you all at Granite are kind of thinking about 18:49some of this stuff in piecemeal. I think understanding and 18:53mitigating risk has certainly been at the core of our 18:56strategy with Granite from the beginning. So, for example, we 18:59made a lot of decisions very early on to make 19:01sure and take preventative measures to prohibit known pirated content, 19:06for example. That has been the subject of many lawsuits 19:08in the US from being used in training. We're very 19:12open and transparent about the data that we're using, which 19:15is really just a testament to how careful we're being 19:17about data selection. And I think that has ultimately led 19:24to us continuing to work with different standards bodies to 19:29get certification for that. Steps to help our educate our 19:33customers and users on the variety that exists in model 19:36development and why aspects are important. So Granite is the 19:41first open source model family that was developed according to 19:44ISO 42001 standards, which is really exciting. The only other 19:49model developer, there's plenty of model providers but the only 19:52other model developer I'm aware of with those standards is 19:54Anthropics Claude. So not many providers have gone after that 19:59certification but I think it is starting to grow in 20:03terms of its prevalence in the at least US based 20:05markets. I struggle though with how insurance kind of plays 20:11a role. I think there's a couple of different things 20:13going on. So one like there are examples in the 20:18article even cites it of insurance for for example for 20:21a copyright protection. So Copilot came out first saying that 20:24you know, we'll insure and identify real identification, indemnify against, 20:29you know, if the model produces copyrighted output, you use 20:32that and your products will protect you. I worry that 20:37there will be the best providers to understand the true 20:42risks. Won't be the insurance companies, it'll be the model 20:45developers themselves. Just because this technology is so new. They 20:49had somewhere in the article citing that there's only like 20:51100 researchers who are qualified to do model audits in 20:55the United States. There are very few people who understand 20:59the technology to the detailed level needed to assess that 21:02risk as well as very few companies open and transparent 21:07enough and you know, kind of governing the development in 21:10a way that that risk can be well tracked and 21:13understood. So I, I don't see a great opportunity for 21:17third party insurance providers to come in and kind of 21:21buffer against. It'll be like first party insurance basically. Yeah, 21:25well you're going to use our model and if you 21:26get into trouble it'll be like the copyright thing. IBM 21:28does that as well. Right. So we provide indemnification for 21:31our models when they're used Granite models when they're used 21:34their Watson X product lines because we understand better than 21:37anyone else could, you know, the extent to how these 21:39models were trained and the risk that they can create. 21:41So I think that's going to potentially prevent third party 21:45insurance providers coming in. I also had some questions that 21:51this article raised. It does discuss the importance of moving 21:56fast and being competitive against China. But insurance by definition 22:00is really not a global, global market like insurance. You 22:04always have a population that is similar in some characteristics 22:08that pull their risk together and provide protection. And so 22:12if all of these Chinese model providers and developers don't 22:16have this kind of standards and, or insurance policies are 22:20going to keep moving probably with less regulation than US 22:23based ones, they're going to move a lot faster. And 22:25that's just going to disincentivize U.S. companies from operating under 22:29the same constraints in order to stay competitive. And I 22:32didn't think the article had a really great solution to 22:35that. You know, I don't see that going away and 22:37I don't think insurance is going to like allow us 22:40to compete more competitively in that sense. So, you know, 22:43it raised for me more questions than I think it 22:45answered ultimately. One thing I guess, Chris, I'll bring you 22:49into this conversation because I think in previous MOEs you've 22:52tended to be our most like, not that you're like 22:56a super intelligence believer, but you tend to be kind 22:58of the most bullish on capacities getting really, really fantastical 23:04I guess in the near term. Do you believe a 23:08little bit more in that? Because it feels like your 23:09calculations kind of change if you believe that. Oh yeah, 23:12we're about to see these AIs do incredible things that 23:14are very high risk that maybe the first party model 23:18providers are not going to be able to self insure 23:21essentially. And so maybe there's kind of pressure. If your 23:25model is going to be used for some crazy, you 23:28know, DNA printing thing that has like a huge biorisk, 23:31you might eventually want to kind of like outsource that 23:34to like a third party insurance market. Were you a 23:36little bit more sympathetic to this article or were you 23:38kind of similarly skeptical like Kate and Gabe? Insurance leads 23:41to clauses, clauses lead to gray areas, gray areas lead 23:47to lawyers, and lawyers lead to the dark side. That's 23:51my, my opinion on this one. So I, I am 23:54not a fan of this. The world does not need 23:56more lawyers. Because of, because of the lawyers. Basically. Yeah, 24:00no, I don't want this. And you say the model 24:04providers can't. Who are the model providers? They are all 24:09billion trillion dollar companies. If a trillion dollar company can't 24:14afford to insure. Right. Some of the richest companies in 24:18the world can't afford to insure their models. Who do 24:21they think are going to insure These models, it's like, 24:25what planet are we living on? And to your point, 24:27Kate and Gabe, right, which is, do you think suddenly 24:31OpenAI anthropic are going to go, please come in, I 24:34want to show you the inner details of my model. 24:36Please, please, please, please. And then you can write it 24:39up in your insurance, et cetera. No, and we know 24:41what the clauses are going to be like. You must 24:44do this in a safe and secure way. You must 24:46not prompt inject. You must not do this. And it 24:48is always going to come back to being your fault 24:52of why the model went wrong. It's never going to 24:54be my fault. You'll be like, oh, you didn't put 24:56a safety rail in there. I didn't know that was 24:58I meant to do that. It's like you never, I, 25:01it's always going to be my fault. And what am 25:03I going to be doing? I'm going to be handing 25:05them over money and then contact centers are going to 25:08bring me up. It's like, do you want me to 25:09ensure your ChatGPT instance? You know, that's going to be 25:11$300 a month, grandma, you know, and I'm going to 25:13be like, no, this is not an industry I want. 25:16Insurance is bad. Okay, I think there's one other thing. 25:21Yeah, I think there's one other thing. You got the 25:23point on that one. This article gets wrong or misses 25:27the point on entirely, which is so much of AI 25:31based risk is use case specific. So having insurance policy 25:36for general model deployments and for general models themselves is 25:41I think, just very naive. If you're talking about, okay, 25:44here's a very specific biomedical application where we need X, 25:48Y and Z regulations to ensure consumer safety and health. 25:51That has existed forever. Not forever, that has existed for 25:54a while now and is really important and should continue 25:57to exist and expand to consider these new AI based 26:00ones. But these kind of like global policies, which again 26:03is really based off of the entire premise of this 26:06article is we're going to have artificial general intelligence and 26:09superintelligence. And so therefore we need superintelligence Insurance, I think 26:13is really just not practical or realistic of what the 26:17real risks are and what needs to be solved for 26:19in the near term. Well, and Kate, to your point, 26:21you know, I think there's a lot of debate about 26:23what superintelligence actually means. You could achieve superintelligence in a 26:26specific domain vertical where the machine can do better than 26:30top individuals trying to solve that problem. That's a, that 26:33might be a great, very specific slice of a market 26:35for an insurance policy, however, sort of the AI stop 26:40at those two letters, ensure that thing. I agree. It's 26:44really wide open. And you're going to basically just get 26:47back around to let's trust the model creators because they're 26:50the ones that actually understand these problems, that they've done 26:53it correctly. So it's going to. It's a chain of 26:55trust without a root certificate. Right. And to that point, 26:57Gabe, it's lazy offloading, right? Because to your earlier point, 27:01it's like, okay, you're providing a product and service, and 27:03that service has got to be compliant with whatever regulation 27:05does it do. The thing that is your responsibility as 27:08the product to decide whether the AI is suitable and 27:10you've tested enough to be in your product. By ensuring 27:14the model, you're just offloading your risk, which I guess 27:16people want to do. And it's lazy. You know what 27:19I mean? It's lazy. So I get why you want 27:22to do it. And if the model developers want to 27:24ensure that risk, that's fine. And I can imagine that 27:26many do for various reasons. But I assure you, when 27:31it goes wrong and you're looking at your credits for 27:33that month and go, oh, I got my credits back 27:35for that month because it went wrong, I don't think 27:37that's going to cover your risk. So that's kind of 27:39my issue on that. All right, Moe. Skeptical of superintelligence 27:43and just hates lawyers. I'm not skeptical of superintelligence. I'm 27:48still right on for. Yeah, you're skeptical of insurance. I 27:51just don't like insurance or lawyers. Well, I'm going to 27:59move us on to our next topic, which actually, I 28:00think, like, I wanted to put these two next to 28:03one another because I think they. They sort of rhyme 28:05in interesting ways. So we just talked a little bit 28:08about, you know, I think an essay that was like, 28:10very speculative about, like, superintelligence and how you might insure 28:14against those risks. And in some ways, like, the next 28:17topic that we're going to talk about is also about 28:19companies managing their risk, but in like, a very, very 28:22different, different context. And I thought it was kind of 28:24like a fun to kind of put them next to 28:26one another. OpenAI did a blog post talking a little 28:30bit about how they're working to make sure that ChatGPT 28:35can navigate sensitive conversations with people well and safely. And 28:42it's a really super interesting post that kind of goes 28:45through sort of like their philosophy of how they attack 28:47these problems. Some of the data that they've seen from 28:50GPT5 and overall kind of like how they're sort of 28:54managing this from a product standpoint. And you know, to 28:56our conversation earlier, I think it's like a good example 28:59of maybe an alternative path about how companies will manage 29:01some of the risk here, which is, I think what 29:03OpenAI is doing here is just sort of showing the 29:06work. They say, okay, you're worried about this. Here's the 29:09evaluations we've done, here's how we think about it, and 29:12you have to make your decision on whether or not 29:14you want to use this product. And I think, you 29:18know, Kate, maybe I can kick it over to you. 29:19I think the first sort of really interesting problem is 29:22just from a product standpoint, it's really hard to do 29:25safety in this space because the number of actual real 29:28world cases is so rare. So they point out that 29:31basically, you know, they say across these kind of sensitive 29:35conversations, and there's a question about how you define that. 29:38They say their initial analysis estimates only about 0.07% of 29:42users active in a given week and 0.01% of messages 29:46indicate possible signs of mental health emergencies related to psychosis 29:50or mania. And that's kind of an interesting problem, right, 29:54because I think the usual machine learning thing is like, 29:56well, can we collect some cases and fine tune against 29:59that? But here you're kind of like, it's very, very 30:01limited. And I guess mostly I'm just sort of interested 30:05in sort of your take on if you kind of 30:07buy their approach on how they're managing this, which is 30:10we only have a few examples. So we're going to 30:12rely on panels of experts, we're going to rely on 30:15like simulated kind of evals. Like, is this the way 30:18that you think kind of like is going to become 30:19sort of the industry method for kind of attacking these 30:21types of problems? I mean, I like read these articles 30:24and then in the back of my head I'm like, 30:26didn't Sam Altman just say we're going to let it 30:28treat adults like adults and let erotica on our platform 30:31and like get rid of guardrails and safety altogether? I 30:34was going to ask about that as well. So if 30:36you want to take that question also, it's like, what's 30:38going on here with this split screen? Seems, yeah, very 30:41discordant at least to. But look, I think they are 30:47certainly it's good that they're engaging and they're being more 30:50open about work that they're taking with these kind of 30:54like taxonomy based approaches which are Very similar to ones 30:58that we use at IBM to map out and define 31:01safety issues, particularly around mental health. And that they're engaging 31:04with like subject matter experts and clinicians in the area 31:08to figure out best ways to respond to that doesn't 31:12mean though, that they're like, I don't know that they 31:16still have the right incentives to fully solve this problem. 31:19And you'll notice a lot of things about what they 31:22say are very vague. So 0.01%, first of all, can 31:27still be an amazing amount of data on potentially suspect 31:32mental health conversations. If we're talking about ChatGPT usage and 31:37they talk a lot about, oh, we've had, you know, 31:39a 65% reduction in harmful responses, but they don't tell 31:44you the percentage of harmful responses they started out with. 31:47So Is this a 65% reduction where 50% of the 31:50time the model was saying something awful or 1% of 31:53the time the model was saying something awful and now 31:55it's down to, you know, 0.35. So I think there's 31:59still being, I think the lawyers got in there, Chris, 32:01and definitely heavily post edited that release to try and 32:05make this like, okay, we're gonna be open, we're gonna 32:08start about it, but we're not going to tell you 32:09too much. Is it right to say you're kind of 32:11skeptical? This maybe seems to indicate more than it actually 32:15does. It's not necessarily a bad thing. But I am 32:19certainly skeptical that they're doing the full extent that they 32:22can to solve the problem or have the incentives to 32:25solve it and to really take this as seriously as 32:29it needs to be taken. I also worry that you 32:33called this very much like a product based strategy. And 32:36so at the product level they could be doing a 32:38lot of work that then doesn't get addressed at the 32:40lower level like endpoints that the rest of the world 32:43is building their own products off of. And so how 32:46that translates to where we're now having advertisements being sold 32:52to consumers who are having mental health issues and breaks 32:57and problems, and how that translates to other broader applications 33:01I think is concerning and not really addressed. So I 33:05think there's just a lot more work to be done 33:06ultimately. Yeah, I mean, it kind of comes back to 33:08the chain of trust question, right? If you as a 33:11consumer have decided that OpenAI is a company you trust, 33:15this is a great article. It's like, man, look at 33:17all this great stuff. I'm leaning into trusting that they 33:21are, you know, really trying to do their best. Right? 33:23They've got a lot of verbiage about working hard and 33:26doing their best and that's, that's, that's great. If you 33:30are not not already founded on us a root of 33:35trust in OpenAI. There's a whole lot of holes to 33:38poke in this article. Right? There's lots left to your 33:42imagination to fill in. Exactly. As Kate said, what does 33:46an actual positive or negative outcome mean here? Right. It's 33:51a really gray space and to the previous conversation, it's 33:55really hard to establish trust here. This is, you know, 33:58the realm of non Gaussian statistics. Right. We're off in 34:01the tail end of the distribution and it's hard, it's 34:06hard math, it's hard science. So you know, I fall 34:10probably somewhere in the middle of the trust spectrum. I 34:12think the fact that they're putting this out there means 34:14at least some real work was probably done and is 34:17probably making some positive improvements. But juxtaposed against the public 34:21messaging, it's still really hard to know, you know, exactly 34:24where this falls and how much. This is a sort 34:28of COVID your sorry CYA blog post versus post. Yeah, 34:35exactly. Versus A, you know, a real sign of technical 34:42improvements. And we also know that any company of reasonable 34:46size, the left hand doesn't really talk to the right 34:48hand very well, especially if the right hand is a 34:50public figure at the top of the company. And so 34:53it's very possible that there are chunks within the OpenAI 35:02is then running into the buzzsaw of product decision making 35:05and public messaging and, and all of that that makes 35:08it hard to get the science in the right place. 35:10So it's probably a much more complicated picture behind the 35:13curtain than this blog post is is presenting. I'm glad 35:16that it they're at least putting a foot forward but 35:19I'm still, you know, very guarded in my trust of 35:24them as an organization and any organization frankly that's self 35:27certifying that it's doing the right thing for sure. And 35:29I think this is like Aaron, kind of the problem, 35:32right? Which is like I think on the last conversation 35:34we said well the more realistic outcome is that these 35:36companies are going to self certify. I think now where 35:38it's a case where we're talking self certification and we're 35:40like is this the kind of certification we really want 35:44is kind of where you're kind of stuck between two 35:46maybe not so great scenarios, I guess. Chris, I think 35:50one of the things that really struck me about this 35:52piece outside of the data was it's also OpenAI starting 35:57to show their point of view on how these types 36:01of technologies should respond in certain types of situations. So 36:05the example I'm thinking a little bit about is they 36:07said, well, okay, in cases where we detect that the 36:09user is becoming emotionally reliant on the model, then our 36:13intervention is that the model should tell them to have 36:15more real world conversations, which is a very specific kind 36:19of opinion about what the model should be doing in 36:21that context. And I guess you can just kind of 36:25take that case. I was kind of curious if you 36:27think that is actually the desired behavior for these technologies, 36:30if they detect you are becoming emotionally reliable or emotionally 36:34reliant, that they should say, hey, have you considered talking 36:38to other people in the real world? Is that effective? 36:41From your point of view, do you think that this 36:42is a good way for these models to approach, you 36:45know, these situations? I am, yeah. I am the last 36:48person you should be asking these questions, Tim. I think 36:52as someone who's very emotionally reliant on Clyde. Yeah, exactly. 36:56I, I think it's, I think it's difficult, right? I 36:59think first of all, I mean, To Kate's point, 800 37:02million users, 0.07% of the, you know, that's a huge 37:06number. Do you know what I mean? That's a Friday 37:09number. But I think people who are relying on AI 37:17and I don't really know in that sense, I rely 37:19on Claude, but not for emotional stuff. But I mean, 37:22I think the reality is maybe they're in a scenario 37:25where going and speaking to a human isn't an option, 37:28do you know what I mean? And maybe they're relying 37:30on the AI for those very reasons because it's a 37:33very difficult subject and going to speak to a human 37:35about that is going to be difficult for them. So 37:39I understand the sort of things that they're doing there 37:43and I think people need help wherever they need help, 37:46whether it comes from an AI, whether it comes from 37:48another human. And you know, and I think the good 37:50thing they're doing is they're speaking to professionals who are 37:53telling them what the right things to say. So I 37:55think I applaud all of that. I guess the skeptic 37:59in me is just like, I hope that they really 38:03want people to get the right help that they need. 38:05And it's not, not off setting the blame and say, 38:09well, you know, they contacted us and we said, go 38:12speak to human being. And they didn't do that. And 38:15again, and those pesky lawyers are back again. So I 38:19don't know, but I don't think there's a good answer 38:22in these scenarios. And I just, I hate to say 38:26I think it's going to get worse. Right. Because at 38:28the moment we're chatting with ChatGPT and others as a 38:32tax mode, of course we have the voice mode, especially 38:34when you're driving and things like that. But as more 38:37modalities come more and more you've got the more realistic 38:40looking avatars, et cetera, and those technologies progress, it's going 38:43to start to feel more human, like. And I think 38:45these problems are going to push more and more. So 38:48I don't know what the answer is, but I do 38:51applaud them for trying, I think. Yeah. And on the. 38:55Applaud them for trying, there's actually. It was a corollary 38:57piece, I think it launched yesterday, maybe the day before. 39:02They just actually published previews of fine tuned versions of 39:05their open weights models that are designed around policy enforcement. 39:09So very similar to the Granite Guardian series where the 39:12user provides an actual policy that they want to check 39:16whether the input or output adheres to and then it'll 39:19do sort of real time on the fly evaluation against 39:22that policy. So actually I thought that was maybe the 39:26most interesting part. I suspect that the timing is not 39:29coincidental, although it may be. But if they're actually making, 39:33making a real stride towards putting solutions to some of 39:37these problems out in the open, I think that's actually 39:39really encouraging. So I'm very curious to check those out. 39:42You know, one thing I noticed that was slightly different 39:44than how we framed the Guardian models is that our 39:46Guardian models are specifically tuned towards a collection of policies 39:49that we have data for. Whereas it sounds like what 39:52they've built is a general purpose policy evaluation model that 40:03arbitrary policy evaluations. Thank you, Kate. I'm sorry. Yes, I 40:09misrepresented. All right, so arbitrary policy is. Well in the 40:12Guardian series and they're smaller and we should try those 40:14first. Yeah. The point being, you know, putting this together 40:17as a system is an actual good technical step forward 40:21that others can, you know, put their hands on and 40:24feel and build some of their own trust in the 40:26technology versus just reading the blog posts. So I think 40:30that's an encouraging sign that there's some real tech being 40:33released in the open to put some weight behind these 40:36assertions. To me it really feels almost like a fundamental 40:40philosophical question, should AI be used for companionship. And I 40:46think, to Chris's point, there are real scenarios where AI 40:50as a companion can provide benefits. But the way if 40:55that's the case, if we want to be able to 41:05that provide value needs to be a lot more core 41:07and fundamental in how these models and chatbots are being 41:10built. I worry that OpenAI's approach so far, it's great, 41:15like I said, they're doing better things and it's great 41:19that they're talking about it publicly. But it includes stuff 41:23like I noticed in their article that they route to 41:25a safe model if you're in distress, and if not, 41:29then you get to use the, you know, treat adults 41:31like adults model, presumably without guardrails and everything else. And 41:35so I do think it comes a bit philosophical of, 41:38like, how ingrained should these guardrails and safety protocols be? 41:43And are you designing for, like, safe companionship up front, 41:47or is this being tacked on at the end and 41:50like, uh, oh, someone so and so has some red 41:53flags. We're gonna to put them down the, you know, 41:55safety mode bumper rails, like in a padded room so 41:59that they don't hurt themselves, so to speak. Which I 42:02think just feels a little bit, you know, unsafe ultimately, 42:08but isn't how I would want a AI companionship that 42:11my loved ones are talking to if they need support 42:14or help. All right, I'm going to move us on 42:20to our last topic. Just kind of a fun story 42:22that kind of popped up. So Nvidia did a blog 42:25post about a company that they're supporting called StarCloud. And 42:29the premise of StarCloud is data centers in space. And 42:34the CEO has this incredible quote which is, in 10 42:37years, nearly all the new data centers will be built 42:39in outer space. And so let me just kind of 42:43lay out the sort of argument, the pitch of StarCloud, 42:46and then we'd love to kind of get everybody's sort 42:47of takes on it. The idea is, in space you 42:50have a lot of benefits. One of them is cooling, 42:53right? You don't need to use a lot of water 42:55because you're in the vacuum of space. And the second 42:58one is energy, right? You can also generate a lot 43:01of effectively green energy through solar. And when I heard 43:07this concept, I was like, this is a little wild 43:09of an idea. And I guess maybe. Gabe, I'll kick 43:13it over to you first. Are data centers headed to 43:16space? We're doing all of this terrestrial Construction on data 43:19centers. Is this a model where. Yeah, maybe these advantages 43:22actually do encourage us to put stuff up in the 43:25sky. So I went on a little rollercoaster ride while 43:28I was reading this article. The first thing I thought 43:30was like, this is absurd. I read the title and 43:33then we got into the meat and potatoes of it 43:36and sort of. I think the advantages are legitimate. These 43:39are. You're a believer. Okay, cool. Well, I'm not at 43:41the end of my rollercoaster ride. The advantages are compelling, 43:47right? Virtually unlimited energy, virtually unlimited cooling, what's not to 43:53love? And if they can make the tech work, awesome. 43:55And then of course I came down and thought, huh, 43:58shoot. Nvidia just launched a brand new card. I guess 44:02I better scrap that one and start a whole new 44:04data center. Or shoot. One of the devices is failing. 44:07I need to send somebody over to the rack to 44:10pull one out and put, ah, damn it, I can't 44:11do that. So the maintenance of this sounds just like 44:14an absolute nightmare. And one of the problem, one of 44:19the projects I was adjacent to in my previous gag 44:21in defense was actually around space object tracking, which is 44:24really, really hard. It turns out there are a metric 44:29something load of things floating around out there that are 44:33just taken up space trajectory, orbiting the earth and are 44:36really, really hard to track. And there's always the possibility 44:39that they crash into something or fall to the ground 44:42and cause major damage. So space is not just like 44:45a happy little bubble up there that you can just 44:47toss stuff into. It's a really, it's already a very, 44:50very crowded physical space that has real implications. So putting 44:55a technology that is obsoleted on a six month basis 44:58into space, that's to me where I see the problem. 45:01You know, the benefits are clear. That's great if we 45:03can actually realize that. But how in the world are 45:06you going to keep up with the pace of innovation 45:07in the physical hardware that you're sticking up there? I 45:10have no idea how that would be feasible. Yeah, I 45:13love the idea that like you gotta replace the cards. 45:16So we're gonna just ship a guy up there to 45:18replace the cards. I mean it'd make for a real 45:20cool movie. Part of me is like, wow, that is 45:23just so cool. Yeah, I think it would be cool. 45:25I would just be so cool. But I do wonder 45:28about things like, you know, like on the side, somewhat 45:31interested in astronomy and like space clutter and like it 45:35is a real thing, the debris fields that are growing 45:38and there is, I think it's going to be really 45:41fascinating to see how some of the regulatory environments grow 45:46up around that, and particularly geopolitical control over space, obviously, 45:54is all very immature right now and something that will 46:04uncharted territory. So, you know, I think there's a lot 46:08of. I think it's more gonna be like the political 46:13difficulties that will prevent this or potentially delay this from 46:17becoming more of a reality than maybe even the physical 46:20or practical. But, you know, at the end of the 46:23day, there's part of me that's just like geeking out 46:25a little bit about it. Like, yeah, that sounds awesome. 46:27It's just cool. Chris, you've got the last word for 46:30this episode. I'm looking forward to the Mark Rober video 46:34where he attaches a GPU to a balloon and sends 46:38it up to space and he beats everybody to the 46:40punch. That's what I'm looking forward to. I mean, Gabe 46:43and Kate covered it all. I think it's a stupid 46:46idea. Go work on making GPUs smaller and better and 46:51have the model smaller so we can enjoy them here 46:54without much power usage. Don't make it space the problem 46:57as well. This is a great episode. Everybody was so 46:59spicy today. But that's all the time that we have 47:02for today. So, Kate, Chris, Gabe, great to have you 47:05on the show as always, and thanks to all you 47:08listeners. If you enjoyed what you heard, you can get 47:09us on Apple Podcasts, Spotify and podcast platforms everywhere. And 47:13we'll see you next week on Mixture of Experts. Although 47:30booing like that makes me feel as if I'm watching 47:33the New York Giants at the moment and I'm just 47:35sort of booming.