Learning Library

← Back to Library

Brain vs AI: Shared Architecture, Divergent Power

6m • Unknown Channel • ai-ml • deep-dive • intermediate • Watch on YouTube ↗

Key Points

Generative AI can assist everyday tasks—like improving a swimmer’s technique or applying artistic styles—but we must ensure its recommendations remain reliable and “sane.”
Large language models share brain‑like structures: densely connected “neurons” (feed‑forward layers) akin to the prefrontal cortex, vector databases that function like the hippocampal memory system, and specialized modules (mixture‑of‑experts) comparable to the cerebellum’s task‑specific functions.
Despite these similarities, the human brain is far more energy‑efficient, using roughly 0.3 kWh, whereas training and running LLMs consumes thousands of kilowatt‑hours.
The brain’s compact size (~1,200 cm³) contrasts sharply with the massive physical footprint of AI hardware, which includes miles of cabling and large GPU clusters.
Information transmission also diverges: neurons communicate through complex chemical neurotransmitter signaling, while AI systems rely on binary floating‑point operations.

Sections

Full Transcript

# Brain vs AI: Shared Architecture, Divergent Power **Source:** [https://www.youtube.com/watch?v=6l0x4qvrnqI](https://www.youtube.com/watch?v=6l0x4qvrnqI) **Duration:** 00:06:31 ## Summary - Generative AI can assist everyday tasks—like improving a swimmer’s technique or applying artistic styles—but we must ensure its recommendations remain reliable and “sane.” - Large language models share brain‑like structures: densely connected “neurons” (feed‑forward layers) akin to the prefrontal cortex, vector databases that function like the hippocampal memory system, and specialized modules (mixture‑of‑experts) comparable to the cerebellum’s task‑specific functions. - Despite these similarities, the human brain is far more energy‑efficient, using roughly 0.3 kWh, whereas training and running LLMs consumes thousands of kilowatt‑hours. - The brain’s compact size (~1,200 cm³) contrasts sharply with the massive physical footprint of AI hardware, which includes miles of cabling and large GPU clusters. - Information transmission also diverges: neurons communicate through complex chemical neurotransmitter signaling, while AI systems rely on binary floating‑point operations. ## Sections - [00:00:00](https://www.youtube.com/watch?v=6l0x4qvrnqI&t=0s) **Brain‑AI Analogies and Risks** - The speaker likens human brain components such as the prefrontal cortex and hippocampus to LLM structures like feed‑forward layers and vector databases, warns that generative AI could produce unreliable outputs when used for everyday tasks like swim‑tech advice or art style transfer, and emphasizes the need to keep AI “sane.” - [00:03:04](https://www.youtube.com/watch?v=6l0x4qvrnqI&t=184s) **Phased Training, Chain‑of‑Thought, Self‑Learning** - The speaker outlines a two‑stage training pipeline—unsupervised representation learning followed by supervised fine‑tuning—introduces chain‑of‑thought reasoning for transparency, and describes self‑learning via nested chain‑of‑thoughts, mixture‑of‑experts voting, and reinforcement learning to create dynamic, meta‑ground‑truth feedback. - [00:06:14](https://www.youtube.com/watch?v=6l0x4qvrnqI&t=374s) **Teaching LLMs While Assisting Users** - The speaker notes that using LLMs to help friends with tasks—like drawing dogs or refining a swimming style—simultaneously enables the models to continue learning without destabilizing their internal coherence. ## Full Transcript

0:00Generative AI algorithms, they're rapidly learning new domains. 0:03But as they do so, the big question is, are they going to lose their minds? 0:07Now, say for instance, I have a friend, Ravi. 0:10He has a swim meet coming up and he wants to use a large language model to get hints on how to better his butterfly. 0:16But perhaps these hints aren't the best. 0:18And then I have another friend named Kevin who's working on a dog artwork. 0:21And he wants a different style transferred into that piece. 0:24And he wants to use a generative AI system to help him out. 0:28Now both of these are really good ideas and use cases where we can use generative AI to help us out in our daily lives, 0:34but we need to make sure as we do this that they don't lose their minds. 0:38Well there's a lot of similarity between the human brain and large language models. 0:42Now both of them, they have these neurons that are deeply connected together. 0:48So in the brain, you have the prefrontal cortex and that's responsible for the different types of thinking that we have. 0:53Now over in LLMs, Within feed-forward neural networks, you have these densely packed 0:58regions that can propagate forward and infer an output. 1:02Now, the other part is called memory. 1:05So within the human brain, we have what's called the hippocampus. 1:09This is where we store our memories and we have to retrieve information in order to respond to our environment. 1:15Now, LLMs are kind of similar because they use what's called a vector database. 1:19So we can write vectors into it and then pull it out. 1:22Now the third aspect that are very similar between the two. 1:25or is what we call specialized regions, right? 1:29And we could think about this within the domain of generative AI as a mixture of experts. 1:36Now within the brain, we can also look at the cerebellum, 1:39a nd the cerebellum helps us with balance and movement and such, 1:42but each of these specialized areas have a certain function that can help us out. 1:48So I told you how they're all similar, but now, how is the brain different? 1:52Okay, so some of the differences here. 1:54is power, right? 1:57So the human brain, it only needs 0.3 kilowatt hours of power. 2:02Now an LLM, it needs thousands of kilowatt hours in particular to train it. 2:07And now the other difference between the two is also volume. 2:11Now the human brain takes up only 1200 cubic centimeters. 2:16And then when I compare that just to the cables alone 2:18to put these supercomputers and clusters that have GPUs together, 2:22right, the generative AI part could have miles of cables. 2:25Now the other biggest difference is the way in which each of them pass messages, right? 2:30So there's a complex series of messages. 2:35So one of them is chemical, the other is binary. 2:39So within the brain, we have this complex stew of neurotransmitters that relay messages back and forth, 2:44whereas in generative AI, we have encoded floating points that use ones in zeros to pass that said information back and forth. 2:51Now, say my friend Kevin, he's learning how to draw the better dogs. 2:55Well, we still need to take some of the similarities and the differences and train these LLMs to help him draw better dogs. 3:03Okay, why don't we jump into it? 3:04So at the foundational level, we can begin this phased training approach. 3:09Now, this training approach is broken down into two different components. 3:13The first one being unsupervised learning, 3:15where you don't provide any labels at all, but the model learns how to represent the data. 3:19And the second component of this is supervised learning. 3:22This is where you do provide the answer, 3:24and then it can back propagate the error in between the output and the answer so you change the gradients. 3:30Now the second area that I wanted to mention is called chain of thought. 3:34This is a step-by-step logical reasoning that can be used to even teach other models. 3:39And this also provides transparency so that we can understand what's happening. 3:44Now what you're seeing in the field that's beginning to emerge, it's called self-learning. 3:49Now the self learning aspect, 3:51this is where we can use a lot of these chain of thoughts and nest them together and have a mixture of experts learn them. 3:57So they become experts in their own little area, right? 4:01And what you can do is have each of those experts in the MoE vote, 4:04and the more votes you have for a particular answer, that is going to be the right answer, 4:09and that can be your quasi or meta ground truth that you then can send back right into the network so it learns. 4:15Now this helps the models to branch off and learn even new skills, it can learn new capabilities, 4:21and you'll even begin to see reinforcement learning that's being used within the field as well. 4:26Now you might be thinking that some of these models are losing their mind. 4:29Well, they're really not. 4:31So as these models begin to teach themselves, we really need to be careful that they do produce good results. 4:37Now, we try to use what's called a funnel of trust, 4:40and we can use this funnel to help minimize the hallucinations or incorrect skills 4:44that might be acquired through these three examples that I provided. 4:47Now, one of these areas that I would like to highlight is called a large language model as a judge. 4:54Now, this is where a model itself, it interprets the output of another model, 4:59and if we want to follow what's called a condorcet jury theorem, 5:01we can stack together lots of these judge models together to create a jury, 5:06right, and say if all of these jury members are more than half likelihood to get it right 5:12and you keep adding more and more and more, then your judge jury is gonna be more than likely correct as well. 5:19Now the other area is called theory of mind. 5:23And what we wanna do here is ensure that the output of these models, 5:26they match your expectations so that the models understand, 5:29wait a minute, my user or my agent that's trying to use me, they have their own expectations, 5:35and so we want to be able to meet these agent mental models or user 5:40mental models so that they begin to have an alignment of their output. 5:45Now the other area is called machine unlearning. 5:49With machine unlearning, we can begin to remove data in a systematic way. 5:55So we can have this where we can create virtual lesions within a 5:58MOE where one of the experts forgets the data that they were taught, right? This is called selective forgetting. 6:06And this is very powerful in MOEs, 6:08or even during retraining, we can shard the data and split it so that we don't want to train a certain skill anymore. 6:14Now you might think that all this is pretty interesting 6:17and it can help my friend Kevin draw better dogs 6:19or can help Ravi begin to understand how to get a different style of swimming for the upcoming meet that they might have, 6:26but as we do this, we can begin to help LLMs learn without losing their mind.