Learning Library

← Back to Library

Memory-Centric AI Agent Architecture

Key Points

  • The speaker promises to deliver high‑level, practical insights over the next 10‑15 minutes that will help listeners build believable, capable, and reliable AI agents within the next six months.
  • They emphasize shifting from current stateless LLM applications to stateful ones by embedding persistent memory, which reduces prompt engineering and enables agents to form lasting relationships with users.
  • A brief historical overview shows the progression from early chatbots to retrieval‑augmented generation, then to reasoning and tool‑use capabilities, culminating in today’s discussion of AI agents and their varying levels of “agenticity.”
  • An AI agent is defined as a computational entity that perceives its environment, reasons via an LLM, takes action through tool use, and crucially depends on short‑term or long‑term memory to become reflective, proactive, reactive, and autonomous.

Sections

Full Transcript

# Memory-Centric AI Agent Architecture **Source:** [https://www.youtube.com/watch?v=W2HVdB4Jbjs](https://www.youtube.com/watch?v=W2HVdB4Jbjs) **Duration:** 00:17:38 ## Summary - The speaker promises to deliver high‑level, practical insights over the next 10‑15 minutes that will help listeners build believable, capable, and reliable AI agents within the next six months. - They emphasize shifting from current stateless LLM applications to stateful ones by embedding persistent memory, which reduces prompt engineering and enables agents to form lasting relationships with users. - A brief historical overview shows the progression from early chatbots to retrieval‑augmented generation, then to reasoning and tool‑use capabilities, culminating in today’s discussion of AI agents and their varying levels of “agenticity.” - An AI agent is defined as a computational entity that perceives its environment, reasons via an LLM, takes action through tool use, and crucially depends on short‑term or long‑term memory to become reflective, proactive, reactive, and autonomous. ## Sections - [00:00:00](https://www.youtube.com/watch?v=W2HVdB4Jbjs&t=0s) **Memory‑Driven Stateful AI Agents** - The speaker outlines a roadmap for creating believable, capable AI agents by shifting from stateless, prompt‑heavy designs to memory‑centric, persistent architectures, tracing the evolution from early chatbots through RAG to current reasoning capabilities. - [00:03:10](https://www.youtube.com/watch?v=W2HVdB4Jbjs&t=190s) **Memory as the Core of AI** - The speaker emphasizes that both short‑term and long‑term memory—and its various specialized forms—are fundamental to creating reflective, autonomous agents and advancing toward human‑level or superhuman artificial intelligence. - [00:06:25](https://www.youtube.com/watch?v=W2HVdB4Jbjs&t=385s) **AI Memory Management Overview** - The speaker outlines the core components of AI memory management—generation, storage, retrieval, integration, updating, and forgetting—emphasizing retrieval's primacy and the role of MongoDB in supporting diverse search methods beyond vectors in RAG pipelines. - [00:10:38](https://www.youtube.com/watch?v=W2HVdB4Jbjs&t=638s) **MongoDB Toolbox for LLM Memory** - The speaker describes how storing JSON schemas of tools and conversational data in MongoDB creates a scalable, searchable “toolbox” and memory system for language models, leveraging flexible document storage, various retrieval queries, and a forgetting mechanism. - [00:15:12](https://www.youtube.com/watch?v=W2HVdB4Jbjs&t=912s) **Cat Cortex Inspires AI** - The speaker reflects on early stock buying, then describes how Nobel‑winning research on the visual cortex of cats revealed hierarchical edge‑to‑shape processing that later inspired convolutional neural networks, positioning this natural intelligence as the blueprint for their company's secure, rapid AI product development. ## Full Transcript
0:01[Music] 0:10[Music] 0:15In the next 10 to 15 minutes, here's uh 0:17I guess my promise to you. I'm going to 0:20give you some information 0:23that will be high level. There will be 0:26some practical component to it. But this 0:29information I'll give you within the 0:31next 6 months will be very relevant and 0:34it will put you in the best position to 0:36build the best AI applications to build 0:39the best agents that are believable, 0:41capable, and reliable. 0:43I know we we going to get there. 0:48You know what? Just for you. There we 0:50go. You're welcome. So, we're going to 0:52be talking about a memory. Um, 0:56we're going to be talking 0:57about the stateless applications that 1:00we're bu building today and how we can 1:02make them stateful. We're going to be 1:04talking about the prompt engineering 1:06that we're doing today and how we can 1:07reduce that by focusing on persistence. 1:11We're going to be turning the responses 1:13in our AI application and making our 1:16agents build relationship with our 1:18customers and all of it is going to be 1:21centered around memory. 1:24So I'm going to do a very quick 1:28evolution of what we've been seeing for 1:29the past two to three years. 1:33We started off with chat bots. LMN power 1:35chatbots. They were great. Chat GPT came 1:37out November 2022 and yeah exploded. 1:42Then we went into rag. We gave this chat 1:44bots more domain specific relevant 1:46knowledge and it gave us more 1:47personalized responses. Then we begin to 1:50scale the compute, the data we're giving 1:53to the LLMs and they gave us emerging 1:55capabilities, right? Reasoning uh tool 1:59use. Now we're in the world of AI agents 2:01and agentic systems 2:04and the big debate is what is an agent, 2:07right? What is an AI agent? I don't like 2:09to go into that debate because that's 2:11like asking what is consciousness 2:14um is a spectrum. the agenticity and 2:18that's a word now agenticity 2:21of uh of an agent is is a spectrum so 2:25they're different levels 2:27I came here and I saw Whimo and to me 2:30was pure sorcery we don't have that in 2:32the UK and they're different levels of 2:34um self-driving so you can look at the 2:36genetic spectrum in that respect we have 2:39a minimal agent whereas an LLM running 2:41the loop great then you have a level 2:44four is autonomous agent a bunch of 2:46agents that have access to tools you 2:48they can do whatever they want. They're 2:50not prompted in any way or a minimal 2:52way. But this is how I see things is a 2:55spectrum. So what is an AI agent? It's a 2:59computational entity with awareness of 3:01his environment through perception, 3:03cognitive abilities through an LLM and 3:06also can take action through tool use. 3:08But the most important bit is there is 3:10some form of memory short-term or 3:13long-term. 3:15Memory is important. It's important 3:17because we're trying to make our agents 3:18reflective, interactive, proactive, and 3:21reactive and autonomous. And every most 3:24of this, if not all, can be solved with 3:28memory. 3:30Um, I work at MongoDB and we're going to 3:32make we're going to connect the dots, 3:34don't worry. So, this is all nice and 3:36good. This is what what you look at if 3:38you um double click into one AI agent 3:41is. But the most important bit to me is 3:43I'll go slide. People are taking 3:44pictures. Sorry. 3:47All right, let's go. The most important 3:49bit is memory. And when we talk about 3:50memory, the easy way you can think about 3:52it is short-term, long-term, but there 3:54al other distinct forms, right? Um, 3:57conversational entity memory, knowledge, 3:59data, store, cache, working memory. 4:01We're going to be talking about all of 4:02that today. So, these are the high level 4:04concepts. 4:06But let me go a little bit metal. 4:08why we're all here um today in this 4:12conference is because of AI, right? 4:14We're all architects of intelligence. 4:16The whole point of AI is to build some 4:18form of computational entity that 4:20surpasses human intelligence or mimics 4:23it. Then AGI, 4:25we're focused on making that 4:27intelligence surpass humans in all tasks 4:30we can think of. And if you think about 4:33the most intelligent humans, you know, 4:36what determines the intelligence is 4:37their ability to recall. It's their 4:39memory. So if we if AI or AGI is meant 4:42to mimic human intelligence is a 4:44no-brainer, no pun intended, that we 4:47need memory within the agents that we're 4:49building today. Does anyone disagree? 4:53Good. I would have kicked you out. Um, 4:56okay, let's go. So humans, you in your 4:58brain right now, you have these, you 5:00have this. This is not what it looks 5:02like, but it's close enough. You have 5:04different forms of memory, and that's 5:06what makes you intelligent. That's what 5:07makes you retain some of the information 5:09I'm going to be giving you today. There 5:10is short-term, long-term, working 5:12memory, semantic, episodic, procedural 5:14memory. Um, in your brain right now, 5:17there is something called a cerebellum. 5:19I always get the word wrong, but that's 5:21where you store most of the routines and 5:23skills you can do. Can anyone here do a 5:25backflip? 5:27Really? Wow. You just see my excitement. 5:30Um 5:32your the information or the knowledge of 5:34that bat flip is actually stored in that 5:36part of your brain. So I heard it's 90% 5:39confidence by the way 5:41that is actually it is right. I'm not 5:44going to do one but but it's stored in 5:46that part of your brain. Now you can 5:48actually mimic this in agents and I'm 5:51going to show you how. But now we're 5:53talking about agent memory. 5:56Agent memory is the mechanisms that we 6:00are implementing to actually make sure 6:03that states persist in our AI 6:06application. 6:08Our agents are able to accumulate 6:10information, turn data into memory and 6:12have it inform the next ex execution 6:15step. But the goal is to make them more 6:18reliable, believable, and capable. 6:22Those are the key things. 6:25And the core topic that we are going to 6:27be working on as AI memory engineers is 6:32on memory management. We're going to be 6:34building memory management systems. And 6:37memory management is a systematic 6:39process of organizing all the 6:41information that you're putting into the 6:42context window. Yes, we have like large 6:45context window, but that's not for you 6:46to stuff all your data in. That's for 6:49you to pull in the relevant memory and 6:52structure them in a way that is 6:53effective that allows for the response 6:56um to be relevant. 6:58So these are the core components of 7:00memory management. Generation, storage, 7:02retrieval, integration, updating, 7:04deletion. There's a lie here because you 7:08you don't delete memories. Humans don't 7:10delete their memories except traumatic 7:12one and you want to forget. But we 7:14really should be looking at implementing 7:16forgetting mechanisms within the memory 7:19management systems that we're building. 7:21You don't want to delete memories. And a 7:23different research papers are looking at 7:24how to implement some form of forgetting 7:27within agents. 7:29But the most important bit is retrieval. 7:32And I'm getting to the MongoDB part. 7:35This moving around um this is rag. It's 7:39very simple, right? because we've been 7:41doing it as AI engineers. Um, MongoDB 7:46is that one database that is core to rag 7:49pipelines because it gives you all the 7:50retrieval mechanisms. Rag is not just 7:52vector. Vector search is not all you 7:54need. You need other type of search and 7:57we have that with MongoDB. Anything you 7:59can think of, you're going to be hearing 8:00a lot about MongoDB in this um in this 8:03conference today. But this is what rag 8:05is and you level up. You go into the 8:08world of agentic rag, right? You give 8:10the retrieval capability to the agent as 8:14a tool. And now we can choose when to 8:16call on information. 8:19There's a lot going on. I I'll send this 8:21somehow to you guys or you can come to 8:23me and I'll um LinkedIn it to you. Add 8:26me on LinkedIn 8:28and just ask for the slice and I'll send 8:30it to you. Richmond on LinkedIn. Um this 8:34is memory. 8:36MongoDB 8:38is the memory provider for Aentic 8:40systems. And when you understand that we 8:43provide the developer, the AI memory 8:46engineer, the AI engineer all the 8:48features that they need to turn data 8:52into memory to make the agents 8:53believable, capable, and reliable. You 8:56begin to understand the importance of 8:57having a technology partner like MongoDB 9:01on your AI stack. 9:03So these are this is the same um image 9:07but just a bit more focused and not a 9:09different memory. So I'm going to skip 9:10through this slide because I go into a 9:11bit of detail. Um I'm also going to give 9:14you a library. I'm working on an open 9:17source library. I'm ashamed of the name. 9:19I was trying to be cool when I came up 9:21with it. It's called Memoriz. 9:23Um 9:24you can type that on Google. You'll find 9:26it. But it has all the design patterns 9:29of all of this memory that I'm showing 9:31you. for this memory types and that I 9:33will show you as well. But there are 9:34different forms of memory and AI agents 9:36and how we make them work. So let's 9:38start with persona who's is anyone here 9:40from open AI 9:43leave. I'm joking. Um well a couple a 9:47couple months ago right so they they 9:49gave chat GBT a bit of personality right 9:53um and they didn't do a good job but 9:57they are going in the right direction 9:59which is we are trying to make our 10:01systems more believable right we're 10:03trying to make them more human we're 10:04trying to make them create relationship 10:06with the consumer with the users of our 10:09systems persona memory helps with that 10:12and you can model that in MongoD DB, 10:15right? This is memories. You if you spin 10:18up the library, it helps you um spin up 10:21all of this um different type of memory 10:23types. So, this is persona. Um I have a 10:25little demo if we have time. Um 10:29but this is persona memory. This is what 10:30it will look like in MongoDB. Then 10:32there's toolbox. 10:34Um the guidance from OpenAI is you 10:38should only put uh the schema of maybe 10:4110 to 21 tools in the context window. 10:45But when you use your database as a 10:47toolbox where you're storing the JSON 10:49schema of your tools in MongoDB, you can 10:52scale because just before you hit the 10:56LLM, you can just get the relevant tool 10:59using any form of search. So that's 11:01toolbox that's me that's a toolbox 11:04memory and that's what it would look 11:06like right you would store all this is 11:08how you model it in uh MongoDB you store 11:10all the information of your JSON schema 11:12now you'll begin to understand that 11:15MongoDB gives you that flexible data 11:18model the document data model is very 11:20flexible it can adapt to wherever data 11:22wherever model you want your data to 11:24take wherever structure and you have all 11:26of the retrieval capabilities graph 11:28vector text geospatial query 11:31in one database. 11:33Conversation memory is a bit obvious, 11:35right? Back and forth conversation with 11:37uh chat GPT with Claude. You can store 11:39that in your database as well in MongoDB 11:43as conversational memory. And this is 11:44what that would look like. Time stamp, 11:46time stamp, and you have a conversation 11:48ID and you can see something there 11:51called recall recency and associate 11:53conversation ID. And that's my attempt 11:55at implementing some memory signals. Um 11:58but and that's goes into the forgetting 12:00mechanism that I'm trying to implement 12:02in my very famous library memories. Um 12:06I'm going to go through the next slides 12:08a bit quicker because I want to get to 12:09the end of this. 12:12Workflow memory is very important. You 12:14build your agentic system, they execute 12:15a certain step. Step one, step two, step 12:17three, it fails. But one thing you could 12:20do is the failure is experience. It's 12:22learning experience. You can store that 12:24in your database. I see you nodding. 12:25You're like, "Yeah." um you can store 12:27that in your database and you can then 12:29pull that in in the next execution to 12:32inform the LLM to not take this step or 12:35explore other paths. You can store that 12:37in MongoDB as well. You can model that 12:40because what you have with MongoDB is 12:42that memory provider for your agentic 12:43system and that's what this is what that 12:45looks like when you model it. An example 12:47of it anyway. So we have episodic 12:49memory, we have long-term memory, we 12:51have an agent registry, you can store 12:53the information of your agent as well. 12:55Um, and this is how I do it. Uh, you can 12:57see the agent has tools, persona, all 13:00the good stuff. There's entity memory as 13:02well. So, there's different forms of 13:04memory. And the memory the memoriz 13:07library is very experimental and 13:09educational, but it it encapsulates some 13:11of the memory and implementation and 13:13design patterns that I'm thinking of on 13:16an everyday basis that we're thinking of 13:18in MongoDB. So, MongoDB, you probably 13:22get the point now, the memory provider 13:24for Agent Tech systems. There are tools 13:26out there that focus on memory 13:28management. Um, MEGPT, ME Zero, Zep, 13:32they're great tools, but after speaking 13:35to some of you folks and some of our 13:37partners and customers here, there is 13:40not there is there is not one way to 13:42solve memory. and you need a memory 13:45provider to build your custom solution 13:48to make sure the memory management 13:50systems that you're able to implement 13:51are effective. So we really understand 13:56the importance of managing data and 13:58managing memory and that's why earlier 14:01this year we acquired Voyage AI. Now 14:03they create the best 14:06no offense open AI embedding models in 14:09the market today. Voyage AI embedded 14:11models are we have a text multimodel we 14:16have re-rankers and this allows you to 14:19really solve the problem or at least 14:21reduce AI hall elucination within your 14:23rag and aentic systems and what we're 14:26doing and what we're focused on the 14:27mission for MongoDB is to make the 14:30developer more productive by taking away 14:32the considerations and all the concerns 14:35around managing different data and all 14:38the process of chunking in retrieval 14:41strategies. We we pull that into the 14:43database. We are redefining the 14:44database. And that's why in a few months 14:48we're going to be pulling in Voyage AI, 14:49the embedded models and the rerankers 14:51into MongoDB Atlas and you will not have 14:55to be writing chunking strategies for 14:58your um for your data. I see a lot of 15:00people nodding. Yeah, that's good. So, 15:03MongoDB is a is a household name to be 15:06honest. with um I watched MongoDB IPO 15:08back when back when I was in university. 15:10I bought the stocks when I was in 15:12university um free just free. I already 15:15had about £100. Um I was broke but 15:20we are very focused and we take it very 15:23seriously making sure that you guys can 15:25build the best AI products, AI features 15:28very quickly in a secure way. So MongoDB 15:31is built for the change that we are 15:32going to experience now, tomorrow, in 15:35the next couple years. I want to end 15:37with this. You know who these two guys 15:39are? 15:41Damn. Okay, this is Hob and Wiso. They 15:44won a Nobel Prize um in the late 90s, 15:47but they did some research on the visual 15:49cortex of cats. um they experimented 15:52with cats that this probably wouldn't 15:54fly now, but back in the 50s and 60s 15:56things were a bit more relaxed. But they 15:59found out that the visual cortex of the 16:01brains between cats and humans actually 16:04worked by learning different hierarchies 16:07of representation. So edges, contours 16:09and abstract shapes. Now people that are 16:12in deep learning would know that this is 16:14how convol convolutional neural network 16:15works. And the research that these guy 16:19these guys did inspired and informed 16:22convolutional neural networks. That's 16:24face detection, object detection. It's 16:27it all comes from neuroscience. So we 16:30are architects of intelligence. But 16:32there is a better architect of 16:33intelligence. It's nature. Nature's 16:35created our brains. It's the most 16:38effective form of intelligence and well 16:40some humans that I meet. But it's the 16:42most effective form of intelligence that 16:44we have today. and we could look inwards 16:46to build this agentic system. So last 16:48week Saturday myself and Tenu is the 16:52chief AI scientist at MongoDB also the 16:54founder of Voyage AI. We sat with this 16:56three guys in the middle are 16:58neuroscientists. Kenneth has been 17:00exploring human brain and memory for 17:03over 20 years and and over here is 17:06Charles Parker. He's the creator of 17:07MEGPT your letter and we are having 17:10these conversation and once again we're 17:12mirroring how we're bringing 17:14neuroscientists and application 17:17developers together to solve and push us 17:21on the path of AGI. So that's my talk 17:23done. Check out memories and you can 17:25come talk to me about memory. Add me on 17:27LinkedIn if you want this presentation. 17:29Thank you for your time. 17:35[Music]