The Memory Problem in LLMs
Key Points
- Large language models, despite their intelligence, have extremely limited short‑term memory (only a few minutes or ~200 k tokens), which hampers their usefulness for longer, contextual tasks.
- Scaling memory to meet current user volumes (≈125 M daily active users of ChatGPT) would cost on the order of half a trillion dollars, making affordable long‑term memory (months or years) a major technical and economic challenge.
- This memory constraint forces us to reconsider which problems are feasible for LLMs and highlights the need for breakthroughs in memory architectures as a next critical research focus.
- The reliance on highly capable but memory‑limited AI partners may already be influencing how people think, communicate, and remember information, raising important societal and educational implications that deserve more discussion.
Full Transcript
# The Memory Problem in LLMs **Source:** [https://www.youtube.com/watch?v=BhjtZP4T0oA](https://www.youtube.com/watch?v=BhjtZP4T0oA) **Duration:** 00:03:07 ## Summary - Large language models, despite their intelligence, have extremely limited short‑term memory (only a few minutes or ~200 k tokens), which hampers their usefulness for longer, contextual tasks. - Scaling memory to meet current user volumes (≈125 M daily active users of ChatGPT) would cost on the order of half a trillion dollars, making affordable long‑term memory (months or years) a major technical and economic challenge. - This memory constraint forces us to reconsider which problems are feasible for LLMs and highlights the need for breakthroughs in memory architectures as a next critical research focus. - The reliance on highly capable but memory‑limited AI partners may already be influencing how people think, communicate, and remember information, raising important societal and educational implications that deserve more discussion. ## Sections - [00:00:00](https://www.youtube.com/watch?v=BhjtZP4T0oA&t=0s) **LLM Memory Limitations Crisis** - The speaker warns that large language models' extremely short token memory is a dominant, costly issue—potentially requiring over half a trillion dollars to provide multi‑month memory for hundreds of millions of users—raising doubts about their practical usefulness despite growing intelligence. ## Full Transcript
today I want to talk very briefly about
the memory problem for large language
models I believe this is going to be one
of the dominant issues we need to
discuss in
2025 at the end of the day large
language models are becoming very
intelligent but they still have
atrociously bad
memory 100,000 200,000 token memory it's
like having a PhD in your pocket that
forgets a conversation from 10 minutes
ago it's not working well and the
problem is if you do the math on the
cost of memory there is no easy solve
given our current solution
architectures I I did the math I wrote a
substack on this it would take over half
a trillion dollars to solve this problem
just at the current daily active user
count for chat GPT which is roughly 125
million and that's growing all the time
the problem is getting worse all the
time and that's assuming you don't want
human level memory which lasts years you
would be happy with long-term memory
that's several
months and I don't even know if we would
be happy with that to be honest with you
but that would still be vastly better
than just a few minutes of memory which
is really what we have today if the chat
is going well I burn through chats with
Claude in about 10 minutes claud's great
well Claude lasts and then we're
done and I think that one of the things
that we need to ask ourselves is if you
have this much intelligence but it has
this much short-term memory what kinds
of problems are useful to solve with
that kind of intelligence and what kinds
of problems are we inherently limited
from solving because memory is an issue
even 01 Pro 200,000 tokens you have very
limited memory now people are doing
incredible things with it so I'm not
saying that you can't find really cool
problems to solve I absolutely you can
but it is making me wonder is memory the
next breakthrough that we need to be
looking for and if it is I don't see
anything on the horizon that help helps
us to solve that right now and I think
that's probably worth talking about a
lot more than we currently do especially
if we're using llms all the time and
they have very short-term
memories is that going to affect the way
we remember things does that change the
and shape the way we remember things
there are anecdotal stories coming out
now of the way people are changing their
vocabulary changing their thinking
because of the way they interact with
llms especially early on at formative
stages in education
if that continues and we are used to
working with these thinking partners
that are very smart but have very very
limited short-term memory like it has
read everything in the world but it
cannot remember your conversation from
20 minutes ago does that shape the way
we remember things too maybe for good
maybe for ill maybe we're the ones that
have to get better at remembering
because our thinking partner can't I
don't know but to me it's one of the
most interesting Dynamics in large
language models right now and I think it
deserves to be talked about about more
than it is so I wrote a substack on that
if you're interested um otherwise enjoy
the YouTube