LLM Limits and Seven Key Use Cases
Key Points
- LLMs struggle with breaking‑news because they’re trained on static, large‑scale corpora and can’t readily incorporate tiny, fresh pieces of information without a dedicated, up‑to‑date data pipeline.
- Their core design as next‑token predictors makes them ill‑suited for real‑time fact‑checking or staying current with daily events, highlighting a need for systematic, frequent model updates.
- LLMs are poor decision‑makers; they can be easily swayed and often provide advice that reflects statistical patterns rather than sound reasoning.
- Understanding these architectural trade‑offs is essential—treating LLMs as a “magic wand” ignores their genuine strengths (e.g., language generation) and the contexts where they reliably fail.
Full Transcript
# LLM Limits and Seven Key Use Cases **Source:** [https://www.youtube.com/watch?v=uS_YX_LGCAY](https://www.youtube.com/watch?v=uS_YX_LGCAY) **Duration:** 00:15:32 ## Summary - LLMs struggle with breaking‑news because they’re trained on static, large‑scale corpora and can’t readily incorporate tiny, fresh pieces of information without a dedicated, up‑to‑date data pipeline. - Their core design as next‑token predictors makes them ill‑suited for real‑time fact‑checking or staying current with daily events, highlighting a need for systematic, frequent model updates. - LLMs are poor decision‑makers; they can be easily swayed and often provide advice that reflects statistical patterns rather than sound reasoning. - Understanding these architectural trade‑offs is essential—treating LLMs as a “magic wand” ignores their genuine strengths (e.g., language generation) and the contexts where they reliably fail. ## Sections - [00:00:00](https://www.youtube.com/watch?v=uS_YX_LGCAY&t=0s) **LLMs Aren't Magic: Breaking News** - The speaker warns that large language models cannot reliably handle real‑time breaking news and must be viewed as specialized token‑prediction tools with distinct strengths and limitations. ## Full Transcript
we are going to talk about three things
that AI is not large language models are
not in particular and then we're going
to cover seven use cases that large
language models are really good at and
the reason why and if you followed my
Tik Tok at all you know I get really
insistent that we not treat a large
language model as if it was a magic wand
where you could just sort of wave the
magic one then it will do whatever you
want I think it's important to
understand architecturally what they're
designed to be good at and then what
they're designed to not be good at
because any design makes tradeoffs and
in this case the tradeoffs actually show
us the kinds of jobs that large language
models are actually not very good at as
well as the kinds that they are so let's
get into it so this is the the number
one thing they're not good at and this
is the thing that made me think of this
I don't know about you but it feels like
there's been an enormous amount of news
that has broken over the last week or so
in the middle of July large language
models are absolutely terrible at
handling breaking news unless you put
them into some kind of tool chain that's
only looking at a select body of text
that is authoritative that is considered
breaking news they're going to do badly
and they're going to do badly because
fundamentally large language models are
designed to handle large bodies of text
they are designed to look across the
entire internet and to
say what is the next token predictor for
a given query based on my understanding
of the entire body of text on the
internet all of the written works of
humanity Etc that makes them good at
certain things it's kind of amazing that
we built a machine that can do that but
if you look across all of that and you
have this tiny sliver of text that comes
in that's brand new that's supposed to
be a new authoritative fact llms just
don't know how to handle it and that
actually suggests an inherent weakness
for llms going forward we are
continually adding new facts to the body
of knowledge for Humanity every
newspaper that comes out there's a whole
series of new facts and new events we
need to have a more deliberate way of
upgrading our llms on Cadence so that
they are up toate on the facts right now
we don't even have an idea or a concept
of a world where you could open up chat
GPT in the morning and say oh good I see
that it's read all of the newspapers
from yesterday
it's up to date on the facts let alone
same day
news inherently if your job is to
predict the next token from a large body
of tokens you're going to look at that
large body of tokens you're not really
going to over index on this one tiny
piece of breaking
news and that's what makes large
language models really really terrible
at handling news right now they're just
not good at it another thing that large
language models are really bad at is
making decisions I say this because even
if they give you advice if you ask them
to they're incredibly persuadable
they're actually designed almost to be
mirrors they're designed to respond to
you in a way that you will find
appealing and that makes them really
really bad at actually making hard
decisions so if you ask them to list
pros and cons and to make a decision
they tend to be over optimistic in my
experience and if you ask them to
reverse the decision they tend to be
easily persuadable and reverse the
decision because they're token
predictors they're not thinking in
symbolic logic they don't have any idea
of really what a decision is they're
just predicting the next token and
they're predicting a token that they
think is going to be able to keep the
conversation
going and they're predicting a token
that they think will match the query and
that's that second piece is actually
more important I'm not sure that we
actually know for sure that they are
trying to juice llm so that you keep
talking with them I certainly wouldn't
be surprised given our history with
social algorithms as a tech industry but
for now what we really know is that they
are designed to respond to a query and
your query reveals your own biases your
own opinions and anyone who's designed
surveys can tell you you can write a
survey that will get anyone to tell you
anything it just depends on the kind of
question you ask similarly when a large
language model is being asked a question
it will read all the nuances and Detail
in that question the unique human
utterance that you make and will then
respond with a token and a string of
tokens that's designed to match exactly
that question and so it becomes an
intensely suggestible conversation and
that makes it very very bad at decision-
making and this brings me to the last
thing that large language models should
not be asked to do do not ask a large
language model to make a management
decision and yes I am deliberately
tipping my my hat here to azimov because
azimov wrote In The Three Laws uh of
Robotics that a number of things that
robots should not be asked to do and
that got expanded over the years into a
llm should not be asked to make a
management decision so actually while
we're doing this live I am not going to
chat with Chad
GPT but I am going to find out for you
where the rule came from uh that a robot
should not make a management decision
because I think that's super
interesting let me see
here
um it's an IBM presentation from the
1970s see this is why we need to find
our sources I thought it was azimov and
then I remembered live that azim's three
laws of robotics are not about
management they're about
ethics and as I was doing that I
realized I needed to get better sourcing
and so it turns out that this comes from
an IBM
presentation uh IBM of course built some
of the first AI tooling um and I think
the idea is a sticky
one we should not have large language
models making management decisions if
they're bad at decision- making and I
think we sometimes expect
that and you can't have the kind of
business business judgment that you need
from an llm it's just not going to work
they're designed to predict what you
give them and so they will be far too
suggestable to make good
choices and we will probably over index
on the choices that they do make by the
way you should check your facts that was
a nice little moment there right like
I'm modeling that go check your
facts don't assume the llm is going to
give you the truth okay we're going to
go to part two those are are all things
that large language models are not good
at what is what are large language
models actually really really good at
what are they designed to be good at
number one they're phenomenal at
synthesizing I can give a large language
model a 50-page document and it will
take a couple of seconds to read it that
is so much faster than a human being I
think we lack categories for it mentally
it it still feels like magic to me when
I stick a 50-page document into an llm
and it just just digests it like
that and then it synthesizes it and
increasingly it acts as a Precision
recall device for it that part has
gotten better it used to be much much
more hallucinatory and because of the
updates in the background to our core
modeling llms are as of this time July
in 2024 much much better at precisely
pinpointing where in the document
something happened and recalling it
accurately and I think that's credit to
the model Builders credit to open Ai and
others who have actually worked to make
sure that that's
true so synthesis is something they do
well they will summarize something for
you they will describe something for you
really effectively so for instance if
you upload an image of a chart or if you
upload an image of an equation or if you
upload an image of a house or a
motorcycle or even a fairly complex
crowd scene large language models are at
the point where they can understand
that scene and describe it really well
and that is partly because of all the
work that's been done that I haven't
talked a ton about on the image
generation side where we have had
similar advances driven by this idea of
being able to reliably predict patterns
and images and generate reliable images
from a body of images it's a similar
macro motion to what we've done with
text it's just on the image side and so
large language models have sort of put
that together I know when uh chat GPT 40
came out one of the big points was that
it actually had a native image
ingest uh more work is going to be done
there but even
now describing images is something that
llms do really really well and you might
think well that's not super relevant but
it is because a lot of what humans have
to do at work involves describing images
if you're looking at a chart to gain
insights you're describing an image
if you are conducting analysis of a
diagram you're describing an
image so there's a lot of pieces of work
that we do that amount to description if
you're writing a how-to guide for a
piece of software you're describing an
image it's just a series of
images okay so they're good at
synthesizing they're good at describing
they're also eerily good at pattern
recognition I would argue that large
language models probably know more about
grammar than any living linguist there
is something
phenomenally effective about the way
they've mastered human
language and that pattern recognition
extends
into asking it to understand patterns in
documents you upload it's it's just
something they're really good at and I
think that's one area where we're just
scratching the surface of what we can do
with that amazing aming pattern
recognition
capability number four they're good
Companions and whether or not we like
that regardless of how we feel about it
they're really really good at engaging
people in conversation and feeling like
there's someone else on the other side
of the line and that is something that
people are responding to there is a
reason that AI companion apps do so well
in the App
Store like it or not
and that gets me to the next point
they're excellent conversationalists and
I separated those out because a
companion is sort of an emotional
function that an AI provides uh help
with whereas a con
conversationalist is someone who can
help you debate or understand something
better yourself so if you want to debate
and understand a particular known corner
of study like if you're trying to
understand an advanced piece of physics
or a piece of chemistry and the concepts
aren't making sense you can have a
convers ation with a large language
model and it will help you it will help
you understand it now if you only depend
on the large language model we've
actually done some work there and it
turns out that you don't learn as well
if you're only depending on an llm
because it ends up becoming something
you lean on when you really should be
doing your own critical thinking but if
you need to wrap your head around it the
first time it can be very
effective all right we're getting to the
second to last thing that they're good
at so i' I said seven right so we have
synthesizer a good describer a good
pattern recognizer a good companion a
good conversationalist and then number
six they're a good business writer
they're absolutely phenomenal at
business writing and I differentiate
business writing and literature because
literature requires a degree of
Attunement to the lived experience that
llms just aren't good at they they
aren't embodied creatures they don't
understand how to write liter and I have
seen people try and it just falls apart
but they're absolutely amazing at
writing business like if you need to
write a quick update for the boss very
good at it if you need to write even a
one-pager if you have a solid idea and
you can critique it it's good for
drafting so they're good business
writers and the last thing number seven
they're really really good
analysts that means that if you set them
up and prompt them properly they will
actually analyze and assess what is in
front of them very very carefully and I
think it's that attention to detail that
is really helpful they look through with
unfailing attention across the entire
body like if you ask a financial analyst
to look with the same degree of
attention that an llm does across the
same degree of text either it's going to
take forever or they just won't like
they'll skip out and that is something
that we're still getting used to this
idea that you can have an analyst that
is always there that is always at your
fingertips and that is always thinking
things through that's a new one for us
it's like we have a personal Analyst at
our
fingertips and I think maybe that's
where I'll sort of wrap this thing in a
conclusion the things that llms are not
good at we seem to expect they'll be
good at and I would say the things that
llms are good at we have a lot of
feelings about as humans we think these
are things that humans should
traditionally be good at ourselves and
we worry that having an LM be good at
them means that somehow we are less and
I would argue that it just means that we
have more time to do other things and it
means that we have more options to do
more interesting versions of those same
tasks I do not miss when an llm writes
business updates and I can like draft
them very quickly from the llm draft and
I'm done in half the time that's not
something that I wish I could do more of
I don't regret having a first pass at
charting
analysis it's great to have a first pass
and so I I think that one of the things
that we are going to need to get used to
is the idea that maybe we've thought of
software is something that is very for
everyone and maybe what llms are
reminding us is that llms can be
personal software can be personal now
and we can have effectively a personal
assistant at our fingertips that can
help us with a lot of these functions
that we're asked to do because if we
work professionally we're asked to
describe we're asked to pattern
recognize we're asked to be
conversationalists we're asked to be
analysts it can help to have someone in
the background who's very good at those
things as long as we don't ask it to do
the things it's not good at like make
decisions or break news all right I will
leave it there I hope that this has been
a helpful breakdown three things that
llms are terrible at and seven things
that they're actually pretty good at be
curious to know uh what you think I
missed in the comments