Beyond Chatbots: The AI Agent Spectrum
Key Points
- The prevailing “Can I use an AI agent for this?” question is misguided because most tasks don’t actually require a full‑blown autonomous agent.
- AI solutions exist on a spectrum—from basic chat advice to fully autonomous agents—and we need a vocabulary to describe the intermediate steps.
- The speaker outlines six levels of AI assistance, starting with the “adviser” (simple prompt‑based advice) and moving through increasingly interactive stages before reaching a true agent.
- Level 2, the “co‑pilot,” exemplifies the middle ground where AI offers real‑time suggestions (e.g., GitHub Copilot for coding, interview support tools) while the human remains in control.
- Understanding this spectrum helps organizations avoid over‑investing in complex agents and choose the cheapest, most effective AI approach for their specific problem.
Sections
- Beyond Binary: AI Assistant Spectrum - The speaker outlines a six‑level spectrum of AI assistance—from basic chatbots to full agents—to help viewers recognize intermediate, cost‑effective solutions and determine when an agent is truly needed.
- Tool‑Augmented AI Assistants - The speaker argues that most teams need a LLM‑based, tool‑augmented assistant—not a fully autonomous agent—highlighting its cheap, easy deployment, multi‑workflow benefits, and the emerging trend of startups acting as plug‑in tools within this framework.
- Semi-Autonomous AI in Customer Success - The speaker outlines how a semi‑autonomous AI system can autonomously resolve most routine customer‑success cases, reserving human intervention for exceptions, thereby creating a scalable, efficient workflow.
- Evaluating AI Task Automation Levels - The speaker urges shifting from debating AI agents to systematically assessing each repeatable task’s frequency, consistency, error impact, data location, and speed requirements to determine its appropriate automation level, recommending a level‑three tool as a practical starting point and offering a reusable diagnostic prompt.
Full Transcript
# Beyond Chatbots: The AI Agent Spectrum **Source:** [https://www.youtube.com/watch?v=obqjIoKaqdM](https://www.youtube.com/watch?v=obqjIoKaqdM) **Duration:** 00:14:38 ## Summary - The prevailing “Can I use an AI agent for this?” question is misguided because most tasks don’t actually require a full‑blown autonomous agent. - AI solutions exist on a spectrum—from basic chat advice to fully autonomous agents—and we need a vocabulary to describe the intermediate steps. - The speaker outlines six levels of AI assistance, starting with the “adviser” (simple prompt‑based advice) and moving through increasingly interactive stages before reaching a true agent. - Level 2, the “co‑pilot,” exemplifies the middle ground where AI offers real‑time suggestions (e.g., GitHub Copilot for coding, interview support tools) while the human remains in control. - Understanding this spectrum helps organizations avoid over‑investing in complex agents and choose the cheapest, most effective AI approach for their specific problem. ## Sections - [00:00:00](https://www.youtube.com/watch?v=obqjIoKaqdM&t=0s) **Beyond Binary: AI Assistant Spectrum** - The speaker outlines a six‑level spectrum of AI assistance—from basic chatbots to full agents—to help viewers recognize intermediate, cost‑effective solutions and determine when an agent is truly needed. - [00:04:23](https://www.youtube.com/watch?v=obqjIoKaqdM&t=263s) **Tool‑Augmented AI Assistants** - The speaker argues that most teams need a LLM‑based, tool‑augmented assistant—not a fully autonomous agent—highlighting its cheap, easy deployment, multi‑workflow benefits, and the emerging trend of startups acting as plug‑in tools within this framework. - [00:08:06](https://www.youtube.com/watch?v=obqjIoKaqdM&t=486s) **Semi-Autonomous AI in Customer Success** - The speaker outlines how a semi‑autonomous AI system can autonomously resolve most routine customer‑success cases, reserving human intervention for exceptions, thereby creating a scalable, efficient workflow. - [00:12:38](https://www.youtube.com/watch?v=obqjIoKaqdM&t=758s) **Evaluating AI Task Automation Levels** - The speaker urges shifting from debating AI agents to systematically assessing each repeatable task’s frequency, consistency, error impact, data location, and speed requirements to determine its appropriate automation level, recommending a level‑three tool as a practical starting point and offering a reusable diagnostic prompt. ## Full Transcript
I am so tired of the AI agent discourse.
Everyone is asking, "Can I use an AI
agent for this?" I'm getting people
asking me, "Can I use an AI agent for
this?" And that is the wrong question to
ask about AI. You probably don't need an
agent for most of the things that you
think you do. And I want to spend this
video talking about the forgotten
spectrum. Okay? We talk about chat,
chat, GPT, and we talk about agents. We
don't talk about anything in the middle.
But the proper way to think about this
is that if your problems are on a
spectrum, the solution space in AI is
also a spectrum. It is not binary, but
we mostly don't have a vocabulary for
it. And so I'm going to spend time in
this video walking you through the guide
to agents 101. not how to build an
agent, but the guide to progressing to
agents. The guide to understanding when
you need an agent, the guide to
understanding all the steps in between
before you get to an agent that are
cheaper and easier to implement for you.
So, let's get into I've put together six
different levels of actual spectrum AI
assistance here from the very basic
chatbot all the way through to agent. I
have had experience working on projects
for all of them. I'm dividing it into
six because it's easy to remember. We've
got handy little labels for each of them
so you can remember them. And I'm going
to give you real examples for each of
them. The goal here is for you to come
out with a sense of how to shape a
solution with AI to a problem. So you
are not overinvesting. So when your CEO
comes and says, you know, I was reading
on LinkedIn, this guy Nate was talking
about agents. We should do agents. No,
no. You should actually think through
your problem space. You can share this
video with him if he gets if he gets too
excited. So, level one, what is level
one? It's what you're doing already. You
ask AI for advice. You do the work. This
is how the vast majority of people use
Chad GPT. For most people, this is the
free tier. For some people, this is the
20 buck a month tier. For a few people,
they're doing the 20 buck a month or
equivalent on Claude. You can do that
right now, right? And most people think
of this as the most basic version. I'm
not going to spend a lot of time here
because you already get it. We're
calling this the adviser, right?
Basically, the LLM gives you advice. The
value of that advice depends entirely on
your prompt. I've talked a ton about
prompting. We're going to move on. Level
two. This is where we get into new
territory. We don't have words for these
in between levels. And so, we're going
to cover four in between levels before
we get to the fully autonomous agent
stage. Level two, co-pilot. Pay
attention here. AI will suggest as you
do the work. So, GitHub Copilot can
write code while you type. Cluey is
going to give you answers as you
interview. Someone called it the Cluey
stare where you kind of stare off into
space for 3 seconds and then give a
perfect five paragraph answer. So, that
is what the co-pilot stage is. It's
becoming a really common pattern. And
here's what it's good for. It's good for
repetitive tasks that have known
patterns. It's like tab complete in
cursor kind of thing. It can get you
going 40 or 50% faster if the patterns
are super repetitive and you know what
they are. And so that might be good
enough, right? The co-pilot piece where
you know what you you you have enough
repeated patterns and you don't really
need to do anything more than that. You
just need something to help you kind of
find an extra gear in your own
productivity. Great. That's what that
co-pilot level is for. You are still the
one driving. you are still the one in
GitHub copilot that is framing up the
coding cursor if you're hitting tab
complete you're starting the line right
level three it gets more interesting
this is a tool augmented assistant I do
a lot of work teaching people about
level three because I think it is one of
the jumps that is like most significant
if you look at the relative pop value of
oh my gosh that people get so think of
it as like a curve of value and you
you're wondering like going from level
one to two to three, like what kind of
value pop you get. The jump from
co-pilot to tool augmented assistant is
absolutely massive because it is
multiplied by the number of tools that
your chat assistant can get access to.
And so there's almost no end to the
value you can get here. I find a lot of
people think they're at this level one
chat advisor thing and they think they
want agents, but they don't. They
actually just want a tool augmented
assistant. And when they get one,
they're like, "Oh, wow. I had no idea it
could do this." Right? It can use Excel.
I had no idea. Right? And so this is for
a chat that can access data. It can
search the web. It can run calculations.
It can build assets. It can edit assets.
You can save your team dozens of hours a
week properly using these. And I'm just
going to say this is 10 times, 100
times, maybe a thousand times easier
than an enterprise agentic system to
install. It's so much cheaper, it's not
even funny. But people sleep on it
because it's not an agent. Well, it is
an agent. It's an LLM plus tools plus
the guidance you give. But people expect
agents to be like, you know, this
completely autonomous Borg like thing
that goes and and uses tools. Most of
the time, what most individuals and
teams actually benefit from is this like
level three tool augmented assistant. if
they could properly implement a tool
augmented assistant for finance
workflows, for marketing workflows, for
product workflows, they'd go so far. And
you know what's interesting is
increasingly entire startups are
becoming tools inside this framework.
You can call an MCP server that has chat
PRD as a product person. It will just be
there. It becomes part of a tool
augmented assistant in cursor. Super
easy. And the way we tend to think about
tools is limited when you have a world
where anything can be a tool. An LLM can
be a tool itself. You can have an LLM
call another AI. And so this is a very
powerful level. It gets you a lot of
bang for the buck, but it's not the last
one. If you find that you have types of
problems that are beyond the repetitive
task, so beyond the the co-pilot task,
they're not susceptible to just calling
tools, which is really that level three.
you need more structure. That's when you
get into structured workflow. That's
when things start to get serious. Often
times in these cases, AI will do a step,
the human will review, AI will continue.
This is choreographed work. And so in
this case, uh the example I have JP
Morgan wrote up a case study on this.
It's a contract system. It saves an
absurd number of hours a year, but
that's really a function of their scale.
People often look at these big numbers.
I think for JP Morgan, it's a third of a
million hours saved. Okay, great. But
like that's a function of them being a
big company. It's not really the AI
there. The AI savings comes from good
design around the problem space. And in
this case, they recognized that what
they needed was not the ability to call
tools per se. They needed the ability to
structure a workflow because in
contracts review, it's got to be the
same thing. You have high liability.
It's got to be exactly correct. And so
the AI can do a step, but the human
needs to review. There are a lot of
businessback operations that fall into
level four. And no, we're not done,
right? Like there's still more autonomy
that we can get to here. And people
again sleep on this piece because they
think, well, the goal should be AI
should do it all. Well, not necessarily.
Like if you're saving a third of a
million hours a year, I'd say you're
doing pretty good, right? If you're
saving a ton of time and your humans are
able to do the work and touch the work
in the right ways. This comes back to
something I've been emphasizing that we
forget in the agentic AI revolution,
which is that your goal when you are
designing AI systems at work should be
for your best humans to touch the work
more, not less. Your best humans should
be more fingertippy, more hands-on with
the work. They should not be feeling
more disconnected. And this is a way to
do that. There are lots of other ways to
do it, but I don't want you to forget
that principle. And I think we sometimes
do with AI agents. We sometimes think,
well, we're just going to sit back.
We're going to get the pina coladas out
and we're just going to No, no, we're
not. We remain engaged with the work. We
remain focused on making our best impact
as a human while AI slides in around us
as like a mech suit. And there's a bunch
of ways to do that as we're discovering.
Let's move to level five. Let's say you
don't need the structured workflow. You
actually need some degree of autonomy
where the human isn't reviewing. We call
this semi-autonomous. Surprise,
surprise. The AI will handle routine
cases independently. Humans would review
exceptions and edge cases. Super popular
in customer success. You can find lots
of examples of this going wrong and
going right. But by and large, the nice
thing about customer success is each
individual case fits within a spectrum
of customer utterance or customer
frustration. And you can really cleanly
map it at scale to well, the AI can take
care of 98% of cases with these 15
workflows. And we can sort of use an
engineering team and build that out and
then the remaining 2% the humans will
do, right? It becomes super clean at
scale because of the way human
complaints about products map onto
typically a fairly normal distribution.
And so semi-autonomous systems are a
good fit here because you basically give
the AI the ability to solve the problem,
the tools to solve the problem, the
workflow guidance to solve the problem.
You build an agentic pipeline where it
can read and respond and you're in
business. And now we're getting to the
world where people are starting to think
of this as a real agent, quote unquote.
I hate that phrase, but people do talk
about it as I want a real agent, right?
I've had that. No, you don't. You want
real answers to business problems. And
if you don't, you're probably not asking
the right question and you're probably
not going to make it. You need to be
asking, "How can I solve this the right
way? How can I put my best humans more
in touch with the work? And oh, by the
way, is AI the right tool for the job?"
And there are so many ways that AI is
the right tool for the job. And we are
sleeping on that. We just think it's
either the chat or the agent. Now,
finally, we get to level six, fully
autonomous. A AI does everything. This
is what people think AI agents are,
right? The humans monitor the metrics.
Okay, you can do that. It does work.
Some production systems are out there
that do that really well. But the deal
is this. You only need to do that if you
have compelling reasons why human
touches aren't relevant here, right?
like why you need to automate the entire
thing. One of the classic examples that
fast food continues to sort of go after
is the drive-through window, right? In
that case, you either are paying someone
to be at the drive-through window or you
are not. It is binary from a labor
perspective. And so, you really need the
AI to take over the entire thing. And we
have a number of cases where systems
have tried to do that and that hasn't
worked out. McDonald's has a case. I
think Taco Bell has a case. It is a
tough problem. And that is something
that you should think about when you get
to level six. Fully autonomous is a hard
problem and the last 2 or 3% of those
edge cases is extremely difficult and
takes a lot of investment to get over.
And so if your goal is fully autonomous
for everything, you should be thinking
actively about what your definition of
the full scope of the problem is because
if you can possibly scope it down, an
example would be we will handle almost
all of our customer queries on our shoe
site, but for 2 or 3% of them, we're
going to actually degrade it and say,
"Thanks, this is a ticket so a human can
look at it." That's almost fully
autonomous, but it's not quite. And it's
this sort of that in between level five,
level six where the AI decides to chat
and manages a conversation but then
sometimes goes to a human. The fully
autonomous bar is really hard. Amazon
tried to conquer it when they tried to
do uh selfch checkout in their little
stores where you just pick it up and
walk out. It turned out they never got
there and that was with Amazon
resources. I'm not saying that it's
impossible to get there. I'm calling out
that this is a spectrum and as a good
system designer, a good solution
designer, as someone who cares about how
AI is is implemented, you should be
thinking, do we have to go all the way
to fully autonomous or can we design
something that is going to give us
almost all of the value if it doesn't
take that much investment. Another
example of how complex fully autonomous
is, we cannot roll out Whimo cars,
self-driving cars to every city like a
rubber stamp effort. Even though we have
them and they are fully autonomous in a
few cities, they have to relearn every
single city. Despite all of the training
on roads, we have to teach them the new
city map in detail. That's where we're
at right now. Fully autonomous is really
hard. So, here's what I want you to do.
I want you to stop asking, should we
build agents? And I want you to start
asking, what level does this specific
task need to be at? Think about a task
that you do repeatedly. How many times
is it done per month? How consistent is
it? What happens if there's an error?
Where does the data live? How fast does
it need to happen? These questions are
not random. They're actually the
questions you need to answer to give you
a sense of your level. And so, look, if
if you're not sure, I'm just going to
tell you, I talked about this level
three tool enabled chat assistant for a
reason. Most people end up at level
three for a lot of things. There's a lot
of other options on that spectrum. I
gave you a lot of range, but level three
is where a lot of people hang out. And
the thing is, if you are unsure, pick a
level you can try yourself that you
don't need stakeholder approval for and
see if it makes your workflow better.
See if you feel more empowered. Now, if
you want to go through the whole thing,
I absolutely did 100% build a prompt for
this. I built a prompt to help you
diagnose where you are at for given
tasks. So, this is not designed to be a
oneanddone prompt. This is designed to
be a prompt you save and then you run
when you have a task where you're like,
is it an AI task? If it's an AI task,
what level is it at? I need some help.
You kind of get my sort of innate brain
on that. If that's something you're
interested in, you know where to find
it. And if it's not, please, for the
love of all of AI and all of your own
work, do not assume that it's just chat
and just agents. Please think about it
as a spectrum. I hope this video has
helped you to see that. share this with
someone who needs to see the problem
more widely because I think so much of
the bad use cases we see in AI, the doom
stories, the terrible implementations
come down to people not understanding
this that really it's not a light
switch. It's a spectrum of AI
implementations and you can talk about
them. You can develop the vocabulary for
them and you can make better choices.
So, here's to better AI implementations
that don't suck.