Search‑Driven Tool Calling in LLMs
Key Points
- Effective research hinges on search, so multi‑agent systems must embed a robust search step to gather and refine information before answering.
- Large language models (LLMs) cannot retrieve real‑time data on their own; they rely on **tool calling**, where the LLM requests external services (web, databases, search APIs) defined as named tools with input specifications.
- In a tool‑calling workflow, the LLM sends a message plus a tool name to an application, which routes the request to the appropriate service, returns the results, and the LLM incorporates them into its response.
- Common pitfalls include **hallucination** (the LLM invents a non‑existent tool) and **poor tool selection** (choosing the wrong service, e.g., web instead of a database), problems that persist even with well‑designed tool sets.
- While many open‑source frameworks simplify defining search and web tools, careful design and validation are essential to mitigate these errors and ensure reliable research‑oriented LLM behavior.
Sections
- Search as Core of Tool‑Calling - The passage explains that effective research in multi‑agent systems hinges on a central search step, requiring LLMs to use tool‑calling mechanisms that invoke external APIs or databases to retrieve up‑to‑date information.
- LLM Tool-Calling Pitfalls - The speaker explains the typical request‑routing flow for LLMs and then points out major drawbacks of current tool‑calling approaches—hallucinated tool names, poor tool selection (choosing the wrong database or API), and the added complexity that can cause incorrect or missing data.
- MCP: Uniform AI Integration - The speaker explains the Model Context Protocol (MCP), a standardized client‑server framework that lets LLMs reliably connect to external tools and knowledge sources, simplifying integration, enabling plug‑and‑play connectivity, and improving trustworthiness.
- Standardized Search with MCP - The speaker highlights how protocols like MCP are rapidly streamlining the integration and scaling of search tools for developers and data scientists, stressing that thoughtful review of search strategies is essential to fully exploit AI-driven research.
Full Transcript
# Search‑Driven Tool Calling in LLMs **Source:** [https://www.youtube.com/watch?v=pUUzXimhUuA](https://www.youtube.com/watch?v=pUUzXimhUuA) **Duration:** 00:09:55 ## Summary - Effective research hinges on search, so multi‑agent systems must embed a robust search step to gather and refine information before answering. - Large language models (LLMs) cannot retrieve real‑time data on their own; they rely on **tool calling**, where the LLM requests external services (web, databases, search APIs) defined as named tools with input specifications. - In a tool‑calling workflow, the LLM sends a message plus a tool name to an application, which routes the request to the appropriate service, returns the results, and the LLM incorporates them into its response. - Common pitfalls include **hallucination** (the LLM invents a non‑existent tool) and **poor tool selection** (choosing the wrong service, e.g., web instead of a database), problems that persist even with well‑designed tool sets. - While many open‑source frameworks simplify defining search and web tools, careful design and validation are essential to mitigate these errors and ensure reliable research‑oriented LLM behavior. ## Sections - [00:00:00](https://www.youtube.com/watch?v=pUUzXimhUuA&t=0s) **Search as Core of Tool‑Calling** - The passage explains that effective research in multi‑agent systems hinges on a central search step, requiring LLMs to use tool‑calling mechanisms that invoke external APIs or databases to retrieve up‑to‑date information. - [00:03:11](https://www.youtube.com/watch?v=pUUzXimhUuA&t=191s) **LLM Tool-Calling Pitfalls** - The speaker explains the typical request‑routing flow for LLMs and then points out major drawbacks of current tool‑calling approaches—hallucinated tool names, poor tool selection (choosing the wrong database or API), and the added complexity that can cause incorrect or missing data. - [00:06:16](https://www.youtube.com/watch?v=pUUzXimhUuA&t=376s) **MCP: Uniform AI Integration** - The speaker explains the Model Context Protocol (MCP), a standardized client‑server framework that lets LLMs reliably connect to external tools and knowledge sources, simplifying integration, enabling plug‑and‑play connectivity, and improving trustworthiness. - [00:09:30](https://www.youtube.com/watch?v=pUUzXimhUuA&t=570s) **Standardized Search with MCP** - The speaker highlights how protocols like MCP are rapidly streamlining the integration and scaling of search tools for developers and data scientists, stressing that thoughtful review of search strategies is essential to fully exploit AI-driven research. ## Full Transcript
You literally can't spell the word research.
Without the word search.
And in a multi-agentic system
that imitates the way that we naturally perform research,
where we start by defining our research objective,
where we make a plan.
Define plan.
Where we gather the information
and perform a search over different data sets, where we
refine.
Based on the data that we gather
and in light of the plan, and finally generate or respond
with an answer.
You can see that search
is actually at the heart of this process.
And so careful consideration must be given to its implementation.
To understand how search agents are implemented today.
We first have to revisit the idea of tool calling.
Now remember,
LLMs by themselves can't search.
They're not connected to the internet.
They're not connected to databases or search APIs,
so they don't have access to retrieve real-time relevant information.
So tool calling is the process
by which the LLM invokes these services.
Whether it's web,
or a database,
or an API.
By virtue of an application
that exposes these services as tools.
Where each of these tools will have a name,
they'll have a definition,
a tool definition.
And of course, they'll have some definition
around the expected input.
So tool calling the LLM
calls this application.
And then this application connects to these services.
And you can think of this database for example.
And this search API as
the way that you could define or provide
custom knowledge sources to your tool
calling agent or tool calling LLM.
Now how the LLM
does this is it sends in response to a query,
it sends a message
and also a tool name to the application.
Then the application routes this request formats and routes this request
to the appropriate service comes back
with data to the LLM.
And then the LLM uses that data to generate an answer.
Where in the
context of our research system will be
here's what I found in response to this query or that query.
Now tool calling is very common today.
And several open source frameworks make it easy
to define these search tools.
And web tools in your app.
But there are a couple of issues with this approach.
So for example,
this LLM can hallucinate.
So in response to a query, when it generates this message
and this tool, it could just make up a tool name for a tool that doesn't exist.
And so no tool
no data.
The next thing that could happen
is what I like to call poor selection.
This may also be a function of the LLM
where some LLMs are better at others at function calling.
And so the LLM might actually choose the wrong tool,
for a particular query.
So it may go to the web in order to fetch information when actually
it should have gone to a database or a search
API, for a particular query.
But other times it may be out of the LLMs control.
So imagine if you
had a situation where you had multiple databases
and multiple search APIs defined in your application.
Then even the best LLM for the job
could accidentally choose the wrong tool.
And therefore get wrong data.
And then one last one
that is a drawback for
this approach is complexity.
What do I mean by that?
Well, as a data scientist or developer
you are responsible to define this application.
And so if the service provider changes some of the APIs
that are underlying to your tool definition in your application,
then it caused the whole application to break, therefore
causing your research agent to break.
And so the complexity is around
app maintenance.
Now enter
model context protocol.
Or MCP for short.
Think of MCP
as a standardized connector for LLMs.
Much like how Rest standardized
how we make calls to web APIs.
Just as Rest offered a uniform way to interact with web services.
MCP provides a consistent method for integrating
AI models with external tools and knowledge sources.
So with MCP, you have
the concept of a MCP client.
Where your LLM interacts
with the external services
through the client.
Then you have the concept of a MCP server.
And this MCP server connects
to external services.
Now the kicker here
is that the MCP server
that connects to the external services
is managed by the service provider.
And then on top of this,
the LLM connects through
the client.
It simplifies
the integration that we have over here in several key ways.
So the first is that you have a uniform interface.
Developers no longer need to create custom integrations for each tool.
MCP provides a standardized method
for connecting to various external services.
It's also plug and play.
So similar to how usb-c offers a universal connectivity for multiple
devices. MCP allows AI models
to connect seamlessly and to switch between services, and that uses or
takes advantage of, the protocol between the client and the server
called reflection.
And then this all leads to trustworthiness.
With MCP, LLMs are less likely to hallucinate
or choose the wrong tool, because now there's a standard way
and protocol to parse all the tools
via the client server connection.
So search is evolving rapidly
with standardized protocols like MCP.
Integrating and scaling search capabilities becomes
far less burdensome for developers and data scientists.
So whether you're building and optimizing or simply using these systems,
carefully reviewing your search strategies will be key
to unlocking the full potential of AI driven research.