Building a Watsonx.ai Multi‑Agent System
Key Points
- Multi‑agent systems can be built by “react prompting” a vanilla LLM into an autonomous agent, and chaining several specialist agents together to automate complex tasks.
- The tutorial uses the CrewAI framework (importing `Crew`, `Task`, and `Agent`) to orchestrate the agents and equips them with external tools like the Serper Dev Tool for web search and other file‑type integrations (CSV, PDF, GitHub, etc.).
- Watsonx.ai is accessed through the `watsonx LLM` class from `langchain_IBM`, with API credentials supplied via environment variables (`os.environ`) so the agents can query the IBM model.
- API keys for both Watsonx and the Serper Dev Tool are set up in the script, enabling the multi‑agent system to retrieve information from the internet and interact with the LLM service securely.
- Two separate Watsonx LLM instances are instantiated (each requiring a model ID and endpoint URL), laying the groundwork for a team of agents that can reason and complete tasks collaboratively within a 15‑minute demo.
Sections
- Building Multi‑Agent System with Watsonx - The speaker outlines a quick 15‑minute tutorial on creating a Watsonx.ai‑powered multi‑agent system using the CrewAI framework, importing necessary dependencies and tool integrations to ready the LLMs for task automation.
- Configuring LLM Parameters in watsonx.ai - The speaker demonstrates how to create a parameters dictionary, set the decoding method to greedy, define a max‑new‑tokens limit, specify the Meta Llama‑3‑70B‑instruct model ID, and paste the appropriate endpoint URL to run the model in watsonx.ai.
- Creating a Function-Calling LLM - The speaker walks through adding a second LLM for function calling (using IBM's merlinite‑7b) and assembling a researcher agent that utilizes both the base and function‑calling models.
- Setting Up Automated Research Crew - The speaker outlines configuring a researcher agent, defining a task with a bullet‑point summary output, assembling these into a crew object, and executing the crew to run the research, with the code to be shared on GitHub.
- Creating a Second Speech‑Writer Agent - The speaker demonstrates duplicating an existing agent, stripping internet access, and reconfiguring it as a senior speech‑writer with a tailored role, goal, and backstory.
- Debugging Agent Switch for Speech Writing - The team discovers they’re still using the researcher agent instead of the writer, quickly fixes the bug, restarts the timer, and hands off to the senior speech‑writer agent to draft a keynote.
Full Transcript
# Building a Watsonx.ai Multi‑Agent System **Source:** [https://www.youtube.com/watch?v=gUrENDkPw_k](https://www.youtube.com/watch?v=gUrENDkPw_k) **Duration:** 00:19:05 ## Summary - Multi‑agent systems can be built by “react prompting” a vanilla LLM into an autonomous agent, and chaining several specialist agents together to automate complex tasks. - The tutorial uses the CrewAI framework (importing `Crew`, `Task`, and `Agent`) to orchestrate the agents and equips them with external tools like the Serper Dev Tool for web search and other file‑type integrations (CSV, PDF, GitHub, etc.). - Watsonx.ai is accessed through the `watsonx LLM` class from `langchain_IBM`, with API credentials supplied via environment variables (`os.environ`) so the agents can query the IBM model. - API keys for both Watsonx and the Serper Dev Tool are set up in the script, enabling the multi‑agent system to retrieve information from the internet and interact with the LLM service securely. - Two separate Watsonx LLM instances are instantiated (each requiring a model ID and endpoint URL), laying the groundwork for a team of agents that can reason and complete tasks collaboratively within a 15‑minute demo. ## Sections - [00:00:00](https://www.youtube.com/watch?v=gUrENDkPw_k&t=0s) **Building Multi‑Agent System with Watsonx** - The speaker outlines a quick 15‑minute tutorial on creating a Watsonx.ai‑powered multi‑agent system using the CrewAI framework, importing necessary dependencies and tool integrations to ready the LLMs for task automation. - [00:03:09](https://www.youtube.com/watch?v=gUrENDkPw_k&t=189s) **Configuring LLM Parameters in watsonx.ai** - The speaker demonstrates how to create a parameters dictionary, set the decoding method to greedy, define a max‑new‑tokens limit, specify the Meta Llama‑3‑70B‑instruct model ID, and paste the appropriate endpoint URL to run the model in watsonx.ai. - [00:06:18](https://www.youtube.com/watch?v=gUrENDkPw_k&t=378s) **Creating a Function-Calling LLM** - The speaker walks through adding a second LLM for function calling (using IBM's merlinite‑7b) and assembling a researcher agent that utilizes both the base and function‑calling models. - [00:09:46](https://www.youtube.com/watch?v=gUrENDkPw_k&t=586s) **Setting Up Automated Research Crew** - The speaker outlines configuring a researcher agent, defining a task with a bullet‑point summary output, assembling these into a crew object, and executing the crew to run the research, with the code to be shared on GitHub. - [00:12:59](https://www.youtube.com/watch?v=gUrENDkPw_k&t=779s) **Creating a Second Speech‑Writer Agent** - The speaker demonstrates duplicating an existing agent, stripping internet access, and reconfiguring it as a senior speech‑writer with a tailored role, goal, and backstory. - [00:16:04](https://www.youtube.com/watch?v=gUrENDkPw_k&t=964s) **Debugging Agent Switch for Speech Writing** - The team discovers they’re still using the researcher agent instead of the writer, quickly fixes the bug, restarts the timer, and hands off to the senior speech‑writer agent to draft a keynote. ## Full Transcript
this is how to build a multi-agent system with watsonx.ai. AI multi-agent systems work
using react prompting they turn your vanilla LLM into an agent,
a system that can reason and complete tasks. Now if we put a bunch of them together you
can build up a team of specialist agents to automate tasks we're going to do that
in exactly 15 minutes but is this possible with all LLMs and is there a framework that can help?
We'll get to that but for now I've broken it down into three
steps and it begins with getting our LLMs ready.
All righty guys we have we' got 15 minutes on the timer let's kick things off so the
first thing that we need to do is import a number of dependencies
so we're going to be using crewAI it's a framework that allows you to do multi-agent
systems or build multi-agent systems so we're going to say from crewAI we are
going to be importing the crew task and agent and then we also want to give our
multi-agent system access to some tools so we're going to say from crewAI_tools
import Serper Dev Tool. Now the cool thing is that there's a number of different tools as
well right so if you wanted to use CSVs or docx files or GitHub or Json or MDX or PDF
you got the ability to do that as well let me know if we uh maybe do that in another video
okay then in order to build our LLMs and interact with our LLMs we're going to be
using watsonx so we need an interface to watsonx so we're going to say from langchain_IBM we're
going to import watsonx LLM so that's going to allow us to build our LLM interface and then
we also need OS so this is going to give us the ability to set our API keys for our environment
so we're going to do that first so we're going to say os.environ and then we want to go and set
our watsonx_API key value and we're going to set this to one that I got a little earlier so I'm
going to grab that API key from over there okay and let's zoom out a little bit and then we also
want to set our API key for our Serper Dev tool environment so this is actually going to allow
us to interact with the net so we can set that by running or passing through os.Environ then setting
Serper_API_key and then we're going to set that to the value that you can get from Serper.dev so you
can actually go and register for a free API key and get that up and running we've got a bit of an
issue there so that's because of our quotes. Okay cool, so we've now got our API key set we've got a
couple of dependencies imported we're now going to create the first LLM so we're actually going
to be using two LLMs, I'll get back to that in a sec, so to create our LLM we're going to create an
instance of the watsonx LLM class then we need a couple of parameters or keyword arguments.
So the first one is model ID and we're going to set that to a blank key for now
we also need the URL so because of the way that watsonx is spun up we've got
the ability to run this on a whole bunch of different environments plus we've also
got different multi- region availability, so if you wanted to go and run it in the
US or in Europe or in APAC you've got the ability to do that as well as on premises.
Okay so we've got model ID we've got URL we also
need to specify the parameters for the model so this is going to be how
we actually decode and we also want to be able to pass through our project ID.
So you can actually isolate your machine learning and data science projects using watsonx.ai.
So we're first up going to create a parameters dictionary so this basically dictates how we
actually go and decode from our model so we're going to create a new value called parameters
and set that to a blank dictionary how we're doing on time we got 12 minutes okay so our first value
is going to be decoding method, and we're going to set that equal to greedy. So we're just going
to use the greedy algorithm for now and then we also want to go and specify max new tokens.
So this gives you a little bit of cost control so it dictates
how many new tokens you're going to be generating.
So we're going to grab that parameters value and set that into our LLM,
and then we also need to specify our model ID. So our first LLM that we're
going to be using is llama 370b, so we've got the ability to actually go and check
that you can see that we've got the ability to use llama 370b instruct .
Feels kind of weird I know I've been using llama 2 for so long now so we're going to specify Meta
Llama and then it's going to be llama-3-70b-instruct,
and we also need to go and set the URL.
So we're going to go to over here so we've got the ability to copy this
URL so I'm just going to grab that and then paste this into my URL area
uh save that so it should reorganize it okay so I'm just going to get rid of everything
after the do so we don't need that. Okay so that is our LLM spun up we also need to pass
to our project ID so again that's available in a similar region so it's available from
your watsonx environment so the project ID is just over here so I'm actually got a project ID
I spun up a little bit earlier so I'm going to grab that and paste that in there okay so this
should be our LLM. So we've gone and specified our model ID, so we're going to use Llama 3,
our URL, our primesm and our project ID. Let's actually test it out so
if I run print LLM.invoke and then I pass through a prompt,
So this could be any prompt, but right now this is just a vanilla LLM.
So I could go and say who is Niels Bohr? you can
tell I've been getting into a little bit of physics lately.
So if we go and run this now so the file's called agent.py.
So to run it we can run python agent.py I'm going to pause the timer while we test it out.
We have exact 10 minutes left, okay, all right. Coincidence? Yes.
Okay so if this runs, so I'm going to pause it while it's generating,
becuase that's a little bit out of my control,
and let's see if we get a response to who is Niels Bohr?
Cross my fingers.
Okay, take a look so we've got a response so this is coming from :lama 370b right now.
So we've got a response so "Niels Bohr was a
Danish physicist who made significant contributions
to the development of atomic physics and quantum mechanics he's best known for his model of the
atom which posits that electrons exist in specific energy levels or shells around the nucleus."
Sounds pretty good .Okay so that is our first LLM created.
Now I did mention that we're going to need a second LLM
and that's going to handle our function calling.
So we're going to go and power on ahead. So we've got our LLM set up now we're actually going to go
on ahead create our second LLM and then build our agent. So we're effectively on to part two
alright the time is at 10 minutes. I'm going to kick this off let's go all right.
So we're going to need to just copy this
LLM over here so we're going to create the function calling LLM.
I'm going to speed up just a little bit,
not too much. So we're going to create a new variable called function calling LLM,
and for this LLM we're going to use a slightly
different model so we're going to use the IBM dash mistal AI,
and we're going to use the Merlinite model,
so merlinite dash 7b. Okay cool, so those are our two LLMs now set up.
Now we're going to create the agent create the agent and our first agent
is going to be a researcher but keep in mind eventually you could have a ton of
different agents they could all delegate and interact with each other. So first
agent is going to be a researcher and we're going to create an instance of the agent. So
there's a ton of keyword parameters that you need to pass through here.
The first one is the LLM. So this is going to be the
base LLM that we use to do all of our generation.
We then need a function calling LLM,
and that is going to be be our merlinite model that we just set
up there. So I'm going to paste that keyword argument in there,
and then we also need to specify three additional arguments so there's more than
three so role, goal, and backstory, and I've pre-written those prompts,
so we're going to spin those or pass those through in a second.
We also want to determine whether or not we want to allow delegation.
So eventually, I might do this in another video, but eventually you've got the ability
to allow these agents to hand off to each other and just sort of delegate,
and work out which agent is going to be able to do the task the best. We also
want to specify some tools so we want to give our agent access to the net,
and then the last thing is whether we want it to
be verbose or not. So I'm going to set that to one.
okay, so roll goal back, sorry, so our this basically dictates what our agent is good at.
So our agent is going to be a senior AI researcher,
the goal for that agent is going to be find promising
research in the field of quantum Computing,
and then the last backstory bit is going to be, so this agent got a little bit of a cross
collaboration between AI and Quantum research so our back story is you're
a veteran Quantum Computing researcher with a background in modern physics.
Okay so that's our agent we now need to give it some tools. So tools.
This one's pretty straightforward so we're going to create a tool
called search and that's going to be equal to the Serper dev tool so this
I believe inherits from Langchain it allows you to go and search the net.
We just need to grab that search value and pass it through here.
So that is our first agent now completely set up.
Now we actually want to go and fill this out a little a little bit more.
So we want to give it a task, so create a task,
and a task is effectively what we want our multi-agent system to do.
So our task first task so which is going to call task one, is going to be a task,
and then the task is going to have a couple of things. So we need a description of the task.
We need the expected output. How we doing on time? 7 minutes, okay. I think we're all right.
we're sounding all right. We also want the output file,
so this is actually going to output the results of the task
and then we want the agent that we want to complete that task
okay so the description,
again I've baked these a little bit earlier, so we're going to grab the
description search a net and find five examples of promising AI research.
going to paste that into description and I'll make all of this available via GitHub later
so you guys can grab it as well, and then the expected output is going to be a detailed bullet
point summary on each of the topics each bullet point should cover the topic background why the in
is useful. Our output file is just going to be the name of the output
file that we want to Output. So we're going to say task one output.txt and
the agent that we want is not a string the agent is going to be our researcher.
So which agent that we want to complete the task. Now
we need to pass through these tasks to the crew.
So put it all all together with the crew. So we're
actually going to hand it off to a crew and let the crew run it.
So we're going to say crew is equal to crew and then inside of there we want to specify agents,
and initially we're just going to have one agent but we're going to come back to that in a sec,
and then we also want to specify the tasks that we've got. So right now it's just task one,
and then we want to specify verbose equal to one, and then to we just need to kick it off,
so we're going to run crew or print crew.kickoff.
okay all right. right I'm going to pause it. So let's actually go and run it.
So I'm going to clear this for now, and we're going to run python.agentpy again,
and all things holding equal this should start
doing our research. Let's pause it all right 5 minutes 32 on the clock.
Let's see how we go.
if this works we should start this task so search the internet find five examples of AI research.
You should see it making calls out to the net and ask getting research back.
so you can see that this has generated some action input. So it's saying to use the search function
so search query and then run this command promising AI research and Quantum Computing,
and take a look we've got some responses back from the net.
So anything in orange is what we're actually getting
back from the internet so once it's collaborate or collated everything it needs,
it should generate the bullet point summary of the research that we want.
So let's just let that run for a little sec.
You can see that we've got a bunch of other research coming back there.
And take a look we've got our response back.
So you can see that we've got all of our output down here so
here are five examples a promising AI research.
So it's gone and done all of the research and then it's decided that
it's done so the agent now understood that it's complete because it's gone
through its thought process it's queried as much as it needs to,
and it's generated its response. So down here we've got five examples of
promising AI research in the field of quantum Computing.
So Quantum AI, Quantum machine learning, Quantum optimization,
Quantum RL, Quantum neural networks, and we've got all of the responses.
The cool thing is that it will output it as a text
file if you wanted to go and use it those are all of our bullet points.
You can go and use that I'll include that in the GitHub repo so you can see the responses as well.
Okay so that is I'm actually going to save this as the first example so you guys can grab it
because we're actually going to regenerate this in a second.
So far we've gone and created one agent, one task, right, but this isn't really a
multi-agent system. It's a single agent system.
So what we now need to go in ahead and do is generate our second agent
to be able to go and truly convert this into a multi-agent system. All
right we've got 5 minutes 32 left on the clock can see that let's let's go.
Let bring bring this home. Second agent, okay so what we're actually going to do now let me
zoom out a little bit so you can see that we've got our first agent and task pair.
We're actually going to copy this, and I'm just going to paste it down here again.
So we're now going to create the second agent and it's pretty much the same the only thing is that
we don't need to give our second agent access to the net so we're going to tweak this a little bit.
So let me zoom in. All right so we can get rid of this function calling bit,
get rid of the tools, and this agent is actually going to be a writer.
So we want this agent to actually write some keynote speeches for us.
So we're actually going to pass through a different role, goal, and backstory.
So I'm going to grab that out of here so our second role is a senior speech writer.
So let's say that you're writing speeches for a prominent executive you need to go
and write some speeches so the role is going to be senior speech writer,
the goal is to be right engaging witty Keynotes features from the provided research,
I'm going to paste that in, and then the backstory is going to be your veteran, no it's the this
should be tweaked a little bit, looks like I just went and copied and pasted when I was writing,
so you're a veteran Quantum Computing,
I'm going to say writer, with a background in modern physics,
but again you can go and tweak this a bunch more,
but effectively you could provide a bunch of background on how this person's a writer.
Okay so that's our backstory. Then we want to go and create our second
task. Tweak this and and let me know how you actually go.
All right so we're going to generate our second task here.
So our second task is write an engaging keynote speech on Quantum Computing. so
I'm going to paste this in the description,
and then our expected output is a detailed keynote speech with an intro
a body and a conclusion. So we're going to paste this in over here.
Perfect. And then our second output file is just going to be called task two output.
Okay so that is our second agent, so our second agent is a writer. Our second task
is to get that keynote speech written. So it's almost like a sequential agent right.
so the first agent's going to do the research, find all the research,
second agent's going to step in and then convert it into a keynote speech.
so we can actually go and pass through our second agent,
which is going to be our writer down here.
So this is what gives us the multi-agent feel right and eventually once we allow delegation,
if we did that in the next videol. Then we'd actually
be allowed to or be able to do all of the handoff.
So our tasks so maybe we convert agents into
a series on the channel you let me know in the comments below.
Okay so we've now gone and pass through our second task over here.
Go and pass through our second agent we go and run
this now. We should effectively go through that two-step process
so if we go and run this again, going to clear this,
and we should also get two outputs now, Right?
because our second output file is called task two output. So I'm going to run python agent.py.
I'm going to pause the timer so we've got 2 minutes 43.
This all blows up that's how much time we're going to have left 2
minutes 42 to to to solve this all right so let's let this run and see how we go
okay so you can see it's triggering the chain.
It's doing the research again so again. This is still the first agent,
so you can see that the working agent is the senior AI researcher.
We haven't got to the writer yet.
Okay so we're doing research.
Okay so you can see we've now I know we're still still still doing researcher there.
I thought we finished oh actually no we've got a bug.
Just realized all right I'm going to start the timer again so at 2:41
so I just noticed that we were using the same agent.
So inside of our second task we actually need this
to use the second agent because right now we're just using the base agent.
so we actually want this to be writer okay we're going to we're going to change the agent there,
and then let's kick off our timer again. So I'm going to clear this. All right.
It only took me 10 seconds to resolve. So we're at 2:22 we still got time to debug,
but we're probably going to speed through this so editors speed through this
okay so we're triggering our first bit from our senior AI researcher. Hopefully
once we get to that next debug stage, we're now onto our what is it what
do we call senior speech writer is going to pick up from there.
Okay take a look so we've now successfully gone and handed off to the senior speech
writer so this is our second agent kick in hopefully they just write our keynote speech.
It might be limited cuz we've only given it
500 tokens to write the speech but you sort to get the idea.
Fingers cross we now get a keynote speech written in under 15 minutes.
Not too bad all right take a look that's our keynote speech.
Not too bad. Alright. So we should so we can see that we've gone and
done the research. We've got a bullet point list over here.
Bullet point list over here and then we've got our
keynote speech over here. Here so if we actually go and take a look,
we've got our two outputs so task one output and task two output so this is
the next set of output from the second run so we've got all of our DOT points,
and if we go and take a look at our task two output let's read out a bit of our keynote speech.
Ladies and gentlemen esteemed guests and fellow innovators I'm thrilled to be speaking
with you today, at a revolution, about the revolutionary potential of quantum Computing.
as we stand at the threshold of a new era in technological advancement it is imperative
that we explore the possibilities that Quantum Computing has to offer
I don't know how witty this is, but it is a keynote speech.
In the realm of artificial intelligence Quantum Computing is poised to
unlock unprecedented capabilities Quantum...
I'm going to put this all in the GitHub repo so you can read it yourself but
you can see that we have in fact gone and written a bit of a keynote speech.
So we could definitely take this further but for now that's how
to build a multi-agent system with watsonx.AI
I'll catch you in the next one for