Learning Library

← Back to Library

Build an MCP Server for LLM Tools

14m • Unknown Channel • ai-ml • tutorial • intermediate • Watch on YouTube ↗

Key Points

The Model Context Protocol (MCP), released by Anthropic in November 2024, standardizes how LLM agents communicate with external tools, eliminating the need for duplicated integrations across different frameworks.
Building an MCP server lets you expose any existing API (e.g., a FastAPI employee churn predictor) as a universal tool that any LLM agent can call without custom wrappers.
The tutorial demonstrates that a functional MCP server can be created in under 10 minutes using familiar Python tooling, showing the step‑by‑step process from project setup to endpoint exposure.
MCP works with both paid and open‑source LLMs, and includes built‑in observability features so you can track which agents are invoking which tools.
Once the MCP server is running, the same tool definition can be reused across any client or agent, dramatically simplifying integration and scaling of AI‑driven workflows.

Sections

Full Transcript

# Build an MCP Server for LLM Tools **Source:** [https://www.youtube.com/watch?v=EyYJI8TPIj8](https://www.youtube.com/watch?v=EyYJI8TPIj8) **Duration:** 00:14:53 ## Summary - The Model Context Protocol (MCP), released by Anthropic in November 2024, standardizes how LLM agents communicate with external tools, eliminating the need for duplicated integrations across different frameworks. - Building an MCP server lets you expose any existing API (e.g., a FastAPI employee churn predictor) as a universal tool that any LLM agent can call without custom wrappers. - The tutorial demonstrates that a functional MCP server can be created in under 10 minutes using familiar Python tooling, showing the step‑by‑step process from project setup to endpoint exposure. - MCP works with both paid and open‑source LLMs, and includes built‑in observability features so you can track which agents are invoking which tools. - Once the MCP server is running, the same tool definition can be reused across any client or agent, dramatically simplifying integration and scaling of AI‑driven workflows. ## Sections - [00:00:00](https://www.youtube.com/watch?v=EyYJI8TPIj8&t=0s) **Rapid MCP Server Setup** - A concise walkthrough showing how to create a Model Context Protocol (MCP) server in under ten minutes to standardize LLM tool integration while addressing paid model compatibility, observability, and deployment details. - [00:03:04](https://www.youtube.com/watch?v=EyYJI8TPIj8&t=184s) **Setting Up Virtual Environment and Server** - The speaker creates a Python virtual environment, installs the MCP CLI and requests packages using uv, generates a server.py file, and starts importing the necessary modules to build a FastMCP server. - [00:06:06](https://www.youtube.com/watch?v=EyYJI8TPIj8&t=366s) **Calling Employee Churn Prediction API** - The speaker walks through constructing a payload, extracting data from a list, and making a POST request with appropriate JSON headers to invoke an API that predicts whether an employee will churn. - [00:09:08](https://www.youtube.com/watch?v=EyYJI8TPIj8&t=548s) **Choosing Transport Types for Inspector** - The speaker walks through connecting to the inspector, copying its URL, and explains the difference between STDIO and server‑sent events transport options, guiding users on selecting the appropriate transport for their tool. - [00:12:14](https://www.youtube.com/watch?v=EyYJI8TPIj8&t=734s) **Running LLM for Employee Churn** - The speaker demonstrates configuring and executing an Ollama Granite 3.1 LLM to predict whether an employee will churn, linking sample data, setting server parameters, and running the Python agent within a timed demo. ## Full Transcript

0:00This is how to build an MCP server so you can connect your LLM agents into just about anything. 0:05The model context protocol was released by Anthropic in November, 2024. 0:09It addresses a lot of the issues that have been popping up around agents. 0:13How? 0:14Well, in order for agents to exist, they need tools, right? 0:18But every framework or app or client tends to bring its own way of declaring these tools. 0:23Now, this becomes a pain because 0:25you might find yourself creating integrations repeatedly every time you want to use an AI capability. 0:30This is where MCP comes in. 0:33It standardizes how LLMs talk to tools. 0:36So you can define your tool server once and use it everywhere. 0:40I'm going to show you how to build your own in under 10 minutes, 0:43but does it only work with paid LLM's and how hard is it actually to build? 0:47And what about observability? 0:49Can I track what's using a specific tool? 0:51We'll get to that. 0:52I'm recording this after a big bowl of carbs and without using cursor copilot or my 0:56old mate stack overflow, I'm going to break it down into three straight forward steps. 1:01Phase one, build the server. 1:03Alrighty, so we are gonna go on ahead and build our very own MCP server. 1:07And as usual, we're gonna set a bit of a timer. 1:09So 10 minutes on the clock, let's kick this thing off. 1:13So I've already got, I guess a little bit of background as to what we're going to be building an MCP server for. 1:19So I built a machine learning API in this video where we actually went and deployed it using Fast API. 1:27And you can see here that I've currently got it running locally via this specific endpoint. 1:32Now, if I go and send this particular body, so years at company, so it's predicting employee churn. 1:38So this dictates how many years that particular employee has been at the company, 1:41as well as their satisfaction, their position, whether they're a manager or non-manager, and their salaries. 1:46I've split this up into an ordinal representation between one to five. 1:50So if I send this off, it's gonna predict whether or not the employee is 1:53likely to churn, so you can say down here that we've got a prediction of zero. 1:56If we went and change their employee satisfaction to zero one, you can see they're still not gonna churn. 2:02What if their salary sucked? 2:04So if we send that through, they're not gonna churn. 2:06So maybe if they had less, so maybe they're ultra loyal. 2:09So if change their years at the company, take a look. 2:11So we've now got a one. 2:12So that represents the fact that they are gonna churn. 2:15But how would we convert this into an MCP server so that we can expose it to all of our amazing AI agents? 2:22Well, that's exactly what we're gonna do with our MCP servers. 2:25So. 2:26You can see that I've got my API running here. 2:28So that's just what I've shown using fast API and I'll include a link to that as well. 2:32But for now, we are going to focus on getting our MCP server up and running. 2:37Okay, so we wanna go on ahead and do this. 2:41So, all right, focus. 2:42So we're gonna go UV in it and we're going to create a employee project. 2:47So this is going to now create a folder called employee. 2:50It's got my PI project, thermal file, so on and so forth. 2:53Then what we need to go ahead and a CD into that folder. 2:57So we're now inside it and we want to go ahead and create a virtual environment. 3:02So I'm gonna go UV, V, ENV. 3:04So we've now got a virtual enviroment and then we're actually gonna copy this command to activate it. 3:08Boom, that is our virtual environment now created. 3:10So if we jump in, you can see that we've got our virtual enviornment. 3:13All of our project files were looking good. 3:16Okay, what do we wanna go ahead and do now? 3:18We need to import or install our dependencies. 3:20So we gonna go uv add and we wanna MCP CLI package. 3:27And we also want requests. 3:29Let's make sure I'm not covering that. 3:30So we're actually going to be using the model context protocol. 3:33So this is our big library over here, which is going to allow us to do all of this amazing stuff. 3:38And there's a whole bunch of information about how this actually works. 3:41We're mainly going to using the Python SDK. 3:44Okay, so that is, let's run that install. 3:47Perfect, we're now installed. 3:48All right, we can clear that. 3:50So if we jump into, we also want to create a server file. 3:52So I'm gonna go touch server.py, beautiful. 3:56Okay, so if we jump in here now, we should have a server file. 3:59Okay, that is the beginnings of our server. 4:02Very basic at the moment, but we need to go on ahead and build this out. 4:05So the first thing that we're gonna do is we're going to import our dependencies. 4:09Oh God, the time. 4:11How's that time? 4:12Oh my Lord, seven minutes. 4:14Okay, we are gonna need to punch it. 4:15So we're to go from mcp.server.fastmcp. 4:22we are going to import fast MCP. 4:26And then we need to import a bunch of other stuff. 4:28So we're gonna import JSONs. 4:29We're gonna use this. 4:29So this fast MCP class is going to be like the crux of our entire server. 4:34So you'll see when I instantiate that in a second. 4:36Then we want JSON, we're going to use that for some parsing later on. 4:38We're going import requests to actually make a request out to this API. 4:42And then what do we want? 4:44We need a little bit of typing assistance. 4:46So we gonna go from typing, we're go to import list because that's how we're to pass the input 4:51from that agent. 4:52Okay, those are our dependencies now done. 4:53We're then gonna create our server. 4:55So I'm gonna say MCP is equal to fastMCP and we're gonna call it churn and burn. 5:02sort of in alignment with my desktop, right? 5:05Okay, so then, so that's our server created, server created. 5:10And then we wanna create a tool. 5:12So create the tool, 5:14and there's different resource types, right, or different capabilities that you are able to build inside of your MCP server. 5:21So you've got the ability, let me jump back. 5:23You've got ability over down here. 5:25So you can build resources, prompts, tools, you can handle sampling, transports. 5:29We'll talk about that a little bit later. 5:32Okay, so we are going to create a decorator. 5:34So we're going to mcp.tool. 5:36So this is going to wrap our function that's going to call out to our end point. 5:40So then we're gonna create that specific tool. 5:43So I'm gonna call it predict churn and it's going return a string and we now need to handle the data that we're to take in. 5:52So we are gonna take in a argument called data. 5:55It's going be a list of dictionaries. 5:58Okay, that's beautiful. 5:59So then what we actually need to do is define a docstring. 6:02Now I've gone and written this a little bit earlier. 6:05So this is what we're gonna paste in. 6:06So I'm gonna copy that and then let's read it for a sec, time permitting. 6:11Okay, so this tool predicts whether an employee will churn or not pass through the input as a list of samples. 6:16So the arguments, so data employee attributes which are used for inference, 6:19the example payload, and you can see I've got a list. 6:22wrapped or a dictionary wrapped in a list. 6:24And it takes in the exact same variables that we had inside of postman where I was over here. 6:29Here's a company employee sat position and salary. 6:32Here's the company employees sat position and salary, and it's gonna return either one churn or zero no churn. 6:37Okay, so that's a doc string now created. 6:40Now we wanna go on ahead and handle this. 6:43So we're gonna create a variable for our payload and we're just gonna grab the first. 6:48value that we have inside of our list, right? 6:50So we're just accessing this excluding the list. 6:54Okay. 6:54Then we need to make a call out to our API for minutes. 6:58Okay. 6:58This is not looking good. 6:59So we gonna go request.post cause remember over here, we're making a post 7:04request and then we are gonna send it actually out to that URL. 7:08But if you had an external URL, you'd be going out to, let's paste in there. 7:12We need some headers and our headers, oh, okay. 7:17I should have practiced some typing this morning. 7:20We're going to accept CCEPT application forward slash JSON. 7:28And we are going to specify the content type. 7:32Should have toggled word wrap. 7:36There we go. All right. 7:36Content type is going to be application /JSON, 7:41and then we want to pass through our data, which going to be a JSON dumped 7:48Payload. 7:50Okay, beautiful. 7:51All right, so that's our response now set up. 7:53Then what we're gonna do is we're going to return response.json once the user calls out to this. 7:59Okay, so, that is our tool now created. 8:00Now, we just basically need to specify what we do when it gets called out. 8:04Actually, if name equals main, we are going to run our MCP server. 8:12So, MCP.run and then our transport. 8:16is going to equal STDIO, so standard input output. 8:20All right, I'm pausing the timer. 8:21All right. So we've got 8:22two minutes and 47 seconds left, but that is our server now created. 8:28So we're good to go. 8:29All right so we've knocked off phase one and we've the server up and running, but how do we actually test that it's working? 8:35Can we actually get to our tool? 8:37Well, this brings us to phase two, testing out the server. 8:41Okay, we're back. 8:42So we've gone and created that server, but we haven't really tested it out yet. 8:45So how do we go about doing this? 8:47Well, I've got two minutes, 47 seconds left on the timer. 8:51So let's keep this up. 8:51So let me show you how to do this. 8:53Okay, so we are currently inside of the employee folder. 8:56We want to start off the dev server. 8:58So this is going to give us access to the MCP inspector. 9:01We can actually test out our tools. 9:02We can go UV run MCP dev server.py. 9:07This should start off the inspector. 9:08I'm gonna pause it. 9:09If we successfully get the inspector up. 9:10Okay, that's our inspector up and running. 9:13I can copy this URL here. 9:15I'm going to go to a browser. 9:18Nope, no, go back. 9:20I'm not going to copy this again. 9:23I'm now going to paste that in, beautiful. 9:24All right, that time inspector. 9:25So if I connect down here, then if I go to tools up here, then if go to list tools, that is our tool now running. 9:33I'm go to pause it, all right. 9:34We're good. 9:34We've got two minutes left. 9:35All right, but let me sort of show you, right? 9:37So over here, we've got our transport type. 9:39So there's two different transport types available. 9:41So there standard input output. 9:43There is also a SSE, which is server-sent events. 9:46So this is more important. 9:47So you probably use a standard input-output when you're just connecting with local files or local tools. 9:52When you're doing something more client-server related, you're probably more default over to server-sent events. 9:57We're using... 9:58STDIO or standard input output as dictated based on the fact that right down here under transport, we are specifying that. 10:07Tools like Cursor can handle SSE and STDIO, I think for desktop that uses STDIO only. 10:15The capability that we're gonna use in a second is we're going to do that with STDIO when it comes to using our agent. 10:21Okay, enough of me blabbing. 10:22So let's jump over to our inspector. 10:24So to use our inspector, you just make sure you specify the right transport type, the right command. 10:29and the right argument. 10:30And if you hit connect, you can see that we're connected. 10:33Remember how I said there's different capabilities that you can pass through in your MCP server. 10:37They're all up here. 10:38We're just interested in our tool. 10:39And if we hit predict churn, we can actually go and test this out. 10:42So if I switch to JSON over here, we can go and pass through an object. 10:46So again, I've got one nicely formatted. 10:49So we're gonna copy this over, chuck that in here. 10:53All right, drum roll, please. 10:55So now if we go and hit run tool, take a look. 10:58We've got our prediction. 10:59So we've successfully gone and determined that this particular employee with these particular values will not churn. 11:07Now, if we went and changed it up, so I'm just gonna change it in here because you get this weird thing, not ideal. 11:11So when I go and delete value, so like, let's say I deleted this, you can see we're getting errors. 11:15So it's running through syntax formatting while I'm trying to edit. 11:19So we just do that here. 11:21So let's they had not that many years of the company, they weren't 11:24all that satisfied and their salary was in the lower quadrant or 11:30what is it, fifth? 11:31So if I paste that in now, take a look, that particular person will churn. 11:35Okay, that is our tool now successfully running and our MCP server successfully working. 11:40Right, so we've now established that our server actually works. 11:43We've made a call to it and we're able to get a prediction back as 11:46to whether or not somebody's likely to churn or not churn, how do we bring this into an agent? 11:51Well, that brings us to phase three, adding it into an agent. 11:55Alrighty, last thing we got to do, so I've got two minutes. 11:57Thanks for watching! 11:58Let me bring that back up. 11:59Two minutes left on the timer. 12:01All right, so what we now need to go ahead and do is integrate this into our agent. 12:05So I've gone and pre-written up an agent using the BeeAI framework 12:11and I'll make this available via GitHub so you'll be able to test that out. 12:15And this is all built using, let me show you the LLM. 12:19So we're using Ollama, we're specifically using the Granite 3.1, dense eight 12:23billion parameter model, which is gonna be trained by an amazing research team. 12:26So we are going to go on ahead and use this. 12:29Now, right down the bottom, you can see that I've got this question. 12:32So will this particular employee churn? 12:34And I've gotten my employee sample over here. 12:36So I've go the years of the company, the employees sat, the position and their salary. 12:41So hopefully, fingers crossed, we can send this over. 12:43So we were running out of time. 12:44Got a minute 14 left. 12:45Okay, let's punch this out. 12:46So we need to go over to here. 12:48We've got our standard input output server params. 12:51So we needed to pass through our command. 12:53So our command is going to be UV. 12:55And then we are going to run, we need pass through directory to specify where our server actually is. 13:01And then I just want to go and grab the file path to our server. 13:05So I'm going to copy this, copy path boom, and then back into our agent. 13:10And then, I'm gonna paste that in here 13:13and then we need to go and run our command and we are going to run 13:17server.py and just over here, I just want to get rid of server. 13:21Okay, so now if I go and ran this, let me just go to another terminal, beautiful. 13:25And if I got run Python single flow agent, okay. We've got 30 seconds left. 13:31Let's see how we go. 13:32Drum roll, please. 13:34Take a look. 13:35There we go, I'm going to pause it. 13:37We had 22 seconds left, not too bad. 13:39Okay, sorry. 13:40And let's quickly run through this. 13:42So right down here, you can see that we've got a thought from our agent. 13:46So the user wants to know if the employee will churn based on their attributes. 13:49I need to use the predict churn tool. 13:50And then right down, here, we've managed to get a prediction. 13:54So over here, we've a prediction of one indicating that the employee should churn. 13:58So what does our agent said? 13:59This employee is predicted to churn, so we've successfully gone 14:03and built out our MCP server and integrated it into an agent. 14:07A lmost forgot observability. 14:09All you need to do is import logging and add in this line here. 14:12And this will give you the ability to see every tool call in your server logs. 14:17And in the interest of interoperability, just to prove that this MCP server could be used elsewhere, 14:22I can add it to, for example, cursor by using the following command, 14:26which is effectively just the command that will pass through to our agent, then I can open up a chat, 14:31paste in an example to whether or not this particular person will churn, then convert to agent mode and hit send. 14:39Then I should be able to run the tool by opening up the example and hitting run tool, 14:46and if I scroll on down, you'll see that 14:48I've got a prediction of one, which indicates that that particular employee is going to churn. 14:52Same server, MCP everywhere.