Learning Library

← Back to Library

Key Layers of the AI Stack

Key Points

  • Building successful AI applications requires thinking about the entire AI stack—model, infrastructure, data, orchestration, and application layers—rather than just picking a powerful model.
  • The infrastructure layer matters because large language models often need GPU‑accelerated hardware, which can be provisioned on‑premises, via cloud services, or through hybrid solutions, and the choice impacts cost and scalability.
  • A dedicated data layer is essential to supplement model knowledge (e.g., recent scientific papers) since pre‑trained models have fixed knowledge cut‑offs, and the quality of this supplemental data directly affects solution relevance.
  • Orchestration and application layers coordinate complex workflows (splitting queries, fetching data, summarizing, reviewing) and define user interfaces and integrations, influencing the system’s speed, safety, and overall user experience.

Full Transcript

# Key Layers of the AI Stack **Source:** [https://www.youtube.com/watch?v=RRKwmeyIc24](https://www.youtube.com/watch?v=RRKwmeyIc24) **Duration:** 00:08:53 ## Summary - Building successful AI applications requires thinking about the entire AI stack—model, infrastructure, data, orchestration, and application layers—rather than just picking a powerful model. - The infrastructure layer matters because large language models often need GPU‑accelerated hardware, which can be provisioned on‑premises, via cloud services, or through hybrid solutions, and the choice impacts cost and scalability. - A dedicated data layer is essential to supplement model knowledge (e.g., recent scientific papers) since pre‑trained models have fixed knowledge cut‑offs, and the quality of this supplemental data directly affects solution relevance. - Orchestration and application layers coordinate complex workflows (splitting queries, fetching data, summarizing, reviewing) and define user interfaces and integrations, influencing the system’s speed, safety, and overall user experience. ## Sections - [00:00:00](https://www.youtube.com/watch?v=RRKwmeyIc24&t=0s) **Building AI Stack for Real‑World Apps** - The speaker outlines the essential layers—model, infrastructure, data, and orchestration—required to turn large language models into practical, problem‑solving applications, illustrated by an AI tool for drug‑discovery research. - [00:03:12](https://www.youtube.com/watch?v=RRKwmeyIc24&t=192s) **AI Stack: Deployment, Models, Data** - The speaker outlines the three layers of the AI stack—deployment options (cloud vs. local), model selection (open vs. proprietary, size, specialization), and data sources with processing pipelines—to help AI builders match infrastructure to their needs. - [00:06:56](https://www.youtube.com/watch?v=RRKwmeyIc24&t=416s) **Application Layer: Interfaces and Integrations** - The speaker explains that the AI application layer adds usability through varied interfaces (text, image, audio, data) with features like revisions and citations, and through integrations that allow AI inputs and outputs to seamlessly interact with users' existing tools. ## Full Transcript
0:00Whether you're building an experimental prototype for your own personal use, or creating an 0:05application to power an entire organization, there are key components of the AI technology stack 0:12that you must get right to build AI systems that can do more than just generate answers but solve 0:17real, meaningful problems. Say, for instance, I'm building an AI-powered application to help drug 0:22discovery researchers understand and analyze the latest scientific papers in their domain. Maybe it 0:28starts with a model that I recently heard about that is supposed to be better 0:35at highly complex tasks like that of a PhD researcher. Model is an important layer of the 0:41stack, but it's just one piece of the puzzle. There's also the infrastructure that that model 0:48will run on, because not all LLMs, large language models, can run on 0:54standard enterprise CPU-based servers, and not all are small enough to run on a laptop. So it 1:01matters what infrastructure you have access to and how you choose to deploy it. Next is data 1:08because in this example, the whole point is to help scientists understand the latest papers in 1:14their field. And models typically have a knowledge cutoff date. So if we want to talk about papers 1:20from, say, the past three months, that means we have to provide the AI system with extra data. 1:26That will be the data layer. Next would be the orchestration layer. Because to do 1:33a complex task like this probably is going to require more than simply providing a large prompt 1:40into the AI system and getting an output, a single output, out. Instead, we'll want to break that user 1:45query up into different parts. Help um, plan how the AI solution is going to actually 1:53tackle this problem, what data it needs, and then do the summarization and creating an answer and 1:59maybe even review that answer. Finally is the application layer. And this is because at the 2:06end of the day, there's a user using this tool. So there will have to be an interface that defines 2:11what the inputs will be and what the outputs will be. It might not be as simple as text in and text 2:16out. And there's also the issue of integrations. So, will the actual results of this be something 2:21that's integrated into other tools that this user uses? It's important to understand 2:28all the layers of the AI stack, whether you're building a solution from scratch or using 2:34solutions which might manage several of these layers for you as a service. This is because 2:39across the stack, from the hardware, all the way up to the user interface level, the choices you make 2:45will have important implications on your solution's quality, itsspeed, its cost and its 2:51safety. When it comes to infrastructure, LLMs generally require AI-specific hardware, 2:58specifically GPUs, and these can be deployed in one of three ways. The first would be on premise, 3:05that is, assuming you have the means and resource to buy this kind of infrastructure yourself. 3:12Second option would be cloud, and that would allow you to rent this capacity and be able to scale it 3:18up or down as needed. Finally would be local, which usually means on your 3:25laptop. Not all lap ... laptops can support LLMs of different sizes, but there are certainly LLMs on 3:32the smaller end of the range that can be run on the kind of GPUs available in a standard laptop. 3:37The next layer is models. So AI builders have plenty of choice when it comes to what model they 3:44can use. One dimension to consider is whether the model is open 3:51versus proprietary. Another dimension is the model 3:57size. So we have large language models; we also have small language models that might 4:04be lighter weight and able to fit on more lightweight hardware,uh, but might not have exactly 4:11the same thinking capacity as a large language model and instead be specialized for more 4:15specific things. Finally is specialization. 4:26Which sometimes goes hand in hand with size. Some models might perform better on things like 4:31reasoning or tool calling or generating code. Others might have different language strengths 4:37than others. There are plenty of new models over 2 million already in model 4:44catalogs, like Hugging Face that can serve any mix of these different needs that an AI builder might 4:50have. The next layer of the stack is data. This breaks up into a few different components, 4:57so the first would be data sources themselves to supplement the model's knowledge. 5:05This could also include the pipelines to do any processing, 5:12pre-processing, post-processing of that data, as well as vector 5:18databases you may use. Or retrieval systems, 5:25also known as RAG. Vector databases is the step where that external data is actually vectorized 5:32into embeddings that are saved so your model can retrieve that context more quickly and augment it 5:37with this additional knowledge that the base model does not have. That's important because base 5:42models are usually trained on publicly available information, which might not always be complete to 5:47accomplish the task that you have. You might need to supplement with additional data. The next layer 5:53is orchestration, because building an AI system that does something more complex than 6:00just generating text or answering questions requires breaking the initial user input down 6:06into smaller tasks. Those can start with things like thinking, 6:13using the model's reasoning ability to plan out how it will tackle the problem. That 6:20can also include things like execution, where the model does tool calling 6:27or function calling, as well as steps like 6:33reviewing, where an LLM can actually provide its own critique of the initial generated 6:40responses and initiate feedback loops to even improve those responses. This layer 6:47is very quickly evolving, with new protocols like MCP and new architectures for how to best 6:53orchestrate increasingly complex tasks. 6:56Next is the application layer, so the most widely used AI systems do follow a pretty simple design of text in and text 7:06out. But as we use these tools in our work in life, there are important features that become critical 7:12for the actual usability of AI and these factors make up the application layer. 7:17First factor is interfaces. 7:23The most classic interface is text in and text out, 7:28but there are other modalities that can be very valuable for certain tasks too, like image, audio, 7:38numerical data sets, and plenty of other custom data formats. 7:44Also, in the interface, 7:45it's really important to keep in mind the ability to do things like revisions or 7:51citations so that when the user sees what the model comes up with, they have the ability to edit 7:58that or inquire on it further. The second consideration is integrations, and that 8:05comes both in the form of integrations, of allowing other tools that the user uses to 8:12actually send inputs to the AI system, or to take the 8:19model outputs and automate how that gets integrated into some of the tools that they use 8:25in their day-to-day work. All together, these layers of the AI stack, from the 8:32hardware to the models, the data you use, how you orchestrate it, and the application and the 8:37usability of it, matter because when we have a clear understanding of how they fit together, we 8:43can see what's truly possible and make practical choices to design AI systems that are reliable, 8:49effective and aligned to our real-world needs.