Learning Library

← Back to Library

AI Agents Empower Mainframe Operations

Key Points

  • Combining AI agents with mainframe computing extends simple “Call Home” alerts into proactive, intelligent hardware and workload management.
  • Unlike narrow ML models or static LLMs, AI agents can perceive inputs, make informed decisions, and act—such as rebalancing loads or generating actionable reports.
  • An agent’s “memory” is split into context (the business goal like minimizing downtime or optimizing CPU usage) and knowledge (structured and unstructured data from sources like Call Home and SMF records).
  • By integrating this context and knowledge, agents can invoke appropriate tools to execute complex, context‑aware actions across the enterprise ecosystem.

Full Transcript

# AI Agents Empower Mainframe Operations **Source:** [https://www.youtube.com/watch?v=IRa6XmG8QCQ](https://www.youtube.com/watch?v=IRa6XmG8QCQ) **Duration:** 00:06:43 ## Summary - Combining AI agents with mainframe computing extends simple “Call Home” alerts into proactive, intelligent hardware and workload management. - Unlike narrow ML models or static LLMs, AI agents can perceive inputs, make informed decisions, and act—such as rebalancing loads or generating actionable reports. - An agent’s “memory” is split into context (the business goal like minimizing downtime or optimizing CPU usage) and knowledge (structured and unstructured data from sources like Call Home and SMF records). - By integrating this context and knowledge, agents can invoke appropriate tools to execute complex, context‑aware actions across the enterprise ecosystem. ## Sections - [00:00:00](https://www.youtube.com/watch?v=IRa6XmG8QCQ&t=0s) **AI Agents for Mainframe Proactive Management** - The speaker outlines how integrating AI agents with mainframe “Call Home” systems can transform simple hardware threshold alerts into intelligent, decision‑making tools that predict issues and automate preventative maintenance. - [00:05:56](https://www.youtube.com/watch?v=IRa6XmG8QCQ&t=356s) **AI Enhancing Mainframe Operations** - The speaker urges moving beyond generic AI productivity and fraud‑detection applications toward using AI to automate and simplify day‑to‑day mainframe administration tasks, making SREs’ work more enjoyable. ## Full Transcript
0:00What happens when you combine technology's foundation with its frontier? We're going to look 0:05at how to bring AI agents to mainframe computing. Now, let's look at this 0:11background. We have enterprise systems and the way they're set up, your technology is broken up 0:18into various sysplexes, various environments where you're running your applications, you're running 0:25your business. And in this system, there's a facility for Call Home. And so, Call 0:32Home is going to send events like one where hardware is running hot, or there's an 0:39upcoming plausible problem with a part of your system. And so, the part can get 0:45ready, it can be set up, and you could be notified to say this upcoming problem can be 0:52avoided if we schedule this maintenance ahead of time. This ability to really deal with the 0:59hardware on a proactive basis is a criti ... critical part of 1:05enterprise computing. But this is simple. These are simple events. This is a hardware threshold, 1:12a thermometer threshold, a simple environment. Imagine, let's think what can 1:19we do? Bring in AI agents and we get a whole lot more information. So, what is an AI 1:26agent? Well, the difference between AI agents and previously, LLMs, or even before that, 1:33traditional ML models, is that AI agents can perceive inputs, make informed decisions and 1:40then they act or they generate. And the way that looks against traditional models, like some of 1:46what you were mentioning earlier, would be uh, previously, they were narrow purpose. You know, they 1:51raise a flag or they make a simple prediction, but they can't really do anything about it or handle 1:56the complexity or the context of the business using all these mainframes in their ecosystem. So, 2:02what an agent would do is take some kind of action, and that 2:09could look something like uh, rebalancing loads across different systems, or generating 2:16reports that would enable a sysadmin to look through and take the right action, essentially 2:23a recommendation. And that takes into account multiple types of data. So we're going to look 2:30in ... look at what goes into an agent that enables it to take this action. So the first thing that 2:36we're going to look at, we're going to call it memory. But we're going to break it down into two 2:40subparts. We're going to call it context and knowledge. 2:47So context in ... in this case is going to be the business need for this agent. What exactly is it 2:54trying to optimize? Are we talking about minimizing downtime? Are we talking about 2:59preventing errors or, you know, managing CPU usage as ... as an average? We don't want it to think unidimensionally, 3:05so we want to set a persistent context for the system that enables it to then go 3:12to this next source of information, which is the knowledge or the data that it can get from 3:18systems like Call Home or SMF records. going to look through both unstructured and structured 3:24data, and arrive at an action that is governed by the tools that it has access 3:30to. So tools could be another model that is running on the structured 3:37data and producing a summary or an aggregation based on which it can decide what to do next. So 3:43there are other components in here. For ... for instance, there is the summarization model which could be 3:50its own agent, or there could be a problem identification subpart which could be its own 3:55agent, followed by an action or a recommendation or a remedy component. So that's how all three of 4:02these components feed into the agent's ability to act on our problem. Now, if we take this 4:09environment and we take that first scenario that I talked about, we have very complex environments 4:16with multiple sysplex running in the environment. Each one of these is managed 4:23on its own. You see the performance of each one, you manage the workload of each one. And so we 4:29manage each one independently. But when we think about the entire environment we're 4:35running, we really need to look across all of these. So if we can take this agent technology 4:42and apply it across all of the systems, we can get the information from all of them and make better 4:49decisions across them all. Today, what people do, yeah, they shut off dev test. They just turn it 4:56off and say, sorry, you're out of luck. Well, maybe we don't have to do that. Maybe if we have the 5:02context of what's going on, the current environment, we can make better decisions 5:09with the agent technology to actually maybe not turn it off entirely but just reduce it 5:15appropriately so that we can have a better en ... overall context. And, more 5:22importantly, if we're doing this with agentic AI and it's able to take all of this 5:29information and perform actions, then my system programmers don't have to spend all that 5:36time looking at all this data, processing all this data themselves, going through all 5:43that RMF data and understanding how it's performing. What they can spend the time doing 5:50more fun things like building up new systems or experimenting with new opportunities. 5:56And so really and truly, instead of just focusing on AI as a 6:03general productivity improvement or the most common use case, which is fraud detection 6:10mainframes, which is absolutely wonderful use case But let's take it a step further. Let's make it ... 6:17use in our internal systems, on our mainframes to make our lives better and 6:24make the system programmer or system administrator's life or SRE, whichever way you want 6:29to call it, a lot more fun and remove those manual things that they're having to do today. Bringing 6:36this AI technology into the mainframe really can help make all of our lives 6:43better.