Learning Library

← Back to Library

MACE Framework: Assessing Agentic AI Tools

27m • Unknown Channel • ai-ml • deep-dive • intermediate • Watch on YouTube ↗

Key Points

Manis AAI launched in March 2025 with hype that outpaced its early performance, leading to reliability, cost, and token‑usage complaints until the platform began stabilizing around June‑July.
The speaker highlights a broader challenge in AI: naming and categorising capabilities is difficult because the technology is highly general‑purpose, yet clear terminology is essential for practical work.
To address this, a new “MACE” framework is proposed for evaluating agentic AI tools, comprising four dimensions—Modality, Autonomy, Complexity, and Environment.
The framework’s first two dimensions are explained: Modality (e.g., text, coding, workflow, research, multimodal—Manis falls in the last category) and Autonomy (ranging from reactive prompt‑responses to fully autonomous agents), providing a structured way to discuss and compare AI agents.

Sections

Full Transcript

# MACE Framework: Assessing Agentic AI Tools **Source:** [https://www.youtube.com/watch?v=8m2-WKhidYk](https://www.youtube.com/watch?v=8m2-WKhidYk) **Duration:** 00:27:02 ## Summary - Manis AAI launched in March 2025 with hype that outpaced its early performance, leading to reliability, cost, and token‑usage complaints until the platform began stabilizing around June‑July. - The speaker highlights a broader challenge in AI: naming and categorising capabilities is difficult because the technology is highly general‑purpose, yet clear terminology is essential for practical work. - To address this, a new “MACE” framework is proposed for evaluating agentic AI tools, comprising four dimensions—Modality, Autonomy, Complexity, and Environment. - The framework’s first two dimensions are explained: Modality (e.g., text, coding, workflow, research, multimodal—Manis falls in the last category) and Autonomy (ranging from reactive prompt‑responses to fully autonomous agents), providing a structured way to discuss and compare AI agents. ## Sections - [00:00:00](https://www.youtube.com/watch?v=8m2-WKhidYk&t=0s) **Proposing a Framework for Agentic AI** - After recounting Manis AAI’s rocky launch and reliability issues, the speaker introduces a new framework to name, categorize, and assess agentic AI tools. - [00:03:30](https://www.youtube.com/watch?v=8m2-WKhidYk&t=210s) **Classifying Modern AI Agent Capabilities** - The passage contrasts basic step‑by‑step models like Claude and ChatGPT with more advanced, multi‑agent systems that support sequential, branching, and dynamic replanning tasks across cloud, IDE, and platform runtimes, proposing a framework based on complexity, execution environment, autonomy, and modality to map current AI agents. - [00:06:52](https://www.youtube.com/watch?v=8m2-WKhidYk&t=412s) **Balancing Autonomous AI and Human Collaboration** - It highlights the rise of fully autonomous AI agents like Manis, the challenges of cost and task selection, and the importance of hybrid workflows where humans intervene. - [00:10:50](https://www.youtube.com/watch?v=8m2-WKhidYk&t=650s) **Enterprise AI Agent Challenges** - The speaker outlines the major technical hurdles for scaling AI agents in enterprises—including tool selection and fallback, memory and long‑context management, cross‑modal context handling, token‑cost optimization, and designing robust error‑recovery decision trees. - [00:14:10](https://www.youtube.com/watch?v=8m2-WKhidYk&t=850s) **Scaling Multi-Agent Orchestration Amid Ambiguous Intent** - The speaker discusses the difficulty of interpreting vague user requests while ensuring privacy, compliance, and scalability in enterprise‑level multi‑agent orchestration platforms like Manis. - [00:17:37](https://www.youtube.com/watch?v=8m2-WKhidYk&t=1057s) **Manis: Unique Enterprise Agent Platform** - The speaker explains that Manis stands apart from other AI agents—such as ChatGPT, Claude, and Google offerings—by tackling hard engineering challenges to deliver reliable, scalable, enterprise‑focused intelligence through its MACE framework. - [00:21:31](https://www.youtube.com/watch?v=8m2-WKhidYk&t=1291s) **AI-Driven Process Mapping & Prototyping** - The speaker explains how the AI tool Manis enables operations teams and consultants to rapidly map workflows, generate documentation, and build technical proof‑of‑concepts, delivering fast, low‑cost first drafts that save weeks of manual effort and cut expenses by up to 90 %. - [00:25:17](https://www.youtube.com/watch?v=8m2-WKhidYk&t=1517s) **Manis AI Agent Launch Forecast** - The speaker predicts major AI companies will soon release a Manis‑type agent, highlighting its ability to dramatically cut costs on high‑price specialized tasks and generate new revenue streams for model providers. ## Full Transcript

0:00Manis AAI launched in March of 2025 and 0:03I didn't talk about it very much and the 0:06reason why is that it was another one of 0:08those cases like Devon where the hype 0:11video ran way ahead of what people in 0:14practice were actually able to do. And 0:17so Reddit forums filled up and Twitter 0:19complaint conversation started. And the 0:21long and the short of it was that after 0:23launch in March through about June or 0:25July, there were a lot of issues with 0:28reliability, with cost, with token 0:30consumption clarity. That is starting to 0:33shift. It is shifting enough and the 0:35platform is stabilizing enough that I 0:37think it's worth having a wider 0:39conversation. But before we do that, I 0:42want to talk about what I think is 0:43actually one of the key challenges when 0:46we think and talk about AI and agents in 0:49particular. naming things. It is really 0:52hard to name an AI capability because AI 0:57is such a slippery technology. It's 0:58general purpose. It can do anything. And 1:01so naming and categorizing what these 1:03different things do becomes both really 1:06important to get work done and also not 1:08at all obvious like it's not clear. And 1:11so before we dive into the capabilities 1:12of Manis itself and kind of why I think 1:14the platform is stabilizing and use 1:16cases you can use. I want to take a 1:18second to talk about a proposed 1:20framework for how we assess Agentic AI 1:24tools. As far as I know, we haven't 1:26really had a good framework for this. 1:27That's why I'm proposing one. I want to 1:29go through it. Tell me where it's wrong. 1:31Tell me where it's better. Let's dive 1:33in. I'm calling this the MACE framework. 1:37Mac stands for modality, autonomy, 1:41complexity, and environment. I think 1:43those four things are all dimensions 1:46that we need to assess agentic AI tools 1:49on and that we've really lacked the 1:51language for assessing them on 1:53previously. Let's dive into each of 1:56these. Number one, what is the primary 1:58modality of this tool? And there's at 2:01least five different things you can look 2:02at there. Text agents. Examples of that 2:06would be Claude, Chat, GPT, Gemini. They 2:08generate, they analyze text. Number two, 2:11coding agents. Cursor, GitHub, 2:15Claude artifacts. Number three, workflow 2:17agents. NAN, Zapier, Make, Langchain, 2:22etc. Number four, research agents like 2:27Deep Research or Perplexity. 2:30And number five, multimodal agents. 2:33Mannis falls into that category. 2:36There are probably other primary 2:37modalities, but you get the idea, right? 2:39It's basically what is the primary mode 2:41of this agent becomes a relevant thing. 2:44Number two, a autonomy. What is the 2:48degree of proactive autonomy that this 2:50agent brings to the table? It can be 2:52reactive, so it responds to individual 2:53prompts again like claude or chat GPT in 2:56the text window. It can be interactive, 2:58so it might be multi-turn with human 3:00guidance. You have that sometimes when 3:02uh deep research comes back and asks you 3:04a question. It can be semiautonomous. So 3:08it might execute plans with checkpoints 3:10and an example of that GitHub copilot 3:12workspace right like it will come back 3:14and ask you along the way or it could be 3:17completely autonomous endto-end 3:18execution very minimal intervention uh 3:20and manis and devon are both in that 3:22category. All right. What is the C in 3:24mace? Complexity. Complexity handling. 3:27It can handle simple tasks step by step. 3:30Some of the non-reasoning models with 3:31claude and chat GPT fall in this 3:33category. It can handle sequential 3:35multistep. I would argue that claude 3:38code is a good example of sequential 3:39multistep. It might handle branching 3:42which is more complex. Good naden 3:44workflows will do that. Or it might do 3:46dynamic replanning based on the results 3:49of what it sees. Manis does that and 3:51more advanced agent configurations can 3:53do that as well. You can set up claude 3:55with multiple agents to do that in cloud 3:57code for example. 3:59What is the E execution execution 4:02environment? Is it cloud contained? So 4:05it runs on the provider sandbox 4:08cloud both do that in their application 4:10interfaces. Is it integrated into your 4:12IDE? So it works within the development 4:14environment inside cursor for example. 4:16Is it platform hosted with dedicated 4:19agent runtime? NADN can be that. It 4:21doesn't have to be that depending on how 4:23you configure it. Is it infrastructure 4:24spanning? Can it deploy or access 4:26different external systems and use 4:29complex tools? Manis can do that. You 4:31can configure cloud code to do that as 4:33well. And so I when you look across 4:35this, it's easy for you to say, well 4:36Nate, you just said a bunch of things, 4:38right? You said it needs to have 4:39complexity def defined, execution 4:41environment defined, autonomy defined, 4:43the modality defined. That's all well 4:45and good, but what do we know about the 4:48current generation of AI agents and 4:51where they would fit? I want to suggest 4:52that we have at least six different 4:56categories, practical categories of AI 4:59agents out there today that sort of fit 5:01within this broader spectrum of use 5:03cases. 5:05The first one, it's the simplest, right? 5:07The conversational generators. Uh, Chad, 5:10GBT, Claudet, Gemini all come to mind. 5:12Deep See comes to mind. You use them 5:14when you need high quality text 5:15generation back fundamentally. 5:18The second class is coding assistance. 5:20When you need to write code and you have 5:22a feedback loop for it, depending on how 5:24you configure it, cloud code is a great 5:25example here. Cursor does this, wind 5:27surf does this, etc., 5:30you can't use these when you need sort 5:33of broader system orchestration unless 5:35you're going to configure them 5:36specially. And so I think that the 5:38exception to a lot of this is claude 5:39code because it's such a malleable tool. 5:42And so that's why it appears in a couple 5:43of these. But I think that code 5:45assistant is the good vanilla or generic 5:47use case for cloud code. Third class of 5:50agent categories, workflow orchestrators 5:53and zap year, make all fall in here. So 5:56you're connecting known systems, you've 5:58got predictable data flows. You may have 6:00trouble with ambiguous inputs. These 6:02systems tend to be somewhat brittle. 6:05Number four, research synthesizer 6:08agents. So deep research works here. 6:10Perplexity has a deep research function 6:12and you can also use claude in its deep 6:16research function. You put opus 4.1 on 6:18there. You have it search the web and 6:20think hard. 6:21You need current information compiled. 6:23You need it analyzed. You need to act on 6:25findings or integrate them with systems. 6:28Typically I find that the acting part is 6:31a problem with these, right? Like if you 6:32need to actually take an action rather 6:34than just read, don't use these. But if 6:37you just need to develop the 6:38information, it needs to be very high 6:40quality. Research synthesizers are 6:42really, really good and people are using 6:43them for those use cases. Now, number 6:46five, autonomous execution agents. 6:50Menace and Devon obviously go in here uh 6:52and there are custom agents that also 6:54work this way. There are people who are 6:55running claude code continuously, for 6:57example, and so it's been configured 6:58specially to be an autonomous execution 7:00agent. 7:02More and more energy is going into this 7:04category number five. That's part of why 7:06I've called out Manis is because I think 7:08it is a flagship toward a wider future 7:11of autonomous AI execution and it it is 7:14worth paying attention to on that basis 7:15because the world is going to look more 7:17like Manis in the future. 7:19The challenge is managing the cost and 7:22managing the complexity. You have to 7:24know what kind of tasks you want to 7:26entrust to an agent that is that 7:28complex. The sixth category, hybrid 7:30collaboration. And so there are a lot of 7:33these where like you want it to come 7:35back and talk to you. You want it to 7:38engage with you. I would say cursor 7:39composer is a great example of this 7:41where like there's some degree of human 7:43judgment. There's AI capability. I feel 7:45like Andre Carpathy has done a great job 7:47talking about that nuanced human 7:49collaboration piece that happens with 7:52good agent workflows. One of the things 7:54he emphasized uh in a tweet a few weeks 7:56ago is that as we build these AI agents, 8:00probably too much focus right now is 8:02going into bucket five with autonomous 8:04execution. And we are sometimes missing 8:07the realization that we need to have the 8:09smart time for the human to touch the 8:11model or for the human to touch the work 8:14because humans can bring tremendous 8:16value especially seasoned experienced 8:18humans who have domain knowledge. And it 8:20is critical to give human space to do 8:23that. Well, all right, those are six 8:26examples. You've got that mace framework 8:28in your head. We've talked about how 8:30these different agents all bucket 8:32together. I hope you've gotten a better 8:34sense of the landscape. I think we need 8:36to have more of these conversations 8:38around how we bucket these 8:39intelligences. To me, one of the things 8:43that really needs to happen is that we 8:45need to have some degree of like tagging 8:48that goes with these names because 8:50Claude code is a great example. It 8:52doesn't just code. It does a lot more 8:54than code, but it was named Claude code. 8:56Manis happens to write code. It also 8:59runs it. It also continues the workflow. 9:02Calling it a general purpose agent is 9:04fine, but it would be more precise to 9:05talk about it as a multi-AN 9:07orchestrator. I know that's a bit of a 9:09handful, but the precise wording helps 9:12us to name what agents to compare things 9:15to because otherwise we end up making 9:18inappropriate comparisons. I would not 9:20right now compare the agent mode that 9:23chat GPT shipped with Manis. Those are 9:26there are different architectures. They 9:28have different capabilities. I would say 9:29Manis is a whole lot better than the 9:31agent mode that chat GPT shipped and 9:34it's not close. And I'm not even sure 9:36they're playing in the same ballpark 9:37even though they're both called agents. 9:40So if I were to think about this, the 9:42first thing I would do is I would say 9:45how do we talk about 9:48the challenges associated with not only 9:51naming, which I think we we've done the 9:53naming thing. We we won't do any more on 9:54the naming but the challenge is 9:56associated with stabilizing these 9:59technologies into reliable flavors that 10:02companies can go and access because part 10:05of why I am doing the work on naming and 10:06talking about naming and part of why 10:08I've talked a lot about the whole 10:09ecosystem before getting into mana 10:11specifically is I think that 10:13organizations need some predictability 10:15to purchase and delivering that 10:17predictability with a technology like AI 10:19is actually quite challenging. You have 10:22to solve complexity of orchestration, 10:25right? You have to solve state 10:26management across modalities. So if you 10:28have different sub aents and you're 10:29trying to sell this as a bundle the way 10:31Manis is, you have to be able to show 10:33that each sub agent can maintain its own 10:35state, but the orchestrator needs to be 10:36able to have global coherence because 10:37the enterprise will expect that. You 10:40have to be able to show that state 10:41complexity can continue to be maintained 10:44despite task length and modality 10:46extending. 10:48Another example is tool selection. When 10:50it's uncertain, how can you show the 10:52enterprise what tool choice the agent 10:54will execute on when it's not sure 10:57what's going to happen. What does the 10:58fallback look like? What does the error 10:59handling? 11:01Memory management and context is another 11:03big piece that these kinds of companies 11:05including Manis have to solve 11:07effectively. How do you handle long 11:09workflows that accumulate huge huge 11:11context? One of the biggest challenges 11:13in AI right now is that enterprise 11:16businesses bring enterprisecale context 11:18and it's very very difficult to bring 11:20that to AI in a way that's reliable and 11:23scalable. 11:24You can't just truncate. You might lose 11:26dependencies. You have to figure out how 11:29you handle external memory, how you 11:30handle summarization, and it's not 11:34entirely intuitive how to do that at 11:37enterprise scale. Another example of 11:39challenges that these kinds of agents 11:40need to solve, crossmodal context and 11:43how you avoid context bleed. So code 11:46outputs might need to inform text 11:47generation for a complex task. But you 11:50have to make sure that they have 11:51different context requirements and 11:53different token economics so you're not 11:55spending code tokens on text tokens if 11:57code is more expensive and so you're not 11:59leaking requirements back and forth 12:00between the two. Errors are another 12:02challenge. How do you avoid a situation 12:03where you go into an error loop when one 12:05sub agent fails? That's a really hard 12:07challenge. What does an error recovery 12:09decision tree look like that an 12:11enterprise can audit and understand? 12:14Resource predictability is another big 12:15one. This has been one of the chief 12:17issues that people have complained about 12:19with Manis. 12:21How do you predict what it's going to 12:22cost if you're paying in credits? When a 12:26credit is burned, is it the same value 12:29for every action or not? People have 12:31complained that some days it seems like 12:32Manis burns more credits and some days 12:34it seems like it burns less credits and 12:35it's not predictable. Overall, it has 12:38gotten much better since March and 12:40that's part of why I'm talking about it 12:42now. But it isn't at the degree of 12:44enterprise predictability that it needs 12:45to be yet. QA is another massive 12:47challenge. How do you validate code and 12:51not just code but engineering 12:53configurations when the LLM designs all 12:56of it with multi-agent orchestration? 12:58That's really hard to do. That is one of 13:00the reasons why Manis is more popular 13:03right now with with consultants, more 13:08popular with independent builders than 13:10it is with enterprises. Last but not 13:12least, I want to talk a little bit about 13:14user intent and model coordination. 13:17It is really really difficult to handle 13:22different model results consistently 13:24over time if you have different sub 13:26aents working in the same way and some 13:28of them are from different models. That 13:30is not an easy and obvious task to do 13:32but it's a task that many of these 13:34builders are trying to handle underneath 13:36the covers because of the unit economics 13:40associated with token burn for models. 13:42And so if there was an article I think 13:45from mentioned notion where it said 13:46basically 10 percentage point of 13:49notion's margin have been eaten up in 13:50the last year just because notion is 13:53using AI models and so AI model makers 13:55are starting to eat SAS margins. Well if 13:58you want to combat that you have to have 13:59a multi- aent configuration but your 14:03multi- aent configuration needs to 14:06actually work. And that's really hard. 14:08And it gets harder when you think about 14:10the second part of what I said, user 14:12intent. How do you handle user intent 14:15when users are not intentful? When they 14:18aren't clear about what they want. And 14:19you have to assume at the enterprise 14:21level that on the one hand, you're going 14:22to have engineers that are really clear. 14:24And on the other hand, you're going to 14:27have people who just say, "Make it good. 14:29Good luck with that, right? Make a 14:31dashboard." Well, good luck with that. 14:33How do you interpret that? How do you 14:34interpret that for the enterprise in a 14:36way that is uh compliant with privacy 14:38that is able to handle all of the 14:40challenges that come with building a 14:42fullyfledged product and in line with 14:45the user's presumed intent. How do you 14:47handle that with push back and questions 14:49etc and everything I just described 14:51which is all scaling challenges 14:53associated with multi- aent 14:55orchestrators like manis and why it 14:58makes them hard to scale to enterprise. 14:59I didn't even get to the technical 15:01scaling part. Scaling out the actual 15:03system so it serves enterprise. That's 15:04another ch. Why am I going over all the 15:06hard things? This all explains the 15:10challenge that Manis is trying to solve. 15:12Why I believe it's important that we 15:14talk about it and why I believe Manis's 15:16current position makes sense. At the end 15:18of the day, Manis is trying to get to a 15:21point where they can scale multi-agent 15:23orchestration for the enterprise. But to 15:25do that, they're running the classic 15:26startup playbook where they're starting 15:28with indie builders, they're starting 15:30with small startups, and then they're 15:31going to gain the experience they need 15:33to move into the enterprise space. 15:36They are trying to solve all of these 15:38problems in ways that are transparent to 15:41the user and in ways that enable the 15:44user to deliver value specifically where 15:47Manace is good. I'm going to talk about 15:48those use cases toward the end of this 15:50video, but Manis' current position is 15:52pretty simple. They've chosen to 15:54optimize for reliability 15:57and they've chosen to optimize for 15:59capability. 16:01And that explains the issue with cost. 16:04The old engineering dilemma is you can't 16:06optimize for reliability, capability, 16:08and cost all at once. You got to you got 16:11to pick two out of three, right? You can 16:12be reliable and capable, but you're not 16:14going to be cheap. You can be reliable 16:16and cheap, but you're not going to be 16:17fast. You can't have you can't have all 16:19three. And in a sense, I think Madness 16:21has one of the most transparent pricing 16:22systems in the business because when the 16:25tokens run out, you just buy more tokens 16:27and they can allocate the compute to the 16:28people who are willing to pay. They're 16:30also following a very typical platform 16:32evolution pattern. You have a demo phase 16:35that happened in March. You have early 16:36access which happened sort of roughly 16:38April to June. People found edge cases. 16:41People had reliability issues right on 16:43schedule. They're now stabilizing. 16:44They've fixed a number of those 16:46problems. It's not perfect, but it's 16:48good enough to start talking about. And 16:50then going from there, they're going to 16:51be optimizing and scaling into the 16:53second half of this year. 16:55The fundamental tension 16:58that they are operating within is as 17:00follows. Users want chat, GPT, 17:03simplicity. They want autonomous 17:04execution and they want predictable 17:06costs. They can't have it all. And what 17:09you're getting is complicated workflows. 17:12You are getting autonomous execution and 17:14you're getting variable costs. And this 17:16is this this core explains why Manis 17:19remains in the expensive specialist tool 17:21category rather than a mainstream app 17:24because at the end of the day solving 17:26for the engineering challenges that 17:28would enable chat GPT simplicity and 17:30predictable cost and autonomous 17:31execution is non-trivial. Do you want to 17:33know an example of why it's non-trivial? 17:35Nobody else has launched a competitor 17:37that really matches Manis from one of 17:40the major model makers. The agent mode 17:42from chat GPT doesn't do it. Claude code 17:45is just in a separate category. I would 17:47argue it's not the same thing. 17:49Google hasn't launched something. Manis 17:52is its own thing. Part of why is because 17:54the engineering challenges they're 17:56solving are really, really tough. I've 17:58been through a few of those. Okay. So, 18:00we've spent a lot of time talking about 18:01a framework. We talked about the MACE 18:03framework for intelligence and how you 18:04talk about agentic intelligence. We 18:06talked about some of the practical 18:08categories with agents, conversational 18:10generators, code assistants. We came all 18:12the way to autonomous executors where I 18:13would argue manises. We talked a little 18:16bit about the challenge that comes from 18:18scaling intelligence for the enterprise 18:20and sort of where we are in this moment. 18:22I talked about state management. I 18:24talked about context and how you manage 18:25that about error propagation etc. All of 18:29this is top of mind and I wanted to 18:31contextualize that because now you 18:33understand where manis is why manis is 18:35optimizing for reliability and 18:36capability. quality has been an issue 18:38with these agents and they know that and 18:40they know people won't come back as a 18:41challenger brand if they don't optimize 18:43for it where they are in their platform 18:46evolution why they are really positioned 18:48as a specialist tool right now but they 18:51are getting better and why they've had 18:52trouble getting to the enterprise stage 18:54but I think that they are poised to get 18:56there if they continue to optimize all 18:58of that being said in the present state 19:02in September of 2025 19:05what are practical use cases where Manis 19:07is likely to be useful today. I think 19:10there are several sweet spots where I've 19:12already seen people I know people who 19:13are using Manis for this who are very 19:15happy and I think that there's a pattern 19:17we can see that suggests how these tools 19:20tend to evolve that we can apply in 19:22other cases as well because remember the 19:24whole purpose of this discussion it's 19:25not just manis is that we're trying to 19:27understand how Manis exemplifies where 19:30agentic tooling is going. So use case 19:33number one, highv value research and 19:34analysis, a monthly quarterly industry 19:36analysis for execs, competitive in 19:39intelligence briefings, due diligence 19:41research packages, that kind of thing. 19:43It it wins here. Manis wins because the 19:47cost is justifiable. If it costs a 19:49hundred bucks to develop that report, 19:51it's a lot cheaper than 2,000 bucks for 19:53the consultant. 19:55It combines web research. You have to 19:57have a nicely formatted output. You have 19:59to analyze the data. And human review is 20:02expected anyway before you make 20:03strategic decisions. So, it's not too 20:05risky and the time savings can be huge. 20:07If it takes two hours to make that 20:08report, it saves you days and days of 20:10work. Use case number two, content 20:13marketing production pipelines. So, if 20:15you're managing multiple clients as a 20:17small agency, if you're a SAS company 20:19with regular content needs, Manis can 20:22win here because they can scale content 20:24production without a linear cost 20:26increase. They can handle research, 20:28analysis, and creation and formatting of 20:30the content. The quality bar tends to be 20:32good first draft versus publication 20:34ready. And the ROI is really clear 20:37because otherwise you're hiring content 20:39writers. Use case number three, data 20:40analysis and visualization for non- tech 20:43teams. So, business analysts that don't 20:45have coding skills. Are there any of 20:47those left? Marketing teams analyzing 20:49campaign performance. Small businesses 20:51that just need ad hoc analysis. You kind 20:53of get the idea. Manis eliminates the 20:55need to learn Python, right? Or R or 20:57hire a data scientist. It handles messy 20:59data. It handles the analysis and it 21:01handles the visualization piece. I can 21:03think of other tools that do other parts 21:05of that, but I don't know of any tools 21:07that do all of those parts besides 21:09Manis. Output quality will often exceed 21:12Excel-based manual analysis and time to 21:15insight is reduced. Now, with truly 21:18large enterprise data sets, this is not 21:20going to work and I'm not going to 21:21pretend it will. And that's why I 21:22emphasize sort of the small business use 21:24case. Use case number four, process 21:26documentation. 21:29So operations teams that document 21:31workflows, consultants that are 21:32analyzing client process, creating 21:35training materials, Manis wins because 21:37you can map existing processes and 21:39identify opportunities and create 21:41documentation on the fly, right? And it 21:44and it's very fast. It can save weeks of 21:46manual process scraping and it can 21:48provide immediately actionable 21:50recommendations in a nice visualized 21:52format. Again, not a super risky use 21:54case. It saves a ton of time. Technical 21:57proof of concept development is the 21:59fifth one. So if you want to validate a 22:01product idea, if you want to explore a 22:03new integration, if you want to create 22:05technical specs as a PM, 22:08this goes beyond what you would get with 22:10a lovable because manis can create the 22:12prototype, the documentation and the 22:14deployment in one big workflow. It can 22:16handle multiple technical domains and at 22:19the end of the day, you want speed 22:20versus production ready code. Overall, 22:23again, you see the same pattern. And I 22:25want to call out the success patterns 22:26here because I think that there's 22:27something there's something to this that 22:30we should understand when we see where 22:32agents are at in the fall of 2025. One, 22:34there needs to be economic 22:36justification. All of the tasks that 22:38I've described are $500 to $5,000 if 22:41done manually, often in the thousands. 22:44The manice cost is going to be a 22:47fraction of that, a tenth of that or 22:49less. The time savings stretches like 22:52into days typically and the quality 22:54expectation tends to be make a fantastic 22:57first draft not a perfect final product. 23:00Those are the sweet spot for independent 23:02agents in the fall of 2025. 23:05Especially if you notice because these 23:09are technical complex workflows. They 23:11have five to 15 to 20 to 25 distinct 23:13actions. They're combining research with 23:16creation with formatting. They have 23:18human review and refinement and they 23:20have very clear deliverables. Now, 23:23I think the thing to call out is that if 23:26you have an agent that excels at complex 23:29multi-dommain workflows where the 23:31alternative is hiring expensive 23:32specialists, 23:34what you have is still a premium 23:37automation tool rather than a general 23:39productivity app. We keep coming back to 23:41this idea that there are certain buckets 23:44in the agentic landscape that are more 23:48specialized than others and where you're 23:50going to be paying more as a result. And 23:53one of the things I want you to take 23:54away from this is that AI agents should 23:57not be viewed as a singular bucket 23:59anymore. You have agents that are going 24:01to be positioned as general productivity 24:03tools and you have agents that are going 24:05to be positioned as specialist tools for 24:07specialist tasks. Manis as it stabilizes 24:10is looking more and more like a 24:12specialist tool for a specialist task. 24:15It's looking like a surgeon scalpel 24:18versus a Swiss Army knife. I'm sure they 24:21would like to be a Swiss Army knife from 24:23an economics perspective, but because of 24:26the engineering challenges I've 24:27identified earlier in this video, it's a 24:30really hard position for a multi-agent 24:32orchestrator to be in. That being said, 24:34I want to end my video by talking about 24:36it because I think that is where the 24:40market is going and that is why it's 24:41worth talking about Manis. Manis is like 24:44the canary in the coal mine. They're the 24:46ones that are showing us the way forward 24:47on multi- aent orchestration, what it 24:50looks like independently. They've shown 24:51how you can start to stabilize a product 24:53even for smaller businesses and 24:55independent builders or for teams on 24:57larger businesses that don't have too 24:58heavy a data needs, but 25:02they don't have the scale that the major 25:05model makers have and they haven't been 25:07able to build out the kind of footprint 25:09that would enable them to really harvest 25:10unit economic gains and bring down the 25:13cost curve. 25:15And that's where I think we're going. I 25:17think it is likely that we will see a 25:19version of Manis from a major model 25:22maker in the next few months. Maybe from 25:24Google, maybe from Claude, maybe from 25:28OpenAI, but the value that people see 25:30with these these complex use cases is 25:33very high. If you were spending three, 25:35four, $5,000 on this, yeah, you're going 25:38to be willing to pay for the whatever it 25:42costs to get this done with an AI agent 25:44because it's so much cheaper. tenth a 25:46fifth of the price, it's going to be so 25:48much cheaper. Which if you are looking 25:50as a major model maker to recover some 25:52of the cost associated with these 25:54models, you want to have more reasons 25:57for people to pay you 200 bucks. Maybe 25:59this is, you know, the reason why you 26:01have a 200 buck ad hoc task on top of 26:03your 200 bucks subscription. And some 26:05people will pay for that because it's so 26:07good. And you're going to see a lot more 26:09of the economists at these major model 26:11makers and yes they have economists 26:14looking for these kinds of specialist 26:16tasks that enable them to scale margin 26:19and that's where manis is showing the 26:20way. So if you have specialized tasks if 26:24you have stuff that I have talked about 26:25where it's you know $500 to $5,000 task 26:28and you know that you need to get that 26:31task done no matter what maybe try 26:33Manis. it will probably save you a fair 26:37bit of money and you will be willing to 26:39pay the cost because it's so much 26:41cheaper and the ROI is so clear, 26:42especially if you're looking for an 26:44excellent first draft. So, that's my 26:45verdict on Manis. I waited to talk about 26:47it till it started to stabilize. I feel 26:50excited to talk about it now. I think 26:51it's a great example of how AI agents 26:53are developing specialized use cases in 26:56fall 2025. And I'm excited to see where 26:58Manis goes next. Cheers.