Notebook

Reply to Ken Stewart

This is superbly written (elevensies LOL) and this frame of task length barrier is a novel term to me. I really enjoyed how you're thinking about this and worki…

This is superbly written (elevensies LOL) and this frame of task length barrier is a novel term to me. I really enjoyed how you’re thinking about this and working through the context barriers. The challenge I’ve had with agentic flows thus far has been they don’t feel terribly different to me than say RPA or workflow chains I’m used to.

I love elevensees. And it never stops me from having luncheon and afternoon tea either (especially when the King’s team alarm sounds, not getting fined again).

the majority of agentic flows that you see today are being built by those who have recently read up on RPA and normal workflow chains or have come from that background and they're leaning on it like a crutch and that's bad but what is good is when you leverage traditional system design like RPA like workflow chains to enhance the reach that agentic flows have so for the large part everybody's getting agentic flows wrong we saw it in the early days of AI or gen AI automation where people were creating these workflows in [make.com](http://make.com/) or Zapier to manage their social media posting. And they were just using Gen AI as a task activity in the flow to do that part of the task instead of scripting it or what have you. And that really enabled the low coders, no coders, or people, not even technical, to have reach there.

So yes, I think because of the old way of thinking, because of being stuck or boxed in to a certain methodology, people are missing out on actually using generative AI in the correct way. And there's a lot of workflows that I've seen people demonstrate. And I go, well, yeah, I built that with an RPA or workflow chain in a traditional way and I'll do it cheaper. and you're not actually providing any kind of multiplying value here in that you're really not and this means that so if you're not providing any value with the agentic way of doing things because the agentic way of doing things is expensive but if you're not providing that multiply effect that sort of exponential graph effect of implementing the agentic flow, then what you end up having is this very expensive RPA that's being run by an LLM instead.

On the opposite side of that, you have the other problem, and this is where the detractors of generative AI start popping up, because what they attempt to do is they attempt to allow a generalist AI tool, like a large language model, to act as an agent on behalf of us. and it fails miserably and that's largely what this mtr report is demonstrating is that the larger language models you know this year of are being very good at doing the the small jobs that take you know under a minute to do from an autonomous and energetic point of view let them reason and let them decide how they do it but they're they're failing substantially and then certainly having a 50 failure rate as you get higher through the the timings of a task and so this is the opposite problem to what you're having is that you're seeing overpriced workflows being created um and on the opposite side of that you have these equally expensive attempts at doing tasks agentically and in horizontal generalist large language models as an agent and they are failing because they can't deal with the context windows, they can't deal with how much knowledge they do have and where the task and the goal apply to their large understanding of things.

You have to build vertical agents. And the problem is we’re configuring it in the wrong way. We’re configuring AI agents in very much the same way that we configured RPAs and workflows and the logic there, because we’re treating it like a machine. And we have to stop that. We have to think about configuring AI agents like we would teach an employee how to do their job, or teach a student or an apprentice how to do their job.

And therefore, what ends up happening if you create a true agentic workflow is that instead of creating an if statement or a case statement or a switch statement in either programming language or in natural language, instead of doing that, you end up just giving it the policies and procedures like you’d give a human. you know you’re not going to give a human it’s very unlikely that you give a human a rigid logical if statement script you know there’s only certain jobs where you do that and yes we do do that their jobs are under threat 100 because they’re just following a script but here we’re asking the ai agent to follow its its skills its knowledge of its tools it’s been given and the knowledge base that all of us humans have access to and act on those and then it then mediates between the large language model and those elements of knowledge and tooling and that helps it decide which tooling to build and to call out it helps them decide what data it needs to go and find just like a human would and that’s the key difference and so a large of frustration that you’re seeing with agentic flows just being glorified are just a misunderstanding of how we’re supposed to be configuring this.

So the trade-off you talk about between a standard static code flow and a supposed complex agentic one really only exists if you’re talking about the agentic flow being inside of the large language model. so it really doesn’t necessarily weigh up in the context of a vertical agent where you’ve architectured it in the right way using system instructions and its access to the knowledge base as a human would have and then it’s being able to decide its own switch if and case statements about whether it creates its own workflow with a different set of if case and switch statements in the workflow so yeah in our experience in the past static code flows that are out of the box are not technical debt because we pay for a service that those get updated by the service provider, the vendor product support team. We don’t have to deal with that supposedly, but they’re less dynamic. They don’t always fit our use case. It requires some level of workaround in those standard flows, all those frustrations I know you’re aware of. And so the trade-off doesn’t exist if you use vertical agents that are constrained to a very specific task in the overall workflow that you’re trying to replace the code flow from to an agentic one.

The key difference between RPA-based flows and true AI-agentic workflows is that RPAs automate school-based and repetitive tasks and mimic human actions in structured workflows. It cannot make decisions and it requires human intervention for those exceptions, whereas An AI-agentic workflow employs AI agents through orchestration, through another agent that orchestrates them, and it can learn and adapt and make decisions dynamically to handle complex instructed tasks. It will be able to adjust its own system instructions for its AI agents. the orchestrator can do this based on learnt information. Now with RPAs we did this a little bit with data-driven dynamic RPA flows, but basically if the data changed the workflows were built in such a way that they would do an excessive amount of looping or reasoning to make those decisions quicker and better in those circumstances. But again, there’s still a limit to how dynamic it can be with unstructured tasks like an AI agent system can be.

But the trouble is today, some of our AI practitioners and builders today are still building in the old way. They’re not building in the new way. They’re putting our AI agents on strict workflows as a means to create the constraint where horizontal AI agents fail. That is obviously the good thing to try and achieve is the constraint of a vertical agent. but you don’t do that by rigidly boxing it into a workflow as you would an RPA system. You have to allow it to make decisions on what that workflow is and how it must change in certain circumstances.

we are absolutely still in the relatively early days with the sleep step function and the you know the early pains of configuring maintaining these outputs will grow easier over time from from a practitioner standpoint it will because there’ll be frameworks and there’ll be ways that you do it you know but there isn’t a prescriptive way that says you can’t, that prevents and stops AI practitioners from building AI agents against the frameworks and fundamentals of an RPA. And this is the true crux of the problem here, is that there is no reference point and framework, there’s no bible, like there is for RPA-based architecture like there is for SAAS architecture. There is no framework for how AI agentic frameworks should work. So people are easily going to slip into the old ways of doing things and just not understand what they’re doing. So these are the early days and everyone’s learning. I’m learning and you’re learning. We’re all learning on how to get this right. But we do need to break through this and understand that today it is possible to create a vertical agent that can get 90 to 100% success rate for two hour long tasks. And Manus AI is an example of a vertical agent that has been able to reach out this far.

There seems to be a tradeoff between how complex our flows get (and necessary tools to create and manage) not to mention the technical debt incurred in upkeep after launch vs a standard code-flow that is static and needs less upkeep IME. The real crux of this seems to be 2 key points:

  1. GenAI (especially more narrowly scoped agents) seem to have a bit of craftiness to them to far exceed intuitive outputs of similar RPA-based flows (to be sure)

  2. we are still in relatively early days. With this leap step function, I have to believe early pains of configuring and maintaining said outputs has to grow easier (from a practitioner standpoint)

I once hired a former college professor to code a new blockchain licensing model. It was a disaster because he had all the knowledge, but none of the practical bits when it came to application. Love your framing!