iTranslated by AI

The content below is an AI-generated translation. This is an experimental feature, and may contain errors. View original article
🗺

(3) The Missing 'LLM Workflow' Quadrant in Our Vocabulary — Re-visiting ReAct Agent Application Scopes

に公開

Update (2026-04-30): To align with the AAP repository, two of the four quadrant names have been renamed: (2) Algorithmic Search Quadrant and (4) Autonomous Agentic Loop Quadrant. The Script Quadrant and LLM Workflow Quadrant remain unchanged. While terminology used as quadrant names in the body has been replaced with the new names, references to the ReAct pattern (Yao et al. 2022) or ReAct loops remain as they were.

Premise

In my previous article, I categorized business AI into four quadrants. The horizontal axis represents "Deterministic / Requires Semantic Judgment," and the vertical axis represents "Workflow Definable / Exploratory." To facilitate discussion throughout this article, I will assign short names to these four quadrants:

  • (1) Script Quadrant — Deterministic × Definable. Processes handled by scripts/pipelines.
  • (2) Algorithmic Search Quadrant — Deterministic × Exploratory. A* search / dynamic programming / MCTS / reinforcement learning (outside the scope of this article).
  • (3) LLM Workflow Quadrant — Semantic Judgment × Definable. Calling LLMs within a predefined workflow. This includes "workflow patterns" as described in Anthropic's "Building Effective Agents" (prompt chaining, routing, orchestrator-workers, etc.), specialized chat agents for conversational tasks, and single-function LLM tasks for batch processing.
  • (4) Autonomous Agentic Loop Quadrant — Semantic Judgment × Exploratory. Autonomous loops where the LLM itself determines the next action.

The argument of the previous article was that the ReAct agent is truly necessary only for the Autonomous Agentic Loop Quadrant, and that the majority of business tasks can be handled within the Script Quadrant and LLM Workflow Quadrant.

This article examines the root cause of this issue. Why do agent vendors insist on overlaying the architecture of the Autonomous Agentic Loop Quadrant onto the entire business domain? Why are tasks that could be handled within the LLM Workflow Quadrant designed as autonomous loops instead?

This seems less like a technical choice problem and more of an issue with industry vocabulary: the LLM Workflow Quadrant lacks an independent name.

(3) The LLM Workflow Quadrant is missing from our vocabulary

When reading the discourse on agents, the narrative around business tasks generally converges into two categories:

  • Deterministic parts are left as they were using old methods.
  • Everything else is handled by autonomous agents.

There is no place for the LLM Workflow Quadrant in between. There is no standard industry term that affirmatively names the architecture of "building workflows deterministically and calling LLMs only at points requiring semantic judgment." While similar points have been made in Anthropic's "Building Effective Agents" or Thoughtworks' critiques of "agentwashing," they primarily focus on negative warnings like "don't build agents where autonomy isn't needed." A term that affirmatively labels the LLM Workflow Quadrant as an independent design quadrant remains absent.

What happens when there is no affirmative vocabulary? After agent designers separate the deterministic parts from the rest, they treat the latter collectively as the "domain where the LLM judges autonomously." Because they lack a sharp distinction between the LLM Workflow Quadrant and the Autonomous Agentic Loop Quadrant, they default to overlaying the Autonomous Agentic Loop Quadrant architecture—the ReAct loop—onto the LLM Workflow Quadrant. The categorical error I mentioned in the previous article, where "most business tasks should be in the LLM Workflow Quadrant but are somehow implemented in the Autonomous Agentic Loop Quadrant," appears to be an inevitable consequence of this vocabulary void.

(3) Consequences of treating the LLM Workflow Quadrant as the (4) Autonomous Agentic Loop Quadrant

When the LLM Workflow Quadrant is processed using the architecture of the Autonomous Agentic Loop Quadrant, multiple independent symptoms emerge downstream. While they may appear to be separate problems upon observation, they share the same root. The following four symptoms can be read as different manifestations of the artificial "redirect inability" caused by the absence of vocabulary.

The exception handling bottleneck in RPA

There is a phenomenon long known in the world of business automation: when you extract the parts that can be written with deterministic rules using RPA, the remaining exception handling emerges as a bottleneck. The number of people assigned to handle exceptions increases, and maintenance costs balloon. As I wrote in the previous article, business processes consist of parts that can be written deterministically and parts that do not fit that model. The latter is the LLM Workflow Quadrant, and because RPA lacked the vocabulary to handle it, it only extracted the Script Quadrant, leaving the LLM Workflow Quadrant as manual labor on the front lines.

Now that LLMs are here, we should be able to explicitly separate the LLM Workflow Quadrant as single-function LLM tasks or specialized chat agents. However, since the industry vocabulary lacks the LLM Workflow Quadrant, it is ultimately dumped into the autonomous agents of the Autonomous Agentic Loop Quadrant. It appears that the exception handling bottleneck of the RPA era is recurring in the agent era under a new name.

Demands for stronger sandboxing

A ReAct loop dynamically determines "what to execute next" via the LLM. In production, it is impossible to predict what the agent will do. To contain potential damage, high-strength sandboxing such as process isolation, microVMs, or WASM has become a requirement.

Sandbox technology itself is a robust, rational technology with independent merits, and it is not the target of criticism here. What is concerning is the intensity of the demand for it. If the LLM Workflow Quadrant were conceptualized independently and each LLM invocation were explicitly separated as a component with a fixed role, the authority boundaries of each component could be designed naturally based on the business unit. Invoice matching functions do not need Firecracker. However, in a design where "an autonomous agent runs the entire business," you need maximum isolation because you cannot know what it will do. The strength of the sandbox appears to be manifesting as an adjustment cost caused by treating the LLM Workflow Quadrant as the Autonomous Agentic Loop Quadrant.

Structural distortion of Human-in-the-Loop

The "Loop" in HITL should ideally be a feedback loop where humans improve AI output. In reality, however, HITL functions in a different way. It becomes a structure where humans are constantly catching the judgments of the LLM Workflow Quadrant that the AI cannot fully process.

In organizations where the Script Quadrant was mechanized via RPA, the reason the number of exception handlers ultimately increased was because the LLM Workflow Quadrant remained unmechanized. Even after introducing AI agents, personnel are required to control their autonomy or confirm the agent's output. The structure is the same for conversational types (specialized chat agents): just as experts cross-check the output of legal or diagnostic chatbots at every turn, human confirmation is attached to every turn of the conversation. Instead of "humans being liberated," the "human role shifts toward filling in the imperfections of AI." The ideal held up by the term HITL and the roles humans actually play on the ground look like different things.

Artificially created accountability problems

Once production begins, responsibility for the system shifts to the operations manager. This is true for any business system. The problem is whether, when something happens, the operations manager can effectively redirect the responsibility—separating the origin of the problem and passing it back to the person in charge.

If the architecture of the LLM Workflow Quadrant is properly designed, the operations manager can redirect. The input/output schema for each LLM invocation is articulated, and the chain of decisions can be traced from the workflow logs. When a failure occurs, the operations manager can investigate and say, "This is an accuracy issue in function f, so send it to the model selection team," "This is a flaw in routing logic, so send it to the designer," or "This is abnormal upstream data, so send it to the data management team." The structure does not force the operations manager to shoulder the responsibility alone.

However, when the LLM Workflow Quadrant is thrown into a ReAct loop, this redirection fails. Even if you try to reconstruct "why this judgment was made" from the logs, the "thoughts" of a ReAct loop are difficult to interpret post-hoc. The judgments made by an agent's runtime are hard to clearly attribute to designers or model selectors. It appears that the structure Elish (2019) called the moral crumple zone—where the responsibility of an autonomous system is pushed onto "human operators with limited control capability"—is activated here. Responsibility that has nowhere to be redirected remains stuck to the operations manager.

This is not an inevitable consequence of design, but looks like an artificial redirect inability caused by treating the LLM Workflow Quadrant as the Autonomous Agentic Loop Quadrant. If you design the LLM Workflow Quadrant using its original architecture, redirection functions, and the operations manager does not have to bear the burden alone.

(4) The Autonomous Agentic Loop Quadrant is the true site of accountability problems

Most of the accountability discussions in the agent ecosystem—sandboxing, HITL, explainability, and governance—appear to be dealing with the confusion surrounding the LLM Workflow Quadrant. They have become discussions about patching up downstream the consequences of problems that originally did not need to occur.

Of course, accountability problems inherent to the Autonomous Agentic Loop Quadrant exist as well. ReAct loops are black boxes, and it is difficult to reconstruct the chain of judgments post-hoc. The trilogy of articles I wrote previously explored handling this through structured design—separating interfaces, append-only logs, and approval gates (or in organizational terms, separation of duties, four-eyes principle, and principle of least privilege). The conclusion was that by distributing responsibility across the structure, one can recover causal traceability and the location of accountability. For details, see "Can we trace causality after an accident?". Structured design works because it relies on the premise that the contribution of each component is identifiable and separable post-hoc.

However, the ReAct architecture undermines this premise. In the LLM Workflow Quadrant, each LLM invocation has a fixed role, so in case of failure, one can distinguish between "an accuracy issue in function f," "a flaw in routing logic," or "abnormal upstream data." A ReAct loop is different. The role of each iteration is determined at runtime, and the model's judgment, tool selection, history reference, and prompt context effect are all blended together. Since the output appears as a blend of multiple judgment elements, one cannot retrospectively separate their respective contributions when a result is incorrect.

As a concrete example, suppose a ReAct agent is tasked with completing a dispute case—from case law research and issues organization to presenting settlement conditions to the opposing party (though currently, this is difficult to conceive within the framework of bar association laws and conflicts of interest). If it is later determined that we "agreed on unfavorable terms," the model's interpretation of precedents, the usage of research tools, the application of patterns from past cases, and the "prioritize risk avoidance" prompt instructions all chained together at runtime to create that proposal. Therefore, even if you look back at the logs, you cannot extract "which judgment led to the disadvantageous agreement" as an independent factor. The settlement conditions have already been presented, and if the other party accepts, they cannot be withdrawn. Why it happened cannot be explained post-hoc. Is there any human who can take responsibility for the judgments of an agent that cannot be explained?

An answer to this question can only be reached after tracing the true nature of redirect inability. The artificial redirect inability created in the LLM Workflow Quadrant can be resolved through design changes. On the other hand, the redirect inability that occurs when ReAct is applied to business truly belonging to the Autonomous Agentic Loop Quadrant—business requiring autonomous judgment—stems from autonomy itself and thus cannot be resolved. This is not a problem of technological maturity, but an attribution gap that occurs in principle as a consequence of adopting autonomy.

That is why deciding whether to apply this architecture to business in the Autonomous Agentic Loop Quadrant is not a question of whether it is technically possible or not, but a judgment of whether to accept the attribution gap as a cost. If the business involves reversible consequences or has approval gates upstream, the cost is easier to accept. Conversely, in businesses where accountability units are predefined on the business side, it is difficult to fit blended outputs into those units, and the cost balloons.

The issue of who bears that cost overlaps here. Autonomy comes with responsibility. Humans can accept responsibility as legal entities, but agents are not currently subjects of legal responsibility (though they might be in the future). Thus, a void remains where there is neither a human who can accept responsibility for the judgments of an agent with an attribution gap, nor an agent as a subject capable of accepting it.

As seen in this section, apart from the artificial redirect inability of the LLM Workflow Quadrant, there is a separate issue of the attribution gap when ReAct is applied to business inherent to the Autonomous Agentic Loop Quadrant—the one-line statement in my previous article, "Accountability problems become seriously prominent in the Autonomous Agentic Loop Quadrant," was referring to this.

Structure of Inversion

Lining up these observations, a distorted image of the agent ecosystem comes into view.

The industry is applying the architecture of the Autonomous Agentic Loop Quadrant to the LLM Workflow Quadrant—where redirecting responsibility should be achievable—thereby artificially creating redirect inability. Discussions regarding sandbox intensity requirements, HITL distortions, and the moral crumple zone-like concentration of responsibility are all arising here.

On the other hand, discussions regarding the principled redirect inability that occurs when ReAct is applied to tasks inherent to the Autonomous Agentic Loop Quadrant—the attribution gap that arises as the price of autonomy—have not deepened as much as the discussions aimed at patching up the confusion in the LLM Workflow Quadrant. While the former has descended into specific technical responses like sandboxes and HITL, the latter struggles to move beyond the awareness that "AI autonomy entails responsibility issues." Furthermore, whereas the former has redirect destinations (designers, model selection leads, data management leads) within the organization, in the latter, the very subject capable of accepting responsibility is currently absent, meaning there is nowhere for the redirect to stop.

As a result, the validity of accepting an attribution gap and the principled limits of accountability structures seem to be hidden in the shadow of the confusion surrounding the LLM Workflow Quadrant.

The situation is such that discussions are concentrated on artificial redirect inability (which is resolvable), while we have yet to reach discussions on principled redirect inability (which is unresolvable).

If we had possessed the four quadrants from the beginning, this inversion might have been avoided. We would treat the LLM Workflow Quadrant with a workflow-type architecture (allowing operations managers to redirect responsibility). When mechanizing the Autonomous Agentic Loop Quadrant, we would do so while accepting the attribution gap as a cost. We would write the Script Quadrant deterministically (as no problems originally existed there). We would handle the Algorithmic Search Quadrant with classical AI/OR (where LLMs are unnecessary). Each quadrant has its own specific design guidelines, and mixing them causes breakdowns.

Conclusion

When we perceive business through the dichotomy of "parts that can be written deterministically" and "everything else," we end up pushing all of the "everything else" into ReAct. The four quadrants from my previous work were an attempt to introduce a third vocabulary—the LLM Workflow Quadrant—from the start. Once the LLM Workflow Quadrant can be named as an independent quadrant, it becomes easy to see that what most business operations actually need is the architecture of the LLM Workflow Quadrant.

What is missing from the discussion of agent design is a vocabulary to name the LLM Workflow Quadrant in the affirmative. Merely saying "autonomy is unnecessary" in the negative does not stop field designers from wandering between the Autonomous Agentic Loop Quadrant and the LLM Workflow Quadrant. Where there is no vocabulary, design decisions flow toward process of elimination.

Moreover, the real accountability problem lies not in the confusion of the LLM Workflow Quadrant, but in the attribution gap that occurs when ReAct is applied to tasks inherent to the Autonomous Agentic Loop Quadrant. While discussions patching up the confusion of the LLM Workflow Quadrant currently dominate the scene, the redirect inability of mechanizing ReAct is a point of contention that will surely be questioned in earnest from here on out.

As a point of discussion, how to design the subject responsible for the attribution gap will also overlap. Just as frameworks for social recovery that combine operator liability and insurance systems are being explored for autonomous driving (though the systems are still under construction), whether we head toward a similar path or require a different route—that discussion has yet to become a shared industry vocabulary regarding ReAct agents.

This is not to say that vendors or designers are at fault. It appears to be a matter of timing: a powerful technique called ReAct emerged while the industry's conceptual stock lacked a term to call the LLM Workflow Quadrant in the affirmative. If the vocabulary is lagging, we should stand up names in the affirmative from the field. My previous work and this article are one such attempt.


References

  • Yao, S., et al. (2022). "ReAct: Synergizing Reasoning and Acting in Language Models." arXiv:2210.03629.
  • Elish, M. C. (2019). "Moral Crumple Zones: Cautionary Tales in Human-Robot Interaction." Engaging Science, Technology, and Society 5: 40–60.
GitHubで編集を提案

Discussion