Agentic AI Fails in the Layer SR 26-2 Left to You
The Control Risk No One Owns Now Belongs to the Firm to Govern
Image: Depositphotos
The irony is hard to miss. SR 26-2 leaves each bank to determine how agentic AI should be governed through its own risk framework and architecture. Inside the workflow, the agent is doing its own version of that: resolving what its inputs mean before it acts.
That is where the hardest problem now sits: not in the model output or the execution record, but in the reasoning layer, where operational meaning forms before action.
In April 2026, the regulators issued SR 26-2, the revised interagency model risk guidance. Most attention went to what it tightened. For agentic systems, the consequential move is what it set aside. Generative and agentic AI models fall outside its scope because they are novel and rapidly evolving, leaving banking organizations to determine appropriate governance for systems the model-risk framework does not cover.[1]
Existing model risk guidance does not reach the layer where the control risk is forming. The governance question is not whether the model is valid, but whether the agent is acting under an authorized meaning, or more precisely, the operational interpretation under which it is permitted to act.
The Layer Where Agentic Systems Actually Fail
Agentic systems do not fail the way models fail. A model can fail statistically as the data shifts. Model risk teams know how to watch for that. An agent can fail earlier and far more quietly. Before it acts, it must settle what its inputs mean across systems never built to agree on terms like available, eligible, cleared, and authorized. That settlement is the reasoning layer, where Agentic Workflow Drift begins.
Consider a liquidity transfer run end-to-end by an agent across treasury, risk, ledger, and regulatory reporting. Treasury treats funds as available once they are unencumbered. Risk treats them as available only within an intraday limit. The ledger ties availability to settlement timing. Reporting applies its own eligibility test. Each definition is correct inside its own system. None was written to reconcile with the others.
A person bridges that gap with accountable judgment. An agent resolves it through inference, synthesizes four definitions into one working version of available, moves the funds, and posts an entry that flows into the regulatory report. Nothing breaks. The transfer settles, the controls fire, the report ties out. The institution has moved money against a meaning of available that no treasurer or risk officer approved. No rule was broken, so no alert fires. Nothing failed, so no exception opens.
This is why the exposure is hard to place. It does not belong to model risk, because no output was wrong; to operational risk, because no process broke; or to conduct risk, because no person acted. Each function inspects its own layer and sees a clean result, because the drift lives in the layer none of them owns. Every surface reads green and the institution is still wrong. That condition is Invisible Failure: correct across every surface the institution can see, exposed in the one it cannot.
This is the control paradox of agentic AI: the cleaner the downstream evidence looks, the easier it becomes to miss the unauthorized interpretation that produced it.
Each function can report a clean result while the reasoning-layer failure remains outside its field of view. Source: Doyle-Spare (2026), supporting SSRN working papers
The Infrastructure Is Already in the Building
More controls on the same surfaces will not close this; the gap is structural, not operational. The layer itself has to be governed, and much of the foundation is already in place. Banks have built ontologies, semantic layers, and knowledge graphs, yet too often those assets are consulted at design time and set aside at runtime. Bring those assets forward: place a thin runtime control plane over them so authorized meaning is enforced at orchestration, not buried in a reference model no agent reads.
Vendor platforms, SaaS systems, workflow engines, and policy repositories already encode operational meaning, while headless, composable architectures increasingly expose those definitions through services and APIs. The answer is not to invent governance from scratch. It is to make that semantic infrastructure executable at runtime.
The institution already maintains the ontology, reasoning baseline, and knowledge graph. The missing layer is runtime evaluation, authorization, and evidence. Source: Doyle-Spare (2026), supporting SSRN working papers
The control plane measures how far an agent’s resolved meaning diverges from the meaning the institution authorized, a divergence captured by the Semantic Deviation Index. When divergence crosses an institutional threshold, a Deterministic Gate stops execution. The gate does not reason; it enforces. The control plane rests on the Agentic 3 C’s: Context, the authorized definitions the agent reasons against; Control, the gate that tests resolved meaning before execution; and Coordination, the discipline that holds meaning intact across agents. This is the Semantic Control Plane before action: verification before execution, not reconstruction once the damage is done.
Once meaning becomes executable, governance can move from review after the fact to authorization before the act.
Every material agentic decision pass through the Deterministic Gate before execution: permit, hold to human review, or mandatory hold. Source: Doyle-Spare (2026), supporting SSRN working papers
Why This Is How Agentic AI Finally Runs End-to-End
This is not a brake on automation. It is what makes full automation defensible. Once meaning is governed at runtime, an institution can fully automate workflows end-to-end, reserving attention for exceptions the gate raises. This moves the institution from Human-in-the-Loop (HITL), with a person at every decision, to Human-on-the-Loop (HOTL), where people step in only for exception handling, when a runtime boundary or gate is not met.
A runtime control plane does not need to touch deterministic point-to-point integrations. Those do not need an agent and should not have one. Agentic systems belong on workflows that are already manual and judgment-heavy, where the work moves at the speed of people coordinating across systems. Because that is the baseline, well-designed runtime checks cost far less than the human coordination they replace. Real-time deviation measurement and deterministic gates are what keep workflow drift and reasoning-layer errors from breaking banking controls.
Self-Regulation Is the Regulation
None of this is an argument against agentic AI. The prize is large. Agentic AI can materially improve productivity and scale. The constraint is not the agent. It is the human review required when the institution cannot prove the meaning under which the agent was authorized to act. If every material decision still needs a person in the middle, the institution never reaches that scale. Runtime governance lifts that ceiling, letting agentic AI reach production with the traceability and supervisory confidence the sector requires.
The regulators did not prescribe a governance architecture for agentic reasoning. They left that determination to the institution. That makes governance of the reasoning layer a new control surface, self-imposed by definition. Institutions that can show, under examination, that agents reasoned from authorized meaning before acting will be ready for supervision, not retrofitting for it.
SR 26-2 does not eliminate governance pressure. It relocates it. In practical terms, for agentic AI in banking, self-regulation is now the regulation. The question is whether the institution writes that standard deliberately, or lets its agents write it silently, one correct but unauthorized transaction at a time.
[1]SR 26-2, Interagency Guidance on Model Risk Management (April 17, 2026), attachment, Footnote 3, which states that generative AI and agentic AI models are novel and rapidly evolving and accordingly are not within the scope of the guidance, and directs banking organizations to determine appropriate governance for systems not covered.
About the author
Maureen Doyle-Spare
Maureen Doyle-Spare is an independent practitioner and researcher in AI governance and banking controls. She is the originator of the Semantic Control Plane (SCP) runtime governance architecture and the connected frameworks of Agentic Workflow Drift, Agentic Workflow Subversion, the Semantic Deviation Index (SDI), the Deterministic Gate, the Agentic Blast Radius, the Semantic Audit Trail, and the Agentic 3 C’s Framework, collectively anchored by the Doyle-Spare Agentic Governance Model (AGM). The supporting working papers are available on SSRN: No. 6459612 (Agentic Workflow Drift and Agentic Workflow Subversion, March 2026), No. 6531238 (Semantic Deviation Index), and No. 6674761 (Agentic 3 C’s Framework).
This article is published under the author’s individual capacity as an Independent Practitioner and Researcher in AI Governance. No employer affiliation applies.
ORCID: 0009-0009-6655-1394
SSRN Author Page: https://papers.ssrn.com/sol3/cf_dev/AbsByAuth.cfm?per_id=10836296
OSF Research Program: https://osf.io/zuacj/
Correspondence: https://www.linkedin.com/in/maureendoylespare