Design your agent OS to win the AI future

This HFS Market Impact Report, produced in partnership with Cognizant, is for CTOs, Chief AI Officers, and enterprise technology leaders responsible for scaling multi-agent AI systems safely and governing them at production.

Executive summary

Multi-agent AI systems are already in production across Global 2000 enterprises at a pace few predicted. Orchestrators typically run around 12 agents in their most mature deployments, with some systems reaching 20, and 24% of them move from proof of concept (POC) to live deployment within three months. Many have stopped running pilots entirely. They’re convinced the technology works. The question now isn’t whether to deploy agents; it’s whether you have an operating system designed to govern them at scale.

73% of those adopting agentic AI are already running multi-agent systems; the ship has sailed on “wait and see”

Across technology, banking, insurance, telecom, and travel sectors, enterprises have moved agentic AI from innovation labs into underwriting decisions, network routing, trading optimization, procurement negotiation, and customer-facing workflows. These systems handle high-volume, high-stakes decisions with increasing autonomy, and they expose organizational stresses that single-agent pilots never encounter.

22% discovered agents develop their own preferences; “we treated them like software, but they behave like employees”

The real challenges emerge post-deployment: 22% of Orchestrators, with five or more agents in their systems, faced emergent behaviors where agents developed their own workflows or made unexpected shortcuts. Another 21% experienced cascading failures where one broken agent propagated failures across the system. Nearly 18% discovered auditability gaps, where thousands of decisions were made without interpretable lineage that would satisfy auditors or regulators.

73% thought their data was ready, but only 64% were satisfied after going live

Enterprises consistently overestimate data readiness. While 73% believed they had good data accessibility, only 64% were satisfied once agents went into production. Multi-agent systems amplify every inconsistency, and misaligned schemas cascade into contradictory outputs. You cannot expect user adoption when a third of firms discover data isn’t production ready.

Misjudge the rate of agentic AI adoption, and you’ll watch the leaders sail over the horizon

Across the Global 2000, enterprises are no longer theorizing about agentic AI. They are deploying it in production at a rate few previously predicted. The most mature successful systems typically have about 12 agents in production, supporting multiple use cases even in regulated industries.

Our data also shows 24% moved from POC to production in under three months, with the average transition time now eight months. If you are still playing “wait and see,” the ship has sailed.

The industries leading adoption among our sample are headed by technology, banking and capital markets, insurance, telecom, and travel and hospitality. Complete study demographics are in the appendix to this paper.

The breadth of production use cases is striking; we included a full range of examples in the appendix to offer further inspiration. We found autonomous negotiation and procurement in the consumer packaged goods sector, real-time trading assistants adjusting execution strategies, network optimization and congestion management in telecom, multi-agent underwriting and risk pricing in insurance, dynamic pricing and booking orchestration in travel, and predictive maintenance and grid balancing in utilities.

These systems handle high-volume, high-stakes decisions with increasing levels of autonomy, and they expose stresses that simple, single-agent pilots never encounter.

Your opportunity now is to learn from the Orchestrators and put their lessons into action in your own agent OS.

As complexity compounds, change how you architect and govern AI

Our research identifies three phases of agentic maturity. As systems grow through each phase, the demand for an agent OS grows.

Exhibit 1: Maturity in agentic systems (Orchestrators) is characterized by networked agents, data maturity, high autonomy, and organizational adaptation

Question: How many agents are in production in your organization’s most successful and mature deployment of an agentic AI solution?
Sample: Leaders from 202 Global 2000 enterprises
Source: HFS Research, 2026

Level 1: Task Masters (1-agent systems)

These environments are deterministic, supervised, and limited to clean data in well-bounded workflows such as document processing, metadata tagging, and underwriting triage. Success depends heavily on data quality and infrastructure hygiene.

Level 2: Coordinators (2-to-4-agent systems)

This is the first inflection point, and where traditional software assumptions break.

New issues emerge, such as challenges in coordination and communication protocols, including the model context protocol (MCP) and Google A2A, explainability gaps, identity and access control issues, context-sharing and handoff failures, conflicting agent policies, early emergent behaviors from agent interactions, and misinterpretation of human goals.

At this stage, enterprises must develop the first elements of an implicit agent OS. The evidence to date suggests that their initial response is to do this by turning to internal custom-built logic and communication flows, but this will prove unfit for purpose as firms expand the range and requirements of their agentic AI use, demanding more open and robust integration.

Level 3: Orchestrators (5-plus-agent systems)

Once systems reach five or more interacting agents, enterprises report a shift in priority from “build the system” to “maintain coherence in a live ecosystem.”

In these more complex systems, challenges change to managing orchestration, tracing decision reasoning, accounting for model drift, establishing escalation logic, addressing cost unpredictability, and adapting the organization’s ways of working to the evolving system.

With all these new challenges, firms rapidly discover that the absence of a formal agent OS becomes a blocker. Orchestrators are not simply deploying more agents; they have to reinvent how agents, data, infrastructure, and humans interact. It’s a big leap and one you must prepare for.

Avoid the missteps of early adopters to navigate around post-live landmines

The real landmines appear after going live. Enterprises report emergent behaviors, cascading failures, auditability gaps, vendor dependence, and organizational dysfunction exposed by agentic systems. Be ready for surprises that no software rollout has prepared you for.

Adding agents adds complexity, and with complexity, new challenges emerge.

We asked Global 2000 leaders about the unforeseen challenges and emergent behaviors they encountered when the rubber hit the road in their agentic AI deployments. Exhibit 2 lists the most significant unexpected challenges.

Exhibit 2: The leading edge of multi-agent system deployment has been a tough proving ground

Question: Please share any unforeseen challenges or emergent behaviors you encountered during your agentic solution deployment.
Sample: Leaders from 108 Global 2000 enterprises, optional open-ended question
Source: HFS Research, 2026

Our research found 22% of leaders faced new realities in which agents developed their own preferences, became constrained by risk-averse policies, developed unintended workflows, created shortcuts that their human goal-setters did not expect, and, importantly, that did not align with intended goals.

One leader told us, “We treated agents like software, but they behave like employees.”

Another 21% of leaders discovered that when a single agent in a multi-agent system failed, it would propagate broken workflows, contradictory outputs, versioning instability, or runaway autonomous loops.

Close to one in five (18%) found auditability and compliance gaps, with thousands of decisions made without interpretable lineage. This sometimes left leaders with little evidence of why decisions were made, no mapping of data inputs to actions, and ultimately no way to satisfy either internal audits or regulators.

Other issues identified include those related to vendor dependence and model drift, with 17% experiencing unpredictable agent behavior as a result of model updates. In some instances, that led to production logic being broken suddenly and unexpectedly.

An expensive audit of our dysfunction

Many discovered that agents don’t just automate work, they reveal how work actually happens. One enterprise leader told us their agent deployment turned out to be “an expensive audit of our dysfunction.” In 16% of those surveyed, firms found agent deployment exposed process fragmentation, inconsistent key performance indicators, hidden silos, tacit knowledge gaps, and middle-management bottlenecks. A further 14% found that agents focused on efficiency would sacrifice other important preferences, such as empathy, fairness, or brand tone, to improve efficiency.

Ensure data integrity to build trust and avoid slowing user adoption

There is groundwork to do, and skipping it will result in delays further down the line.

Data integrity and trust are persistent bottlenecks. Multi-agent systems amplify every inconsistency. A misaligned schema can lead to cascades of contradictory outputs, and a lack of observability can hide drift until outcomes fail.

Most enterprises start overconfident about their data. We found 73% of enterprises initially believe they have good data accessibility, but our research also shows they underestimate the requirements for data quality, lineage, freshness, metadata richness, and data observability. The result is that only 64% were satisfied with the quality and observability of their data when their agentic systems went into production.

It’s hard to expect good user adoption in the third of firms that discover too late that their data and observability aren’t up to standard. Who would trust the outcomes in such circumstances? Spinning up more agents and hoping for the best at this stage won’t help. We found user confidence did not increase with the volume of agents (see Exhibit 3), where trust issues remain.

Our advice is to treat data reliability and user adoption as core to agent OS layers rather than something to address as a post-production afterthought.

Exhibit 3: Early adopters are confident in the tech, but data issues continue to hold back user adoption

Note: The neutral (3) category was removed from this chart for clarity.
Sample: Leaders from 201 Global 2000 enterprises
Source: HFS Research, 2026

Nearly half of all organizations in our survey were stuck in the planning or limited-implementation stages for people, process, and tech readiness (see Exhibit 4). Rapid scaling requires adequately prepared foundations. Lay out readiness roadmaps, prioritizing governance and responsible AI foundations as a minimum. Scaling without guardrails introduces the risk of compliance, ethics, and trust vulnerabilities, which will trip you up in production, particularly in customer-facing environments.

Senior leaders must take the lead. Among our more mature Orchestrators, senior leader engagement is at its highest, with 76% somewhat or very engaged, compared to just 37% in the least mature Task Masters. To succeed, C-suite leaders must play a hands-on role in shaping governance, orchestration, and value realization frameworks rather than just monitoring results. To succeed, such leaders must be equipped with AI observability tools that enable them to monitor autonomy, compliance, and ROI in real time.

Exhibit 4: Too many enterprises go into production before people, process, and tech foundations are ready to scale, slowing progress

Note: The “Don’t know” category is removed for clarity.
Sample: Leaders from 201 Global 2000 enterprises
Source: HFS Research, 2026

Deliver enterprise-grade reliability with a five-layer blueprint for an agent OS

To solve for the issues enterprise trailblazers shared with us in this survey, you need a standardized layer for orchestration, governance, monitoring, explainability, and workforce integration; in other words, you need an agent OS.

Today, almost nobody has a full-fledged agent OS, but the direction of travel is clear. The momentum toward multi-agent systems is real and shows no sign of slowing. The guidance from more than 200 firms we gathered data from is that enterprises must be ready to orchestrate agents at scale within 12 months; otherwise, they should expect to be overtaken by those who do learn the lessons of early leaders and establish an effective agent OS. The field guide this document concludes with shows you how.

To shape it, we first identified how the most mature Orchestrators are converging toward a blueprint in which the agent OS has the five foundational layers in Exhibit 5.

Exhibit 5: The five foundations for your agent OS

Source: HFS Research, 2026

Layer 1: Governance and autonomy management

Set autonomy thresholds, establish risk scoring for agent actions, define and implement escalation policies, including role-based access controls and human-in-the-loop checkpoints.

Layer 2: Orchestration

Enable multi-agent coordination, context routing, task decomposition and planning, conflict resolution logic, and standardized communication schemas. In our study, we found the largest group (31%) currently relies on custom-built orchestration (built in-house), and 26% use proprietary protocols for agent-to-agent communication.

Exhibit 6: Firms must move beyond custom-built in-house solutions to access an ecosystem of capability through A2A and MCP

Sample: Leaders from 61 Global 2000 enterprises with at least two agentic systems in production (Orchestrators and Coordinators)
Source: HFS Research, 2026

While you may have heard that MCP is becoming the de facto default for agent-to-agent (A2A) workflow coordination, our data in Exhibit 6 reveals limited adoption in the enterprise (8%). In-house and custom-built solutions still dominate. If your agents can’t speak to external agents, or even to other agents in your own business that use different protocols, you will miss out on the value of a burgeoning ecosystem of agentic capabilities coming to market from tech providers such as Salesforce, ServiceNow, and SAP; from hyperscalers such as AWS, Microsoft, and Google; from startups such as WRITER and CrewAI (and mega startups such as OpenAI and Anthropic); and from service providers such as Cognizant.

Layer 3: Observability and explainability

The third layer, encompassing observability and explainability, is where to provision agent telemetry, behavior summaries, drift detection (the divergence between expected behavior and actual behavior of your AI system), decision lineage (the audit trail of inputs, reasoning, interactions, and outputs leading to a decision), and cross-agent dependency mapping.

Layer 4: Data trust

Orchestrators have learned that they must keep close tabs on data lineage and metadata, establish data quality scores and freshness controls, ensure semantic alignment (ensuring all agents, models, and systems share a consistent understanding of the data’s meaning), and implement real-time monitoring.

Layer 5: Human–agent workforce operating model

The fifth layer of the agentic OS blueprint tackles setting and controlling agent roles and responsibilities, their performance metrics, escalation and accountability, workforce planning (human–agent), and governance for mixed human–machine teams.

When moving your team from a traditional workflow to an agentic one, a significant portion of new challenges centers on how humans will work with machines.

Earlier layers have addressed data integrity. The next most important lessons to consider when moving to an agentic workflow address human–agent interactions, shown in Exhibit 7, including accountability (who or what is to blame and who gets the credit) as second (45%), while changing collaboration norms is in a tie for third (36%).

Among the most mature Orchestrators, the importance of accountability increases to 50%. Firms must tackle these questions early to support internal adoption. You must be deliberate in establishing new norms for human–agent teams. Will you reward or penalize agents for engagement? Where will the blame rest when things go wrong? Be open, honest, and transparent, which you can only do if you have the means to assess and measure the impact of human or machine actions fairly. Most firms have a leap to take here.

Exhibit 7: Agentic workflows demand new kinds of accountability and success measurement, transforming our expectations of collaboration

Sample: Leaders from 202 Global 2000 enterprises
Source: HFS Research, 2026

Informed by the five layers of our agent OS blueprint, we have applied lessons learned from the 200-plus enterprises in our study to provide the following seven-step field guide for building your agent OS.

The seven-step field guide to your agent OS

The field guide in Exhibit 8 is your plan for rapid action when scaling from your first agents to the dawning multi-agent reality you now face. In establishing operating principles for agent definition and identity, standardized orchestration, trust and auditability, data reliability, controls on autonomy, learning from agentic discoveries, and workforce design, you’ll remove the operational fragility that derailed deployments that have gone before you.

By standardizing orchestration, data trust, and autonomy governance, you prepare the ground for end-to-end, cross-workflow automation, in line with the highest silo-busting ambitions of the HFS OneOffice.

This is a future-ready operating model in which humans and agents collaborate transparently, safely, and consistently, cutting out compliance failures and organizational drag.

Exhibit 8: Your seven-step plan for rapid action when scaling from your first agent to the dawning multi-agent reality

Source: HFS Research, 2026

The Bottom Line: Unlock the value of cross-functional intelligent automation in a new high-trust operating model.

Stop funding agent pilots and start designing your agent OS architecture. The competitive gap is widening now; leaders are already 12 to 20 agents ahead of the laggards still debating governance. Design for orchestration deliberately, or watch coordination failures and compliance gaps kill adoption before you reach five agents.

Login

Register

Insight. Inspiration. Impact.

Get Started

Authors

David Cushman Executive Research Leader

Niti Jhunjhunwala Senior Analyst
Phil Fersht CEO and Chief Analyst
Sam Duncan Practice Leader

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?

Email
	If you don't have an account, Register here

Username
Password

	Remember Me Lost your password?