Point of View

The problem is not tech; BFSI needs an operating model to truly scale AI

BFSI firms are not failing to scale AI because the technology is “not ready.” They are failing because they are trying to bolt probabilistic systems onto operating models built for deterministic work, fragmented ownership, and legacy constraints.

The roundtable with BFSI executives, conducted in partnership with Sutherland, surfaced a common pattern: pilots look promising, but enterprise scaling runs into a wall of process, technology, data and people debt, and governance and ROI ambiguity, derailing the best plans. BFSI CIOs must stop pumping money into more POCs and instead focus on redesigning decision rights, workflow ownership, evaluation, and controls so AI can safely run as part of the workforce.

The market is moving from “AI excitement” to “AI accountability”

BFSI CIOs have moved beyond debating whether AI works and are now wrestling with how to make it repeatable, governable, and value-producing at enterprise scale. Participants in the roundtable described a reality where AI has become a catch-all label, much like digital transformation previously, creating confusion, scattered investments, and misaligned expectations across functions (see Exhibit 1).

Exhibit 1: AI programs break down due to lack of clarity around technology, data, or workflow ownership

Sample: n=12 delegates
Source: HFS Research, 2026

A BFSI executive compared AI today to owning a garage full of high-performance cars where the promise of the powerful engines is real, but they don’t have the “AI interstate system” needed to operate them at full power. Simultaneously, regulatory pressures and customer intolerance for opaque risk further constrain funding on indefinite experimentation without proof of value.

Another participant captured the core dilemma: there is “proof of concept” and “proof of promise,” but not enough “proof of value,” leading to “death by 1,000 POCs” as experiments proliferate without convergence. Building the path for AI at scale in BFSI is a collective responsibility. Regulators must create clearer lanes for responsible innovation, enterprises must redesign operating models, and providers must deliver production-grade, governed outcomes.

AI scaling is an agency and operating model problem before it is a technology problem

Leaders at the roundtable described AI as a powerful engine being placed into a system without the “highways” needed to use it effectively because BFSI organizations are structurally optimized for stewardship and risk control, not rapid redesign of how work gets done.

Three failure modes kept recurring:

  1. Treating AI like a tool rollout instead of workflow redesign. Multiple leaders pointed out that AI, especially agentic AI, can’t be layered onto existing processes unless it has been reimagined. This is what shows up as “pilot success,” followed by scale failure when the AI hits real-world exceptions, policy constraints, and messy handoffs.
  2. Mistaking model performance for business performance. Leaders noted the tendency to obsess over model performance rather than business value. For probabilistic systems, classic software metrics (uptime, latency, handle time) matter. But for AI at scale, these are insufficient because outputs can be confident and well-formed yet wrong. This creates a trap: teams either under-invest in evaluation and risk controls or over-engineer evaluation so much that they never get to production.
  3. Scaling without a credible governance and economics model. Participants repeatedly stressed governance (including lifecycle governance for agents) and the lack of shared “tokenomics” (how usage translates into real cost and how to benchmark that cost by process) often led to failures.
To operationalize AI, BFSI CIOs must shift from problems to best practices

BFSI firms can’t scale AI by doing more pilots. They must scale by making a small set of practices repeatable across value streams. The leaders who reported making meaningful progress are treating AI as an operating capability, with common building blocks that can be reused. They emphasized redesigned workflows, clean data definitions, fit-for-purpose architecture, risk-tiered governance, and outcome instrumentation as key to standardize “how” AI gets taken from idea through production to monitored run-state. Participants pointed to the following best practices as the difference between isolated wins and enterprise-scale adoption:

  1. Democratize AI, but separate “everyday productivity” from “production automation”. A compelling operating model pattern was “two-speed AI” comprising (i) a broadly accessible enterprise AI environment for daily work (drafting, summarizing, analysis) to capture productivity and learning (ii) and a much more controlled pathway for production-grade agents and automations with explicit governance gates. This approach balances speed and safety and prevents innovation from being trapped in a small central team.
  2. Treat AI as workforce augmentation with explicit supervision and identity. A BFSI leader described using “digital employees” with unique logins working alongside humans to automate complex tasks, explicitly managed by humans. This “AI as a supervised workforce” model creates clearer accountability and operational acceptance than “shadow bots.”
  3. Build evaluation discipline that matches probabilistic risk. Implementing robust evaluations appropriate to AI while avoiding bogging pilots down with exhaustive measurement too early was cited as elemental for measuring value. Leaders called out using eval frameworks that look beyond classic deterministic software checks and include data on business outcomes enabled.
  4. Re-architect for cost and context. Two practical “in-the-trenches” points surfaced repeatedly. Cost modeling matters as every token and output has a cost, and architecture choices affect feasibility at scale. Not everything needs a large language model (LLM). Using smaller or domain-tuned models where appropriate can significantly reduce cost and increase reliability.
  5. Put AI “front and center” across transformation swim lanes. Turn AI into the horizontal thread that ties together modernization swim lanes (data modernization, CRM modernization, underwriting/operations transformation) instead of treating it as a parallel program. This makes scaling more likely as dependencies are handled as part of the broader roadmap, not an afterthought.
The Bottom Line: BFSI firms can’t pilot their way to AI at scale. The winners will redesign how work is owned, measured, governed, and improved, so that AI can operate as a controlled part of the workforce, tied to outcomes and trust.

Sign in to view or download this research.

Login

Register

Insight. Inspiration. Impact.

Register now for immediate access of HFS' research, data and forward looking trends.

Get Started

Download Research

    Sign In

    Insight. Inspiration. Impact.

    Register now for immediate access of HFS' research, data and forward looking trends.

    Get Started

      Contact Ask HFS AI Support