Blog Static Headline Banner

Why FS institutions have more intelligence than they can act on

A conversation between Dheeraj Remella (Volt Active Data) and Ken Ballou (NewEnding)

Financial institutions have spent the last decade building serious analytical capability. Sophisticated fraud models, mature ML platforms, years of transaction history. And yet production failures keep happening. Fraud gets through. Reconciliations blow up. AI systems that performed well in demos fall short in live environments.

The gap is not intelligence. It is the ability to act on it in time.

Dheeraj Remella is Field CTO at Volt Active Data, where he has spent more than 13 years working with enterprises on the architectural challenges that emerge when decisions must be made correctly, at scale, in milliseconds. He works directly with customers and partners across financial services, telecom, and industrial systems, and speaks regularly on the intersection of agentic AI, deterministic execution, and operational trust.

Ken Ballou is the founder of NewEnding, an enterprise software advisory firm, and a recent addition to Volt’s BFSI go-to-market team. Over more than 30 years he has worked with Wall Street, London, and Toronto-based banks, insurers, asset managers, and fintech organizations, with senior roles at Palantir Technologies, Sensedia, and Camunda along the way.

This conversation started from a shared observation: the institutions with the most invested in fraud and AI are often the ones most frustrated by what happens in production. What follows is their attempt to find the architectural root of that frustration.

Dheeraj: Ken, you have spent a long time working with banks and FinTechs on fraud and payments. I want to understand something from the business side before we get into technology. These institutions have invested seriously in data and analytics over the last decade. When you talk to fraud leaders or payments executives, what are they actually frustrated about?

Ken: The frustration is consistent, and it has not really changed in the time I have been doing this. They have a lot of capability on paper. Sophisticated models, good data teams, years of transaction history to work with. And then something goes wrong in production: a fraud wave gets through, a reconciliation blows up, or a rail goes live and starts generating losses they did not anticipate. The post-mortem almost always reveals the same thing. The system knew something was wrong. It just could not act on it fast enough, or with enough confidence, to prevent the outcome.

The second frustration, which is newer, is AI. There has been enormous investment in building AI capability in FS over the last two or three years, and a lot of it is genuinely impressive. But the production results are not matching the demo results, and the gap is starting to create real internal skepticism about whether the investment is going to pay off.

Dheeraj: That gap between demo and production. That is exactly what I want to dig into. From a technology perspective, when I look at what these institutions have built, I see a very capable analytical layer. Great data warehouses, mature ML platforms. But I also see that those systems were designed for a specific purpose, and it is not the same purpose as making a fraud decision in 50 milliseconds. Is that a distinction that resonates when you are talking to business people, or does it sound like technical detail?

Ken: It resonates when you frame it around consequences rather than architecture. Nobody in a business conversation cares that the latency profile of a data warehouse is incompatible with a real-time authorization budget. They care that their fraud model flagged the right pattern and the transaction still cleared. Those are the same problem, but you have to arrive at it from the business end.

The clearest way I have found to explain it is: your system has two speeds. It has the speed at which it learns, which is the speed of your data platform and your model training pipeline. And it has the speed at which it decides, which is the speed of a transaction. Most FS institutions have built the first one very well and left the second one to chance.

Dheeraj: I want to stay on that for a moment, because I think “left it to chance” is a bit generous. What I see more often is that the decision speed problem has been patched rather than solved. Caching, feature stores refreshed every few seconds, rule engines fed by near-real-time pipelines. There is real engineering effort behind it. Why does that not work?

Ken: Because near-real-time is not the same as transactionally correct, and in FS the difference between those two things is a paid-out fraud loss or a regulatory finding.

Let me give you a concrete example. We had a client running a feature store that refreshed every 90 seconds. On average, their fraud features were 45 seconds stale when the rule evaluated them. Most of the time, that was fine. Then they started seeing coordinated synthetic identity attacks where the coordination window was under 60 seconds. The feature store looked clean, the rule approved, and by the time the next refresh arrived the pattern was obvious in hindsight. The system was well-engineered. It was just engineered for a slightly different problem than the one it was being asked to solve.

Dheeraj: That example is exactly the kind of thing that drove the architecture we ended up building. What you are describing is a state consistency problem. The fraud check is evaluating a view of the world that is already out of date. The fix, technically, is to make the decision where the state lives rather than moving state to where the decision is. Run the fraud logic inside the database, against in-memory data that reflects every committed transaction, rather than querying a feature store that is refreshed on a schedule.

Ken: And I understand that at a conceptual level, but I want to push back on the simplicity of it. The reason banks built these multi-component architectures is not because they did not know better. It is because the alternatives have historically been worse. Tightly coupled systems that hold everything in one place tend to become monoliths that are expensive to change and brittle under load. The engineers who built the feature store approach were making a sensible trade-off at the time.

Dheeraj: That is a fair point and worth taking seriously. The tight coupling concern is real. What changed is that there is now a class of database designed specifically for this problem: high-throughput transactional workloads where you need ACID consistency at the latency of a memory read. The reason the feature store approach was the right trade-off five years ago is that the alternative did not really exist at scale. You could have consistency, or you could have speed. The architecture we are talking about gives you both, and it does so through shared-nothing partitioning that scales linearly rather than through the kind of coordination overhead that makes monoliths brittle.

The stored procedure model is the specific thing worth understanding here. Instead of moving data to the logic, you run the logic where the data lives. A fraud check reads velocity counters, device history, account state, and blocklists in a single round trip, executes the rule, and writes the decision record, all inside the same transaction. No network hops on the critical path. No consistency window to exploit.

Ken: The single round trip is the part I want to understand better. Because in practice, fraud logic is not one thing. It is hundreds or thousands of rules. Vendors like Sift and Sardine have built substantial platforms around this. Are you saying those go away?

Dheeraj: No, and that is an important distinction. The state plane and the rules logic are different things. What we are describing is a decisioning layer that holds the operational state, including velocity counters, the device graph, account balances, and behavioral baselines, and executes deterministic rules against that state at sub-10ms. The ML model scores from your existing vendors still feed in. A Sift score, a Sardine signal, an in-house model output. Those become inputs to the decision procedure rather than the decision itself. You are not replacing the intelligence you have built. You are giving it a foundation that can actually reach the transaction in time.

Ken: That is a more honest framing than I usually hear from technology vendors, and it changes how I think about the problem. Because the real failure mode I see with fraud AI is not that the models are wrong. It is that by the time the model output has been assembled, evaluated, and turned into an action, the window has closed. If the decisioning layer is what closes that gap while keeping the vendor integrations intact, that is a different conversation than ripping out an existing stack.

Dheeraj: Let me give you the rails version of the same problem, because I think it makes the stakes clearer. A bank goes live on FedNow or SEPA Instant. The transaction is irrevocable once it clears. Their fraud and AML checks have to complete within a two-to-three second bank-side processing window, and that covers balance check, limit enforcement, fraud scoring, AML screening, and sanctions lookup. In a multi-service architecture, those are sequential network hops. They can consume the entire window before a decision has been made.

Ken: I have sat in more than one post-mortem where that is exactly what happened. The institution launched on a real-time rail, volume grew faster than projected, and the processing chain started timing out under load. The engineering response was to reduce the scope of checks run in real time, which means they went live on a rail carrying irrevocable transfers while running partial fraud and AML controls. And then UK PSR comes along and says you need to demonstrate that your controls were operating correctly at the moment of a specific transaction, or you are liable for the reimbursement. That is a regulatory exposure that was created by an architectural trade-off.

Dheeraj: Right. And the architectural answer is to run all of those checks inside a single stored procedure, in one database round trip, so the combined latency is under 10ms rather than the sum of five sequential network hops. The sanctions tables from Refinitiv or Dow Jones load into the same system. AML rules execute against the same state as the balance check. There is no chain to time out because there is no chain.

Ken: What you are describing is essentially moving the compliance obligation into the data layer rather than trying to satisfy it at the application layer. I have not thought about it in those terms before, but that reframe is useful. Because the compliance teams I work with are always trying to get engineering to build them something faster. The actual ask is that the data the compliance control runs against reflects the truth at the moment of the transaction, not a few seconds earlier.

Dheeraj: That is exactly it. And it connects back to where we started, with the AI story. The reason AI agents do not perform in production the way they do in demos is the same reason the fraud model fires too late and the AML check times out. The agent is reasoning from state that does not reflect the present. When we expose current operational state to an agent through structured queries rather than warehouse SQL, the recommendations it produces are grounded in what is actually true right now. The stale-state problem is the AI problem and the fraud problem and the rails problem. They all have the same architectural root.

Ken: I want to sit with that for a moment, because I think there is something important in it. The way the AI conversation usually goes in FS is that people debate which model is more accurate, or whether the agent’s reasoning is sophisticated enough, or how you govern the decisions it makes. Those are real questions. But if the agent is reading from a feature store that is 45 seconds stale, the governance question is almost beside the point. You are governing a decision that was made on incorrect inputs.

The business case, when I put it that way, is not really about AI at all. It is about whether the data layer can keep up with the decision the business needs to make. Everything else follows from that.

Dheeraj: That is probably the clearest summary of the problem I have heard from someone who was not already inside the technology conversation.

Ken: I have spent enough time watching expensive systems fail at inconvenient moments to recognize the pattern. The intelligence is usually there. Getting it to the decision before the window closes is the part that keeps not working.

Read our brief “The Real-Time Data Platform for Financial Services” to see how Volt approaches real-time decisioning for financial services — covering fraud prevention, payment rail readiness, and AI-assisted risk controls in production environments.


back to top