Why You Can't Retrofit Real-Time Decisioning

Key Takeaways

Real-time decisioning is gaining popularity but faces challenges due to legacy architectures.
Real-time decisioning involves initiating an event without knowing the outcome, requiring decisions that alter state.
Key factors in real-time evaluation are average latency and long-tail latency, with the latter being crucial when deadlines exist.
Common mistakes include misunderstanding latency types, assuming component latencies are additive, and undercounting network trips.
Achieving consistent low-latency requires starting with a time budget, minimizing layers, and potentially splitting the system into faster and 'really fast' parts, with latency as the fundamental constraint.

Real-time decisioning has rapidly gained popularity within the tech zeitgeist, especially with its connection to streaming data.

It seems like everything these days either happens in real time or it doesn’t happen at all.

But companies are starting to run into a major issue with real-time data and real-time decisioning. They’re trying to fit a square peg into a round role: ie, trying to achieve real-time data processing and decisioning with legacy architectures that just weren’t built for it, and, in doing so, creating massive technical debt and TCO that slowly sinks their revenues and maybe even the entire company.

Before we look at why this has become such an issue, let’s define our terms.

WHAT IS “REAL-TIME DECISIONING” ANYWAY?

“Real-time decisioning” is a pretty loaded term. Let’s break it down, starting with the easy part — decisioning. Decisioning is when you start an event without knowing the outcome. Rearranging fields in a record or writing a log message are not ‘decisions’. Deciding whether to connect a phone call, let someone join a game, or what level of difficulty their video game should switch to are examples of “decisions”.

HOW DO WE DEFINE REAL TIME?

There are two variables you need to consider when evaluating how ‘real time’ something is:

1. Average latency

Average latency is often measured in milliseconds, or thousandths of a second. An average human reaction time is between 150 and 300 milliseconds. A traditional ‘spinning rust’ hard drive will take a couple of milliseconds to return data. In an ideal environment and situation, a legacy RDBMS can do it in around 10ms, and Volt Active Data can do it in 1-2ms. Light will travel 186 miles during a millisecond. Below milliseconds you get into microseconds (millionths of a second) and nanoseconds (billionths of a second). Some exotic trading systems work in microseconds, and physicists often work with nanoseconds.

2. Long-tail latency

If you rely on an answer to control something in the real world then long tail latency matters to you. People often refer to ‘99th-percentile latency’ — the time it takes for 99% of events to happen. If your average latency is 2ms but your 99th percentile is 1000ms, then one event in 100 will not have finished after a second. On the other hand, if you need events to finish within 1000ms, an average latency of 800ms with a 99th percentile of 950ms will be a lot better, as it means that no more than one in a thousand events will be late. The reality is that if you are in a situation where you have a deadline to do something, average latency isn’t really the metric to use – what matters is what proportion of events miss the deadline.

When enough events finish late to disrupt activity then you have a ‘long-tail latency’ problem. A social media feed that responds instantly to 99.9% of requests and takes 20 minutes for the remainder is not really an issue. But an ATM taking 20 minutes to dispense cash would be unacceptable. It’s much more likely to be an issue if you are changing something, and the change still happens, but long after you’ve given up waiting.

In addition to latency, in the real-time decisioning world, some decision of value that alters state somewhere usually needs to happen. If so, scaling becomes a problem. If all you’re doing is pulling entries from a cache you can scale by cloning the cache, but if you’re allocating or spending something, you can’t.

Putting all of the above together, we can define “real time” as:

Occurring within a time frame that is short enough to have real-world impacts on business decisions and update key data before it goes stale. This time frame is usually single-digit milliseconds, although the length will vary depending on the environment and use case.

WHERE, WHY, AND HOW COMPANIES GET INTO TROUBLE WITH REAL TIME

The most common mistake we see is failing to understand the difference between average and long-tail latency when defining real-time data processing requirements. As the above example makes clear, you need a really clear understanding of actual deadlines, not hypothetical management ones, and what will happen if you fail to meet them.

Humans will generally wait a couple of seconds and then try again. A lot of devices will treat a latency SLA failure as if it were a transient network outage, and try again, immediately. It could then fail because it bumps into the original request, leading to a storm of retries at one millisecond intervals and related chaos.

The second mistake we see is that people will assemble a stack of ‘best of breed’ components and assume that ‘latency’ will be the combined execution time of each layer. In practice, this doesn’t happen. If you get twice the expected average latency you’re doing well, and long-tail latency can be all over the place.

The third mistake is to undercount the number of actual network trips. A lot of applications will require multiple round trips to solve a business problem. 5ms latency for a 40ms SLA is fine, until you need to do 7 round tps, and you’re looking at 35 ms latency for your 40ms SLA. Edge computing can make this a really serious issue.

THE TECH-STACK CONUNDRUM: YOU CAN’T “JUST ADD” REAL-TIME DECISIONING

A lot of problems in technology can be solved by using or adding more of something — hardware, bandwidth, etc. But it’s really, really hard to ‘add speed’ by adding another layer.

Caching might outwardly be the exception to this, but caching doesn’t involve decisions. Taking a decision usually means you need 100% up-to-date and accurate data. In any situation where your decision-making logic and data storage are in separate places, you run the risk of being ‘overtaken by events’: you read something, but it changes the moment you stop looking at it. You can solve that by double-checking, but that’s yet another trip.

In short, once you’ve built your solution you may find it really, really hard to reduce real-time latency without a rewrite. You can’t make an existing stack faster by adding more stuff, any more than you can make a finished cake bigger by piling more raw ingredients on top.

WHAT CAN YOU DO? HERE ARE A FEW TIPS

So how do you reach your goal of consistent low-latency real-time decisions if you can’t do it by adding ‘more’ of something?

There’s no easy answer.

You’ll need to start from the available time budget and architect with that in mind as a hard limit.

One design goal should be to have as few different layers as possible, to avoid time being wasted on internal communications.

You may also find that different parts of the system operate on different timescales and might be able to split it into a ‘fast’ part and a ‘really fast’ part.

Using really fast CPUs might help, but will at best give you a 50% speedup when you could need 500%.

While there is no obvious, one-size-fits-all answer, once you accept that in any real-time decisioning system the fundamental constraint is latency your choices will become simpler and better.

Get Started with Volt

Architecture

Capabilities

Data Center Replication

In-Service Upgrades

Low Latency

Consistency

High Availability

Scalability

Page group one

Fraud Prevention

Hyper-Personalization

Private 5G Networks

Streaming Data

Edge-Based Deployments

Page group two

Industrial IoT

AI + ML

Business Support Systems

5G Streaming Mediation

The 6 Reasons BFSI Companies Need Real-Time Data Processing

From Tsunami to Transformation: 6 Key Takeaways from IoT Tech Expo North America 2025

Telco

BFSI

Intelligent Manufacturing

Smart Utilities

Supply Chain

Fantasy Sports

Retail

Resource Library

Blog

Partners

For Customers

Support

Professional Services

Documentation

For Developers

Developer Hub

Quick Start Guide

Developer Edition

About

Careers

News

Press Releases

Webinars & Events

Our Team

Contact Us

Why You Can’t Retrofit Real-Time Decisioning

Key Takeaways

WHAT IS “REAL-TIME DECISIONING” ANYWAY?

HOW DO WE DEFINE REAL TIME?

WHERE, WHY, AND HOW COMPANIES GET INTO TROUBLE WITH REAL TIME

THE TECH-STACK CONUNDRUM: YOU CAN’T “JUST ADD” REAL-TIME DECISIONING

WHAT CAN YOU DO? HERE ARE A FEW TIPS

About Author

Featured Resources

Six Use Cases of Real-Time Decisioning

Why Your Tech Stack Is About to Break (and How to Avoid It)

Follow Us:

Categories

Power Real-Time BFSI Success

Guide to Streaming Data Platforms

Volt Active Data’s Top-10 Capabilities

Why Your Tech Stack Is About to Break (and How to Avoid It)

Test Drive the Only Lightening-Fast No-Compromise Real-Time Data Platform on the Planet

Guide to Private 5G Networks