SCOOBY-DOO and all related characters and elements © & ™ Hanna-Barbera and Warner Brothers.
My youngest child recently went through a brief ‘Scooby-Doo’ phase. After it ended I asked him why he had lost interest. He looked at me with all the fury a small boy can muster and hissed “It’s. Always. A. Man. In. A. Suit” at me through clenched baby teeth, before stomping off. This is a show which has made in excess of US$275M by using the same plot more than 40 times over.
So what has this got to do with databases? There’s a lot of talk right now about how 5G and the IIoT will increase volumes of ‘transactions’ driven by streaming events, but equally, there’s a lot of hand waving and vague definitions about what a ‘transaction’ actually is. As a consequence, I would argue people are overestimating what can be done with clever queues and underestimating how much high-volume transactional stream processing will be needed.
Just because a simple single item transaction can be scaled in a queue doesn’t mean your complicated business transaction can be scaled in a queue.
Normally we think in terms of ACID transactions, which are defined thus:
- Atomic – either all of it happens, or none of it happens. Nobody ever sees part of it.
- Consistency – A transaction can’t break any primary key or other constraints.
- Isolation – Running transactions don’t influence each other. They always act as if they were the only thing happening at that time.
- Durability – The results of a transaction are somehow permanent.
- From a SQL perspective a Scooby-Doo Episode look might like this:
INSERT INTO wacky_plans VALUES (…);
INSERT INTO men_in_suits_in_handcuffs VALUES (….);
INSERT INTO solved_mysteries VALUES (…);
This set of actions indeed constitute an ACID transaction, and will happily live in a queue, but it’s not representative of real-world problems. In Scooby-Doo, the outcome is preordained by a Showbible and is thus a foregone conclusion. Scooby never fails. You can’t place bets in Vegas on Scooby-Doo’s plot twists. You can happily insert the data involved into your favorite queuing platform without a moment’s thought or any effort on your part, with no further action needed.
Unlike the real world, Scooby has no external dependencies. He is thankfully immune to having to deal with day to day problems relating to shared, finite resources such as parking spaces for the mystery machine, availability of police officers to hand the bad guys to, expired scooby snacks or, for that matter, anybody else trying to solve their mystery at the same time.
But in the real world, somebody would have to solve all these problems as the need arises. None of these things are certainties, and until the moment they are checked, the outcome is in doubt. Things are dynamic.
Not all transactions are easy to scale using queueing technologies:
- Non-dynamic transactions where the outcome is never in doubt are easily scalable.
- Dynamic transactions with unknown outcomes can be tough to scale, especially if they involve shared, finite resources. In this scenario, transactions will compete for these resources.
- Being ACID-compliant doesn’t just encompass data storage but also needs to include the decisions driven by the data.
Right now the database industry is looking at 5G and the IIoT. I’ve had conversations with people in the telco business who casually stated they expect volumes to go up tenfold, and this is in a space where 500K TPS is already not unusual. There will be a whole series of new high volume transactional problems to solve. Some of these may be “Scooby-Doo Transactions” – where you’re effectively recording events that have already happened. But a lot of them will be the kind of transactions where decisions driving responses to events at high volumes must complete within milliseconds if they are to be relevant. As this ability will directly impact your bottom line, it calls for some very careful design decisions.
Handling hundreds of thousands of impactful, complex transactions per second.
You have four choices:
- Try to solve this with legacy RDBMS. This approach won’t work unless you somehow create a farm of legacy RDBMS databases, shard the work across the farm, write a layer on top of it and hire lots of DBAs to manage it.
- Try to solve this with NoSQL. While NoSQL products are beginning to offer transactions, their capabilities are currently far behind what the real world will ask of them. Approaches that involve overwriting entire copies of records with new ones will be vulnerable to scaling problems, especially when low SLAs are involved, as conflicts result in multiple time-consuming retries.
- Try to solve this with a smart queueing product. Queuing product vendors have started to put SQL layers on top of their base functionality, and have implemented basic GROUP BY and time-based aggregation. This approach works well for simple scenarios like rendering a dashboard. But, when actions need to be taken when these aggregates indicate deviation from the norm, it will not be sufficient to just store aggregates for interested parties to query whenever they choose to. Ultimately this approach runs the risk of breaking totally if the complexity of your use-case changes and involves conditional logic or any other additional complexity.
- Use Volt Active Data. We don’t claim to solve every problem, but handling huge numbers of dynamic transactions with a predictable latency without compromising on data and decision veracity is an area we have an established track record in.
We know that the number of transactions will skyrocket over the next decade as machine-to-machine communications increase in volume. You need to fully understand what vendors mean when they claim to support high transaction volumes. Transactions that simply record state changes are much easier than ones that need to encapsulate the entire work spectrum of ingest-store-aggregate-measure-detect-decide-act.
Do you have Scooby-Doo transactions or complicated business transactions that need to be scaled in a queue? Get in touch and let’s chat about how Volt Active Data can help.