Home < Blog < NoSQL, RDBMS, and Contention for Shared Data: Part 2

NoSQL, RDBMS, and Contention for Shared Data: Part 2

Dec 21, 2017

4 min read

Key Takeaways

Traditional JDBC interactions have locking and latching overhead, which increases significantly in distributed systems.
NoSQL key-value interactions offer scalability but can be problematic for data access control, complexity, and accuracy in non-trivial use cases.
Moving computation to the data server using stored procedures can reduce overhead from simultaneous data access.
Eliminating uncommitted transactions and forcing requests to execute serially by partitioning data can significantly reduce resource contention.
Partitioning data by CPU core and using a single-threaded process per core can maximize transaction throughput.

Previously in part one of this blog series, I explained the problems with two methods of database interactions. Traditional JDBC-style interactions mean that locking and latching take up ~30% of compute time. This overhead skyrockets when you try to use a distributed system. A NoSQL key-value interaction allows for better scalability, but for non-trivial use cases there are problems with data access control, complexity, and accuracy. This brings us to our third approach:

Method #3: Move the problem to the data, do what needs to be done, send an answer back.

In the Hadoop ecosystem it’s taken for granted that moving computation is cheaper than moving data, and the same principle can be applied to transactional work. Both of the two methods we’ve looked at incur significant costs because multiple clients want simultaneous access to the same data at the same instant in time, and the resulting costs of preventing conflicts (with SQL + Sessions) or cleaning up conflicts (with a KV store) escalate exponentially as throughout and complexity increase. By doing all the ‘thinking’ on the server, we can reduce the overhead of simultaneous access, as we aren’t moving data across the wire all the time. So the logical thing to do is use some form of stored procedural logic on the server.

Stored Procedures? Seriously?

Stored procedures are in fact the most efficient way to handle high volume OLTP workloads.

Developers have traditionally disliked them because they made development more complicated and required more interaction with DBAs. They also usually have their own peculiar language which will somehow manage to be both complicated and limited, such as PL/SQL, which is one of the few computer languages where a “Hello World” program is non-trivial.

But their real problem is that legacy RDBMS implementations allow uncommitted transactions to exist while the client decides when or if a transaction should be committed. As we’ve already explained, this creates a massive server side overhead as well as resource contention issues. But if you make three changes you get some fairly astonishing results:

We can eliminate most ‘locking and latching’ by making it a rule that all calls to the database must commit or rollback by the end. This creates issues for some use cases, but for the vast majority of OLTP/IOT use cases works really well, as the trip to the database exists because somebody or (in the case of the IoT) something needs a clear answer to do their job, instead of wanting to mull things over and decide to commit or rollback on the client side at some arbitrary point in the future.
If we force requests to execute one at a time, we design out contention for resources. The legacy alternative leaves us with residual contention among requests that are ‘live’ on the server at the same time. Even with traditional stored procedure approaches, this will sooner or later become a problem, as figuring out what resource is supposed to be locked in what order is trickier than it looks. But by partitioning the data and then forcing each partition to execute serially, there is no longer any contention for resources. Bear in mind that if nobody is allowed to hold resources between calls, and that if each call is the only thing touching that partition of data, there is no reason why it shouldn’t run like greased lightning – the database doesn’t have to devote any time to keeping track of what other ‘sessions’ are seeing and the client doesn’t have to devote resources to dealing with requests overwriting each other or needing to be coordinated.
The final step is to partition the data by CPU core, which means that you can have a single threaded process running on each core handling request after request. Zero CPU cycles are spent figuring out what to do next. This is what we’ve done at Volt Active Data. On a consumer grade desktop I can easily get between 5-9K TPS per core, with each transaction updating multiple tables and being 100% ACID.

Conclusion

Since the legacy RDBMS’s architecture was nailed down in the 1980’s, dramatic advances in both computer architecture and anticipated transaction volumes have completely changed the world that we now work in. As a consequence, we need to realize that the real constraint for high performance write-centric OLTP systems is not RAM, or CPU, or network, but instead is contention for shared data. Volt Active Data is built around this assumption. To learn more about the differences in modern databases and how to pick the right one for your application, view our pre-recorded webinar “The Changing Landscape of the Database Industry.” Or, read our blog and checklist: 8 tips for evaluating a database in 2018.

About Author

David Rolfe

David Rolfe is Volt Active Data’s senior technologist in EMEA. He has 30 years of database industry experience with a focus on telcos.

Get Started with Volt

Architecture

Capabilities

Data Center Replication

In-Service Upgrades

Low Latency

Consistency

High Availability

Scalability

Page group one

Fraud Prevention

Hyper-Personalization

Private 5G Networks

Streaming Data

Edge-Based Deployments

Page group two

Industrial IoT

AI + ML

Business Support Systems

5G Streaming Mediation

The 6 Reasons BFSI Companies Need Real-Time Data Processing

From Tsunami to Transformation: 6 Key Takeaways from IoT Tech Expo North America 2025

Telco

BFSI

Intelligent Manufacturing

Smart Utilities

Supply Chain

Fantasy Sports

Retail

Resource Library

Blog

Partners

For Customers

Support

Professional Services

Documentation

For Developers

Developer Hub

Quick Start Guide

Developer Edition

About

Careers

News

Press Releases

Webinars & Events

Our Team

Contact Us

NoSQL, RDBMS, and Contention for Shared Data: Part 2

Key Takeaways

Method #3: Move the problem to the data, do what needs to be done, send an answer back.

Stored Procedures? Seriously?

Conclusion

About Author

Featured Resources

5 Reasons Volt Was Built for Telco-Grade Resiliency

The Real-Time Data Platform for Financial Services

Follow Us:

Categories

Power Real-Time BFSI Success

Guide to Streaming Data Platforms

Volt Active Data’s Top-10 Capabilities

Why Your Tech Stack Is About to Break (and How to Avoid It)

Test Drive the Only Lightening-Fast No-Compromise Real-Time Data Platform on the Planet

Guide to Private 5G Networks