FoundationDB's Lesson: A Fast Key-Value Store is Not Enough

Key Takeaways

FoundationDB attempted to add SQL and other query layers on top of its core key-value store to increase flexibility.
The performance of FDB-SQL was significantly slower than MySQL in single-machine tests, raising concerns about its efficiency.
Building SQL on top of a key-value store presents challenges related to metadata management, locking granularity, storage efficiency, secondary indexes, and data transfer.
Unlike more integrated systems, FDB-SQL’s separate SQL layer led to inefficiencies in data processing and transfer.
The need for powerful query languages and real index support is crucial for data systems, as evidenced by FoundationDB’s acquisition of Akiban and Cassandra’s addition of CQL.

The sale and subsequent closure of FoundationDB cut short something of a grand experiment. FoundationDB, conceived as a Key-Value store, had decided to add flexibility in the form of programming and query-model “Layers” on top of its core KV store. First up was SQL, software that ran on top of core FoundationDB and provided SQL relations, indexes and queries. A graph interface and possibly other “Layers” would follow.

So how did the SQL system work out? Running sysbench, FDB-SQL was less than half as fast as MySQL on a single machine. This is by their own measurements. They do claim their system scales well as you add more machines, but crucially, they don’t give actual numbers for the distributed SQL performance on sysbench, just “normalized performance.” I’d bet some money that absolute performance was not good. If the network overhead of the loopback interface was bottlenecking the single-machine test, then an actual network with actual latency couldn’t have helped.

The lesson here circles back to the failed search for one-size-fits-all architectures. The architecture that made FoundationDB a good ACID-transactional key-value store for operations on small numbers of keys is the wrong architecture to implement an operational SQL store.

This came up the other day in a discussion about the FDB sale on Hacker News. I’m just going to quote my little rant below; the comments that prompted it are in italics.

“Sure, but isn’t that basically what everyone does? All of the relational databases I know use ordered key-value storage engines. FoundationDB is the same, except it’s distributed.”

“The point is to use FDB as a foundation. I heard how one of the FDB founders replaced SQLite’s B-tree storage engine with FoundationDB, which turned SQLite into a distributed SQL database. It’s much more difficult to make something like that if you had to tackle the hard distributed systems problems.”

So this is a really interesting point and I can see why it makes sense if you’ve not built a SQL engine. Why is building SQL on top of a KV store the wrong call? What’s the difference between MySQL’s or Postgres’s or Volt Active Data’s “storage engine” and what FDB had built?

First, I’m not claiming that putting SQL on top of a KV store like FDB is impossible, just that you’re going to have to either compromise the purity of the KV engine substantially, or you’re going to get slow SQL for anything other than single row CRUD.

It starts with metadata. FDB-SQL stores metadata in the KV store itself. This is great from a distributed correctness point-of-view. It’s also great from a simplicity point-of-view. If I trust the underlying system to be safe and consistent, then my metadata is also safe and consistent. But now I need to do a ton of reads before I run my SQL to know how to run my SQL. Where’s my data, for example? Compare this to Volt Active Data, which replicates metadata to all processing sites, and basically has a second state-machine to ensure that each processing site has the right metadata at the right time. Updating metadata is more expensive, but the fast-path of using the metadata is several orders of magnitude faster.

Then you get to locking. I believe FDB-KV uses per-key read-write-sets to manage concurrency. This choice makes their two-key transaction benchmarks look great, but it scales poorly as the number of keys per transaction grows. SQL basically begs users to write operations that read and write to lots of keys. To make scans fast (even common partial-index-scans), you’re going to need more granular locks, or you’re going to need to relax consistency (ACID, not CAP).

Now we get to storage efficiency. If I have a SQL relation with a primary key and other columns, do I split it so the primary key is in the “key” and the other columns are in the “value” in the KV store? Do I store the whole record in the “value” and lose some efficiency? What about relations with no primary key? Say I’ve got a table that facilitates a many-to-many relationship and it’s just a set of integer pairs. How do I store that efficiently in a KV store? And what about that pesky metadata? Does the key need to include the identifier of the SQL relation (table)? Of course it does.

Secondary indexes. Oh man. To do them well on top of a KV store, you’re going to need pretty strong consistency to ensure that index records point to the right base record, and that there are no base records without an index record. This rules out the aforementioned trick of relaxing consistency to get faster many-key transactions. It’s also a metadata problem; I’ve got metadata I need to read to understand how to use the secondary index. More than that, when I update a record in the base relation, I’m going to need to find all affected secondary indexes and make sure they reflect the new information. That might mean lots of reads to the system to query metadata to update the secondary indexes. And again, we have the efficiency problem where I have to put a bit of extra stuff in each “key” to identify the secondary index the KV pair belongs to. Secondary indexes have locking/consistency, storage efficiency and metadata problems in this model.

Finally you get data transfer. This is probably not a broadly applicable problem, but I think in the FDB implementation, there is a lot of needless data transfer between KV cluster nodes and SQL processing nodes. Too much data needs to be collected and processed, rather than pushing down that processing to the data’s resting location. In FDB-SQL, if I just want one value from a large record, do I have to move the whole record over the network? Most SQL systems build processing DAGs for each SQL statement and can in-process stream between nodes, or have efficient temp tables to buffer intermediate results.

This falls out of making the SQL layer so separate. In fact, I think the FDB team working on SQL was almost a completely separate team in Massachusetts than the KV team in Virginia. If the SQL layer worked, then I can see how awesome this would be. The “Layers” model sounds so appealing, but it’s technically quite hard.

So yes, I agree that under most SQL-relational systems, there is a storage engine that smells much like a KV store. Still, that system is much less “pure” than what they were going for at FDB. Locking, metadata, secondary indexes, native understanding of relations, moving the processing to the data — all are critical to do right to get reasonable performance.

If FDB had more time to continue on the SQL path, I imagine it would dictate much deeper hooks into the KV side of things. I’m sure with time it could get a lot better. There’s a lot of research addressing some of the tradeoffs I’ve mentioned above, especially in the distributed transaction coordination space. Still, it’s always going to be easier to build the storage engine for the kinds of operations you want from day one.

I think the interesting thing here is that FoundationDB saw what Volt Active Data has seen over the past five or six years: fast data mutation and lookup-by-key are important, but they’re not enough. At high mutation rates, users want visibility into how their data is changing and how their business is affected, usually in real-time. A fast key-value store – even one with strong ACID properties – is all well and good, but without a powerful query language and real index support, it’s much less interesting. Quoting the Google F1 Paper:

“Features like indexes and ad hoc query are not just nice to have, but absolute requirements for our business.”

FoundationDB saw this coming and bought Akiban to add a SQL layer. But bolting SQL onto a KV store, even a really good one, isn’t trivial to do. It was released behind schedule and didn’t live up to expectations.

Cassandra is an interesting parallel. They also saw the importance of powerful queries and have added CQL to their system to support a SQL-esque query layer. Unlike FDB-SQL, CQL is much more natively integrated into Cassandra. It’s still a much less powerful query system than a proper relational database with real relationships, secondary indexes and such. But it seems like it’s close enough to complement Cassandra’s other strengths.

I’ll close with congratulations to the FoundationDB team. I’ve met some excellent engineers on their team and wish them the best in their new roles.

Get Started with Volt

Architecture

Capabilities

Data Center Replication

In-Service Upgrades

Low Latency

Consistency

High Availability

Scalability

Page group one

Fraud Prevention

Hyper-Personalization

Private 5G Networks

Streaming Data

Edge-Based Deployments

Page group two

Industrial IoT

AI + ML

Business Support Systems

5G Streaming Mediation

The 6 Reasons BFSI Companies Need Real-Time Data Processing

From Tsunami to Transformation: 6 Key Takeaways from IoT Tech Expo North America 2025

Telco

BFSI

Intelligent Manufacturing

Smart Utilities

Supply Chain

Fantasy Sports

Retail

Resource Library

Blog

Partners

For Customers

Support

Professional Services

Documentation

For Developers

Developer Hub

Quick Start Guide

Developer Edition

About

Careers

News

Press Releases

Webinars & Events

Our Team

Contact Us

FoundationDB’s Lesson: A Fast Key-Value Store is Not Enough

Key Takeaways

About Author

Featured Resources

5 Reasons Volt Was Built for Telco-Grade Resiliency

The Real-Time Data Platform for Financial Services

Follow Us:

Categories

Power Real-Time BFSI Success

Guide to Streaming Data Platforms

Volt Active Data’s Top-10 Capabilities

Why Your Tech Stack Is About to Break (and How to Avoid It)

Test Drive the Only Lightening-Fast No-Compromise Real-Time Data Platform on the Planet

Guide to Private 5G Networks