Guest blog: Volt Active Data for easy HTTP ingestion

Key Takeaways

A Full360 gaming customer needed an analytics data warehouse for in-game events generated by mobile and desktop games.
The solution involved capturing events and staging them in S3 before loading them into a Vertica database for analysis with Tableau.
Volt Active Data was chosen due to its JSON interface and export table features, enabling a REST-like HTTP interface and direct export of data to text files.
The architecture includes an Amazon Elastic Load Balancer with TLS/HTTPS to distribute incoming events to multiple Volt Active Data servers in different Availability Zones.
The system handles approximately 10 million events per day using two m1.large servers, with the ability to scale up or add more instances as needed.

We thank Jeremy Winters of Volt Active Data partner Full360 for this guest post, which first appeared on Full360’s Elastic Ninja blog. It is published here in its entirety.

About a year ago, a Full360 gaming customer had a requirement to create an analytics data warehouse for in-game events. These events are generated by various mobile and desktop versions of their games, representing significant actions in the game such as wins, losses, level ups, and in-game currency transactions. The final target for the data was a Vertica database running on AWS EC2 instances… to be analyzed and visualized with Tableau.

The twist in this story for Full 360 is that the events were not being collected yet… so I was tasked with determining the mechanism for capturing the events and staging them in S3 before they were loaded into the database. It was obvious that this mechanism needed to be a HTTP based service of some sort. At the time I was familiar with building web services using JRuby, Sinatra, and Puma… the combination of which allows for a truly concurrent solution that is relatively easy to build.

Around that time Full360 became partners with Volt Active Data. I was browsing the Volt Active Data documentation and saw two features that knocked JRuby off the table and became a slam dunk for this requirement.

To give a little background, Volt Active Data is an in-memory SQL database that provides an MPP, shared nothing architecture. These principles are very similar to the architecture of Vertica, which is not surprising as they are both the brain children of Michael Stonebraker, who recently won a Turing award for his contribution to database systems technology. A database like Vertica is intended for high performance querying of very large datasets, while Volt Active Data is a bit of the opposite, designed for high throughput of complex transactional events. These events are sped up through design aspects such as the in-memory model and partitioned stored procedures. Volt Active Data is an excellent tool for achieving a high throughput of intelligent operations such as personalization, fraud detection, and analytics that capitalize on the value of freshly generated data.

But… we’re not doing any of that fancy stuff here! We are simply collecting events for ingestion into a data warehouse.

The two features of Volt Active Data that rang a bell for me were the following:

JSON Interface – Volt Active Data provides a simple RESTlike HTTP interface that allows you to call procedures and pass parameters via the URL in a JSON format. This functionality can be enabled for any Volt Active Data stored procedure.
Export tables – Volt Active Data is a SQL database that supports basic SELECT, INSERT, and UPDATE statements (among others). Tables can be defined as export only. Export tables will only accept an INSERT statement. When the record is inserted, the field is immediately queued for export to a text file on a defined directory of the Volt Active Data server. These files are rotated at an interval defined by the database configuration.

Defining the table as export is easy… you simply create the table definition as you normally would in a DDL statement, then declare it as an export table: create table sessions ( session_start timestamp ,user_id bigint ,session_id bigint ,b_user_id bigint … ); export table sessions;

Export functionality is enabled in the deployment.xml file that contains the resource definition of the Volt Active Data instance you are running. Note that the example below not only defines the export parameters, but enables the JSON API interface.

<?xml version=”1.0″?>

<export enabled=”true”>
<onserver exportto=”file”>
<configuration>
<property name=”type”>csv</property>
<property name=”nonce”>MyExport</property>
<property name=”batched”>true</property>
<property name=”period”>1</property>
<property name=”outdir”>/path/to/voltdbroot/export_output</property>
<property name=”skipinternals”>true</property>
<property name=”with-schema”>true</property>
</configuration>
</onserver>
</export>

</deployment>

When the database is running… any rows inserted into export only tables are written directly to text files which output in a basic CSV format. To complement the export process, I built a simple job that checks for new files (after they are rotated) then uploads them into Amazon S3 with a filename based upon a sortable timestamp and the server hostname, allowing multiple servers to collect events and write them safely into the S3 bucket.

Getting events into Volt Active Data is quite easy when using the JSON API. This API allows your application (in this case the mobile game) to send events to Volt Active Data by way of simple HTTP calls such as the example below…

http://servername:8080/api/1.0/?Procedure=SESSION&Parameters=&User=someuser&Password=abc123

Of course the example above uses HTTP… but in production we use an Amazon Elastic Load Balancer that has TLS/HTTPS to encrypt the traffic. The ELB also enables us to spread the incoming events to multiple Volt Active Data servers running in different Availability Zones (electrically isolated data centers) allowing for a data center to fail without losing the ability to collect events.

Each Volt Active Data server works in isolation, gathering events and writing them to files without awareness of the other servers. This customer typically gathers around ~10M events per day. This volume is handled by two m1.large servers neither of which ever goes past 20% CPU. If they were to require more horsepower, we could easily scale up the instances or add more of them to handle the load.

Behind the scenes we have a few other tricks, such as the way we generated all of the Volt Active Data event definitions (Java source files) using a metadata-driven Ruby template system… but that’s a story for another time!

Get Started with Volt

Architecture

Capabilities

Data Center Replication

In-Service Upgrades

Low Latency

Consistency

High Availability

Scalability

Page group one

Fraud Prevention

Hyper-Personalization

Private 5G Networks

Streaming Data

Edge-Based Deployments

Page group two

Industrial IoT

AI + ML

Business Support Systems

5G Streaming Mediation

The 6 Reasons BFSI Companies Need Real-Time Data Processing

From Tsunami to Transformation: 6 Key Takeaways from IoT Tech Expo North America 2025

Telco

BFSI

Intelligent Manufacturing

Smart Utilities

Supply Chain

Fantasy Sports

Retail

Resource Library

Blog

Partners

For Customers

Support

Professional Services

Documentation

For Developers

Developer Hub

Quick Start Guide

Developer Edition

About

Careers

News

Press Releases

Webinars & Events

Our Team

Contact Us

Guest blog: Volt Active Data for easy HTTP ingestion

Key Takeaways

About Author

Featured Resources

5 Reasons Volt Was Built for Telco-Grade Resiliency

The Real-Time Data Platform for Financial Services

Follow Us:

Categories

Power Real-Time BFSI Success

Guide to Streaming Data Platforms

Volt Active Data’s Top-10 Capabilities

Why Your Tech Stack Is About to Break (and How to Avoid It)

Test Drive the Only Lightening-Fast No-Compromise Real-Time Data Platform on the Planet

Guide to Private 5G Networks