In this post we describe Volt Active Data’s high availability architecture and how it interacts with bonded networks. Together they offer protection and continuity against network failovers. How do we know? Because we tested it and describe the tests we ran to verify compatibility with various bonded network modes.
What Is a Bonded Network?
Bonded networking is a protocol that allows multiple physical network interfaces to act as one logical interface for higher bandwidth and redundancy for high availability. It is also known as Channel Bonding, NIC Teaming, Link Aggregation Groups (LAG), Trunk Group, EtherChannel and 802.3ad. The Linux driver that bonds the server interfaces is transparent to all higher level applications and shouldn’t require any extra configuration by applications.
Volt Active Data High Availability Feature
Volt Active Data allows a user to specify how many copies of each record the cluster will store. If you tell the database to store three copies, two machines can fail and the cluster will continue operating. Failure detection and failure handling are automatic and backed by strong consensus guarantees. Availability is configured by specifying K-safety which is the number of cluster nodes that can be lost without interrupting the service.
Volt Active Data uses heartbeats to verify the presence of other nodes in the cluster. If a heartbeat is not received within a specified time limit, that server is assumed to be down and the cluster re-configures itself with the remaining nodes. This time limit is called the heartbeat timeout and is specified as a integer number of seconds.
While waiting for the heartbeats, any transactions that require access to data on the node or nodes that are not reachable will wait until the node(s) have either re-established connectivity and started heartbeating, or until the timeout expires and the Volt Active Data cluster decides to reconfigure itself to run without the extra copies of the missing data.
Summary of Results
Volt Active Data can handle a bonded network failover as long as the network can re-establish connectivity within Volt Active Data’s configured heartbeat timeout period. If the heartbeat timeout is too short to handle the bonded network failover, it can result in a node or nodes failing and as long as the K-safety configuration is high enough, the remaining cluster nodes will continue operation.
Volt Active Data tested five types of network bonding and in all cases, Volt Active Data was resilient to link failures in all modes. In addition, we verified that Volt Active Data ran correctly in those modes that also support Load Balancing. Each network switch had different failover duration when testing high availability and in our controlled environment under a moderate workload we found that a heartbeat timeout of 10 seconds was adequate to prevent unnecessary Dead Host detections.
The network switch configuration can dramatically affect the duration of failovers. As one example, in our tests we were connected to a switch with Spanning Tree enabled. A spanning tree topology change was triggered, this caused some ports on the switch to go through listening-and-learning state and temporarily prevent traffic on those ports. The delay between the state change and the ports forwarding again was longer than our heartbeat timeout setting and it caused a node to fail.
If you only require high availability with link redundancy and don’t need aggregation, we recommend setting up the bonded network for failover using active-backup mode. This networking mode has the simplest configuration and produced the most reliable failovers in our testing. After setting up the network, we recommend testing the duration of failover under load and setting Volt Active Data’s heartbeat timeout setting to a value that is long enough to prevent a node from being evicted from the cluster.
Bonded Network Modes
Linux Channel Bonding can be configured in 1 of 7 modes and each mode has benefits and limitations. Below is a chart of each mode, its features and whether we can verify Volt Active Data is resilient to link failures.
|Requires Specific Hardware Support||Requires Non-Standard or Proprietary Legacy Protocols||Standards Based||Load Balancing||Redundancy||Cross-Switch Redundancy||Verified Volt Active Data Compatible|
Mode 0 balance-rr
Mode 1 active-backup
Mode 2 balance-xor
Mode 3 broadcast
Mode 4 802.3ad
Mode 5 balance-tlb
Mode 6 balance-alb
We didn’t test balance-xor and broadcast mode because we didn’t have the proper hardware. However we believe that these should also pass.
Software Setup and Test Configuration
We ran tests on Centos 7.4 and Ubuntu 16.04. We also tested Volt Active Data inside a VM using lxc containers that shared a bridged interface over the bonded interface and verified no adverse effects from the configuration.
We tested failover and performance by running the Volt Active Data “Voter” application on a 2-node, k=1 cluster, with a client that continuously generates transactions. We monitored for errors and found no transaction failures while repeatedly killing network links at various intervals. We also varied the Volt Active Data heartbeat timeout value to find the limitation of heartbeat timeout vs. link failover time.
Hardware and Driver Configuration
Our testing used the Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011). This is the same kernel module used in Ubuntu 14.04, 16.04 and Centos 7.4 In all modes tested we used the default kernel parameters for the Linux 3.7.1 EtherChannel driver on both Ubuntu and CentOs.
We compared two server and switch configurations:
- Ubuntu 16.04 attached to an IBM RackSwitch™ G8264, using dual Intel 10Gb copper interfaces. They used the kernel module – ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver – version 4.2.1-k . Linux Kernel 4.4.0-96-generic
- Centos 7.4 attached to a Dell Powerconnect 8024F using dual Intel 10Gb copper interfaces. They used the kernel module – ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver – version 4.4.0-k-rh7.4 . Linux kernel 3.10.0-693 .
We used module defaults for all of our tests, in particular, miimon was used for port fail detection.
We also verified proper failover in a Multi-Switch Configuration using a single server (Ubuntu 16.04) with interfaces attached to each switch above.
With one exception, we were able to configure our equipment so all failover tests worked correctly. When we repeatedly brought the interfaces up and down within 3 seconds of each other and Volt Active Data had a heartbeat timeout of 1 second, the network couldn’t keep up and a Dead Host detection was triggered. Increasing the heartbeat timeout to 2 seconds corrected this issue. In general any link thrashing with intervals of 3 or less seconds apart caused the network to become unstable. This wasn’t specific to Volt Active Data.
We recommend that the network administrator determine the acceptable maximum duration of multiple links failover and take this into consideration when configuring Volt Active Data’s heartbeat timeout.
In active-backup, continuously thrashing (bringing each interface up and down) every second severely impacted networking in our configuration. This affected the network in general and Volt Active Data was impacted the same as if a single link had failed and recovered continuously. In our limited testing for our configurations with 2 bonded interfaces, we saw that the best case for any repeated failover was 3 seconds. If the user has interfaces that are constantly bouncing up and down they will experience network issues that are outside the control of Volt Active Data.
Balanced-rr is the only mode that will allow a single TCP/IP stream to utilize more than one interface’s worth of throughput. However, this can result in packets arriving out of order, causing TCP/IP’s congestion control system to kick in and retransmit packets. This could add extra latency to Volt Active Data transactions, so should be measured to make sure that application performance meets all requirements.
Our switch hardware didn’t specifically support balanced round robin traffic, sometimes called trunking, and as such we were not able to fully utilize two interfaces worth of traffic, however, even without hardware support, we were able to verify that throughput was not degraded in this mode and that link failover didn’t adversely affect Volt Active Data with a large enough heartbeat timeout.
The two hardware configurations (Dell/Centos and IBM/Ubuntu) had significantly different failover durations. When this mode was used and there were delays longer than 10 seconds for some failovers, we had to use a heartbeat timeout of up to 30 seconds to maintain stability.
Tools for testing bonded network failover
It is important to set Volt Active Data’s heartbeat timeout to be longer than the time it takes for the failover to occur. This section describes some of methods we used to measure the link failover duration.
A simple method is to start continuous pings across the bonded network to the host as well as an ssh session to the host. You may notice ping timeouts and/or missing sequence numbers. On the ssh session, if you type continuously you will notice a delay in the echo of your keystrokes. Measuring the ping timeouts and the typing delay will give you an estimate of how long the link failover can take. You should do more than one failover of each link in sequence. In our tests the first failover was usually very fast, but the second failover was often slower.
The ‘cat /proc/net/bonding/bond0’ command gives the most information about the state of the bond including which links are active/backup, some failover stats and any module parameters used. The output is different for each type of bonding. Likewise all of the link state changes on the host should be logged on the system.
You may also want to monitor out of order or resent packets, as this may contribute to increased latency for Volt Active Data transactions.
>watch -n 5 'netstat -s | grep segments' >netstat -s | grep segments 49049707 segments received 56121614 segments send out 6702 segments retransmited 1 bad segments received.
After determining the failover time without Volt Active Data, you should try the link failover with Volt Active Data running. Volt Active Data will issue warnings at regular intervals when heartbeats are skipped. For example:
2017-09-26 15:03:19,466 WARN [ZooKeeperServer] HOST: Have not received a message from host proddb01 for 10.005 seconds 2017-09-26 15:03:29,468 WARN [ZooKeeperServer] HOST: Have not received a message from host proddb01 for 20.005 seconds
If the network interruption from the failover exceeds the heartbeat timeout (set to 30 seconds in this example), then surviving nodes will log an ERROR:
2017-09-26 15:03:39,467 ERROR [ZooKeeperServer] HOST: DEAD HOST DETECTED, hostname: proddb01
By tuning your network and testing failover, you should be able to determine a heartbeat timeout value that enables Volt Active Data to maintain high availability during bonded network events.
Linux Bonding Kernel Module https://www.kernel.org/doc/Documentation/networking/bonding.txt
802.3ad LAG Overview http://www.ieee802.org/3/hssg/public/apr07/frazier_01_0407.pdf
Redhat Channel Bonding https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/s2-networkscripts-interfaces-chan.html
Ubuntu Channel Bonding https://help.ubuntu.com/community/UbuntuBonding
Dell PowerConnect Configuration Guide http://downloads.dell.com/manuals/all-products/esuprt_ser_stor_net/esuprt_powerconnect/powerconnect-8024f_user%27s%20guide_en-us.pdf
IBM System Networking RackSwitch G8264 Application Guide for Networking http://www-01.ibm.com/support/docview.wss?uid=isg3T7000679&aid=1