Skip to content

Replication Plugin usage for Java

lkairies edited this page Jun 12, 2014 · 1 revision

BabuDB replication

Introduction

This guide refers to the replication plugin that will become available with the next major release (TBA 0.5.0). Since the replication mechanism has been separated from the core BabuDB functionalities also its usability has been enhanced. For your application it now does not matter anymore if it uses a single instance of BabuDB or a system of multiple instances that are replicated among each other in background. It will always be accessed through the same well documented interface that can be found within the package org.xtreemfs.babudb.api. The replication itself has no user-application accessible interface at all.

Features

Using the replication plugin for BabuDB will provide the following features:

  • Better protection against data loss: Database operations continuously be spread among the servers participating at the setup.
  • Higher availablility: An integrated master fail-over mechanism ensures an operational system as long as at least an user defined amount of servers is up and running. You still may need a fail-over mechanism for your application itself, because if the server that is executing your application fails, your system will be not available as well.
  • Completely transparent integration with BabuDB: Any application that has been written to run using BabuDB 0.5.0 without replication, will run on a replicated multiple-instance BabuDB system as well.

Preconditions necessary to use replication

  • A server running a BabuDB instance that is said to attend one replication-setup has to be able to connect to each other servers of the setup via a TCP/IP connection with limited latency.
  • They all must offer system clocks that are loosely synchronized within the setup ensuring a well known maximal time drift (by default that timedrift is assumed to be less than 3 seconds).

Quickstart

This guide describes how to setup an example with two BabuDB instances running at the same host. The more common way to setup replicated BabuDB instances will involve multiple hosts, but in case of a quickstart-scenario this will do. A sample application based on this quickstart can be found here.

Preparations

  • Ensure that you put your latest BabuDB release including its dependencies to myBabuDBPath.
  • Unzip the release of the replication plugin into myPluginPath on your harddrive.
  • Check whether you have installed a SUN/Oracle JRE of at least version 1.6.

Configuration

First of all we need to configure the replication plugin by creating one configuration file for each BabuDB instance taking part in our setup. For the two instance of our example these will be pluginConfig0.properties with the following content:

plugin.jar = myPluginPath/Replication-1.0.0_0.5.0.jar

# paths to libraries this plugin depends on
babudb.repl.dependency.0 = myPluginPath/lib/PBRPC.jar
babudb.repl.dependency.1 = myPluginPath/lib/Flease.jar
babudb.repl.dependency.2 = myPluginPath/extLib/protobuf-java-2.5.0.jar

# DB backup directory - needed for the initial loading of the BabuDB from the 
# master in replication context
babudb.repl.backupDir = /tmp/backup0

# number of servers that at least have to be up to date
babudb.repl.sync.n = 2

# it is possible to set the local address and port of this server explicitly. if not it will be
# chosen from the list of participants added right hereafter (default).
babudb.repl.localhost = localhost
babudb.repl.localport = 35666

# participants of the replication including the local address
babudb.repl.participant.0 = localhost
babudb.repl.participant.0.port = 35666
babudb.repl.participant.1 = localhost
babudb.repl.participant.1.port = 35667

and pluginConfig1.properties with similar content:

plugin.jar = myPluginPath/Replication-1.0.0_0.5.0.jar

# paths to libraries this plugin depends on
babudb.repl.dependency.0 = myPluginPath/lib/PBRPC.jar
babudb.repl.dependency.1 = myPluginPath/lib/Flease.jar
babudb.repl.dependency.2 = myPluginPath/extLib/protobuf-java-2.5.0.jar

# DB backup directory - needed for the initial loading of the BabuDB from the 
# master in replication context
babudb.repl.backupDir = /tmp/backup1

# number of servers that at least have to be up to date
babudb.repl.sync.n = 2

# it is possible to set the local address and port of this server explicitly. if not it will be
# chosen from the list of participants added right hereafter (default).
babudb.repl.localhost = localhost
babudb.repl.localport = 35667

# participants of the replication including the local address
babudb.repl.participant.0 = localhost
babudb.repl.participant.0.port = 35666
babudb.repl.participant.1 = localhost
babudb.repl.participant.1.port = 35667

Both files have to be saved to myBabuDBPath.

Startup

Now we need to extend the configuration of the single BabuDB instances with the reference to the plugin's configuration and fire them up. This will be part of the initialization of our application program:

ConfigBuilder builder0 = new ConfigBuilder();
builder0.setDataPath("/tmp/babudb0", "/tmp/babudb0/log").setLogAppendSyncMode(SyncMode.SYNC_WRITE);
builder0.addPlugin("myBabuDBPath/pluginConfig0.properties");
BabuDBConfig config0 = builder0.build();

ConfigBuilder builder1 = new ConfigBuilder();
builder1.setDataPath("/tmp/babudb1", "/tmp/babudb1/log").setLogAppendSyncMode(SyncMode.SYNC_WRITE);
builder1.addPlugin("myBabuDBPath/pluginConfig1.properties");
BabuDBConfig config1 = builder1.build();

BabuDB babuDB0 = BabuDBFactory.createBabuDB(config0);
BabuDB babuDB1 = BabuDBFactory.createBabuDB(config1);

The data stored at any of both instance will be replicated to the other one, while you may treat both of them as if they were single instances, like described at BabuDB usage in Java.

Technical details

The following details should not bother you, if you only want to use the replication plugin but they will if you want to optimize your setup or develop depending plugins.

The replication mechanism underlies a master-slave related replication model. Databases are replicated by the granularity of single operations. Any operation performed by the user will cause a log entry to be written to the pending on-disk log file. These log entries are serialized and send to the slaves via Google's protocol buffer RPC architecture. Mutual exclusion for the master is garanted by Flease a quorum-based algorithm, that was developed within the XtreemFS research. The user specifies behavior of the replication by deciding how many of the total amount n of hosts participating at replication setup have at least to be up-to-date.

Invariants

  • A participating server can be holder of the master privilege, obey such a master as its slave or remain in a well defined idle-mode.
  • A slave does obey only one master at a time.
  • If a master cannot spread an operation to at least k out of n slaves it rejects further incomming requests until the required amount of servers becomes available/reachable again.
  • At any time every participant can state a steadily distinct LSN (Logentry Sequence Number) representing its progress. It is the identifier of the last entry written to its on-disk log file.

For even more technical details visit BabuDB's replication for Java.