Making docker swarm redundant

A basic docker swarm comes with two options for storage bind or volume. Both are persistent storage bot only on that node. So if the node fails and the swarm starts the task on a new node the data is lost. There are a few options to mitigate this with storage plugins for redundant storage. For my small raspberry pi docker swarm, I will use replicated storage via GlusterFS and binds.

There are GlusterFS volume drivers that you can use but the support on the arm platform is spotty at best. By using GlusterFS in a local mount and replicate it to all nodes we can use local binds in the containers. This ill be a stable and redundant solution for the use case. So first I install a USB stick in each node. To format the USB-sticks with XFS filesystem we need xfsprogs. The following steps need to be done on each node.

sudo apt-get install xfsprogs

The USB sticks show up as /dev/sda so we will format them with XFS. I had to use -f since some of the USB sticks contained old unused partitions.

sudo mkfs.xfs -i size=512 /dev/sda -f

Now create a directory to mount the USB sticks to. I use /gluster/bricks/1 where the 1 represents the first node. So the second node would use /gluster/bricks/2 and so on. This is a simple way to locate what node the brick belongs to.

sudo mkdir /gluster/bricks/1 -p

The -p is to create the whole structure at once. Then we need to mount the USB stick to these folders. Remember to update each echo line to the correct node number.

sudo su
echo '/dev/sda /gluster/bricks/1 xfs defaults 0 0' >> /etc/fstab
mount -a
exit

Now you can verify that you have the mounts with df -h. The last line should be like

/dev/sda         15G   48M   15G   1% /gluster/bricks/1

Make sure we are up to date and install GlusterFS.

sudo apt-get update && sudo apt-get upgrade -y
sudo apt-get install  glusterfs-server -y
sudo systemctl enable glusterd
sudo service glusterd start

Then the nodes need to probe each other. Make sure that the hostnames resolve from each node. You only need to probe from one node and they will all connect. Once you have two nodes in a cluster you have to probe the others from one of those otherwise it will not work.

sudo gluster peer probe dockernode2

You can then check the list of nodes and status on each node with:

pi@dockernode1:~ $ sudo gluster peer status
Number of Peers: 4

Hostname: dockernode2
Uuid: 4f07f1a4-f419-4b7a-854a-b43ed854765d
State: Peer in Cluster (Connected)

Hostname: dockernode3
Uuid: 6c411295-23d6-47d0-b0fe-611857819672
State: Peer in Cluster (Connected)

Hostname: dockernode4
Uuid: 906c156f-b19a-40a7-82f1-f7a247ca77e6
State: Peer in Cluster (Connected)

Hostname: dockernode5
Uuid: 65b9911f-82ff-4fff-a812-c722e8adbeac
State: Peer in Cluster (Connected)

Create replicated GlusterFS volume

Now when the gluster cluster is in place we need to create a replicated volume. We will create volume storage for all the swarm nodes so I will name it rep_swarm_vol.

pi@dockernode1:~ $ sudo gluster volume create rep_swarm_vol 
> replica 5 
> dockernode1:/gluster/bricks/1/brick 
> dockernode2:/gluster/bricks/2/brick 
> dockernode3:/gluster/bricks/3/brick 
> dockernode4:/gluster/bricks/4/brick 
> dockernode5:/gluster/bricks/5/brick
volume create: rep_swarm_vol: success: please start the volume to access data

Then we need to start the volume.

pi@dockernode1:~ $ sudo gluster volume start rep_swarm_vol
volume start: rep_swarm_vol: success

Then we can check the status of the volume.

pi@dockernode1:~ $ sudo gluster volume status rep_swarm_vol
Status of volume: rep_swarm_vol
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dockernode1:/gluster/bricks/1/brick   49152     0          Y       27059
Brick dockernode2:/gluster/bricks/2/brick   49152     0          Y       7423
Brick dockernode3:/gluster/bricks/3/brick   49152     0          Y       1968
Brick dockernode4:/gluster/bricks/4/brick   49152     0          Y       28296
Brick dockernode5:/gluster/bricks/5/brick   49152     0          Y       2188
Self-heal Daemon on localhost               N/A       N/A        Y       27082
Self-heal Daemon on dockernode4             N/A       N/A        Y       28319
Self-heal Daemon on dockernode2             N/A       N/A        Y       7448
Self-heal Daemon on dockernode5             N/A       N/A        Y       2211
Self-heal Daemon on dockernode3             N/A       N/A        Y       1991

Task Status of Volume rep_swarm_vol
------------------------------------------------------------------------------
There are no active volume tasks

Local mount on all nodes

Then mount the gluster volume on each docker node, and in my case gluster nodes.

sudo su
mkdir /mnt/glusterfs
echo 'localhost:/rep_swarm_vol /mnt/glusterfs glusterfs defaults,_netdev,backupvolfile-server=localhost 0 0' >> /etc/fstab
mount -a
exit

Now I can create a folder in /mnt/glusterfs for each container that needs persistent storage. It will replicate to each node so if the swarm redistributes the container it will run just fine. Since all the docker nodes are also gluster nodes we can secure the gluster volume from any outside access with:

sudo gluster volume set rep_swarm_vol auth.allow localhost

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: