Enswitch storage on NFS with DRBD and Heartbeat

From Integrics Wiki
Revision as of 17:54, 22 May 2015 by Danthony (talk | contribs) (Cutover procedure)
Jump to: navigation, search

Disclaimer

The following comes with no warranty whatsoever. I am not responsible for any data loss or other issues that may arise from following these instructions. Please make backups of all files and test this thoroughly in your lab environment before using it in production.

Overview

This document details the procedure for migrating a multi-machine Enswitch system from a single NFS storage server to a fault tolerant cluster using NFS with DRBD and Heartbeat.

The procedure has been tested on Enswitch 3.11, but should work on most other versions. The NFS servers run Ubuntu 14.04 64bit and the clients run Ubuntu 10.04 64bit and 12.04 64bit.


The servers are as follows:

enswitchnfs0 - current active NFS server

enswitchnfs1 - current backup NFS server

enswitchstorage0 - New NFS server 0

enswitchstorage1 - New NFS server 1

The Enswitch subnet is 10.0.0.0/24

The gateway IP address is 10.0.0.1

The shared IP for heartbeat is 10.0.0.109

Server configuration

Load enswitchstorage0 and enswitchstorage1 with Ubuntu 14.04 64bit. Make a partition for the OS and leave the rest of the disk empty for the DRBD volume. Do not create a swap partition, a swap file will be added later.


Update all OS packages on enswitchstorage0 and enswitchstorage1:

sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get autoremove
sudo init 6


Create swap file:

sudo dd if=/dev/zero of=/swapfile0 bs=1M count=2048
sudo chmod 0600 /swapfile0
sudo mkswap /swapfile0


Add the following line to /etc/fstab:

/swapfile0              none            swap            sw              0 0


Enable swap file:

sudo swapon -a


Install additional software on enswitchstorage0 and enswitchstorage1:

sudo apt-get install ntp ifenslave


Install additional software on enswitchstorage0 and enswitchstorage1 (optional):

sudo apt-get install htop iotop bwm-ng tshark 


Add firewall rules on enswitchstorage0 and enswitchstorage1. The following can be used as the base for a firewall script:

iptables -F -m comment --comment "Clear all existing rules"
iptables -A INPUT -m state --state RELATED,ESTABLISHED -m comment --comment "Allow packets from related and established connections" -j ACCEPT
iptables -A INPUT -i lo -m comment --comment "Allow all on lo interface" -j ACCEPT
iptables -A INPUT -s 10.0.0.0/24 -m comment --comment "Allow everything from Enswitch subnet" -j ACCEPT
iptables -A INPUT -m comment --comment "Log all unmatched packets" -j LOG
iptables -A INPUT -m comment --comment "Drop all unmatched packets" -j DROP


Add entries to /etc/hosts for each server on enswitchstorage0 and enswitchstorage1:

10.0.0.122   enswitchstorage0
10.0.0.123   enswitchstorage1


Renumber libuuid user and group to free up uid 100 and gid 101. Then change the UID/GID of the libuuid files to match their new groups:

sudo chown libuuid:libuuid /usr/sbin/uuidd
sudo chown libuuid:libuuid /var/lib/libuuid


Add Enswitch user and group. There is no Enswitch code on these boxes, but this will make the file ownership show "enswitch:enswitch":

sudo adduser --system --group --no-create-home --home /var/lib/enswitch/home --disabled-password enswitch


Create partition for the DRBD volume on enswitchstorage0 and enswitchstorage1, in this example we use /dev/sda2:

sudo fdisk


Install DRDB utilities:

sudo apt-get install drbd8-utils


Create /etc/drbd.conf on both servers with the following contents:

global { usage-count no; }
common { syncer { rate 100M; } }
resource drbd0 {
        protocol C;
        startup {
                wfc-timeout  15;
                degr-wfc-timeout 60;
        }
        net {
                cram-hmac-alg sha1;
                shared-secret "secret";
        }
        on enswitchstorage0 {
                device /dev/drbd0;
                disk /dev/sda2;
                address 10.0.0.122:7788;
                meta-disk internal;
        }
        on enswitchstorage1 {
                device /dev/drbd0;
                disk /dev/sda2;
                address 10.0.0.123:7788;
                meta-disk internal;
        }
}


Create volume on both servers:

sudo drbdadm create-md drbd-enswitch
sudo service drbd start


Initialize volume, run the following on the primary server, in this case enswitchstorage0. You can run "watch cat /proc/drbd" to see the status of the rebuild:

sudo drbdadm -- --overwrite-data-of-peer primary all


Once the sync is complete, create a filesystem on /dev/drbd0

sudo mkfs.ext4 /dev/drbd0


Install NFS server:

sudo apt-get install nfs-kernel-server

Configure NFS server:


Add the following line to /etc/exports on enswitchstorage0 and enswitch storage1:

/var/lib/enswitch 10.0.0.0/24(rw,no_root_squash,async,no_subtree_check,fsid=0)


Re-export NFS shares

sudo exportfs -ra


Install heartbeat:

sudo apt-get install heartbeat


Configure heartbeat:

Create /etc/ha.d/haresources:

On enswitchstorage0:

enswitchstorage0 IPaddr::10.0.0.109/24/eth0 drbddisk::drbd0 Filesystem::/dev/drbd0::/var/lib/enswitch::ext4 nfs-kernel-server

On enswitchstorage1:

enswitchstorage0 IPaddr::10.0.0.109/24/eth0 drbddisk::drbd0 Filesystem::/dev/drbd0::/var/lib/enswitch::ext4 nfs-kernel-server


Create /etc/ha.d/ha.cf:

On enswitchstorage0:

debug 1
debugfile /var/log/heartbeat_debug.log
logfile /var/log/heartbeat.log
logfacility local0

keepalive 1
deadtime 10
warntime 5
initdead 60

udpport 696
bcast eth0
ucast eth0 10.0.0.123
ping 10.0.0.1

auto_failback off
node enswitchstorage0
node enswitchstorage1
respawn hacluster /usr/lib/heartbeat/ipfail


On enswitchstorage1:

debug 1
debugfile /var/log/heartbeat_debug.log
logfile /var/log/heartbeat.log
logfacility local0

keepalive 1
deadtime 10
warntime 5
initdead 60

udpport 696
bcast eth0
ucast eth0 10.0.0.122
ping 10.0.0.1

auto_failback off
node enswitchstorage0
node enswitchstorage1
respawn hacluster /usr/lib/heartbeat/ipfail


Create /etc/ha.d/authkeys on enswitchstorage0 and enswitchstorage1:

auth 2
1 crc
2 sha1 YG89uXsBVF0ufX7iy8w10FRrThwB2zcs
3 md5 enswitch

Change permissions on /etc/ha.d/authkeys

sudo chmod 600 /etc/ha.d/authkeys

Cutover procedure

On one of the current NFS boxes, mount the new NFS volume as /var/lib/enswitch2/ and rsync data:

sudo mkdir /var/lib/enswitch2
sudo mount -t nfs 10.0.0.109:/var/lib/enswitch /var/lib/enswitch2
sudo rsync -av --delete  /var/lib/enswitch/ /var/lib/enswitch2/


Unmount /var/lib/enswitch on all servers

sudo umount /var/lib/enswitch


Rsync data one more time on old NFS server:

sudo rsync -av --delete /var/lib/enswitch/ /var/lib/enswitch2/


Add the following line to /etc/fstab on all other servers (commenting out the current line for the /var/lib/enswitch nfs mount):

10.0.0.109:/mnt/drbd0        /mnt/drbd0      nfs     rsize=32768,wsize=32768,hard,timeo=50,fg,actimeo=3,noatime,nodiratime,noauto    0 0

Mount new NFS volume on all Enswitch servers:

sudo mount /var/lib/enswitch


Restart enswitch on all servers

sudo enswitch restart


References

http://drbd.linbit.com/users-guide/s-resolve-split-brain.html
https://help.ubuntu.com/community/NFSv4Howto

https://help.ubuntu.com/lts/serverguide/drbd.html

https://www.howtoforge.com/high-availability-nfs-with-drbd-plus-heartbeat

https://help.ubuntu.com/community/HighlyAvailableNFS