Introduction
This article will cover the build of a two-node high-availability cluster using DRBD (RAID1 over TCP/IP), the Corosync cluster engine, and the Pacemaker resource manager on CentOS 6.4. There are many applications for this type of cluster - as a free alternative to RHCS for example. However, this example does have a couple of caveats. As this is being built in a lab environment on KVM guests, there will be no STONITH (Shoot The Other Node In The Head) (a type of fencing). If this cluster goes split-brain, there may be manual recovery required to intervene, tell DRBD who is primary and who is secondary, and so on. In a Production environment, we’d use STONITH to connect to ILOMs (for example) and power off or reboot a misbehaving node. Quorum will also need to be disabled, as this stack doesn’t yet support the use of quorum disks - if you want that go with RHCS (and use cman with the two_node parameter, with or without qdiskd).
This article, as always, presumes that you know what you are doing. The nodes used in this article are as follows:
- 192.168.122.30 - rhcs-node01.local - first cluster node - running CentOS 6.4
- 192.168.122.31 - rhcs-node02.local - second cluster node - running CentOS 6.4
- 192.168.122.33 - failover IP address
DRBD will be used to replicate a volume between the two nodes (in a Master/Slave fashion), and the hosts will eventually run the nginx webserver in a failover topology, with this example having documents being served from the replicated volume.
Ideally, four network interfaces per host should be used (1 for “standard” node communications, 1 for DRBD replication, 2 for Corosync), but for a lab environment a single interface per node is fine.
Let’s start the build …