Introduction

This article will cover the build of a two-node high-availability cluster using DRBD (RAID1 over TCP/IP), the Corosync cluster engine, and the Pacemaker resource manager on CentOS 6.4. There are many applications for this type of cluster - as a free alternative to RHCS for example. However, this example does have a couple of caveats. As this is being built in a lab environment on KVM guests, there will be no STONITH (Shoot The Other Node In The Head) (a type of fencing). If this cluster goes split-brain, there may be manual recovery required to intervene, tell DRBD who is primary and who is secondary, and so on. In a Production environment, we’d use STONITH to connect to ILOMs (for example) and power off or reboot a misbehaving node. Quorum will also need to be disabled, as this stack doesn’t yet support the use of quorum disks - if you want that go with RHCS (and use cman with the two_node parameter, with or without qdiskd).

This article, as always, presumes that you know what you are doing. The nodes used in this article are as follows:

192.168.122.30 - rhcs-node01.local - first cluster node - running CentOS 6.4
192.168.122.31 - rhcs-node02.local - second cluster node - running CentOS 6.4
192.168.122.33 - failover IP address

DRBD will be used to replicate a volume between the two nodes (in a Master/Slave fashion), and the hosts will eventually run the nginx webserver in a failover topology, with this example having documents being served from the replicated volume.

Ideally, four network interfaces per host should be used (1 for “standard” node communications, 1 for DRBD replication, 2 for Corosync), but for a lab environment a single interface per node is fine.

Let’s start the build …

Continue reading →

Toki Winter

Advanced UNIX for the experienced system administrator

Tag Archives: DRBD

Clustering with DRBD, Corosync and Pacemaker

Introduction