Tag Archives: High-Availability

Configuring GFS2 on CentOS 7

This article will briefly discuss how to configure a GFS2 shared filesystem across two nodes on CentOS 7. Rather than rehashing a lot of previous content, this article presumes that you have followed the steps in my previous article, in order to configure the initial cluster and storage, up to and including the configuration of the STONITH device - but no further. All other topology considerations, device paths/layouts, etc. are the same, and the cluster nodes are still centos05 and centos07. The cluster name is webcluster and the 8GB LUN is presented as /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7 upon which a single partition has been created: /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7-part1.

First, install the lvm2-cluster and gfs2-utils packages:

# yum -y install gfs2-utils
# yum -y install lvm2-cluster

1 2	# yum -y install gfs2-utils # yum -y install lvm2-cluster

Enable clustered locking for LVM, and reboot both nodes:

# lvmconf --enable-cluster
# systemctl reboot

1 2	# lvmconf --enable-cluster # systemctl reboot

Create clone resources for DLM and CLVMD, so that they can run on both nodes. Run pcs commands from a single node only:

# pcs resource create dlm ocf:pacemaker:controld \
>    op monitor interval=30s on-fail=fence clone interleave=true ordered=true
# pcs resource create clvmd ocf:heartbeat:clvm \
>    op monitor interval=30s on-fail=fence clone interleave=true ordered=true

# pcs resource create dlm ocf:pacemaker:controld \

> op monitor interval=30s on-fail=fence clone interleave=true ordered=true

# pcs resource create clvmd ocf:heartbeat:clvm \

> op monitor interval=30s on-fail=fence clone interleave=true ordered=true

Create an ordering and a colocation constraint, so that DLM starts before CLVMD, and both resources start on the same node:

# pcs constraint order start dlm-clone then clvmd-clone
# pcs constraint colocation add clvmd-clone with dlm-clone

1 2	# pcs constraint order start dlm-clone then clvmd-clone # pcs constraint colocation add clvmd-clone with dlm-clone

Check the status of the clone resources:

# pcs status resources
 Clone Set: dlm-clone [dlm]
     Started: [ centos05 centos07 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ centos05 centos07 ]

# pcs status resources

Clone Set: dlm-clone [dlm]

Started: [ centos05 centos07 ]

Clone Set: clvmd-clone [clvmd]

Started: [ centos05 centos07 ]

Set the no-quorum-policy of the cluster to freeze so that that when quorum is lost, the remaining partition will do nothing until quorum is regained - GFS2 requires quorum to operate.

# pcs property set no-quorum-policy=freeze

1	# pcs property set no-quorum-policy=freeze

Create the LVM objects as required, again, from a single cluster node:

# pvcreate /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1
# vgcreate -Ay -cy vg_data /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1
# lvcreate -L 1G -n lv_test vg_data

# pvcreate /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1

# vgcreate -Ay -cy vg_data /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1

# lvcreate -L 1G -n lv_test vg_data

Create the GFS2 filesystem. The -t option should be specified as <clustername>:<fsname>, and the right number of journals should be specified (here 2 as we have two nodes accessing the filesystem):

# mkfs.gfs2 -p lock_dlm -t webcluster:testfs -j 2 /dev/vg_data/lv_test

1	# mkfs.gfs2 -p lock_dlm -t webcluster:testfs -j 2 /dev/vg_data/lv_test

We will not use /etc/fstab to specify the mount, rather we’ll use a Pacemaker-controlled resource:

# pcs resource create gfs2_res Filesystem device="/dev/vg_data/lv_test" \
>    directory="/mnt" fstype="gfs2" options="noatime,nodiratime" \
>    op monitor interval=10s on-fail=fence clone interleave=true

# pcs resource create gfs2_res Filesystem device="/dev/vg_data/lv_test" \

> directory="/mnt" fstype="gfs2" options="noatime,nodiratime" \

> op monitor interval=10s on-fail=fence clone interleave=true

This is configured as a clone resource so it will run on both nodes at the same time. Confirm that the mount has succeeded on both nodes:

# pcs resource show
 Clone Set: dlm-clone [dlm]
     Started: [ centos05 centos07 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ centos05 centos07 ]
 Clone Set: gfs2_res-clone [gfs2_res]
     Started: [ centos05 centos07 ]
# mount | grep gfs2
/dev/mapper/vg_data-lv_test on /mnt type gfs2 (rw,noatime,nodiratime,seclabel)

# pcs resource show

Clone Set: dlm-clone [dlm]

Started: [ centos05 centos07 ]

Clone Set: clvmd-clone [clvmd]

Started: [ centos05 centos07 ]

Clone Set: gfs2_res-clone [gfs2_res]

Started: [ centos05 centos07 ]

# mount | grep gfs2

/dev/mapper/vg_data-lv_test on /mnt type gfs2 (rw,noatime,nodiratime,seclabel)

Note the use of noatime and nodiratime which will yield a performance benefit. As per Red Hat Documentation, SELinux should be disabled too.

Next, create an ordering constraint so that the filesystem resource is started after the CLVMD resource, and a colocation constraint so that both start on the same node:

# pcs constraint order start clvmd-clone then gfs2_res-clone
Adding clvmd-clone gfs2_res-clone (kind: Mandatory) (Options: first-action=start then-action=start)
# pcs constraint colocation add gfs2_res-clone with clvmd-clone
# pcs constraint show
Location Constraints:
Ordering Constraints:
  start dlm-clone then start clvmd-clone
  start clvmd-clone then start gfs2_res-clone
Colocation Constraints:
  clvmd-clone with dlm-clone
  gfs2_res-clone with clvmd-clone

# pcs constraint order start clvmd-clone then gfs2_res-clone

Adding clvmd-clone gfs2_res-clone (kind: Mandatory) (Options: first-action=start then-action=start)

# pcs constraint colocation add gfs2_res-clone with clvmd-clone

# pcs constraint show

Location Constraints:

Ordering Constraints:

start dlm-clone then start clvmd-clone

start clvmd-clone then start gfs2_res-clone

Colocation Constraints:

clvmd-clone with dlm-clone

gfs2_res-clone with clvmd-clone

And we’re done.

We can even grow the filesystem online:

# lvextend -L+1G /dev/vg_data/lv_test
  Extending logical volume lv_test to 2.00 GiB
  Logical volume lv_test successfully resized
# gfs2_grow /dev/vg_data/lv_test
FS: Mount point:          /mnt
FS: Device:               /dev/mapper/vg_data-lv_test
FS: Size:                 262142 (0x3fffe)
FS: Resource group size:  65517 (0xffed)
DEV: Length:               524288 (0x80000)
The file system grew by 1024MB.
gfs2_grow complete.
# df -hT /mnt
Filesystem                  Type  Size  Used Avail Use% Mounted on
/dev/mapper/vg_data-lv_test gfs2  2.0G  259M  1.8G  13% /mnt

# lvextend -L+1G /dev/vg_data/lv_test

Extending logical volume lv_test to 2.00 GiB

Logical volume lv_test successfully resized

# gfs2_grow /dev/vg_data/lv_test

FS: Mount point: /mnt

FS: Device: /dev/mapper/vg_data-lv_test

FS: Size: 262142 (0x3fffe)

FS: Resource group size: 65517 (0xffed)

DEV: Length: 524288 (0x80000)

The file system grew by 1024MB.

gfs2_grow complete.

# df -hT /mnt

Filesystem Type Size Used Avail Use% Mounted on

/dev/mapper/vg_data-lv_test gfs2 2.0G 259M 1.8G 13% /mnt

Building a Highly-Available Apache Cluster on CentOS 7

This article will walk through the steps required to build a highly-available Apache cluster on CentOS 7. In CentOS 7 (as in Red Hat Enterprise Linux 7) the cluster stack has moved to Pacemaker/Corosync, with a new command line tool to manage the cluster (pcs, replacing commands such as ccs and clusvcadm in earlier releases).

The cluster will be a two node cluster comprising nodes centos05 and centos07, and iSCSI shared storage will be presented from node fedora01. There will be a 8GB LUN presented for shared storage, and a 1GB LUN for fencing purposes. I have covered setting up iSCSI storage with SCSI-3 persistent reservations in a previous article. There is no need to use CLVMD in this example as we will be utilising a simple failover filesystem instead.

The first step is to add appropriate entries to /etc/hosts on both nodes for all nodes, including the storage node, to safeguard against DNS failure:

# vi /etc/hosts
10.1.1.107  centos05
10.1.1.108  fedora01
10.1.1.111  centos07

# vi /etc/hosts

10.1.1.107 centos05

10.1.1.108 fedora01

10.1.1.111 centos07

Next, bring both cluster nodes fully up-to-date, and reboot them:

# yum -y update
# systemctl reboot

1 2	# yum -y update # systemctl reboot

Continue reading →

Clustering with DRBD, Corosync and Pacemaker

Introduction

This article will cover the build of a two-node high-availability cluster using DRBD (RAID1 over TCP/IP), the Corosync cluster engine, and the Pacemaker resource manager on CentOS 6.4. There are many applications for this type of cluster - as a free alternative to RHCS for example. However, this example does have a couple of caveats. As this is being built in a lab environment on KVM guests, there will be no STONITH (Shoot The Other Node In The Head) (a type of fencing). If this cluster goes split-brain, there may be manual recovery required to intervene, tell DRBD who is primary and who is secondary, and so on. In a Production environment, we’d use STONITH to connect to ILOMs (for example) and power off or reboot a misbehaving node. Quorum will also need to be disabled, as this stack doesn’t yet support the use of quorum disks - if you want that go with RHCS (and use cman with the two_node parameter, with or without qdiskd).

This article, as always, presumes that you know what you are doing. The nodes used in this article are as follows:

192.168.122.30 - rhcs-node01.local - first cluster node - running CentOS 6.4
192.168.122.31 - rhcs-node02.local - second cluster node - running CentOS 6.4
192.168.122.33 - failover IP address

DRBD will be used to replicate a volume between the two nodes (in a Master/Slave fashion), and the hosts will eventually run the nginx webserver in a failover topology, with this example having documents being served from the replicated volume.

Ideally, four network interfaces per host should be used (1 for “standard” node communications, 1 for DRBD replication, 2 for Corosync), but for a lab environment a single interface per node is fine.

Let’s start the build …

Continue reading →

Secure MySQL Replication over SSL

MySQL is a popular open-source relational-database management system. One of its core features is replication, and in this article I will be showing how to configure a master and slave MySQL instance, and then configure replication from master to slave over SSL. Encryption will help protect the replication from snooping. This type of replication has many uses, for example: disaster-recovery scenarios whereby the slave can be switched to a master role in the case of a master outage, for performance where all reads can take place on the slave with writes and updates occurring on the master, and so on. Replication can be configured without encryption, but encrypting with SSL is preferred as part of a defence-in-depth strategy - it’s an extra layer of security.

This article already presumes a good working knowledge of MySQL. The master server is centosa with IP address 10.1.1.150, and is running a minimal installation of CentOS 6.4 x86_64. The slave, centosb, is running the same OS and has IP address 10.1.1.151. MySQL will be installed from the latest current stable RPMs available at dev.mysql.com, rather than using the upstream versions. The latest stable version available at the time of writing is 5.6.14.

This article will cover the configuration of an SSL-encrypted replicated environment from scratch - it does not cover the migration of an existing replicated configuration to an SSL-encrypted replicated configuration, or the migration of any existing data to a new slave.

Continue reading →

MySQL Cluster: Adding New Data Nodes Online

MySQL Cluster has a pretty cool feature that allows you to add new data nodes whilst the cluster is online, thus avoiding any downtime. This is incredibly useful for scaling out the data nodes and adding additional node groups. In this article, I’ll show how to add two new data nodes to an existing cluster that has two data nodes defined. I’ll also explain what needs to happen after the configuration change to ensure that any existing data is correctly partitioned across the new nodes.

Continue reading →

Solaris Cluster 4.1 Part Four: Highly Available Containers

Introduction

The previous article covered the configuration of two resource groups, each containing a failover zpool for use as the zonepath to a highly-available zone, and a failover IP address to be assigned to each zone. The two zones were also configured and installed, and we verified that they could be booted on either node of the cluster, provided that the storage had been failed over appropriately and was available on the node where the zone was being booted.

This final part in the series will cover the incorporation of the zone boot/shutdown/failover into the cluster framework, as well as the configuration of two iPlanet resources to illustrate how Solaris Cluster can manage SMF services deployed within a highly-available Solaris zone.

Highly-Available Zones

First, install the ha-zones data service, if you haven’t done so already. I installed the full cluster package suite, so already have all data services at my disposal:

# pkg install ha-cluster/data-service/ha-zones

1	# pkg install ha-cluster/data-service/ha-zones

# clresourcetype register SUNW.gds

1	# clresourcetype register SUNW.gds

This is the Generic Data Service that is utilised by SUNWsczone (HA for Solaris Containers) for deploying highly-available zones. SUNWsczone supplies three highly-available mechanisms for zone deployment - sczbt (zone boot - used to start/stop/failover zones), sczsh (zone script resource - used for deploying highly-available services within zones, with start/stop scripts to control them) and sczsmf (zone SMF resource, used for deploying highly-available services within zones, with SMF services to control them). We’ll be using both sczbt and sczsmf.

Continue reading →

Solaris Cluster 4.1 Part Three: Cluster Resources

Introduction

In my previous article, we ended up with a working cluster, with all appropriate cluster software installed. In this article, I’ll start to configure cluster resources. I want to configure two resource groups, ha-zone-1-rg and ha-zone-2-rg. Each resource group will contain a highly-available failover filesystem, a highly-available failover IP address and a highly-available Solaris Zone. I’ll illustrate the process for cloning a zone to save on installation time, as zones in Solaris 11 now use IPS and unless you have a local IPS repository, will connect to http://pkg.oracle.com to download all appropriate packages during zone installation - not something you want to repeat too many times.

A summary of the resources/resource groups I’m looking to create is as follows:

ha-zone-1-rg - Resource group for the first set of failover resources
ha-zone-1-hasp - a SUNW.HAStoragePlus resource for the first failover zpool used for the zonepath for the first failover zone, ha-zone-1
ha-zone-1-lh-res - a SUNW.LogicalHostname resource for the first failover zone
ha-zone-1-res - a SUNW.gds resource, coupled with SUNWsczone/sczbt zone boot registration to create a highly-available zone, ha-zone-1
ha-zone-1-http-admin-smf-res - a SUNW.gds resource, coupled with SUNWsczone/sczsmf zone SMF service registration to create a highly-available iPlanet admin server instance
ha-zone-1-http-instance-smf-res - a SUNW.gds resource, coupled with SUNWsczone/sczsmf zone SMF service registration to create a highly-available iPlanet instance
ha-zone-2-rg - Resource group for the second set of failover resources
ha-zone-2-hasp - a SUNW.HAStoragePlus resource for the second failover zpool used for the zonepath for the second failover zone, ha-zone-2
ha-zone-2-lh-res - a SUNW.LogicalHostname resource for the second failover zone
ha-zone-2-res - a SUNW.gds resource, coupled with SUNWsczone/sczbt boot registration to create a highly-available zone, ha-zone-2

This article will cover a lot of ground, much more so than the previous two parts. By the end of the article, you will see two HA resource groups in action, each with a failover zpool and logical hostname resource. I’ll also install the two zones, but won’t make them HA as yet - that’ll be in the next part of the series, as will the configuration of the HA SMF iPlanet resources.

As always, ensure that you read the Oracle Solaris Cluster 4.1 documentation library for full details.

Let’s make a start …

Continue reading →

Solaris Cluster 4.1 Part Two: iSCSI, Quorum Server, and Cluster Software Installation

Introduction

The previous article in this series covered the initial preparation of our two cluster nodes, and the storage server. This article follows on from this by performing more work on the storage server - configuring the iSCSI LUNs that’ll be exported to our cluster nodes as shared disk devices, as well as installing the Solaris Cluster Quorum Server software. Then we move onto the cluster nodes, and install Solaris Cluster 4.1. By the end of this article, you’ll see an operational cluster - although it won’t have any resources created just yet.

iSCSI Configuration

Before we can configure iSCSI (which now requires COMSTAR configuration in Solaris 11), the appropriate package group needs to be installed - group/feature/storage-server. Install this package group on the storage server:

# pkg install group/feature/storage-server

1	# pkg install group/feature/storage-server

This will install quite a few packages (including things like AVS, Infiniband, Samba, etc.) but is the recommended method in the Oracle documentation. In any case, it provides the packages we want: scsi-target-mode-framework and iscsi/iscsi-target - and meets any dependencies. As an aside, you can find out what package owns a file via pkg search -l <filename> or pkg search file::<filename>:

# pkg search -l /usr/sbin/stmfadm
INDEX      ACTION VALUE            PACKAGE
path       file   usr/sbin/stmfadm
pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2
# pkg search file::stmfadm
INDEX      ACTION VALUE            PACKAGE
basename   file   usr/sbin/stmfadm
pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

# pkg search -l /usr/sbin/stmfadm

INDEX ACTION VALUE PACKAGE

path file usr/sbin/stmfadm

pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

# pkg search file::stmfadm

INDEX ACTION VALUE PACKAGE

basename file usr/sbin/stmfadm

pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

Once the packages are installed, enable the SCSI target mode framework SMF service:

# svcadm enable system/stmf
# svcs stmf
STATE          STIME    FMRI
online         21:28:18 svc:/system/stmf:default

# svcadm enable system/stmf

# svcs stmf

STATE STIME FMRI

online 21:28:18 svc:/system/stmf:default

At this point, I’ll add a second disk to the datapool zpool to ensure there’s plenty of capacity for ZFS volume creation:

# zpool add datapool c8t2d0
# zpool status datapool
  pool: datapool
 state: ONLINE
  scan: none requested
config:
        NAME      STATE     READ WRITE CKSUM
        datapool  ONLINE       0     0     0
          c8t1d0  ONLINE       0     0     0
          c8t2d0  ONLINE       0     0     0
errors: No known data errors

# zpool add datapool c8t2d0

# zpool status datapool

pool: datapool

state: ONLINE

scan: none requested

config:

NAME STATE READ WRITE CKSUM

datapool ONLINE 0 0 0

c8t1d0 ONLINE 0 0 0

c8t2d0 ONLINE 0 0 0

errors: No known data errors

Let’s check how much free space we have:

# zpool list datapool
NAME       SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
datapool  39.8G   142M  39.6G   0%  1.00x  ONLINE  -

# zpool list datapool

NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT

datapool 39.8G 142M 39.6G 0% 1.00x ONLINE -

OK - that’ll do - 39.6GB. Next, I’ll create two ZFS volumes, one for each zone that I’ll be deploying to the cluster. Each volume will be used as a failover zpool by the cluster, and will provide storage for a single failover zone. 8GB will suffice for each volume:

# zfs create -V 8G datapool/ha-zone-1
# zfs create -V 8G datapool/ha-zone-2

1 2	# zfs create -V 8G datapool/ha-zone-1 # zfs create -V 8G datapool/ha-zone-2

ZFS volumes are datasets that represent block devices, and are treated as such. They are useful for things such as this (and swap space, dump devices, etc.).

Continue reading →

Solaris Cluster 4.1 Part One: Initial Preparation

Introduction

This series of articles will cover the build of a two-node cluster running Solaris Cluster 4.1 on Solaris 11 x86. A few failover resource types will be introduced, and the end setup will have highly-available zones deployed to it. I’ll also install iPlanet on one of the zones, to illustrate how to incorporate an SMF service within a zone into the cluster-managed framework, after cloning it to save time in creating a second zone.

This is all being done under VMware Fusion 5.0 on Mac OS X. As VMware Fusion does not support disk sharing, I’ll also build a third Solaris 11 node for use as an iSCSI target and Quorum Server. Later in the process, I’ll remove the Quorum Server from the mix (as when I thought about it, an iSCSI LUN can be used for this purpose). I’ve still kept the Quorum Server details in the article, however, as it’s still of interest. Solaris 11 has totally changed the way iSCSI sharing happens, too. You have to configure COMSTAR - gone is zfs set shareiscsi=on :/

To start with, I installed Solaris 11/11 onto three VMs, clusternode1 and clusternode2 (each with 1.5GB RAM), and storagenode (with 1GB RAM). Both cluster nodes have a single 20GB disk for use as rpool, and the storagenode has a 20GB disk for rpool, and two additional 10GB disks for use as iSCSI LUNs on ZFS volumes. These LUNs will be presented to our cluster, and used for failover storage - which will then be used as the zonepaths for our highly-available zones. The node details are:

10.1.1.70 - storageserver - iSCSI target and quorum server
10.1.1.71 - ha-zone-1 - Zone to be provisioned to host iPlanet
10.1.1.72 - ha-zone-2 - Zone to be provisioned to illustrate cloning
10.1.1.80 - clusternode1 - Solaris Cluster 4.1 node
10.1.1.90 - clusternode2 - Solaris Cluster 4.1 node

Each cluster node has four network interfaces, as follows:

net0 - e1000g0 - Public network
net1 - e1000g1 - Public network
net2 - e1000g2 - vmnet2 - a private host-only network (with no other hosts on it)
net3 - e1000g3 - vmnet4 - a private host-only network (with no other hosts on it)

net0 and net1 will be configured as an IPMP group with transitive probing. net2 and net3 will be used as private cluster interconnects. It is important that no other hosts are using the interconnect networks, otherwise the cluster installation software will detect the traffic and complain, as it could interfere with cluster communications.

Let’s start with some basic preparation …

Continue reading →

How to Cluster Oracle Weblogic 12c via the Command Line

In this article, I will show you how to create a two-node Weblogic 12c cluster using only the command line. Oracle Weblogic (formerly BEA Weblogic) is one of the most resilient, reliable and high-performance J2EE application servers that I’ve worked with. I’ve used it to host both custom applications, as well as commercial applications that required an enterprise-grade J2EE container to serve them.

The lab topology will be as follows:

Weblogic Server	Hostname	IP Address	Port	Cluster Name
AdminServer	dolan	172.16.18.169	7001	N/A
test_managed_server_1	dolan	172.16.18.169	7002	TestCluster
test_managed_server_2	gooby	172.16.18.172	7002	TestCluster

As you can see, AdminServer only runs on a single node. Once the managed servers have been configured and started for the first time by way of the administration server, they can be restarted independently - thus the loss of the administration server does not impact the general running of the cluster. It will, however, prevent administration of that cluster (configuration, deployments, etc.) until such time as the administration server is back online. There are ways around this (bind the administration to a failover VIP provided by keepalived or similar) but for all but the most demanding usage, a single administration server will suffice.

Both dolan and gooby run CentOS 6.3 x86_64. All steps should be run on both nodes unless otherwise noted.

Continue reading →

Toki Winter

Advanced UNIX for the experienced system administrator

Tag Archives: High-Availability

Configuring GFS2 on CentOS 7

Building a Highly-Available Apache Cluster on CentOS 7

Clustering with DRBD, Corosync and Pacemaker

Introduction

Secure MySQL Replication over SSL

MySQL Cluster: Adding New Data Nodes Online

Solaris Cluster 4.1 Part Four: Highly Available Containers

Introduction

Highly-Available Zones

Solaris Cluster 4.1 Part Three: Cluster Resources

Introduction

Solaris Cluster 4.1 Part Two: iSCSI, Quorum Server, and Cluster Software Installation

Introduction

iSCSI Configuration

Solaris Cluster 4.1 Part One: Initial Preparation

Introduction

How to Cluster Oracle Weblogic 12c via the Command Line