Tag Archives: clustering

Configuring GFS2 on CentOS 7

This article will briefly discuss how to configure a GFS2 shared filesystem across two nodes on CentOS 7. Rather than rehashing a lot of previous content, this article presumes that you have followed the steps in my previous article, in order to configure the initial cluster and storage, up to and including the configuration of the STONITH device - but no further. All other topology considerations, device paths/layouts, etc. are the same, and the cluster nodes are still centos05 and centos07. The cluster name is webcluster and the 8GB LUN is presented as /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7 upon which a single partition has been created: /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7-part1.

First, install the lvm2-cluster and gfs2-utils packages:

# yum -y install gfs2-utils
# yum -y install lvm2-cluster

1 2	# yum -y install gfs2-utils # yum -y install lvm2-cluster

Enable clustered locking for LVM, and reboot both nodes:

# lvmconf --enable-cluster
# systemctl reboot

1 2	# lvmconf --enable-cluster # systemctl reboot

Create clone resources for DLM and CLVMD, so that they can run on both nodes. Run pcs commands from a single node only:

# pcs resource create dlm ocf:pacemaker:controld \
>    op monitor interval=30s on-fail=fence clone interleave=true ordered=true
# pcs resource create clvmd ocf:heartbeat:clvm \
>    op monitor interval=30s on-fail=fence clone interleave=true ordered=true

# pcs resource create dlm ocf:pacemaker:controld \

> op monitor interval=30s on-fail=fence clone interleave=true ordered=true

# pcs resource create clvmd ocf:heartbeat:clvm \

> op monitor interval=30s on-fail=fence clone interleave=true ordered=true

Create an ordering and a colocation constraint, so that DLM starts before CLVMD, and both resources start on the same node:

# pcs constraint order start dlm-clone then clvmd-clone
# pcs constraint colocation add clvmd-clone with dlm-clone

1 2	# pcs constraint order start dlm-clone then clvmd-clone # pcs constraint colocation add clvmd-clone with dlm-clone

Check the status of the clone resources:

# pcs status resources
 Clone Set: dlm-clone [dlm]
     Started: [ centos05 centos07 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ centos05 centos07 ]

# pcs status resources

Clone Set: dlm-clone [dlm]

Started: [ centos05 centos07 ]

Clone Set: clvmd-clone [clvmd]

Started: [ centos05 centos07 ]

Set the no-quorum-policy of the cluster to freeze so that that when quorum is lost, the remaining partition will do nothing until quorum is regained - GFS2 requires quorum to operate.

# pcs property set no-quorum-policy=freeze

1	# pcs property set no-quorum-policy=freeze

Create the LVM objects as required, again, from a single cluster node:

# pvcreate /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1
# vgcreate -Ay -cy vg_data /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1
# lvcreate -L 1G -n lv_test vg_data

# pvcreate /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1

# vgcreate -Ay -cy vg_data /dev/disk/by-id/scsi-360014055f0cfae3d6254576932ddc1f7-part1

# lvcreate -L 1G -n lv_test vg_data

Create the GFS2 filesystem. The -t option should be specified as <clustername>:<fsname>, and the right number of journals should be specified (here 2 as we have two nodes accessing the filesystem):

# mkfs.gfs2 -p lock_dlm -t webcluster:testfs -j 2 /dev/vg_data/lv_test

1	# mkfs.gfs2 -p lock_dlm -t webcluster:testfs -j 2 /dev/vg_data/lv_test

We will not use /etc/fstab to specify the mount, rather we’ll use a Pacemaker-controlled resource:

# pcs resource create gfs2_res Filesystem device="/dev/vg_data/lv_test" \
>    directory="/mnt" fstype="gfs2" options="noatime,nodiratime" \
>    op monitor interval=10s on-fail=fence clone interleave=true

# pcs resource create gfs2_res Filesystem device="/dev/vg_data/lv_test" \

> directory="/mnt" fstype="gfs2" options="noatime,nodiratime" \

> op monitor interval=10s on-fail=fence clone interleave=true

This is configured as a clone resource so it will run on both nodes at the same time. Confirm that the mount has succeeded on both nodes:

# pcs resource show
 Clone Set: dlm-clone [dlm]
     Started: [ centos05 centos07 ]
 Clone Set: clvmd-clone [clvmd]
     Started: [ centos05 centos07 ]
 Clone Set: gfs2_res-clone [gfs2_res]
     Started: [ centos05 centos07 ]
# mount | grep gfs2
/dev/mapper/vg_data-lv_test on /mnt type gfs2 (rw,noatime,nodiratime,seclabel)

# pcs resource show

Clone Set: dlm-clone [dlm]

Started: [ centos05 centos07 ]

Clone Set: clvmd-clone [clvmd]

Started: [ centos05 centos07 ]

Clone Set: gfs2_res-clone [gfs2_res]

Started: [ centos05 centos07 ]

# mount | grep gfs2

/dev/mapper/vg_data-lv_test on /mnt type gfs2 (rw,noatime,nodiratime,seclabel)

Note the use of noatime and nodiratime which will yield a performance benefit. As per Red Hat Documentation, SELinux should be disabled too.

Next, create an ordering constraint so that the filesystem resource is started after the CLVMD resource, and a colocation constraint so that both start on the same node:

# pcs constraint order start clvmd-clone then gfs2_res-clone
Adding clvmd-clone gfs2_res-clone (kind: Mandatory) (Options: first-action=start then-action=start)
# pcs constraint colocation add gfs2_res-clone with clvmd-clone
# pcs constraint show
Location Constraints:
Ordering Constraints:
  start dlm-clone then start clvmd-clone
  start clvmd-clone then start gfs2_res-clone
Colocation Constraints:
  clvmd-clone with dlm-clone
  gfs2_res-clone with clvmd-clone

# pcs constraint order start clvmd-clone then gfs2_res-clone

Adding clvmd-clone gfs2_res-clone (kind: Mandatory) (Options: first-action=start then-action=start)

# pcs constraint colocation add gfs2_res-clone with clvmd-clone

# pcs constraint show

Location Constraints:

Ordering Constraints:

start dlm-clone then start clvmd-clone

start clvmd-clone then start gfs2_res-clone

Colocation Constraints:

clvmd-clone with dlm-clone

gfs2_res-clone with clvmd-clone

And we’re done.

We can even grow the filesystem online:

# lvextend -L+1G /dev/vg_data/lv_test
  Extending logical volume lv_test to 2.00 GiB
  Logical volume lv_test successfully resized
# gfs2_grow /dev/vg_data/lv_test
FS: Mount point:          /mnt
FS: Device:               /dev/mapper/vg_data-lv_test
FS: Size:                 262142 (0x3fffe)
FS: Resource group size:  65517 (0xffed)
DEV: Length:               524288 (0x80000)
The file system grew by 1024MB.
gfs2_grow complete.
# df -hT /mnt
Filesystem                  Type  Size  Used Avail Use% Mounted on
/dev/mapper/vg_data-lv_test gfs2  2.0G  259M  1.8G  13% /mnt

# lvextend -L+1G /dev/vg_data/lv_test

Extending logical volume lv_test to 2.00 GiB

Logical volume lv_test successfully resized

# gfs2_grow /dev/vg_data/lv_test

FS: Mount point: /mnt

FS: Device: /dev/mapper/vg_data-lv_test

FS: Size: 262142 (0x3fffe)

FS: Resource group size: 65517 (0xffed)

DEV: Length: 524288 (0x80000)

The file system grew by 1024MB.

gfs2_grow complete.

# df -hT /mnt

Filesystem Type Size Used Avail Use% Mounted on

/dev/mapper/vg_data-lv_test gfs2 2.0G 259M 1.8G 13% /mnt

Building a Highly-Available Apache Cluster on CentOS 7

This article will walk through the steps required to build a highly-available Apache cluster on CentOS 7. In CentOS 7 (as in Red Hat Enterprise Linux 7) the cluster stack has moved to Pacemaker/Corosync, with a new command line tool to manage the cluster (pcs, replacing commands such as ccs and clusvcadm in earlier releases).

The cluster will be a two node cluster comprising nodes centos05 and centos07, and iSCSI shared storage will be presented from node fedora01. There will be a 8GB LUN presented for shared storage, and a 1GB LUN for fencing purposes. I have covered setting up iSCSI storage with SCSI-3 persistent reservations in a previous article. There is no need to use CLVMD in this example as we will be utilising a simple failover filesystem instead.

The first step is to add appropriate entries to /etc/hosts on both nodes for all nodes, including the storage node, to safeguard against DNS failure:

# vi /etc/hosts
10.1.1.107  centos05
10.1.1.108  fedora01
10.1.1.111  centos07

# vi /etc/hosts

10.1.1.107 centos05

10.1.1.108 fedora01

10.1.1.111 centos07

Next, bring both cluster nodes fully up-to-date, and reboot them:

# yum -y update
# systemctl reboot

1 2	# yum -y update # systemctl reboot

Continue reading →

SCSI-3 Persistent Reservations on Fedora Core 20 with targetcli over iSCSI and Red Hat Cluster

In this article, I’ll show how to set up SCSI-3 Persistent Reservations on Fedora Core 20 using targetcli, serving a pair of iSCSI LUNs to a simple Red Hat Cluster that will host a failover filesystem for the purposes of testing the iSCSI implementation. The Linux IO target (LIO) (http://linux-iscsi.org/wiki/LIO) has been the Linux SCSI target since kernel version 2.6.38. It supports a rapidly growing number of fabric modules, and all existing Linux block devices as backstores. For the purposes of our demonstration, the important fact is that it supports operating as an iSCSI target. targetcli is the tool used to perform the LIO configuration. SCSI-3 persistent reservations are required for a number of cluster storage configurations for I/O fencing and failover/retakeover. Therefore, LIO can be used as the foundation for high-end clustering solutions such as Red Hat Cluster Suite. You can read more about persistent reservations here.

The nodes in the lab are as follows:

10.1.1.103 - centos03 - Red Hat Cluster node 1 on CentOS 6.5
10.1.1.104 - centos04 - Red Hat Cluster node 2 on CentOS 6.5
10.1.1.108 - fedora01 - Fedora Core 20 storage node

Installation

I’ll start by installing targetcli onto fedora01:

[root@fedora01 ~]# yum -y install targetcli

1	[root@fedora01 ~]# yum -y install targetcli

Let’s check that it has been installed correctly:

root@fedora01 ~]# targetcli
Warning: Could not load preferences file /root/.targetcli/prefs.bin.
targetcli shell version 2.1.fb35
Copyright 2011-2013 by Datera, Inc and others.
For help on commands, type 'help'.
/> ls
o- / ...................................................................................... [...]
  o- backstores ........................................................................... [...]
  | o- block ............................................................... [Storage Objects: 0]
  | o- fileio .............................................................. [Storage Objects: 0]
  | o- pscsi ............................................................... [Storage Objects: 0]
  | o- ramdisk ............................................................. [Storage Objects: 0]
  o- iscsi ......................................................................... [Targets: 0]
  o- loopback ...................................................................... [Targets: 0]
  o- vhost ......................................................................... [Targets: 0]
/> exit
Global pref auto_save_on_exit=true
Last 10 configs saved in /etc/target/backup.
Configuration saved to /etc/target/saveconfig.json

root@fedora01 ~]# targetcli

Warning: Could not load preferences file /root/.targetcli/prefs.bin.

targetcli shell version 2.1.fb35

For help on commands, type 'help'.

/> ls

o- / ...................................................................................... [...]

o- backstores ........................................................................... [...]

| o- block ............................................................... [Storage Objects: 0]

| o- fileio .............................................................. [Storage Objects: 0]

| o- pscsi ............................................................... [Storage Objects: 0]

| o- ramdisk ............................................................. [Storage Objects: 0]

o- iscsi ......................................................................... [Targets: 0]

o- loopback ...................................................................... [Targets: 0]

o- vhost ......................................................................... [Targets: 0]

/> exit

Global pref auto_save_on_exit=true

Last 10 configs saved in /etc/target/backup.

Configuration saved to /etc/target/saveconfig.json

Make sure that, before proceeding, any existing configuration is removed:

[root@fedora01 ~]# targetcli clearconfig confirm=true
All configuration cleared

1 2	[root@fedora01 ~]# targetcli clearconfig confirm=true All configuration cleared

Continue reading →

GFS2 Implementation Under RHEL

This article will demonstrate setting up a simple RHCS (Red Hat Cluster Suite) two-node cluster, with an end goal of having a 50GB LUN shared between two servers, thus providing clustered shared storage to both nodes. This will enable applications running on the nodes to write to a shared filesystem, perform correct locking, and ensure filesystem integrity.

This type of configuration is central to many active-active application setups, where both nodes share a central content or configuration repository.

For this article, two RHEL 6.1 nodes, running on physical hardware (IBM blades) were used. Each node has multiple paths back to the 50GB SAN LUN presented, and multipathd will be used to manage path failover and rebuild in the event of interruption.

Continue reading →

Clustering with DRBD, Corosync and Pacemaker

Introduction

This article will cover the build of a two-node high-availability cluster using DRBD (RAID1 over TCP/IP), the Corosync cluster engine, and the Pacemaker resource manager on CentOS 6.4. There are many applications for this type of cluster - as a free alternative to RHCS for example. However, this example does have a couple of caveats. As this is being built in a lab environment on KVM guests, there will be no STONITH (Shoot The Other Node In The Head) (a type of fencing). If this cluster goes split-brain, there may be manual recovery required to intervene, tell DRBD who is primary and who is secondary, and so on. In a Production environment, we’d use STONITH to connect to ILOMs (for example) and power off or reboot a misbehaving node. Quorum will also need to be disabled, as this stack doesn’t yet support the use of quorum disks - if you want that go with RHCS (and use cman with the two_node parameter, with or without qdiskd).

This article, as always, presumes that you know what you are doing. The nodes used in this article are as follows:

192.168.122.30 - rhcs-node01.local - first cluster node - running CentOS 6.4
192.168.122.31 - rhcs-node02.local - second cluster node - running CentOS 6.4
192.168.122.33 - failover IP address

DRBD will be used to replicate a volume between the two nodes (in a Master/Slave fashion), and the hosts will eventually run the nginx webserver in a failover topology, with this example having documents being served from the replicated volume.

Ideally, four network interfaces per host should be used (1 for “standard” node communications, 1 for DRBD replication, 2 for Corosync), but for a lab environment a single interface per node is fine.

Let’s start the build …

Continue reading →

Configuring Transitive IPMP on Solaris 11

We all know the pain of configuring probe-based IPMP under Solaris, with a slew of test addresses being required, and a long line of ifconfig configuration in our /etc/hostname.<interface> files.

With Solaris 11, there is a new type of probe-based IPMP called transitive probing. This new type of probing does not require test addresses, as per the documentation: “Transitive probes are sent by the alternate interfaces in the group to probe the active interface. An alternate interface is an underlying interface that does not actively receive any inbound IP packets”.

In this article, I will configure failover (active/passive) IPMP on clusternode1 (the first node of a Solaris Cluster I’m building). Interface net0 has an address of 10.1.1.80 (configured at install time), and I’ll be adding this into an IPMP group ipmp0 along with a standby interface, net1. Make sure you are performing these steps via a console connection, as the original address associated with net0 will need to be removed before attempting to add it to an IPMP group.

The first step, ensure that there is an entry in /etc/hosts for the IP address you’re configuring IPMP for:

# grep '^10\.1\.1\.80' /etc/hosts
10.1.1.80    clusternode1

1 2	# grep '^10\.1\.1\.80' /etc/hosts 10.1.1.80 clusternode1

Next, ensure that automatic network configuration is disabled. In my case it was as I’d configured networking manually during the installation of Solaris 11:

# netadm list -p ncp -x
TYPE        PROFILE        STATE          AUXILIARY STATE
ncp         Automatic      disabled       disabled by administrator
ncp         DefaultFixed   online         active

# netadm list -p ncp -x

TYPE PROFILE STATE AUXILIARY STATE

ncp Automatic disabled disabled by administrator

ncp DefaultFixed online active

Verify that the appropriate physical interfaces are available. In the following output, I’ll be bonding e1000g0 (net0) and e1000g1 (net1) into a failover IPMP group.

# dladm show-phys
LINK              MEDIA                STATE      SPEED  DUPLEX    DEVICE
net1              Ethernet             unknown    0      unknown   e1000g1
net2              Ethernet             unknown    0      unknown   e1000g2
net3              Ethernet             unknown    0      unknown   e1000g3
net0              Ethernet             up         1000   full      e1000g0

# dladm show-phys

LINK MEDIA STATE SPEED DUPLEX DEVICE

net1 Ethernet unknown 0 unknown e1000g1

net2 Ethernet unknown 0 unknown e1000g2

net3 Ethernet unknown 0 unknown e1000g3

net0 Ethernet up 1000 full e1000g0

List the current addresses - from the output of ipadm show-addr I can see that I’ll need to delete net0/v4 and net0/v6, otherwise I’ll be unable to add net0 to the IPMP group.

# ipadm delete-addr net0/v4
# ipadm delete-addr net0/v6

1 2	# ipadm delete-addr net0/v4 # ipadm delete-addr net0/v6

As the net0 IP interface is already created, I only need to create the net1 interface:

# ipadm create-ip net1

1	# ipadm create-ip net1

I can then create the IPMP group, which I’ll call ipmp0:

# ipadm add-ipmp -i net0 -i net1 ipmp0

1	# ipadm add-ipmp -i net0 -i net1 ipmp0

Next, enable transitive probing, which is disabled by default:

# svccfg -s svc:/network/ipmp setprop config/transitive-probing=true
# svccfg -s svc:/network/ipmp listprop config/transitive-probing
config/transitive-probing boolean     true
# svcadm refresh svc:/network/ipmp:default

# svccfg -s svc:/network/ipmp setprop config/transitive-probing=true

# svccfg -s svc:/network/ipmp listprop config/transitive-probing

config/transitive-probing boolean true

# svcadm refresh svc:/network/ipmp:default

And configure the appropriate interface (in my case net1) to be a standby interface (as I’m using failover):

# ipadm set-ifprop -p standby=on -m ip net1

1	# ipadm set-ifprop -p standby=on -m ip net1

Now I can create my IPv4 address on the IPMP group:

# ipadm create-addr -T static -a clusternode1/24 ipmp0/v4
# ipadm show-addr ipmp0
ADDROBJ           TYPE     STATE        ADDR
ipmp0/v4          static   ok           10.1.1.80/24

# ipadm create-addr -T static -a clusternode1/24 ipmp0/v4

# ipadm show-addr ipmp0

ADDROBJ TYPE STATE ADDR

ipmp0/v4 static ok 10.1.1.80/24

Finally, fix the default route. I removed the existing route and added a new default route using the new and correct interface - ipmp0:

# route -p delete default 10.1.1.1
# route -p add default 10.1.1.1 -ifp ipmp0
# netstat -rn -f inet
Routing Table: IPv4
  Destination           Gateway           Flags  Ref     Use     Interface
-------------------- -------------------- ----- ----- ---------- ---------
default              10.1.1.1             UG        1          0 ipmp0
10.1.1.0             10.1.1.80            U         8        388 ipmp0
127.0.0.1            127.0.0.1            UH        2        554 lo0

# route -p delete default 10.1.1.1

# route -p add default 10.1.1.1 -ifp ipmp0

# netstat -rn -f inet

Routing Table: IPv4

Destination Gateway Flags Ref Use Interface

-------------------- -------------------- ----- ----- ---------- ---------

default 10.1.1.1 UG 1 0 ipmp0

10.1.1.0 10.1.1.80 U 8 388 ipmp0

127.0.0.1 127.0.0.1 UH 2 554 lo0

You can use ipmpstat to verify the configuration and health of the IPMP group:

# ipmpstat -g
GROUP       GROUPNAME   STATE     FDT       INTERFACES
ipmp0       ipmp0       ok        10.00s    net0 (net1)
# ipmpstat -a
ADDRESS                   STATE  GROUP       INBOUND     OUTBOUND
::                        down   ipmp0       --          --
clusternode1              up     ipmp0       net0        net0
# ipmpstat -t
INTERFACE   MODE       TESTADDR            TARGETS
net1        transitive <net1>              <net0>
net0        routes     clusternode1        10.1.1.1

# ipmpstat -g

GROUP GROUPNAME STATE FDT INTERFACES

ipmp0 ipmp0 ok 10.00s net0 (net1)

# ipmpstat -a

ADDRESS STATE GROUP INBOUND OUTBOUND

:: down ipmp0 -- --

clusternode1 up ipmp0 net0 net0

# ipmpstat -t

INTERFACE MODE TESTADDR TARGETS

net1 transitive <net1> <net0>

net0 routes clusternode1 10.1.1.1

Let’s perform a failover test. I’ll disable net0 and ensure that the clusternode1 address fails over:

# ipadm disable-if -t net0
# ipmpstat -t
INTERFACE   MODE       TESTADDR            TARGETS
net1        routes     clusternode1        10.1.1.1

# ipadm disable-if -t net0

# ipmpstat -t

INTERFACE MODE TESTADDR TARGETS

net1 routes clusternode1 10.1.1.1

It works! (and my SSH connection is still active…) - net1 is now active with the correct IP address. Let’s fail it back:

# ipadm enable-if -t net0
# ipmpstat -t
INTERFACE   MODE       TESTADDR            TARGETS
net0        routes     clusternode1        10.1.1.1
net1        transitive <net1>              <net0>

# ipadm enable-if -t net0

# ipmpstat -t

INTERFACE MODE TESTADDR TARGETS

net0 routes clusternode1 10.1.1.1

net1 transitive <net1> <net0>

The address has failed back to net0, and again my SSH connection is still active. I can now continue with clusternode2, and the rest of the cluster install.

MySQL Cluster: Adding New Data Nodes Online

MySQL Cluster has a pretty cool feature that allows you to add new data nodes whilst the cluster is online, thus avoiding any downtime. This is incredibly useful for scaling out the data nodes and adding additional node groups. In this article, I’ll show how to add two new data nodes to an existing cluster that has two data nodes defined. I’ll also explain what needs to happen after the configuration change to ensure that any existing data is correctly partitioned across the new nodes.

Continue reading →

Solaris Cluster 4.1 Part Four: Highly Available Containers

Introduction

The previous article covered the configuration of two resource groups, each containing a failover zpool for use as the zonepath to a highly-available zone, and a failover IP address to be assigned to each zone. The two zones were also configured and installed, and we verified that they could be booted on either node of the cluster, provided that the storage had been failed over appropriately and was available on the node where the zone was being booted.

This final part in the series will cover the incorporation of the zone boot/shutdown/failover into the cluster framework, as well as the configuration of two iPlanet resources to illustrate how Solaris Cluster can manage SMF services deployed within a highly-available Solaris zone.

Highly-Available Zones

First, install the ha-zones data service, if you haven’t done so already. I installed the full cluster package suite, so already have all data services at my disposal:

# pkg install ha-cluster/data-service/ha-zones

1	# pkg install ha-cluster/data-service/ha-zones

# clresourcetype register SUNW.gds

1	# clresourcetype register SUNW.gds

This is the Generic Data Service that is utilised by SUNWsczone (HA for Solaris Containers) for deploying highly-available zones. SUNWsczone supplies three highly-available mechanisms for zone deployment - sczbt (zone boot - used to start/stop/failover zones), sczsh (zone script resource - used for deploying highly-available services within zones, with start/stop scripts to control them) and sczsmf (zone SMF resource, used for deploying highly-available services within zones, with SMF services to control them). We’ll be using both sczbt and sczsmf.

Continue reading →

Solaris Cluster 4.1 Part Three: Cluster Resources

Introduction

In my previous article, we ended up with a working cluster, with all appropriate cluster software installed. In this article, I’ll start to configure cluster resources. I want to configure two resource groups, ha-zone-1-rg and ha-zone-2-rg. Each resource group will contain a highly-available failover filesystem, a highly-available failover IP address and a highly-available Solaris Zone. I’ll illustrate the process for cloning a zone to save on installation time, as zones in Solaris 11 now use IPS and unless you have a local IPS repository, will connect to http://pkg.oracle.com to download all appropriate packages during zone installation - not something you want to repeat too many times.

A summary of the resources/resource groups I’m looking to create is as follows:

ha-zone-1-rg - Resource group for the first set of failover resources
ha-zone-1-hasp - a SUNW.HAStoragePlus resource for the first failover zpool used for the zonepath for the first failover zone, ha-zone-1
ha-zone-1-lh-res - a SUNW.LogicalHostname resource for the first failover zone
ha-zone-1-res - a SUNW.gds resource, coupled with SUNWsczone/sczbt zone boot registration to create a highly-available zone, ha-zone-1
ha-zone-1-http-admin-smf-res - a SUNW.gds resource, coupled with SUNWsczone/sczsmf zone SMF service registration to create a highly-available iPlanet admin server instance
ha-zone-1-http-instance-smf-res - a SUNW.gds resource, coupled with SUNWsczone/sczsmf zone SMF service registration to create a highly-available iPlanet instance
ha-zone-2-rg - Resource group for the second set of failover resources
ha-zone-2-hasp - a SUNW.HAStoragePlus resource for the second failover zpool used for the zonepath for the second failover zone, ha-zone-2
ha-zone-2-lh-res - a SUNW.LogicalHostname resource for the second failover zone
ha-zone-2-res - a SUNW.gds resource, coupled with SUNWsczone/sczbt boot registration to create a highly-available zone, ha-zone-2

This article will cover a lot of ground, much more so than the previous two parts. By the end of the article, you will see two HA resource groups in action, each with a failover zpool and logical hostname resource. I’ll also install the two zones, but won’t make them HA as yet - that’ll be in the next part of the series, as will the configuration of the HA SMF iPlanet resources.

As always, ensure that you read the Oracle Solaris Cluster 4.1 documentation library for full details.

Let’s make a start …

Continue reading →

Solaris Cluster 4.1 Part Two: iSCSI, Quorum Server, and Cluster Software Installation

Introduction

The previous article in this series covered the initial preparation of our two cluster nodes, and the storage server. This article follows on from this by performing more work on the storage server - configuring the iSCSI LUNs that’ll be exported to our cluster nodes as shared disk devices, as well as installing the Solaris Cluster Quorum Server software. Then we move onto the cluster nodes, and install Solaris Cluster 4.1. By the end of this article, you’ll see an operational cluster - although it won’t have any resources created just yet.

iSCSI Configuration

Before we can configure iSCSI (which now requires COMSTAR configuration in Solaris 11), the appropriate package group needs to be installed - group/feature/storage-server. Install this package group on the storage server:

# pkg install group/feature/storage-server

1	# pkg install group/feature/storage-server

This will install quite a few packages (including things like AVS, Infiniband, Samba, etc.) but is the recommended method in the Oracle documentation. In any case, it provides the packages we want: scsi-target-mode-framework and iscsi/iscsi-target - and meets any dependencies. As an aside, you can find out what package owns a file via pkg search -l <filename> or pkg search file::<filename>:

# pkg search -l /usr/sbin/stmfadm
INDEX      ACTION VALUE            PACKAGE
path       file   usr/sbin/stmfadm
pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2
# pkg search file::stmfadm
INDEX      ACTION VALUE            PACKAGE
basename   file   usr/sbin/stmfadm
pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

# pkg search -l /usr/sbin/stmfadm

INDEX ACTION VALUE PACKAGE

path file usr/sbin/stmfadm

pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

# pkg search file::stmfadm

INDEX ACTION VALUE PACKAGE

basename file usr/sbin/stmfadm

pkg:/system/storage/scsi-target-mode-framework@0.5.11-0.175.1.0.0.24.2

Once the packages are installed, enable the SCSI target mode framework SMF service:

# svcadm enable system/stmf
# svcs stmf
STATE          STIME    FMRI
online         21:28:18 svc:/system/stmf:default

# svcadm enable system/stmf

# svcs stmf

STATE STIME FMRI

online 21:28:18 svc:/system/stmf:default

At this point, I’ll add a second disk to the datapool zpool to ensure there’s plenty of capacity for ZFS volume creation:

# zpool add datapool c8t2d0
# zpool status datapool
  pool: datapool
 state: ONLINE
  scan: none requested
config:
        NAME      STATE     READ WRITE CKSUM
        datapool  ONLINE       0     0     0
          c8t1d0  ONLINE       0     0     0
          c8t2d0  ONLINE       0     0     0
errors: No known data errors

# zpool add datapool c8t2d0

# zpool status datapool

pool: datapool

state: ONLINE

scan: none requested

config:

NAME STATE READ WRITE CKSUM

datapool ONLINE 0 0 0

c8t1d0 ONLINE 0 0 0

c8t2d0 ONLINE 0 0 0

errors: No known data errors

Let’s check how much free space we have:

# zpool list datapool
NAME       SIZE  ALLOC   FREE  CAP  DEDUP  HEALTH  ALTROOT
datapool  39.8G   142M  39.6G   0%  1.00x  ONLINE  -

# zpool list datapool

NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT

datapool 39.8G 142M 39.6G 0% 1.00x ONLINE -

OK - that’ll do - 39.6GB. Next, I’ll create two ZFS volumes, one for each zone that I’ll be deploying to the cluster. Each volume will be used as a failover zpool by the cluster, and will provide storage for a single failover zone. 8GB will suffice for each volume:

# zfs create -V 8G datapool/ha-zone-1
# zfs create -V 8G datapool/ha-zone-2

1 2	# zfs create -V 8G datapool/ha-zone-1 # zfs create -V 8G datapool/ha-zone-2

ZFS volumes are datasets that represent block devices, and are treated as such. They are useful for things such as this (and swap space, dump devices, etc.).

Continue reading →

Toki Winter

Advanced UNIX for the experienced system administrator

Tag Archives: clustering

Configuring GFS2 on CentOS 7

Building a Highly-Available Apache Cluster on CentOS 7

SCSI-3 Persistent Reservations on Fedora Core 20 with targetcli over iSCSI and Red Hat Cluster

Installation

GFS2 Implementation Under RHEL

Clustering with DRBD, Corosync and Pacemaker

Introduction

Configuring Transitive IPMP on Solaris 11

MySQL Cluster: Adding New Data Nodes Online

Solaris Cluster 4.1 Part Four: Highly Available Containers

Introduction

Highly-Available Zones

Solaris Cluster 4.1 Part Three: Cluster Resources

Introduction

Solaris Cluster 4.1 Part Two: iSCSI, Quorum Server, and Cluster Software Installation

Introduction

iSCSI Configuration