SCSI-3 Persistent Reservations on Fedora Core 20 with targetcli over iSCSI and Red Hat Cluster

In this article, I’ll show how to set up SCSI-3 Persistent Reservations on Fedora Core 20 using targetcli, serving a pair of iSCSI LUNs to a simple Red Hat Cluster that will host a failover filesystem for the purposes of testing the iSCSI implementation. The Linux IO target (LIO) (http://linux-iscsi.org/wiki/LIO) has been the Linux SCSI target since kernel version 2.6.38. It supports a rapidly growing number of fabric modules, and all existing Linux block devices as backstores. For the purposes of our demonstration, the important fact is that it supports operating as an iSCSI target. targetcli is the tool used to perform the LIO configuration. SCSI-3 persistent reservations are required for a number of cluster storage configurations for I/O fencing and failover/retakeover. Therefore, LIO can be used as the foundation for high-end clustering solutions such as Red Hat Cluster Suite. You can read more about persistent reservations here.

The nodes in the lab are as follows:

  • 10.1.1.103centos03 – Red Hat Cluster node 1 on CentOS 6.5
  • 10.1.1.104centos04 – Red Hat Cluster node 2 on CentOS 6.5
  • 10.1.1.108fedora01 – Fedora Core 20 storage node

Installation

I’ll start by installing targetcli onto fedora01:

Let’s check that it has been installed correctly:

Make sure that, before proceeding, any existing configuration is removed:

Backing Store Configuration

I have attached a second disk to fedora01 that I will bring under LVM control. I’ll then create the iSCSI backing stores as logical volumes on this second disk.

I’ll create the physical volume first, followed by a volume group – vg_data – on the new physical volume. I’ll then create two logical volumes, lv_fs_failover and lv_fs_fence. lv_fs_failover will be used by the cluster as a failover filesystem, and lv_fs_fence will be used as the fence device by the cluster via SCSI fencing and SCSI-3 persistent reservations.

Let’s verify that everything has been created as expected:

targetcli Configuration

Time to log into targetcli and begin configuration. We cd to /backstores/block – this is where we create the backing stores on the LVM logical volumes:

Start by creating new backing stores on the previously created logical volumes – try to give the storage objects meaningful names:

Next, cd to /iscsi and disable discovery_auth – as this is only a lab we will not require authentication. This means that anyone who knows your initiator IQNs will be able to connect. We still configure ACLs on the connecting initiator IQNs, however.

Create the iSCSI target:

This has the effect of creating a portal on 0.0.0.0:3260 too.

Next, create the LUNs on the backing stores previously configured:

Verify their creation:

Next, we’ll create our (minimal) ACLs, so that only the appropriate initiator IQNs can connect to the LUNs:

First, on both cluster nodes, verify the current IQNs:

Create the ACLs:

Finally, save the configuration:

Enable and start the target service:

Add an iptables rule so that the initiators can connect:

Create the /var/target/pr directory so that persistent reservations work correctly:

Failure to do this will result in error messages being seen in dmesg, such as:

Initiator Configuration

On each cluster node, discover the new target:

Start the iscsi service on both cluster nodes:

You can now use targetcli on the storage node to verify that the initiators are both logged in:

Verify that the new LUNs are visible on the cluster nodes:

As you can see, /dev/sdb maps to the lv_fs_failover logical volume on fedora01 and /dev/sdc maps to lv_fs_fence.

From one cluster node, create a partition on /dev/sdb:

Create an ext4 filesystem:

We now have an ext4 filesystem for the cluster to manage, and another device (/dev/sdc) to be used as a fence device by the cluster.

Cluster Preparation

Before starting to configure the Red Hat Cluster software, a few things need to be done. First, install NTP on both nodes:

Next, for lab purposes, disable iptables and ip6tables, and switch SELinux into permissive mode:

Modify /etc/hosts on both nodes, adding an entry for each cluster node:

Next, verify multicast connectivity with the omping utility. Start by installing it:

Run the command simultaneously on both nodes, and verify that multicast connectivity is established:

Install the appropriate packages required for Red Hat Cluster on both nodes:

On both nodes, enable the appropriate services:

Set a password for the ricci user:

And start the ricci daemon:

We can now begin cluster configuration.

Cluster Configuration

The following commands should only be issued on one node of the cluster unless otherwise specified. centos03 will be configured first, and the configuration will then be synced across to centos04. The ccs utility is used to administer cluster configuration from the command line.

Create the cluster, in our case it’s called testcluster:

Add both nodes to the configuration, each having a single quorum vote:

Set the fence daemon properties as appropriate for your environment:

The post_fail_delay parameter is the number of seconds the fence daemon (fenced) waits before fencing a node (a member of the fence domain) after the node has failed. The post_join_delay parameter is the number of seconds the fence daemon (fenced) waits before fencing a node after the node joins the fence domain.

Set the cman daemon properties as appropriate for the correct operation of a two node cluster. For this, we need to set the two_node parameter to 1, and expected_votes to 1 for the cluster to remain quorate upon failure of a node:

Next, add the SCSI fencing method to both nodes:

We can now create our fence device on /dev/sdc:

Note that APTPL (Activate Persist Through Power Loss) is enabled, which is another feature that Linux IO supports. We also enable logging for the fence agent.

Next, the appropriate fence instances are added. The unfence instances are automatically created:

Create a failover domain:

And add the two cluster nodes as failover domain nodes:

Add a service to the failover domain:

Next, add the filesystem resource itself:

Finally, add a subservice tying the filesystem resource back to the service. These are used more for when resource ordering is required, but I’ll add it anyway:

Ensure cman is started on both nodes:

Update the active running configuration, and sync to centos04:

Start all cluster services and resources, and verify the cluster is quorate:

Check that the filesystem is mounted on centos03:

We can see /dev/sdb1 mounted on /mnt. Reboot centos03 and the filesystem should failover to centos04:

All is working as expected! You can further verify cluster status with the following commands: