Solaris Cluster 4.1 Part Two: iSCSI, Quorum Server, and Cluster Software Installation

Introduction

The previous article in this series covered the initial preparation of our two cluster nodes, and the storage server. This article follows on from this by performing more work on the storage server - configuring the iSCSI LUNs that’ll be exported to our cluster nodes as shared disk devices, as well as installing the Solaris Cluster Quorum Server software. Then we move onto the cluster nodes, and install Solaris Cluster 4.1. By the end of this article, you’ll see an operational cluster - although it won’t have any resources created just yet.

iSCSI Configuration

Before we can configure iSCSI (which now requires COMSTAR configuration in Solaris 11), the appropriate package group needs to be installed - group/feature/storage-server. Install this package group on the storage server:

This will install quite a few packages (including things like AVS, Infiniband, Samba, etc.) but is the recommended method in the Oracle documentation. In any case, it provides the packages we want: scsi-target-mode-framework and iscsi/iscsi-target - and meets any dependencies. As an aside, you can find out what package owns a file via pkg search -l <filename> or pkg search file::<filename>:

Once the packages are installed, enable the SCSI target mode framework SMF service:

At this point, I’ll add a second disk to the datapool zpool to ensure there’s plenty of capacity for ZFS volume creation:

Let’s check how much free space we have:

OK - that’ll do - 39.6GB. Next, I’ll create two ZFS volumes, one for each zone that I’ll be deploying to the cluster. Each volume will be used as a failover zpool by the cluster, and will provide storage for a single failover zone. 8GB will suffice for each volume:

ZFS volumes are datasets that represent block devices, and are treated as such. They are useful for things such as this (and swap space, dump devices, etc.).

Next, use the stmfadm command to create two LUNs, one for each of our ZFS volumes:

And verify their creation:

Whilst we can use CHAP for security of our iSCSI LUNs, on my test network I can just lock down access to the initiator IQNs. First, on each cluster node, enable the iSCSI initiator service:

Get the iSCSI initiator IQNs from both cluster nodes by issuing a command such as the following:

I’ve obfuscated the IQNs here… Back on the storage node, create a host group:

and add both initiator IQNs to it:

Verify the hostgroup:

Then create a view to each of our LUNs, locked down to the new failover-1-group hostgroup:

And verify the views:

Next, enable the iSCSI target service:

And then create the iSCSI target:

Again, I’ve obfuscated my actual IQN a little. Verify that the target has been created:

On each cluster node, configure target discovery. I used sendtargets rather than static configuration, but either is easy to configure. Substitute the appropriate IP address for your iSCSI target:

You should now be able to see each of the LUNs when you issue iscsiadm list target -S on the cluster nodes:

Keep in mind which LUN you intend to use for which zpool - as you’ll give yourself an administrative headache if you start mismatching them back to the ZFS volumes on the storage server. Issue the following command on the storage server to get a map:

Next, from only ONE of the cluster nodes, create an fdisk partition on each of the disks (as we’re using Solaris on x86). Note that I append p0 to the device path for partition 0:

Again, from only ONE of the cluster nodes, add an EFI label to each of the disks. Use format -e to do this, otherwise a standard SMI label will be applied:

Verify the VTOCs with prtvtoc:

You may need to run devfsadm on the “other” cluster node so that the new fdisk partition is recognised and the new VTOC is available, but the cluster nodes will receive reconfiguration reboots during cluster software installation and prior to any use being made of these devices anyway. At this point, you should see that you now have 2 sessions logged in to the iSCSI target (issue this on the storage server):

At this point, the storage configuration is complete. The two LUNs are presented to the two cluster nodes over iSCSI and the appropriate partitioning and labelling has occurred. We can now make use of these devices for highly-available failover storage once the cluster software has been installed.

Quorum Server Installation

The quorum server installation is very straight forward. First, verify that the ha-cluster publisher is configured:

Next, install the ha-cluster-quorum-server-full package:

The default quorum server configuration is fine for our needs:

So, enable the cluster/quorumserver SMF service:

And verify that it’s working:

This output will change to show node reservations once we connect the cluster to it.

The quorum server is done!

Cluster Software Installation

Next, the full cluster software stack will be installed on both cluster nodes. Once you’ve checked with pkg publisher that the ha-cluster repository is available, install the package set with the following command:

Verify that you have two spare interfaces for use as cluster interconnects:

OK - from the above I can see that net2 and net3 aren’t being used, as planned. I’ll use those.

Add /usr/cluster/bin to your PATH at this point, by editing your shell’s initialisation file(s) as appropriate. I’ll edit and source my ~/.profile:

From clusternode1, install the new cluster via scinstall. This is an interactive program that will ask you simple and easy questions about your cluster name, node names, cluster interconnect configuration, and so on. Major points to note here are that I chose a Custom installation type so I could have fine control over all aspects of the installation, gave the cluster a name of spiffy-cluster, added clusternode2 as an additional cluster node, chose net2 and net3 for my cluster interconnects (and allowed scinstall to use auto-discovery to detect and configure the interfaces on the second node) and disabled automatic quorum device selection (as I want to use a Quorum Server). Since late in the Cluster 3.x series, a separate slice dedicated to the /.globaldevices filesystem is no longer required. Now, a LOFI-backed filesystem is created by default instead. Here is the output of my session - a lot of the time, as you can see, scinstall correctly assumes sensible defaults.

If you see messages regarding the cluster check failing, take a look at the messages in the /var/cluster/logs/install/cluster_check/checkresults.txt file. They may be important, or you may be able to ignore them, but ensure you read the logs! Anything specified as a “violation” must be addressed, or it may prevent proper cluster operation.

After the cluster has configured and rebooted both nodes, the cluster will form! Congratulations, you now have a cluster. Granted, it isn’t doing anything useful at the moment, and it’s still in installmode as it cannot reach quorum. We need to connect it to our quorum device.

You can verify that the cluster is in installmode by running the following command on one of the cluster nodes:

To add the quorum server, run another interactive program called clsetup from one of the nodes. This will guide you through the process of adding the quorum server. Output from my session follows.

You can now verify that installmode is disabled:

And confirm that the cluster is quorate by running the clquorum status command:

Back on the quorum server, you can run clquorumserver show to view the quorum reservation made by the quorum server itself, as well as the registrations made by the two nodes.

As a final verification, let’s confirm that our shared storage is visible:

We can see above that the d1 and d2 DID devices are available to both nodes, and the device path corresponds with to the LUNs we created earlier.

Conclusion

This article has covered a large piece of configuration work. First, iSCSI LUNs and the iSCSI target were built, with restricted-access to the IQNs of the two initiators. A quorum server was built, and Solaris Cluster 4.1 installed to both cluster nodes. Finally, the cluster quorum was configured and installmode disabled, thus giving us a complete cluster ready for the creation and configuration of resource groups and resources.

In the next article, the cluster will be put to use, with failover filesystems, logical hostnames (“floating IP addresses”) and other resources being created.