Configuring GFS2 on CentOS 7

This article will briefly discuss how to configure a GFS2 shared filesystem across two nodes on CentOS 7. Rather than rehashing a lot of previous content, this article presumes that you have followed the steps in my previous article, in order to configure the initial cluster and storage, up to and including the configuration of the STONITH device – but no further. All other topology considerations, device paths/layouts, etc. are the same, and the cluster nodes are still centos05 and centos07. The cluster name is webcluster and the 8GB LUN is presented as /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7 upon which a single partition has been created: /dev/disk/by-id/wwn-0x60014055f0cfae3d6254576932ddc1f7-part1.

First, install the lvm2-cluster and gfs2-utils packages:

Enable clustered locking for LVM, and reboot both nodes:

Create clone resources for DLM and CLVMD, so that they can run on both nodes. Run pcs commands from a single node only:

Create an ordering and a colocation constraint, so that DLM starts before CLVMD, and both resources start on the same node:

Check the status of the clone resources:

Set the no-quorum-policy of the cluster to freeze so that that when quorum is lost, the remaining partition will do nothing until quorum is regained – GFS2 requires quorum to operate.

Create the LVM objects as required, again, from a single cluster node:

Create the GFS2 filesystem. The -t option should be specified as <clustername>:<fsname>, and the right number of journals should be specified (here 2 as we have two nodes accessing the filesystem):

We will not use /etc/fstab to specify the mount, rather we’ll use a Pacemaker-controlled resource:

This is configured as a clone resource so it will run on both nodes at the same time. Confirm that the mount has succeeded on both nodes:

Note the use of noatime and nodiratime which will yield a performance benefit. As per Red Hat Documentation, SELinux should be disabled too.

Next, create an ordering constraint so that the filesystem resource is started after the CLVMD resource, and a colocation constraint so that both start on the same node:

And we’re done.

We can even grow the filesystem online:

Building a Highly-Available Apache Cluster on CentOS 7

This article will walk through the steps required to build a highly-available Apache cluster on CentOS 7. In CentOS 7 (as in Red Hat Enterprise Linux 7) the cluster stack has moved to Pacemaker/Corosync, with a new command line tool to manage the cluster (pcs, replacing commands such as ccs and clusvcadm in earlier releases).

The cluster will be a two node cluster comprising nodes centos05 and centos07, and iSCSI shared storage will be presented from node fedora01. There will be a 8GB LUN presented for shared storage, and a 1GB LUN for fencing purposes. I have covered setting up iSCSI storage with SCSI-3 persistent reservations in a previous article. There is no need to use CLVMD in this example as we will be utilising a simple failover filesystem instead.

The first step is to add appropriate entries to /etc/hosts on both nodes for all nodes, including the storage node, to safeguard against DNS failure:

Next, bring both cluster nodes fully up-to-date, and reboot them:

Continue reading

Nagios Plugin – check_mem

check_mem is a simple Nagios plugin to check memory utilisation on Linux servers. I have written both a bash and a Ruby version of this script.

bash version:

Ruby version:

 

Configuring and Deploying MCollective with Puppet on CentOS 6

The Marionette Collective (MCollective) is a server orchestration/parallel job execution framework available from Puppet Labs (http://docs.puppetlabs.com/mcollective/). It can be used to programmatically execute administrative tasks on clusters of servers. Rather than directly connecting to each host (think SSH in a for loop), it uses publish/subscribe middleware to communicate with many hosts at once. Instead of relying on a static list of hosts to command, it uses metadata-based discovery and filtering and can do real-time discovery across the network.

Getting MCollective up and running is not a trivial task. In this article I’ll walk through the steps required to setup a simple MCollective deployment. The middleware of choice as recommended by the Puppet Labs documentation is ActiveMQ. We’ll use a single ActiveMQ node for the purposes of this article. For a Production deployment, you should definitely consider the use of a clustered ActiveMQ configuration. Again for the sake of simplicity we will only configure a single MCollective client (i.e. our “admin” workstation). For real-world applications you’ll need to manage clients as per the standard deployment guide.

There are four hosts in the lab – centos01 which is our Puppet Master and MCollective client, centos02 which will be the ActiveMQ server and an MCollective server, centos03 and centos04 which are both MCollective servers. All hosts run Puppet clients already, which I’ll use to distribute the appropriate configuration across the deployment. All hosts are running Centos 6.5 x86_64.

Continue reading

SCSI-3 Persistent Reservations on Fedora Core 20 with targetcli over iSCSI and Red Hat Cluster

In this article, I’ll show how to set up SCSI-3 Persistent Reservations on Fedora Core 20 using targetcli, serving a pair of iSCSI LUNs to a simple Red Hat Cluster that will host a failover filesystem for the purposes of testing the iSCSI implementation. The Linux IO target (LIO) (http://linux-iscsi.org/wiki/LIO) has been the Linux SCSI target since kernel version 2.6.38. It supports a rapidly growing number of fabric modules, and all existing Linux block devices as backstores. For the purposes of our demonstration, the important fact is that it supports operating as an iSCSI target. targetcli is the tool used to perform the LIO configuration. SCSI-3 persistent reservations are required for a number of cluster storage configurations for I/O fencing and failover/retakeover. Therefore, LIO can be used as the foundation for high-end clustering solutions such as Red Hat Cluster Suite. You can read more about persistent reservations here.

The nodes in the lab are as follows:

  • 10.1.1.103centos03 – Red Hat Cluster node 1 on CentOS 6.5
  • 10.1.1.104centos04 – Red Hat Cluster node 2 on CentOS 6.5
  • 10.1.1.108fedora01 – Fedora Core 20 storage node

Installation

I’ll start by installing targetcli onto fedora01:

Let’s check that it has been installed correctly:

Make sure that, before proceeding, any existing configuration is removed:

Continue reading

Building a Highly-Available Load Balancer with Nginx and Keepalived on CentOS

In this post I will show how to build a highly-available load balancer with Nginx and keepalived. There are issues running keepalived on KVM VMs (multicast over the bridged interface) so I suggest you don’t do that. Here, we’re running on physical nodes, but VMware machines work fine too. The end result will be a high performance and scalable load balancing solution which can be further extended (for example, to add SSL support).

First, a diagram indicating the proposed topology. All hosts are running CentOS 6.5 x86_64.

nginx_load_balancerAs you can see, there are four hosts. lb01 and lb02 will be running Nginx and keepalived and will form the highly-available load balancer. app01 and app02 will be simply running an Apache webserver for the purposes of this demonstration. www01 is the failover virtual IP address that will be used for accessing the web application on port 80. My local domain name is .local.

Continue reading

Installing Nagios under Nginx on Ubuntu 14.04 LTS

Nagios is an excellent open source monitoring solution that can be configured to monitor pretty much anything. In this article, I’ll describe how to install Nagios under Nginx on Ubuntu 14.04 LTS.

First of all, check that the system is fully up to date:

Next, install the build-essential package so that we can build Nagios and its plugins from source:

Install Nginx, and verify that it has started:

Install libgd2-xpm-dev, php5-fpm, spawn-fcgi and fcgiwrap:

Next, create a nagios user:

Issue the following commands to create a nagcmd group, and add it as a secondary group to both the nagios and www-data users:

Download the latest Nagios core distribution from http://www.nagios.org/download – at the time of writing this was version 4.0.7.

Continue reading

Implementing Git Dynamic Workflows with Puppet

Puppet is the obvious choice for centralised configuration management and deployment, but what happens when things go wrong (or you have the need to test changes)? A typo in a manifest or module, or an accidental deletion, and all hell could break loose (and be distributed to hundreds of servers). What’s needed is integration with a version control system.

I thought about using Subversion, but instead I decided to get with the times, and look at implementing a git repository for the version of my Puppet manifests and modules. Whilst I was at it, I decided to make use of Puppet’s dynamic environment functionality. The end goal was to be able to take a branch of the master Puppet configuration, and have that environment immediately available for use using the --environment=<environment> option to the Puppet agent.

An example will help clarify. Suppose I’m working on a new set of functionality, and don’t want to touch the current set of Puppet modules and inadvertently cause change in production. I could do this:

and then run my Puppet agent against this new testing code:

It would be a pain to have to update /etc/puppet/puppet.conf each time I create a new environment, so it is much easier to use dynamic environments, where a variable ($environment) is used in the configuration instead of static configuration. See the Puppet Labs documentation for more clarity.

First, edit /etc/puppet/puppet.conf - mine looks like this after editing – yours may be different:

As you can see, I set a default environment of production, and then specify paths to the manifest and modulepath directories, using the $environment variable to dynamically populate the path. Production manifest and modulepath paths will end up being $confdir/environments/production/manifests/site.pp and $confdir/environments/production/modules respectively. As new environments are dynamically created, the $environment variable will be substituted as appropriate.

Next, I moved my existing Puppet module and manifest structure around to suit the new configuration:

And restarted Apache (as I run my puppetmaster under Apache HTTPD/Passenger):

I then ran a couple of agents to ensure everything was still working:

They defaulted, as expected, to the Production environment.

Next, I installed git on my puppetmaster:

After this I created a root directory for my git repository:

/opt is on a separate logical volume in my setup. Next, create a local git repository from the existing Puppet configuration:

And clone a bare repository from this commit:

This cloned repository is where people will clone their own copies of the code, make changes, and push them back to – this is our remote repository.

All of the people making changes are in the wheel group, so set appropriate positions across the repository:

We can now clone the repository, make changes, and push them back up to the remote repository. But we still need to add the real functionality. Two git hooks need to be added – one to occur on update (the update hook) to perform some basic syntax checking of the Puppet code being updated and rejecting the update if syntax is bad, and a post-receive hook to check the code out into the appropriate place under /etc/puppet/environments, taking into account whether this is an update, a new branch, or a deletion of an existing branch. I took the update script from projects.puppetlabs.com and made a slight alteration (as it was failing on import statements), and took the Ruby from here and the shell script from here, plus some of my own sudo shenanigans, to come up with a working post-receive script.

Here is /opt/git/puppet.git/hooks/update:

And here is /opt/git/puppet.git/hooks/post-receive:

As previously discussed, all admins working with Puppet are members of the wheel group, so I made sure they could run commands as puppet so that the sudo commands in the post-receive hook would work:

I also removed my Puppet account from lockdown for this:

With all these changes in place, I can now work as expected, and dynamically create environments with all the benefits of version control for my Puppet configuration.

GFS2 Implementation Under RHEL

This article will demonstrate setting up a simple RHCS (Red Hat Cluster Suite) two-node cluster, with an end goal of having a 50GB LUN shared between two servers, thus providing clustered shared storage to both nodes. This will enable applications running on the nodes to write to a shared filesystem, perform correct locking, and ensure filesystem integrity.

This type of configuration is central to many active-active application setups, where both nodes share a central content or configuration repository.

For this article, two RHEL 6.1 nodes, running on physical hardware (IBM blades) were used. Each node has multiple paths back to the 50GB SAN LUN presented, and multipathd will be used to manage path failover and rebuild in the event of interruption.

Continue reading

User, Group and Password Management on Linux and Solaris

This article will cover the user, group and password management tools available on the Linux and Solaris Operating Systems. The specific versions covered here are CentOS 6.4 and Solaris 11.1, though the commands will transfer to many other distributions without modifications (especially RHEL and its clones), or with slight alterations to command options. Check your system documentation and manual pages for further information.

Knowing how to manage users effectively and securely is a requirement of financial standards such as PCI-DSS, and information security management systems such as ISO 27001.

In this article, I will consider local users and groups – coverage of naming services such as NIS and LDAP is beyond its scope but may be covered in a future article. This article also presumes some prior basic system administration exposure with a UNIX-like operating system. 

Continue reading