How to Get Started with Solaris Containers

Disclaimer: This post was originally posted in 2008 as an article on the now-defunct website zazzybob.com. While the software version and actual commands used may vary, the concepts are still similar and give a general idea of how to approach a given problem.

Solaris Containers, available starting with Solaris 10, allow us to portion a physical server into one or more logical units. Whilst Containers are a form of virtualisation, they are not in the traditional sense (multiple OS instances with VMware, or hardware partitioning with LDoms).

The container can be thought of more like a “chroot” environment (in the case of sparse zones) where system resources are also in effect “chrooted” so that processes cannot run away and consume all of the resources of physical parent (the global zone), thus rendering the system inoperable. Only a single instance of Solaris 10 is ever installed (in the global zone) making package and patch management simple. Just apply the patch to the global zone, and all child zones will use the same binary set.

Some, or all, of the parent’s filesystems can be mounted read-write or read-only within the zone. Special care must be taken when mounting a global zone filesystem read-write, as the child zone may be able to cause a denial of service to the global zone by filling a disk.

Up until recently, Solaris Containers could only inherit the global (i.e. physical parent) zones TCP/IP stack. Now, we can assign exclusive phyical interfaces to the Container, but in this article I’ll be creating a dynamic link aggregation of three NICs in the global zone, and then allowing the zones to create virtual interfaces on this aggregation.

There are many more features of Solaris Containers I’ve not had time to mention - Sun do a perfectly good job of this over at docs.sun.com - as can other elimentary topics that have not been covered in this discussion.

As we’re running this setup on a T2000 (SPARC T1 with 4 cores, 4 threads per core - i.e. 16 logical cores) with 8Gb RAM. Ensure you partition so that your zoneroot (i.e. the filesystem on which you will store your zones) has plenty of free capacity. The system has 4 x e1000g interfaces - I will assign a single interface for the exclusive use of the global zone. I will then assign the remaining three interfaces to an aggregation.

I’ll employ resouce controls, and bind a dynamic pool of between 8 and 15 logical cores to the Containers (using Fair Share Scheduling to assign CPU shares on a zone-by-zone basis).

The terms “Containers” and “zones” are use interchangably and at my whim. They both refer to the same thing.

Initial Configuration

Dynamic Link Aggregation

The first thing to do is aggregate three of the parents physical NICs into a dynamic link - this will create a high throughput link on which the zones can reside.

First, configure your switch. Our Cisco switches support EtherChannel, so configuring the aggregation switch-side with LACP was straightforward. I will not cover switch configuration here.

Once your NICs are connected to a configured switch, you can start configuring the aggregation.

If you’re using the ipge driver for your NICs, they will be seen as legacy devices, and will not be appropriate for aggregation. On a T2000, you can transition to the e1000g driver via patch 123334-02.

Check if the patch is installed

# showrev-p |  grep 123334-02

1	# showrev-p \| grep 123334-02

If not, download from sunsolve.sun.com, and install it

# patchadd 123334-02

1	# patchadd 123334-02

Once this is complete, you can perform the transition

# svcadm milestone single-user
# /usr/sbin/e1000g_transition

1 2	# svcadm milestone single-user # /usr/sbin/e1000g_transition

Answer “y” when asked to proceed, and “n” when asked about halting the system. Once the transition is complete, you can reboot:

# shutdown -y -g0 -i6

1	# shutdown -y -g0 -i6

Once the system comes up, you can begin configuring the aggregation. First, ensure that all NICs are now using the e1000g driver.

# dladm show-dev
e1000g0         link: up        speed: 100   Mbps       duplex: full
e1000g1         link: down      speed: 0     Mbps       duplex: half
e1000g2         link: down      speed: 0     Mbps       duplex: half
e1000g3         link: down      speed: 0     Mbps       duplex: half

# dladm show-dev

e1000g0 link: up speed: 100 Mbps duplex: full

e1000g1 link: down speed: 0 Mbps duplex: half

e1000g2 link: down speed: 0 Mbps duplex: half

e1000g3 link: down speed: 0 Mbps duplex: half

Good, we can now check that the types are no longer legacy devices

# dladm show-link
e1000g0         type: non-vlan  mtu: 1500       device: e1000g0
e1000g1         type: non-vlan  mtu: 1500       device: e1000g1
e1000g2         type: non-vlan  mtu: 1500       device: e1000g2
e1000g3         type: non-vlan  mtu: 1500       device: e1000g3

# dladm show-link

e1000g0 type: non-vlan mtu: 1500 device: e1000g0

e1000g1 type: non-vlan mtu: 1500 device: e1000g1

e1000g2 type: non-vlan mtu: 1500 device: e1000g2

e1000g3 type: non-vlan mtu: 1500 device: e1000g3

If they were still legacy, you’d see type: legacy

Notice that the show-dev subcommand displays link information, whilst show-link shows device information? Hmm…

If this all checks out, you’re ready to create your aggregation

# dladm create-aggr -P 3 -l active -d e1000g1 -d e1000g2 -d e1000g3 -u 00:14:4f:6a:ce:c4 1

1	# dladm create-aggr -P 3 -l active -d e1000g1 -d e1000g2 -d e1000g3 -u 00:14:4f:6a:ce:c4 1

Here we specify the LACP policy (3), and LACP mode of active (this must match the switch configuration). Next, we define the devices to use in the aggregation, the MAC to assign, and finally a unique instance number to assign to the aggregation. So our configured aggregation will have an “aggr1″ logical device name.

Check that the aggregation has been formed correctly

# dladm show-aggr -L
key: 1 (0x0001) policy: L3      address: 0:14:4f:6a:ce:c4 (fixed)
               LACP mode: active       LACP timer: short
   device    activity timeout aggregatable sync  coll dist defaulted expired
   e1000g3   active   short   yes          yes   yes  yes  no        no     
   e1000g2   active   short   yes          yes   yes  yes  no        no     
   e1000g1   active   short   yes          yes   yes  yes  no        no

# dladm show-aggr -L

key: 1 (0x0001) policy: L3 address: 0:14:4f:6a:ce:c4 (fixed)

LACP mode: active LACP timer: short

device activity timeout aggregatable sync coll dist defaulted expired

e1000g3 active short yes yes yes yes no no

e1000g2 active short yes yes yes yes no no

e1000g1 active short yes yes yes yes no no

Check that the aggregation can be plumbed

# ifconfig aggr1 plumb
# ifconfig aggr1
aggr1: flags=1000842<broadcast,running,multicast,ipv4> mtu 1500 index 3
       inet 0.0.0.0 netmask 0 
       ether 0:14:4f:6a:ce:c4

# ifconfig aggr1 plumb

# ifconfig aggr1

aggr1: flags=1000842<broadcast,running,multicast,ipv4> mtu 1500 index 3

inet 0.0.0.0 netmask 0

ether 0:14:4f:6a:ce:c4

Make the interface persistent with a dummy IP

# echo "0.0.0.0" > /etc/hostname.aggr1

1	# echo "0.0.0.0" > /etc/hostname.aggr1

And test the changes persist across the reboot

# shutdown -y -g0 -i6

1	# shutdown -y -g0 -i6

Now we can move on to configuring our dynamic processor pool

Resource Management

We will provision resources as follows: the zones will have a dynamic processor pool of between 2 and 15 logical cores, which will vary as utilisation calls for it. The default pool will have a minimum of 1 logical core allocated to it. The default pool will be left available for exclusive use of the global zone, again so that zones cannot consume all resources. We’ll use the TS scheduler for the default pool, but will use FSS (Fair Share Scheduling) to assign weighted CPU shares to our zones.

First, enable the pools and pools/dynamic SMF services

# svcadm enable svc:/system/pools:default
# svcadm enable svc:/system/pools/dynamic:default

1 2	# svcadm enable svc:/system/pools:default # svcadm enable svc:/system/pools/dynamic:default

Once done, we can set the default pool scheduler to TS

# poolcfg -c 'modify pool pool_default ( string pool.scheduler="TS" )'

1	# poolcfg -c 'modify pool pool_default ( string pool.scheduler="TS" )'

Instantiate the configuration

# pooladm -c

1	# pooladm -c

Next, we define our zone processor set (zone_pset) and the pool that the zones will use (zone_pool), and configure the pool scheduler to be FSS

# poolcfg -c 'create pset zone_pset ( uint pset.min=2; uint pset.max=15 )'
# poolcfg -c 'create pool zone_pool'
# poolcfg -c 'associate pool zone_pool ( pset zone_pset )'
# poolcfg -c 'modify pool zone_pool ( string pool.scheduler="FSS" )'
# pooladm -c

# poolcfg -c 'create pset zone_pset ( uint pset.min=2; uint pset.max=15 )'

# poolcfg -c 'create pool zone_pool'

# poolcfg -c 'associate pool zone_pool ( pset zone_pset )'

# poolcfg -c 'modify pool zone_pool ( string pool.scheduler="FSS" )'

# pooladm -c

Once this has been completed, we can begin provisioning our Containers.

Container Provisioning

I will provision container as follows. The container will be a sparse root zone, and will inherit read-only filesystems (such as /lib, /usr) from the global zone. The container will mount my home directory from the global zone, and will do so read-write (/home is on a different partition in the global zone). This allows me to have a consistent home environment across all zones on the server, as well as the global zone itself. It will be assigned 100 CPU shares.

Configure the zone using zonecfg

# zonecfg -z test-zone
> create -F
> set zonepath=/var/zones
> set autoboot=true
> set pool=zone_pool
> add net
> set address=192.168.x.x
> set physical=aggr1
> end
> add rctl
> set name=zone.cpu-shares
> add value ( priv=privileged,limit=100,action=none )
> end
> add fs
> set dir=/home/kevin
> set special=/home/kevin
> set type=lofs
> set options=[rw,nodevices]"
> end
> verify
> commit
> exit

# zonecfg -z test-zone

> create -F

> set zonepath=/var/zones

> set autoboot=true

> set pool=zone_pool

> add net

> set address=192.168.x.x

> set physical=aggr1

> end

> add rctl

> set name=zone.cpu-shares

> add value ( priv=privileged,limit=100,action=none )

> end

> add fs

> set dir=/home/kevin

> set special=/home/kevin

> set type=lofs

> set options=[rw,nodevices]"

> end

> verify

> commit

> exit

Stepping through this - we create the zone “test-zone”, set the zonepath (path to directory where zones will be stored), set the zone to automatically boot when the global zone boots and bind the zone to the zone_pool resource pool.

Next, we add a network interface, bound to aggr1 in the global zone, and assign an IP address. Once your zone is in operation, you’ll see an interface alias (aggr1:1) in the global zone.

A resource control is added next, assigning our zone 100 FSS CPU shares. My home directory is then imported, from /home/kevin in the global zone (special) to /home/kevin in the child zone (dir). See that we specify “rw” (read-write) in our options.

Once the configuration is verified and committed, we can install the zone.

# zoneadm -z test-zone install

1	# zoneadm -z test-zone install

The zone will appear in the output of zoneadm list

# zoneadm list
global
test-zone

# zoneadm list

global

test-zone

Once this has completed, we can boot the zone. However, I like to perform some post install steps on the zone.

Zone Post Install

First, I disable the /home automount. I dislike automount - I don’t use NFS.

Here, I show a handy technique you can use to administer the zone from the global zone - the zone’s filesystems will appear under

/<zone_path>/<zone_name>/root

1	/<zone_path>/<zone_name>/root

For example, on our zone the auto_master file can be reached from the global zone at

/var/zones/test-zone/root/etc/auto_master

1	/var/zones/test-zone/root/etc/auto_master

Disable the NFS4 domain question from appearing on the zones initial boot

# touch /var/zones/test-zone/root/etc/.NFS4inst_state.domain

1	# touch /var/zones/test-zone/root/etc/.NFS4inst_state.domain

I also create a sysidcfg file, that (like JumpStart) allows for an unattended first boot of the zone.

# cat > ~root/sysidcfg
system_locale=C
timezone=Your/TimeZone
terminal=ansi
security_policy=NONE
root_password=eNcRyPtEdPa55w0rD
timeserver=localhost
name_service=NONE
network_interface=primary { hostname=test-zone
                            netmask=255.255.255.0
                            default_route=192.168.x.1
                            protocol_ipv6=no }
nfs4_domain=dynamic
^D
# chmod 400 ~root/sysidcfg
# cp -p ~root/sysidcfg /var/zones/test-zone/root/etc

# cat > ~root/sysidcfg

system_locale=C

timezone=Your/TimeZone

terminal=ansi

security_policy=NONE

root_password=eNcRyPtEdPa55w0rD

timeserver=localhost

name_service=NONE

network_interface=primary { hostname=test-zone

netmask=255.255.255.0

default_route=192.168.x.1

protocol_ipv6=no }

nfs4_domain=dynamic

# chmod 400 ~root/sysidcfg

# cp -p ~root/sysidcfg /var/zones/test-zone/root/etc

Once this is done, you can boot the zone

# zoneadm -z test-zone boot

1	# zoneadm -z test-zone boot

If you hadn’t performed the post install, you’d have to use the zlogin command to connect to the console of the zone, and run through the initial boot dialog

# zlogin -C -e'#.' test-zone

1	# zlogin -C -e'#.' test-zone

If you don’t need the console, connect a virtual tty to the zone, and start working!

# zlogin test-zone

1	# zlogin test-zone

Automation

I have written two scripts to automate the zone provisioning process.

configure_pset_pool.sh - This script is designed to be run on a freshly installed Solaris 10 global zone (on a 16 thread T2000) and will set up the zone resource pools described in this article.

provision_zone.sh - An incredibly detailed script, that will not only provision zones as above, but also perform a large amount of additional post installation steps (for example setting up user accounts, a few security steps, etc).

Instead of typing everything above (except the Link Aggregation), I could have achieved the same with

# configure_pset_pool.sh
# provision_zone.sh -z test-zone -p aggr1 -i 192.168.x.x -a -b

1 2	# configure_pset_pool.sh # provision_zone.sh -z test-zone -p aggr1 -i 192.168.x.x -a -b

But where would the fun have been in that?!

Conclusion

I hope this article has proved to be informative, and has highlighted methods for deploying Solaris Containers on your hardware, as well as controlling basic zone resources.

Toki Winter

Advanced UNIX for the experienced system administrator

How to Get Started with Solaris Containers

Initial Configuration

Dynamic Link Aggregation

Resource Management

Container Provisioning

Zone Post Install

Automation

Conclusion

Initial Configuration

Dynamic Link Aggregation

Resource Management

Container Provisioning

Zone Post Install

Automation

Conclusion

Related posts: