Cluster restart procedure summary

This procedure aims at giving you the possibility to restart the entire management cluster. Resetting a single cluster member is out of scope as it is already described in the Hypervisor evacuation procedure section.

Managed compute nodes and service routers management is also out of this section’s scope. Here we assume that the cluster is ready to be taken down, please check that:

  • Lnet and IP routers are halted, if these are on VMs managed by fleet, stop them with fleetctl stop to make sure that they won’t be restarted at boot.

  • Compute nodes are halted

  • Any additional VMs that are managed by fleet but not located on the management hypervisors are halted.

  • The cluster is in good shape, you can use the begining of the Hypervisor evacuation procedure section as a way to check this.

This procedure is pretty simple but, here, order is important. Moreover, any incident and/or weirdness that could happen in between should be investigated immediately as it can impede the cluster start-up process.

All the commands precised in this procedure must be executed on a top hypervisor (top1 for instance).

Shutdown steps are:

  • Shutting down management VMs

  • Shutting down islet workers

  • Shutting down remaining GlusterFS clients

  • Stopping GlusterFS processes

  • Shutting down remaining management hypervisors

Startup steps are pretty much the exact opposite:

  • Starting up top management hypervisors

  • Starting up GlusterFS processes

  • Starting up islet workers

  • Starting up management VMs

Cluster shutdown procedure

VM shutdown

The shutdown procedure starts by shutting down all VMs. This will use fleetctl as a way to shutdown VMs all at once.

First, collect the VMs to shutdown. Note that service unit name for VMs are formatted like pcocc-vm-${VM}, so any command that lists VMs should be able to format output this way.

If present, you can use the fleet clustershell group source to list VMs (in different states) and to format it the right way:

(top1)# nodeset -ll -s vm
@vm:running admin[1-2],batch[1-3],[...]
(top1)# nodeset -f @vm:running
admin[1-2],batch[1-3],[...]
(top1)# nodeset -O pcocc-vm-%s.service -e @vm:running
pcocc-vm-admin1.service pcocc-vm-admin2.service pcocc-vm-auto1.service pcocc-vm-batch1.service pcocc-vm-batch2.service pcocc-vm-batch3.service

If not, use the following bash snippet to list running VMs:

(top1)# fleetctl list-units --fields unit,sub --no-legend | grep running | awk '{print $1}'

Note

Keep this service name list on your workstation or on a file on the top hypervisors. This may be reused while starting up.

Warning

This will shutdown all the hosted services, by doing this you will loose DNS, for instance. So double check that you have everything shut down. For DNS, any record that must be available, even when the DNS servers are unavailable (GlusterFS server for instance), must be tagged as critical in confiture’s configuration, like the following:

addresses:
  top[1-3]:
    default: [ adm ]
    eq: 10.0.0.[1-3]
    adm:  10.1.0.[1-3]
    data: 10.5.0.[1-3]
    critical: True

Now, shut down these VMs (except DNS VMs) reusing the previous command output, so either:

(top1)# fleetctl stop --no-block $(nodeset -O pcocc-vm-%s.service -e @vm:running -x ns[1-3])

Or:

(top1)# fleetctl list-units --fields unit,sub --no-legend | grep running | awk '{print $1}' | grep -v ns[1-3] | xargs fleectl stop --no-block

Check that all VMs are shut down or in the shutdown process using fleetctl list-units. If some VMs seem to be still running and not shutting down (active/running state in fleet), stop the fleet daemon on the hosting hypervisor (systemctl stop fleet), shut down the VM again and restart the fleet daemon (systemctl stop fleet).

Then shutdown DNS VMs:

(top1)# fleetctl stop --no-block pcocc-vm-ns1 pcocc-vm-ns2 pcocc-vm-ns3

Again, check that all VMs are shut down.

You can now shut down the islet workers:

(top1)# clush -bw @island_worker poweroff

GlusterFS stopping

In order to shut down properly the GluterFS processes, make sure that there are no GlusterFS clients.

Umount GlusterFS on the remaining hypervisors.

(top1)# clush -bw @top,@worker umount -a -t fuse.glusterfs

Check that no clients are still connected:

(top1)# gluster volume status all clients
[...]

Note

Hypervisors themselves may appear as clients. This is because of the Self-heal daemon that acts like a GlusterFS client. You can safely ignore them.

When all clients are disconnected, you can stop the GlusterFS bricks:

(top1)# gluster volume stop volspoms1
(top1)# gluster volume stop volspoms2
(top1)# clush -bw @top,@worker systemctl stop glusterd
(top1)# clush -bw @top,@worker pkill glusterfs
(top1)# clush -bw @top,@worker pkill glusterfsd

And, now you can shut down the top hypervisors:

(top1)# clush -bw @top,@worker -x top1 poweroff
(top1)# poweroff

Cluster start-up procedure

GlusterFS startup

Start the top and worker hypervisors using IPMI or by manually starting them up. Then, wait that they are all reachable.

Check that GlusterFS peers are correctly communicating:

(top1)# gluster peer status
(top1)# time fping $(nodeset -e @top,@worker)

Warning

Anything that slows down communications must be investigated. For instance, any DNS timeout and/or failures must be avoided.

Start up GlusterFS volumes:

(top1)# gluster volume start volspoms1
(top1)# gluster volume start volspoms2

Check that volumes are correctly healed:

(top1)# gluster volume heal volspoms1
(top1)# gluster volume heal volspoms2
(top1)# gluster volume heal volspoms1 info
(top1)# gluster volume heal volspoms2 info

Note

The gluster volume heal VOLUME info may take some time to complete. You may see more-than-normal network usage in the mean time, this is normal.

Once GlusterFS seems healthy, you can start islet workers using nodectrl:

(top1)# nodectrl poweron @island_worker

VM startup

When all the hypervisors are reachable, start the management VMs. You can reuse the list of service names you’ve shut down previously or just start all VMs. Filter here all VMs you don’t want to start (like lnet routers).

First, start the fleet daemons:

(top1)# clush -bw @top,@worker,@island_worker "systemctl start fleet"

Second, start the DHCP VMs:

(top1)# fleetctl start --no-block pcocc-vm-infra1 pcocc-vm-infra2

Then, the DNS VMs:

(top1)# fleetctl start --no-block pcocc-vm-ns1 pcocc-vm-ns2 pcocc-vm-ns3

And finally, start all the remaining VMs:

(top1)# fleetctl start --no-block $(nodeset -O pcocc-vm-%s.service -e @vm:*)

Or

(top1)# fleetctl list-units --fields unit,sub --no-legend | awk '{print $1}' | xargs fleectl start --no-block

To validate that everything is in good shape, you can use the begining of the Hypervisor evacuation procedure section.

Moreover, any monitoring alert about management services or VMs should be investigated here.