Virtual machines operations¶
Architecture¶
Ocean’s infrastructure services are all hosted inside virtual machines. Virtual machines drives are stored inside a sharded GlusterFS volume and launched using a particular mode of the Pcocc tool. Pcocc uses qemu to launch VMs and requires some components or service:
an etcd backend to store, among other thing, managed VMs states
a shared access to VM images
a bridged access to the required networks.
When virtual machines are launched with pcocc using systemd, an heartbeat is performed by pcocc at a 30 seconds interval using a qemu-agent connection (meaning that the agent must be running within the VM).
The result of the heartbeat is given to systemd service watchdog. This means that VM heartbeat can be disabled or tuned per VM with some systemd configuration (WatchdogSec
). This mechanism ensures VM availability.
Hypervisor-level availability is achieved using fleet. fleet can be seen as a cluster-wide systemd manager that reacts to hypervisor incidents:
+------------+------------+-------------+------------+
| Agent | Agent | Agent | Agent |
+------------+------------+-------------+------------+
| Pcocc VM | Pcocc VM | Pcocc VM | Pcocc VM |
+------------+------------+-------------+------------+
| SystemD | SystemD |
+-------------------------+--------------------------+
| Fleet |
+----------------------------------------------------+
No HA is provided within the virtual machines.
Incident response¶
The following table resumes which component reacts in case of a given incident.
Incident
Reactionner
Response
A service crashes
SystemD
None (Can be restarted automatically if configured)
Hang
Pcocc
Notifies underlying SystemD watchdog. See Missed heatbeat
Missed heatbeat
SystemD
Kill (SIGABRT) pcocc process, restarted automatically.
QEMU/Pcocc crash
SystemD
Restarts automatically process after pausing (15 secs)
Fleet crash
SystemD
Restarts automatically, if it takes less than agent ttl (30 secs) no reaction. If more than agent ttl, see hypervisor crash.
Fleet stopped
Fleet
Fleet stops all locally launched services and reschedules them
Network partition
Fleet
Fleet stops all locally launched services if it can’t report to etcd. Treated as a crash for other cluster members
Hypervisor crash
Fleet
Fleet re-schedules launched services on other available hypervisors
Monitoring¶
Pcocc VM¶
Using the pcocc ps
you can list your launched VMs :
# pcocc ps
ID NAME USER PARTITION NODES DURATION TIMELIMIT
-- ---- ---- --------- ----- -------- ---------
518 batch1 root N/A top1 4 days, 3:08:53 N/A
519 lb2 root N/A worker3 4 days, 3:08:52 N/A
520 i54dkless1 root N/A islet55 4 days, 3:08:52 N/A
521 ns2 root N/A worker1 4 days, 3:08:52 N/A
522 batch2 root N/A top3 4 days, 3:08:51 N/A
523 i0conf2 root N/A top2 4 days, 3:08:51 N/A
524 admin1 root N/A top1 4 days, 3:08:50 N/A
525 i0conf1 root N/A worker1 4 days, 3:08:50 N/A
526 i54log1 root N/A islet55 4 days, 3:08:49 N/A
527 nsrelay1 root N/A top3 4 days, 3:08:49 N/A
528 admin2 root N/A top2 4 days, 3:08:49 N/A
529 infra1 root N/A top1 4 days, 3:08:48 N/A
530 ns1 root N/A worker1 4 days, 3:08:47 N/A
531 i0log1 root N/A top2 4 days, 3:08:46 N/A
532 infra2 root N/A top1 4 days, 3:08:46 N/A
533 lb1 root N/A top3 4 days, 3:08:45 N/A
534 db1 root N/A top3 4 days, 3:08:41 N/A
535 ns3 root N/A top3 4 days, 3:08:39 N/A
536 webrelay1 root N/A top3 4 days, 3:08:36 N/A
545 irene271b root N/A irene271 4 days, 2:31:23 N/A
546 irene271a root N/A irene271 4 days, 2:31:19 N/A
547 i54conf2 root N/A islet55 4 days, 2:29:45 N/A
548 i54conf1 root N/A islet55 4 days, 2:29:43 N/A
549 i54dkless2 root N/A islet55 3 days, 20:47:32 N/A
294 siteprep2 root N/A top1 19 days, 23:26:38 N/A
551 i38dkless1 root N/A islet39 2 days, 16:15:06 N/A
550 i38dkless2 root N/A islet38 2 days, 16:15:07 N/A
To see if a VM is still alive, you can you either ping the agent within the VM using the job ID:
# pcocc agent ping -j 518
1 VMs answered in 0.30s
Or run a command using the pcocc agent:
# pcocc agent run -j 518 df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda1 20961280 3218320 17742960 16% /
devtmpfs 1598056 0 1598056 0% /dev
tmpfs 1621840 0 1621840 0% /dev/shm
tmpfs 1621840 16892 1604948 2% /run
tmpfs 1621840 0 1621840 0% /sys/fs/cgroup
/dev/mapper/system-var 25149444 827132 24322312 4% /var
top1-data:/volspoms2 5366088704 24639872 5341448832 1% /volspoms2
top1-data:/volspoms1 8049133056 2124307712 5924825344 27% /volspoms1
tmpfs 1621840 0 1621840 0% /tmp
Use pcocc console
to monitor the VM console, with the -l
option to get history:
# pcocc console -J admin1 vm0
CentOS Linux 7 (Core)
Kernel 3.10.0-957.21.3.el7.x86_64 on an x86_64
admin1 login:
# pcocc console -J admin1 -l vm0
[...] LESS MODE [...]
Exiting the pcocc console
is done by typing 3 Ctrl-C in less than 2 seconds.
Pcocc on SystemD¶
Ocean’s Pcocc deployment creates systemd services named by VM name prefixed by pcocc-vm-
.
Even if services are managed using fleet, systemd services are still visible on the VM hypervisor.
# systemctl status pcocc-vm-infra1.service
pcocc-vm-infra1.service - Fleet service for pcocc VM infra1
Loaded: loaded (/run/fleet/units/pcocc-vm-infra1.service; linked-runtime; vendor preset: disabled)
Active: active (running) since Thu 2019-10-03 07:34:54 CEST; 4 days ago
Process: 44874 ExecStartPre=/usr/sbin/prepare-ocean-image.sh infra1 (code=exited, status=0/SUCCESS)
Main PID: 44912 (pcocc)
Status: "Watchdog successful at 2019-10-07 11:00:56.930856"
CGroup: /system.slice/pcocc-vm-infra1.service
├─44912 /usr/bin/python /usr/bin/pcocc -vv alloc -E sleep infinity -c 4 -m 1000 -J infra1 infra1
├─45675 /usr/bin/python /usr/bin/pcocc -vv internal launcher -E sleep infinity infra1
├─45891 /usr/bin/python /usr/bin/pcocc -vv internal run
├─45901 sleep infinity
└─46022 qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host -S -rtc base=utc -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vgamem_mb=16 -device virtio-scsi-pci,id=scsi0 -object iothread,id=ioth-drive0 -device scsi-hd,id=scsi-hd-drive0,bus...
Oct 07 10:58:51 top1 pcocc[44912]: DEBUG:root:Sending agent sync {"execute":"guest-sync", "arguments": { "id": 430683190 }}
Oct 07 10:59:05 top1 pcocc[44912]: DEBUG:etcd.client:Writing to key /pcocc/global/users/root/batch-local/heartbeat/529 ttl=60 dir=False append=False
Oct 07 10:59:22 top1 pcocc[44912]: DEBUG:root:Sending agent sync {"execute":"guest-sync", "arguments": { "id": 508227119 }}
Oct 07 10:59:35 top1 pcocc[44912]: DEBUG:etcd.client:Writing to key /pcocc/global/users/root/batch-local/heartbeat/529 ttl=60 dir=False append=False
Oct 07 10:59:53 top1 pcocc[44912]: DEBUG:root:Sending agent sync {"execute":"guest-sync", "arguments": { "id": 143384388 }}
Oct 07 11:00:05 top1 pcocc[44912]: DEBUG:etcd.client:Writing to key /pcocc/global/users/root/batch-local/heartbeat/529 ttl=60 dir=False append=False
Oct 07 11:00:24 top1 pcocc[44912]: DEBUG:root:Sending agent sync {"execute":"guest-sync", "arguments": { "id": 817870353 }}
Oct 07 11:00:35 top1 pcocc[44912]: DEBUG:etcd.client:Writing to key /pcocc/global/users/root/batch-local/heartbeat/529 ttl=60 dir=False append=False
Oct 07 11:00:55 top1 pcocc[44912]: DEBUG:root:Sending agent sync {"execute":"guest-sync", "arguments": { "id": 236969055 }}
Oct 07 11:01:06 top1 pcocc[44912]: DEBUG:etcd.client:Writing to key /pcocc/global/users/root/batch-local/heartbeat/529 ttl=60 dir=False append=False
systemctl status
shows VM state, last pcocc logs and reports when was the last hearbeat (Status:
line).
Simple monitoring can be done using systemctl is-failed
or systemctl is-active
commands:
# systemctl is-active pcocc-vm-admin1
active
# systemctl is-failed pcocc-vm-admin1
active
Global state of the systemd
daemon can be given with the systemctl is-system-running
command:
# systemctl is-system-running
running
Fleet cluster¶
Fleet cluster can be monitored with fleetctl list-*
commands:
list-machines
Reports registered fleet members.
# fleetctl list-machines MACHINE HOSTNAME IP METADATA 21f839cc... islet39 10.1.0.39 hostname=islet39,role=islet 24d45a17... islet12 10.1.0.12 hostname=islet12,role=islet 29b0afbf... top2 10.1.0.2 hostname=top2,role=top 3280f87e... top3 10.1.0.3 hostname=top3,role=top 551bb3cf... islet55 10.1.0.55 hostname=islet55,role=islet 5b285f6d... islet13 10.1.0.13 hostname=islet13,role=islet 5e1a5a8f... islet38 10.1.0.38 hostname=islet38,role=islet 70b8f750... irene271 10.8.0.171 hostname=irene271,role=router 944cd548... top1 10.1.0.1 hostname=top1,role=top 9dde1b72... worker3 10.1.0.6 hostname=worker3,role=top a1ff44e6... worker1 10.1.0.4 hostname=worker1,role=top a858697c... islet54 10.1.0.54 hostname=islet54,role=islet c8014c48... worker2 10.1.0.5 hostname=worker2,role=top
list-unit-files
Reports registered services and their state.
# fleetctl list-unit-files UNIT HASH DSTATE STATE TARGET pcocc-vm-admin1.service 7b97047 launched launched 944cd548.../top1 pcocc-vm-admin2.service 4faba09 launched launched 29b0afbf.../top2 pcocc-vm-batch1.service f31950f launched launched 944cd548.../top1 pcocc-vm-batch2.service aa142d2 launched launched 3280f87e.../top3 pcocc-vm-db1.service a0820a4 launched launched 3280f87e.../top3 pcocc-vm-i0conf1.service 8a1579d launched launched a1ff44e6.../worker1 pcocc-vm-i0conf2.service 285ac8a launched launched 29b0afbf.../top2 pcocc-vm-i42conf2.service 0dfffef loaded inactive - pcocc-vm-i42dkless1.service 4dce60d loaded inactive - pcocc-vm-i42dkless2.service da95002 loaded inactive - pcocc-vm-i42log1.service 79a68f4 loaded inactive - [...]Fleet service state (or destination state - DSTATE) can be the following:
- inactive
Registered service but not scheduled anywhere
- loaded
Scheduled on a cluster member
- launched
Launched on a cluster member
list-units
Reports scheduled services and their execution state
# fleetctl list-units UNIT MACHINE ACTIVE SUB pcocc-vm-admin1.service 944cd548.../top1 active running pcocc-vm-admin2.service 29b0afbf.../top2 active running pcocc-vm-batch1.service 944cd548.../top1 active running pcocc-vm-batch2.service 3280f87e.../top3 active running pcocc-vm-db1.service 3280f87e.../top3 active running pcocc-vm-i0conf1.service a1ff44e6.../worker1 active running pcocc-vm-i0conf2.service 29b0afbf.../top2 active running pcocc-vm-i0log1.service 29b0afbf.../top2 active running pcocc-vm-i12conf1.service 24d45a17.../islet12 inactive dead pcocc-vm-i12conf2.service 5b285f6d.../islet13 inactive dead [...]
Logs¶
Todo
Show how to interpret fleet and pcocc log files
Operations¶
Pcocc¶
Several operations are available on Pcocc VM like snapshoting, reset or command execution:
- Reset a VM using pcocc reset
# pcocc reset -J admin1 vm0 1 VMs reset in 0.1s- Execute an arbitrary command inside the VM
# pcocc agent run -J admin1 hostname admin1
- Snapshot a VM (safe for the VM)
# pcocc save -J admin1 --dest /tmp/admin1.qcow2 Copying drive data... 8% 00:05:29 (4609.88MB / 51200.00MB) [...]
Fleet services¶
Fleet-managed services can be operated using the fleetctl
command:
- Submit a new unit
# fleetctl submit /usr/share/doc/fleet-1.0.0_31_g58eadf1/examples/hello.service Unit hello.service inactive
- Schedule a unit
# fleetctl load hello.service Unit hello.service loaded on a858697c.../islet54
- Start a unit
# fleetctl start hello.service Unit hello.service launched on a858697c.../islet54
- Stop a unit
# fleetctl stop pcocc-vm-monitor1.service Unit pcocc-vm-monitor1.service loaded on c8014c48.../worker2 Successfully stopped units [pcocc-vm-monitor1.service].- Destroy a unit (stop, unschedule and unload)
# fleetctl destroy hello.service Destroyed hello.service
Fleet cluster¶
There is no way to directly change or mutate the fleet cluster state, if you need to evacuate an hypervisor you have to stop the fleetd
daemon on the hypervisor. This will trigger a rescheduling of all locally launched services on the other hypervisors.
Note that, fleet will never rebalance the cluster by itself. Meaning that any hypervisor evacuation will unbalance the cluster and rebalancing is a manual process.
To do so, gather the current load of all hypervisors (based on the Weight
configuration of each service).
# clush -Bw $(fleetctl list-machines --no-legend --fields hostname | nodeset -f) -R exec "fleetctl list-units --no-legend --fields hostname,unit | awk '/%h/ {print \$2}' | xargs -r -n 1 fleetctl cat | sed -n 's/Weight=\(.*\)/\1/p' | paste -s -d+ | bc"
---------------
islet[12,38],top3,worker1 (4)
---------------
24000
---------------
islet[13,39] (2)
---------------
20000
---------------
worker[2-3] (2)
---------------
4000
---------------
irene271
---------------
120000
---------------
islet55
---------------
44000
---------------
top1
---------------
28000
---------------
top2
---------------
40000
And if you want to move a resource, use fleetctl unload
and fleetctl start
to unload and load (and start) the resource.
# fleetctl unload pcocc-vm-i0log1.service
Triggered unit pcocc-vm-i0log1.service unload
Successfully unloaded units [pcocc-vm-i0log1.service].
# fleetctl start pcocc-vm-i0log1.service
Triggered unit pcocc-vm-i0log1.service start
Triggered unit pcocc-vm-i0log1.service start