Diskless ^^^^^^^^ The ``diskless`` pack is a high-level for diskless compute node management. It ships with some basic actions that reuses the cluster's procedures implemented using milkcheck. This pack does not include such implementations. This should be done in ``/etc/milkcheck/conf/``. Actions """"""" **diskless.milkcheck.(boot|prepare|open|status)** Actions that launches the ``boot``, ``prepare``, ``open`` or ``status`` actions of the ``compute_dkless`` service. This milkcheck service must be implemented on system's milkcheck configuration files (``/etc/milkcheck/conf``) The ``hosts`` parameter is required. Optionnaly, this action accepts the ``context`` which is a hash of variable to set in milkcheck. **diskless.remediate** This is a workflow that makes sure that the given nodes are correctly booted and ready to be used. This action takes the following required parameters: ``hosts`` The hosts to check and remediate ``image`` The diskless image name. Used as the ``iscsi_image`` variable of milkcheck's procedure. ``vmlinuz``, ``initrd`` The name of vmlinuz and initrd images. Used as the ``kexec_vmlinuz_name`` and ``kexec_initrd_name`` of milkcheck's procedure. Defaults to ``vmlinux`` and ``initrd``. ``concurrency`` The action concurrency of this workflow. ``config_ttl`` Configures the maximum time to wait for the node to go through the cloudinit step of the boot process. Defaults to 30 minutes. ``boot_ttl`` Configures the maximum time to wait for the node to go through the POST+ipxe step of the boot process. Defaults to 10 minutes Pseudo-code workflow may look like the following .. uml:: @startuml start :milkcheck compute_dkless boot_status; if (all ok) then (yes) stop else (no) if (node BMC is reachable) then (no) :power awake unreachable BMCs; :wait for BMC reachable; :power on powered-off nodes; :Wait for iPXE phone_home; if (OK ?) then (yes) :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; else (no) :power sleep unreachable BMCs; :power awake unreachable BMCs; :wait for BMC reachable; :power on powered-off nodes; :Wait for iPXE phone_home; :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; endif else (yes) if (node is powerred off) then (yes) :power on powered-off nodes; :Wait for iPXE phone_home; if (OK ?) then (yes) :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; else (no) :power sleep unreachable BMCs; :power awake unreachable BMCs; :wait for BMC reachable; :power on powered-off nodes; :Wait for iPXE phone_home; :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; endif else (no) if (Node is reachable ?) then (no) fork :Wait for cloud-init phone_home; fork again :Wait for iPXE phone_home; if (OK ?) then (yes) :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; else (no) :power sleep unreachable BMCs; :power awake unreachable BMCs; :wait for BMC reachable; :power on powered-off nodes; :Wait for iPXE phone_home; :Wait for boot; :milkcheck compute_dkless boot; :Wait for cloud-init phone_home; endif end fork else (yes) if (Node is in diskless trap ?) then (yes) :milkcheck compute_dkless boot; :Wait for cloudinit phone_home call; else (no) if (Node is in cloud-init ?) then (yes) :Wait for cloudinit phone_home call; else (no) :fail; endif endif endif endif endif endif stop @enduml **diskless.wait_for.(ipxe|cloudinit)** Actions that wait for a phone_home call. These actions are using inquiries and generated-rules within a workflow to wait for a **diskless.cloudinit.phone_home** or **diskless.ipxe.phone_home** trigger for the given node. This action requires that the **diskless.wait_(ipxe|cloudinit).arming** rule is present and activated. The ``host`` parameter is required. The ``ttl`` defines how long (in minutes) this workflow should wait. This requires inquiries garbage collection to be enabled (``purge_inquiries`` in ``garbagecollector`` section of *st2.conf**). Defaults to 10 minutes Because of StackStorm internal timers, ``ttl`` values below 10 mintues may not timeout immediatly. The sequence of these actions is a bit tricky, here's a quick sequence diagram for **diskless.wait_for.ipxe** (pretty much the same as cloudinit 's one). .. seqdiag:: :desctable: seqdiag { A[label="Action", description="**diskless.wait_for.ipxe** action"] B[label="Rule A", description="**diskless.wait_ipxe.arming** rule"] C[label="Rule B", description="**diskless.wait_ipxe.arming.NODE** Rule"] D[label="Node", description="A booting node"] A --> B [label="Triggers by adding an inquiry"] B -> C [label="Creates a rule"] D --> C [label="Triggers"] C -> A [label="Responds to inquiry"] A -> C [label="Delete rule"] A -> D [label="Boot procedure"] } Sensor """""" The **diskless.phone_home.sensor** is a simple sensor that listen for events comming from the boot process. It uses the ``phone_home`` *cloud-init* module to post some data into this sensor and a simple ``imgfetch`` *iPXE* command. ``phone_home`` *cloud-init* or *iPXE* configuration itself is not handled here, this sensor only listens for ``phone_home`` events. This sensor is a Flask server listening on 32001/tcp that triggers a **diskless.cloudinit.phone_home** with the data encoded (URL-encoded) by *cloud-init* or **diskless.ipxe.phone_home** when triggered from *iPXE*. Triggers '''''''' **diskless.cloudinit.phone_home** A trigger that indicates that a node almost finished it's cloud-init process. The payload contains details about the node: * ``pub_key_dsa``, ``pub_key_rsa`` and ``pub_key_ecdsa``: SSH host key present on the node. * ``instance_id``: cloud-init's instance-id, may be derived from the hostname * ``hostname``, ``fqdn``: Node hostname and fqdn **diskless.ipxe.phone_home** A trigger that indicates that a node is currently booting and is in the iPXE step The payload only contains the node hostname (``hostname``).