ArticlesWarewulf

Troubleshooting PXE Booting, Node Provisioning.

Introduction

As explained in this guide, the only thing that Warewulf requires to provision is that the node is set to PXE boot. You may need to change the boot order if there is a local disk present and bootable. This is a configuration change you will have to make in the BIOS of the cluster node, as this is the first step of the provisioning process. Once the nodes boot to PXE, PXE will request a BOOTP/DHCP address on the network, the Warewulf controller’s DHCP server will respond with a network configuration and filename to try and boot, PXE will attempt to download the filename referred to in the DHCP response via TFTP. The downloaded file will execute an iPXE stack which will reach out to the Warewulf server for it’s configuration. Then, the Warewulf server will generate the iPXE configuration which will include directions of what else is necessary to download and how to boot. The kernel, container image, kernel modules, and system overlay are all downloaded over REST HTTP from the Warewulf Server. iPXE executes the kernel and processes the overlays to provide a unified root file system. Warewulf bootstraps the initialization of cluster node’s operating system: File System (re)configuration, SELinux, and wwclient is called as a background daemon and sleeps until network is ready. And for last, the Warewulf bootstrap execs the container’s /sbin/init

Problem

Node(s) are not booting after set to boot from PXE.

Resolution

The resolution for this problem is not straight forward as this problem could be caused by several different parts of the booting process. Therefore is good starting with a sanity check

  • Ensure httpd, dhcpd, tftp-server.socket (or xinetd) are all enabled and running

systemctl status <service>

  • Ensure firewall rules are set or disabled. Is good to stop firewall completely to rule out any firewall problems

systemctl stop firewalld

  • Check Selinux. If Selinux is enforcing check this guide to enable SELinux if needed, or just disable SELinux to rule out any SELinux issues.

setenforce 0

Test TFTP Server.

Another option to check is making sure TFTP service is working properly. We need to make sure the server is able to get a file from warewulf directory tftp <tftpserverIPADDR> -c get /var/lib/warewulf/bin-x86_64-efi-snponly.efi

Note: This usually only test the tftp server is working properly but to discard any network issue, we advice to run the command from a node in the same network as the warewulf headnode as well, and check if it gets the file successfuly from the tftp server.

References & related articles

Node Provisioning