Resolving Timing and Dependency Issues in Systemd Service Unit for Sriov Configuration
Introduction
This article addresses the issue where the Sriov configuration script runs manually but fails when executed automatically by systemd. The root cause is identified as a timing or dependency issue.
Problem
The script /etc/sriov/setup_dpdk_nic.sh
works when executed manually after a node restart but fails when run automatically by systemd. This indicates a potential timing or dependency issue with the service startup.
Symptoms
- Script executes successfully when run manually after node restart.
- Script fails when triggered automatically by systemd during the boot process.
Resolution
Adjust the service unit file to address potential timing and dependency issues. Update the service unit file as follows:
[Unit]
Description=Sriov Config Service
After=network-online.target systemd-udevd.service
Wants=network-online.target
[Service]
Type=oneshot
ExecStart=/etc/sriov/setup_dpdk_nic.sh
RemainAfterExit=yes
TimeoutStopSec=90
#StartLimitInterval=120
#StartLimitBurst=3
[Install]
WantedBy=multi-user.target
Explanation of Changes:
-
[Unit] Section:
- After=network-online.target systemd-udevd.service: Ensures the service starts only after the network is fully online and systemd-udevd is running. This is crucial for NIC-related configurations.
- Wants=network-online.target: Ensures the network is available when the service runs.
-
[Service] Section:
- RemainAfterExit=yes: Keeps the service active after the script completes, aiding in tracking and status reporting.
- TimeoutStopSec=90: Sets a timeout for the stop operation.
- StartLimit Parameters:* Commented out as they control service restart limits, which are less relevant for a oneshot service.
Benefits:
- Ensures the NIC configuration script runs at the correct time during the boot process.
- Improves service status reporting by keeping the service active after script execution.
Troubleshooting:
If the issue persists, provide the output of the following command for further review:
journalctl -u sriov-configure.service
Root Cause
The problem likely stemmed from the script running before the network was fully online. By adjusting the service dependencies, we ensure the script runs at the appropriate time during the boot sequence.