NMI Watchdog on Linux


Non-Maskable Interrupt(NMI) is the highest priority interrupt that can not be masked by any software. Basically, this is the scenario when you press CTRL+ALT+DELETE on Windows computer. NMI watchdog is available for i386 and amd64 architectures.In order to use the NMI watchdog, operating systems’ kernel must support APIC(Advanced Programmable Interrupt Controller) protocol.

NMI watchdog can be enabled by the kernel parameters;

kernel.nmi_watchdog=1 →(I/O APIC)
kernel.nmi_watchdog=2 → (Locall APIC)

When NMI is enabled , system will periodically generate a NMI call. Each NMI invokes a handler in Linux kernel and check the number of interrupts. If the NMIs’ handler detects the number of interrupts hasn’t changed for a certain period of time, it assumes that kernel is hung.Then it invokes a kernel panic.

Watchdog Timer

Command shows the interrupts per CPUs.

$ cat /proc/interrupts
       CPU0       CPU1       CPU2       CPU3             
NMI:    24         18         21         18  Non-maskable interrupts

NMI watchdog can be useful to detect server hung and reduce down time. But highly recommended to analyze system performance before enable NMI watchdog. NMI generates occasionally high number of interrupts and reduce server performance.

You can follow these steps to enable and disable NMI watchdog. I always recommend to enable NMI watchdog after performance analysis. Because sometimes it can be more complicated to detect kernel hung in order to analyze server performance.

Enable NMI:

kernel.nmi_watchdog=1 →(I/O APIC)
kernel.nmi_watchdog=2 → (Locall APIC)
sysctl -w kernel.nmi_watchdog=1

Disable NMI:

kernel.nmi_watchdog=0
sysctl -w kernel.nmi_watchdog=0

You can only run this command which kernel version is 2.6.18–238 and later. Otherwise you have to edit grub.conf then reboot server to disable NMI watchdog.

#vim /boot/grub/grub.conf
kernel /vmlinuz-2.6.18–194.el5 ro root=/dev/VolGroup00/LogVol00 nmi_watchdog=0
Output shows that  NMI watchdog is disabled. Because there is no interrupt on CPUs.
$ cat /proc/interrupts
       CPU0     CPU1    CPU2     CPU3             
NMI:    0        0        0        0   Non-maskable interrupts

Also it’s possible to send NMI request with IPMITOOL tool. You can check this link for more detail.

Tagged In:

I'm a IT Infrastructure and Operations Architect with extensive experience and administration skills and works for Turk Telekom. I provide hardware and software support for the IT Infrastructure and Operations tasks.

205 Total Posts
Follow Me