i had a quite puzzling moment today when a server refused to boot up after routine maintenance.
order of events:
- datacenter technician has taken server down and added Mellanox 25Gbit network card
- server booted up cleanly with Debian Buster and aged kernel 4.19.0-6
- i’ve rebooted the server again and changed cpu performance settings in bios > Advanced > CPU Configuration > Advanced Power Management Configuration > Power Technology – went from Energy Efficient to Custom, Power Perforamnce Tuning: BIOS Controls EPB, ENERGY_PERF_BIAS_CFG mode – Maximum Performance
- server no longer booted:
- debian that used to work on this machine, loaded from ssd, was freezing just after grub showing me just this ” Loading Linux 4.19.0-6-adm64 … Loading inital ramdisk …”
- debian isos froze as well immediately after grub in a similar way
- gparted live gave me kernel stacktrace
- on the other hand memtest was working and not showing any errors
- i’ve reverted the bios setting related to cpu performance – it did not help, debian was still freezing
- after long struggles i’ve figured out that passing noacpi acpi=off in the boot params made server boot, but that wasn’t helpful – /proc/cpuinfo showed just one core
- spooky. at first i thought it’s related to the mellanox network card, this describe similar situation. but even after removing card system did not boot.
- finally i’ve gone to bios and selected RestoreĀ Optimized Defaults – surprisingly system booted up cleanly this time without any acpi-related kernel parameters