Woke up today to the homeserver being unresponsive. Couldn’t SSH, no video out when I connected a monitor, and even the reset button didn’t do anything. Had to hold the power button to shut it down.
/var/log/syslog doesn’t show anything interesting other than the issue happened at just after 4am. Log
2026-02-27T03:55:01.481794-08:00 blackbox CRON[1743418]: (www-data) CMD (/usr/bin/php8.3 /mnt/MONSTERDRIVE/pixelfeddata/pixelfed/artisan schedule:run >> /dev/null 2>&1)
2026-02-27T04:00:00.198504-08:00 blackbox smartd[2126]: Device: /dev/sdd [SAT], CHECK POWER STATUS spins up disk (0x81 -> 0xff)
2026-02-27T04:00:00.291853-08:00 blackbox systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
2026-02-27T04:00:00.298344-08:00 blackbox systemd[1]: sysstat-collect.service: Deactivated successfully.
2026-02-27T04:00:00.298523-08:00 blackbox systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
2026-02-27T04:00:00.299608-08:00 blackbox kernel: kauditd_printk_skb: 8 callbacks suppressed
2026-02-27T04:00:00.299613-08:00 blackbox kernel: audit: type=1130 audit(1772193600.298:798916): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
2026-02-27T04:00:00.299615-08:00 blackbox kernel: audit: type=1131 audit(1772193600.298:798917): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='unit=sysstat-collect comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
2026-02-27T04:00:01.923610-08:00 blackbox kernel: audit: type=1101 audit(1772193601.922:798918): pid=1744810 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='op=PAM:accounting grantors=pam_permit acct="www-data" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
2026-02-27T04:00:01.923614-08:00 blackbox kernel: audit: type=1103 audit(1772193601.922:798919): pid=1744810 uid=0 auid=4294967295 ses=4294967295 subj=unconfined msg='op=PAM:setcred grantors=pam_permit,pam_cap acct="www-data" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
2026-02-27T04:00:01.923615-08:00 blackbox kernel: audit: type=1006 audit(1772193601.922:798920): pid=1744810 uid=0 subj=unconfined old-auid=4294967295 auid=33 tty=(none) old-ses=4294967295 ses=50544 res=1
2026-02-27T04:00:01.923615-08:00 blackbox kernel: audit: type=1300 audit(1772193601.922:798920): arch=c000003e syscall=1 success=yes exit=2 a0=7 a1=7fff81d75200 a2=2 a3=0 items=0 ppid=2654 pid=1744810 auid=33 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=50544 comm="cron" exe="/usr/sbin/cron" subj=unconfined key=(null)
2026-02-27T04:00:01.923616-08:00 blackbox kernel: audit: type=1327 audit(1772193601.922:798920): proctitle=2F7573722F7362696E2F43524F4E002D66002D50
2026-02-27T04:00:01.924259-08:00 blackbox CRON[1744811]: (www-data) CMD (/usr/bin/php8.3 /mnt/MONSTERDRIVE/pixelfeddata/pixelfed/artisan schedule:run >> /dev/null 2>&1)
2026-02-27T04:00:01.924614-08:00 blackbox kernel: audit: type=1105 audit(1772193601.923:798921): pid=1744810 uid=0 auid=33 ses=50544 subj=unconfined msg='op=PAM:session_open grantors=pam_loginuid,pam_env,pam_env,pam_permit,pam_umask,pam_unix,pam_limits acct="www-data" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
2026-02-27T04:00:01.925610-08:00 blackbox kernel: audit: type=1110 audit(1772193601.924:798922): pid=1744811 uid=0 auid=33 ses=50544 subj=unconfined msg='op=PAM:setcred grantors=pam_permit,pam_cap acct="www-data" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
2026-02-27T04:00:02.357616-08:00 blackbox kernel: audit: type=1104 audit(1772193602.356:798923): pid=1744810 uid=0 auid=33 ses=50544 subj=unconfined msg='op=PAM:setcred grantors=pam_permit acct="www-data" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
2026-02-27T09:23:35.786375-08:00 blackbox systemd-modules-load[904]: Inserted module 'dm_multipath'
Would something like this be a direct hardware failure? Like a power supply hiccup or something? It happening at 4am coincides with my electric car starting to charge, but the server is on a dedicated 20A circuit and behind a battery backup. I also don’t see any power issues on my Sense monitor at that time though it has limited resolution.
Mainboard is a Supermicro H13SAE-MF and I’m using ECC RAM.
I’ve been running this hardware for over a year and never had this issue, but I’m running out of places to look.
Might be time to finally get IPMI working.
The reset button is basically just a signal to the CPU/BIOS that it should wipe memory and begin the boot process from scratch. If it was not working, that indicates the CPU was hard locked and not responding to any sort of input, not just an os fault The power button sends an actual trigger signal to the PSU through the ATX connector so it bypasses any mainboard lock.
Random shit happens, see if it does it again.
My go to for random stability issues is to always run a full deep memtest to look for bad RAM and then a CPU stress test to see if it’s a random thermal or core issue. More often than not I find stability problems just with these two steps.If it happens again and you have Magic Sysrq enabled, you can do Magic Sysrq-t, which may give you some idea of what the system is doing, since you’ll get stack traces. As long as the kernel can talk to the keyboard, it should be able to get that.
https://en.wikipedia.org/wiki/Magic_sysrq
You maybe can’t see anything on your monitor, but if the system is working enough to generate the stack traces and log them to the syslog on disk (like, your kernel filesystem and disk systems are still functional), you’ll be able to view them on reboot.
If it can’t even do that, you might be able to set up a serial console and then, using another system running
screenorminicomor something like that linked up to the serial port, issue Magic Sysrq to that and view it on that machine.Some systems have hardware watchdogs, where if a process can’t constantly ping the thing, the system will reboot. That doesn’t solve your problem, but it may mitigate it if you just want it to reboot if things wedge up. The
watchdogpackage in Debian has some software to make use of this.Solar flares and even the occasional random neutron particle hitting your equipment can cause some weird issues. If its just a one time occurrence and it doesn’t happen again, I wouldn’t worry too much about it.
Wanted to say that - Random shit does happen, even to the most stable systems. There’s a cutoff in consumer hardware where selecting for more stability simply isn’t worth the cost such as radiation hardening. Best you can do is ECC Ram.
For homeservers if you’re not always on-site to look at it, it’s a good idea to set a reset on kernel panic.
More ideal not to reboot of course, but I am often hundreds of miles from my servers so a kernel panic over something that wouldn’t have otherwise killed the system is something I’d rather live with and reboot.
Memory leak eating all your ram them locking up? Is it a one time thing or is it a regular occurrence?
I think it’s the first time it’s happened since I upgraded my hardware over a year ago. 64 gigs of RAM and I rarely use more than 30% of it.
I still use swap for those rare moments i run out of RAM after all. Who knows maybe some heavy cronjobs will clash or whatever.
Ghost in the machine 🤷
Impossible to tell and it sometimes happens.
Reset button not working, but power button working is quite odd.
Is it just the once that this happened? Can you reliably trigger it with the car charger? If yes, maybe worth plugging in a monitor while you triggering it and see what happens.
Are the server and chargers close to each other? Some kind of EMP effect? Seems unlikely, but who knows.
Reset button not working, but power button working is quite odd.
Yeah makes me think something hardware level.
Are the server and chargers close to each other? Can you reliably trigger it with the car charger?
No. The car charges every night. This is the first time this has happened.
Full disk maybe?
Most likely memory issues
In ECC memory?








