Files
adriangl f4eb804c59 fix: legolas NVMe suspend/resume crashes
Switch mem_sleep_default from deep (S3) to s2idle - the Phison E21
NVMe hangs on resume from the D3 transition S3 forces. Remove the
no-op d3cold_allowed udev rule (attribute doesn't exist) and the
broken nvme-resume-fix service (ExecStart redirect never ran). Add a
correct runtime-PM-off rule for the NVMe endpoint and its parent root port.
2026-06-18 10:36:39 +02:00

1.4 KiB

Development log

legolas: NVMe (Corsair MP600 ELITE / Phison E21) suspend/resume crashes

The drive hung on resume from S3 (deep) suspend — boot logs ended at PM: suspend entry (deep) with no resume, followed by a hard reboot. The Phison E21 can't recover from the D3 transition that S3 forces.

Investigation also found two existing workarounds were doing nothing:

  • The udev rule ATTR{d3cold_allowed}="0" was a no-op — that sysfs attribute doesn't exist on this Phison endpoint or its parent root port (0000:00:1d.0).
  • The nvme-resume-fix service was broken: ExecStart used > redirection without a shell, so echo just printed the string and never wrote to the rescan sysfs file. It never actually rescanned.

Changes (hosts/legolas/hardware-configuration.nix):

  • Switched mem_sleep_default from deep (S3) to s2idle (modern standby). s2idle avoids the deep D3 path that hangs the drive.
  • Removed the no-op d3cold_allowed rule and the broken rescan service.
  • Added a correct runtime-PM-off udev rule on both the NVMe endpoint (0000:6e:00.0) and its parent root port (0000:00:1d.0), keeping the PCIe wakeup disable.

Note: runtime PM was already off on the endpoint (power/control=on); that alone never fixed the crash because system suspend uses a separate code path. Verify after rebuild that the cmdline has mem_sleep_default=s2idle (the old running generation showed a stale duplicate deep).