Recover from GPU resets #601

Closed
opened 2024-03-16 03:41:32 +01:00 by sevz · 5 comments
Owner

Description

While it seems relatively easy to implement, it requires a considerable amount of lines. And I'm not sure if we need it.

Sway PR: https://github.com/swaywm/sway/pull/8063

### Description While it seems relatively easy to implement, it requires a considerable amount of lines. And I'm not sure if we need it. Sway PR: https://github.com/swaywm/sway/pull/8063
Member

Sounds like a reasonable thing to need, and GPU's probably won't just stop resetting.

Sounds like a reasonable thing to need, and GPU's probably won't just stop resetting.

With dwl the worst thing that can happen is a full system freeze or a crash that takes down all your applications. So avoiding that is definitely an improvement.

With `dwl` the worst thing that can happen is a full system freeze or a crash that takes down all your applications. So avoiding that is definitely an improvement.

I’ll add that whenever dwl hangs/crashes it is almost always due to a gpu reset when I check dmesg. So +1000 to handling it even at the cost of loc.

I’ll add that whenever dwl hangs/crashes it is almost always due to a gpu reset when I check dmesg. So +1000 to handling it even at the cost of loc.

+1
I had a freeze just yesterday during system update. This caused multiple issues but the most notable was the fact I couldn't get past cryptsetup, as it was one of those packages that got corrupted. Quick look for available USB stick and borrowing someone's PC -later I got my system fixed.

I highly recommend prioritising stability over LOC. I like my programs to be small but in this case I think it is best to go forward with this as a PR.

+1 I had a freeze just yesterday during system update. This caused multiple issues but the most notable was the fact I couldn't get past cryptsetup, as it was one of those packages that got corrupted. Quick look for available USB stick and borrowing someone's PC -later I got my system fixed. I highly recommend prioritising stability over LOC. I like my programs to be small but in this case I think it is best to go forward with this as a PR.
Member

By the way. Does anybody know whether gpu (amd in my case) actually resets on it's own or we need some kernel parameters: https://github.com/ROCm/ROCm/issues/616. And whether SDDM or any login manager has to handle reset either?

By the way. Does anybody know whether gpu (amd in my case) actually resets on it's own or we need some kernel parameters: https://github.com/ROCm/ROCm/issues/616. And whether SDDM or any login manager has to handle reset either?
sevz referenced this issue from a commit 2024-06-21 00:58:54 +02:00
sevz closed this issue 2024-06-21 01:01:14 +02:00
janetski referenced this issue from a commit 2024-09-22 19:44:40 +02:00
tomofthecorn referenced this issue from a commit 2025-08-15 20:14:42 +02:00
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
dwl/dwl#601
No description provided.