Prevent repetitive crashes by only allowing NVDA to restart from a crash three times in two minute.#19175
Conversation
…closes nvaccess#19133) Check a file for the last known crash time, and number of crashes within that time. if that time is less than 1 minute with 3 prior crashes, don't start the crash subsystem. This fixes nvaccess#19133 by ensuring that the crash subsystem cannot restore NVDA after another such crash, ensuring that issues that could cause NVDA to crash at boot do not cause it to boot in circles forever.
* Just record everything in if statement, no need for variable * clean up order of statements so that the code reads more correctly.
70643d8 to
ddca7de
Compare
|
Trying to figure out if tests are flakey, because output is foreign to me and they won't locally work due to needing something on localhost. pre-commit.ci run |
|
@seanbudd I haven't worked with NVDA's system tests yet. I can successfully run NVDA and build a launcher, but can't figure out how to rerun this test suite. I'm still new to the github interfaces presubmit checks, my previous job used very custom workflows. Am I missing something obvious in this tests failure or is this simply a flake? |
|
pre-commit.ci run |
|
@derekriemer you've got the following error in the during installation: |
|
@CyrilleB79 how do I find logs for the tests? I'm probably missing something obvious. I can now repro locally, should have run some manual tests, but I'm surprised I can't find that traceback when I click the startup/shutdown test. |
… rest of the file. I've never used dataclasses to collect consts like this so I originally forgot to instanciate, then started to refactor it so that each record returned a CrashStats to compare with consts but then realized the reviewer meant use a dataclass as a collection to wrap the crash stats consts, for dependency injection and ease of use.
seanbudd
left a comment
There was a problem hiding this comment.
Thanks @derekriemer - just minor suggestions
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
…ash handling logic there, call it from watchdog. This is step 1 of a refactor, which I'll complete after I merge the PR, and can write a test of the watchdog's testable code. * The watchdog would be split into a watchdog directory, exported members in __init__.py * the cancelable thread stuff would be part of a Cancelable file. * the crash stats code would live in crash_stats.py * the crash handler would be in crash_handler.py * the remainder would get split as necessary, such as init, terminate, etc being in watchdog.py and reexported
|
@seanbudd I've addressed review comments. Not sure how to mark review ready for another pass in gh. |
There was a problem hiding this comment.
Pull Request Overview
This PR implements crash loop detection to prevent NVDA from repeatedly restarting when it crashes on startup. The mechanism tracks crash timestamps and disables automatic crash recovery when 3 or more crashes occur within a 120-second window.
Key changes:
- Created new
crashHandler.pymodule with crash tracking logic and moved crash handling code fromwatchdog.py - Modified
watchdog.pyto check for crash loops before installing the crash handler - Updated changelog to document the new feature
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| user_docs/en/changes.md | Added changelog entry documenting crash loop detection feature |
| source/watchdog.py | Refactored to use new crash handler module and added crash loop detection logic |
| source/crashHandler.py | New module containing crash handling logic, crash tracking, and crash event persistence |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Sean Budd <seanbudd123@gmail.com>
|
how in the world do I view the reason for the pyrite failure? It runs successfully on my machine, and when I clicked through that mess, the pyrite process seemed to exit with no output, even when I view the raw logs. Is there some magic github panel I don't know about? |
…asn't saved, but wasn't marking it clearly and I had to manually tell vscode to revert it so it looked like the utils.crashHandler was correct.
|
@derekriemer - you can view here: https://github.com/nvaccess/nvda/actions/runs/19284281460/attempts/1#summary-55142044998 or via running |
H101
|
|
In case I suspect a flake, I push an empty commit to re-run the tests: |
|
Thanks @derekriemer |
…hen running in secure mode (#19238) Fixes #19216 Follow-up #19175 ### Summary of the issue: NVDA was failing to start on secure screens, unless the serviceDebug global parameter was set. ### Description of user facing changes: NVDA works on secure desktops again. ### Description of developer facing changes: None ### Description of development approach: The issue was caused by `utils._crashHandler.CrashStats.crashStatsPath` assuming that `globalVars.appArgs.logFileName` would always represent a path. Notwithstanding this, it also wrote to disk, even when `NVDAState.shouldWriteToDisk` returned `False` (i.e. in scure mode or running from the launcher). 1. Added several defensive measures to the new `utils._crashHandler` module. 2. Changed `watchdog.initialize` to no longer add `utils._crashHandler.crashHandler` as an unhandled exception filter when running in secure mode. ### Testing strategy: Ran from source, and executed the following in the python console to ensure automatic restarts on crash still work as expected: ```py import ctypes;ctypes.windll.kernel32.DebugBreak() ``` Modified NVDA to allow the scratchpad in secure mode, built a self-signed launcher, and installed it. Hit `alt+control+delete`, and observed that NVDA worked as expected. Added the following global plugin to the scratchpad of the user and system config: ```py from globalPluginHandler import GlobalPlugin import ctypes from tones import beep from core import postNvdaStartup class GlobalPlugin(GlobalPlugin): def __init__(self): super().__init__() postNvdaStartup.register(self.crash) def crash(self): beep(500,100) ctypes.windll.kernel32.DebugBreak() ``` Restarted NVDA, and observed that it started, a tone was heard, and it crashed, and this only happened 4 times. Hit `alt+control+delete`, and observed that NVDA started, a tone was heard, NVDA crashed, and did not restart. ### Known issues with pull request: When NVDA crashes on the secure desktop, no feedback is given to the user. However, this is not new.
Closes #19333 Fix-up of #19175. Summary of the issue: NVDA was crashing at startup when using --no-logging command line flag. Description of user facing changes: NVDA can start again when using the --nologging flag. Description of developer facing changes: N/A Description of development approach: When loading the crash list file, also test if its path is None since this is the case when using --no-logging flag.
Follow-up #19175 ### Summary of the issue: If NVDA crashes when running from source, git picks up `nvda_crash_stats.txt` as an untracked file. ### Description of user facing changes: None ### Description of developer facing changes: Git will ignore `nvda_crash_stats.txt`, meaning it's much less likely to be accidentally committed and doesn't clutter our git output. ### Description of development approach: Added `source/nvda_crash_stats.txt` to `.gitignore`. ### Testing strategy: Ran `git status`. Verified that `nvda_crash_stats` was no longer showing as an untracked file. ### Known issues with pull request: None
Link to issue number:
Closes #19133
Summary of the issue:
If NVDA crashes somehow on startup, the watchdog restarts NVDA. This issue limits how many times the watchdog will atttempt to recover from crashes by preventing the crash handler from booting if the application has been in a boot loop of 3 or more crashes in one minute.
Description of user facing changes:
When NVDA gets itself into a crash loop, which is very rare, it can crash many times very rapidly. So rapidly that a user's keyboard is effectively locked from use. This issue prevents NVDA from crashing on repeat. Since NVDA often boots at startup, failure to gracefully bail can lock the users computer up every time the user logs in.
Description of developer facing changes:
The watchdog prevents crash handling code from booting on the fourth crash, by simply recording the timestamp of the last 3 crashes in a text file along side the crash dmp file, and then using the timestamps in a rolling window to decide if NVDA should start the crash handler, or gracefully allow shutdown.
Description of development approach:
Testing strategy:
Tested with a portable copy locally. I don't see any tests for the watchdog.
Known issues with pull request:
Code Review Checklist: