Protect functions from using terminated OneCore instances by seanbudd · Pull Request #13569 · nvaccess/nvda

seanbudd · 2022-04-01T00:24:38Z

Link to issue number:

N/A

Summary of the issue:

NVDA can crash when resetting the configuration to factory defaults.
This is because NVDA terminates OneCore when resetting the config.
However, a race condition exists with the async function speak, where the python callback uses a terminated OneCore instance.

Steps to reproduce:

Run NVDA
In the Keyboard section of the Preferences dialog, turn "Speak command keys" on.
Reset your configuration to factory defaults
Note that NVDA crashes

Description of how this pull request fixes the issue:

Protects usages of the OneCore instance by confirming the instance has not been terminated.
Wait until speech callbacks are completed before re-initializing.

Testing strategy:

Manual testing

Run NVDA
In the Keyboard section of the Preferences dialog, turn "Speak command keys" on.
Reset your configuration to factory defaults
Note that NVDA does not crash.

Known issues with pull request:

Other, somewhat related bugs, are logged as Errors during this process. See: #13580

Change log entries:

Bug fixes

- NVDA no longer crashes when resetting the NVDA configuration to factory defaults while speak command keys is on. (#13569)

Code Review Checklist:

Pull Request description:
- description is up to date
- change log entries
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
API is compatible with existing add-ons.
Documentation:
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English

LeonarddeR · 2022-04-01T06:09:15Z

Is it necessary to apply a similar approach to the OCR code?

seanbudd · 2022-04-04T03:56:52Z

Is it necessary to apply a similar approach to the OCR code?

Maybe, but for a given handle uwpOcr_initialize and uwpOcr_terminate are always called sequentially. See uwpOcr.UwpOcr.recognize. I don't think you can reproduce a crash with OCR, at least from NVDA core.

Edit: Actually, this could be the cause of #9472 and #13189

AppVeyorBot · 2022-04-08T06:38:22Z

Build execution time has reached the maximum allowed time for your plan (60 minutes).

See test results for failed build of commit 4b1193ab51

feerrenrut · 2022-04-12T03:08:14Z

 using namespace winrt::Windows::Foundation::Collections;
 using winrt::Windows::Foundation::Metadata::ApiInformation;

+std::recursive_mutex InstanceManager::_instanceStateMutex;


It seems like it is only waitUntilReadyForInitialization / initializeNewInstance that requires this to be recursive?

feerrenrut · 2022-04-12T03:12:16Z

+	/*
+	When initialized, increases the count of active speech threads
+	waiting on a callback.
+	When exiting the scope of initialization, decreases the count
+	of active speech threads and deletes the OneCore instance if it has
+	been terminated and no speech callbacks are waiting.
+	*/


C++ docs (like python doc strings) are typically written on the line before the declaration, rather than within the definition.

It is also helpful to have reasoning for the implementation within the definition.

feerrenrut · 2022-04-12T03:18:21Z

+class InstanceManager {
+private:
+	static std::recursive_mutex _instanceStateMutex;


This and SpeakThreadGuard don't need to be in this header. They aren't used outside of the oneCoreSpeech.cpp compilation unit, and don't need to have their compilation tied to oneCoreSpeech. We'll likely need them separate for OCR.

feerrenrut · 2022-04-12T03:35:52Z

+	Waits until an instance has been fully terminated,
+	and then (once speech callbacks have finished) deleted.
+	*/
+	std::lock_guard g(_instanceStateMutex);


This can cause a deadlock. This is acquiring the lock potentially while there is an active SpeakThreadGuard due to OcSpeech::speak which can't terminate until it acquires the lock to delete the instance and signal _readyForInitialization

Instead, I'd make waitUntilReadyForInitialization transfer ownership of the lock back to the caller after it is acquired safely.

seanbudd · 2022-04-19T05:28:50Z

Closing as a new approach is required

Supersedes PR #13569 Summary of the issue: NVDA was crashing when resetting the configuration to factory defaults. NVDA terminates the ocSpeech native module when resetting the config, however there is a pending callback to a now deleted python function. Steps to reproduce: - Run NVDA - In the Keyboard section of the Preferences dialog, turn "Speak command keys" on. - Reset your configuration to factory defaults - Note that NVDA crashes Description of fix: Changes the design of the ocSpeech module. Both initialization and termination are blocked if a callback is about to happen. The "tokens" for prior initialization of ocSpeech are collected by the ocSpeech system. When an async task reaches completion, the origin token is checked for validity, the callback is not called for a now invalid token. Initialization now requires the callback to be specified.

seanbudd added 2 commits April 1, 2022 11:15

Throw runtime error if OneCore has terminated

7f83cfa

Protect the callback to python to prevent OneCore crash

769f3b6

seanbudd requested a review from a team as a code owner April 1, 2022 00:24

seanbudd requested review from feerrenrut and michaelDCurran and removed request for a team April 1, 2022 00:24

seanbudd changed the title ~~Prevent one core config crash~~ Protect functions from using terminated OneCore instances Apr 1, 2022

seanbudd marked this pull request as draft April 1, 2022 00:32

seanbudd self-assigned this Apr 1, 2022

Wait for speech to finish before terminating

e2f694f

seanbudd marked this pull request as ready for review April 4, 2022 04:51

Fix includes

25cba61