Unicode normalization follow up, adding character navigation and several fixes#16622
Conversation
|
Hi @LeonarddeR - just noting that #16616 hasn't been triaged yet. |
|
Thanks for pointing that out. Note that I was aware of the standpoint to delay this pr to 2024.4, but others might not have been. |
|
@seanbudd this seems quite stable, which points do you think need still clarification? If you need more community feedback, this needs to be merged into alpha. I don‘t think there is anythink open regarding speech in #16616. you can also keep this behavior in alpha until you think it is stable enough to bring it into Beta. At least this is how we dealt with new features in the past as well, see e.g. cancellable speech which was enabled only in alpha for a longer period of time until it finally reached the stable version. |
|
@seanbudd would you accept milestone 2024.3 for this PR if we change default value to disabled (see #16624 (comment))? This would at least avoid to ship 2024.3 with a buggy feature, closing #16624. A subsequent PR for 2024.4 could then switch the default value to enabled. |
|
@CyrilleB79 - yes as requested in #16624 (comment), however @LeonarddeR seems to suggest in #16624 (comment) that this is not possible for whatever reason |
|
@seanbudd I must have misunderstood you then. I thought you were asking in #16624 (comment) whether it would be possible to fix #16624 without changing character navigation behavior. That would be very difficult. If you'd agree changing this pr to disable normalization again by default but yet change character navigation to also normalize, that is certainly possibly and I can change the pr as requested by @CyrilleB79. |
|
That would be great yes. The normalization fix can go into 2024.3, but we don't want to turn it on by default until it has had some wider testing in a release |
See test results for failed build of commit 8e0d3c7fdb |
See test results for failed build of commit de6705d08d |
|
Tip For best results, initiate chat on the files or code changes. @LeonarddeR, after reviewing the implementations of Observations:
Proposed Refactoring:To reduce duplication, you can abstract the common parts of these functions into helper functions. Here’s a proposed refactoring:
This approach not only reduces duplication but also enhances maintainability by centralizing the error handling and memory management, making the code cleaner and easier to manage. Let me know if this refactoring aligns with your expectations or if you need further modifications! Analysis chainScripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Extract the implementations of calculateCharacterBoundaries and _calculateUniscribeOffsets from textUtils.cpp
ast-grep --lang cpp --pattern $'$_ $_(calculateCharacterBoundaries, _calculateUniscribeOffsets)($$$) {
$$$
}' --file textUtils.cpp
Length of output: 327 Script: #!/bin/bash
# Correctly extract the implementations of calculateCharacterBoundaries and _calculateUniscribeOffsets from textUtils.cpp
ast-grep --lang cpp --pattern $'$_ $_(calculateCharacterBoundaries, _calculateUniscribeOffsets)($$$) {
$$$
}'
Length of output: 112 Script: #!/bin/bash
# Fetch the entire content of textUtils.cpp
cat textUtils.cpp
Length of output: 65 Script: #!/bin/bash
# Search for the textUtils.cpp file in the repository to find the correct path
fd textUtils.cpp
Length of output: 49 Script: #!/bin/bash
# Fetch the entire content of textUtils.cpp from the correct path
cat nvdaHelper/local/textUtils.cpp
Length of output: 4852 |
|
@coderabbitai I didn't like your suggestions much. I refactored it in the last commit. |
|
Tip For best results, initiate chat on the files or code changes. @LeonarddeR, I've reviewed the latest changes in Using If there's anything specific you'd like feedback on or further improvements you're considering, feel free to let me know! Great work on these refinements. Analysis chainScripts executedThe following scripts were executed for the analysis: Script: #!/bin/bash
# Fetch the latest changes made to the textUtils.cpp file in the last commit
git diff HEAD~1 -- nvdaHelper/local/textUtils.cpp
Length of output: 3463 |
|
@LeonarddeR -is this ready for re-review? I'd encourage any change of policy/usage of feature flags to be moved to a separate discussion. |
|
@seanbudd Yes, I think the unit tests, including some edge cases, seem to prove that this works as advertised. |
This is because people didn't pay attention to the actual purpose of feature flags when reviewing the coresponding PRs. I think it makes sense to clearly not encourage using these feature flags outside of the advanced settings panel anymore, and state the actuual purpose in the policy docs. |
|
is this ready for re-review? |
|
Yes, sorry. Forgot to mark it as ready |
…symbol definition, the symbol replacement is spoken (#16950) Fixup for #16622 Summary of the issue: When unicode normalization of a character (e.g. ·) resulted into a character that had a symbol definition (e.g. ·, middle dot), the symbol definition wasn't applied to the normalization. This resulted in NVDA speaking nothing or only the word normalized. Description of user facing changes NVDA will now properly speak the · character (Greek Ano Teleia) as middle dot when normalizing. This also applies to other characters where normalization results in a character that's part of the symbol dictionary. Description of development approach When normalizing a character, ensure it is thrown through characterProcessing.processSpeechSymbol.
Link to issue number:
Fixes #16622
fixes #16640
Summary of the issue:
It has been discussed that normalization would also be helpful for character navigation. There's also an issue where character descriptions and symbol pronunciation didn't work correctly because normalization took place before symbol processing. Furthermore, for the UnicodeNormalizationOffsetConverter used for braille, it was discovered that diffing didn't turn out to be accurate enough.
Description of user facing changes
Report normalized when navigating by characterin the speech settings.Description of development approach
SuppressUnicodeNormalizationCommandthat allows you to suppress global normalization within a speech sequence. This command is used when creating a spelling sequence, because spelling has its own normalization logic now. It also ensures that when spelling a character or providing a character description (i.e. NVDA+. double press), normalization does not occur. It can even be used to disable character normalization altogether if necessary.UnicodeNormalizationOffsetConverterto use a new function in NVDAHelper local that uses uniscribe to give a list of offsets for character boundaries. This allows us to split a string into glyphs and then apply normalization to every glyph, which results in more reliable offset calculation and less complex code.Testing strategy:
a accutewhen navigating by character, thereby improving that behavior as wellKnown issues with pull request:
None known
Code Review Checklist:
Summary by CodeRabbit
New Features
Enhancements
Tests
Documentation