Skip to content

QuickNav Text paragraph navigation#16031

Merged
seanbudd merged 16 commits into
nvaccess:masterfrom
mltony:textNav
Feb 7, 2024
Merged

QuickNav Text paragraph navigation#16031
seanbudd merged 16 commits into
nvaccess:masterfrom
mltony:textNav

Conversation

@mltony

@mltony mltony commented Jan 11, 2024

Copy link
Copy Markdown
Contributor

Link to issue number:

Closes #15998

Summary of the issue:

Add QuickNav text paragraph navigation

Description of user facing changes

  • Added QuickNav gesture p to jump to next/previous text paragraph in browse mode.

Description of development approach

  • In BrowseModeTreeInterceptor added function _iterSimilarParagraph that finds next/previous paragraph that satisfies condition defined by a lambda function. I will reuse this function in later PRs to implement vertical navigation.
  • Added a new clause in BrowseModeTreeInterceptor._quickNavScript() that handles text paragraph case and calls function defined in the previous bullet.
  • For text criteria I ended up implementing a user configurable regex.
    • Its initial value is defined in DEFAULT_TEXT_PARAGRAPH_REGEX variable in configSpec.py:11.
    • It is user-configurable in Browse Mode page in NVDA options.
    • The rationale for using a regex instead of plain-text search is to reduce the number of false positives.
    • For example, for period character we check that it follows a word character \w and is followed by space character or end of string. There are a few more rules for edge cases in the code.
    • I don't include comma, colon and semicolon punctuation marks in the regex due to high number of false positives.
    • The regex also accounts for CJK languages by checking for full-width punctuation marks.

Testing strategy:

  • Tested manually on a bunch of webpages in Google Chrome and Mozilla Firefox.
  • Included a system test.

Known issues with pull request:

N/A

Code Review Checklist:

  • Documentation:
    • Change log entry
    • User Documentation
    • Developer / Technical Documentation
    • Context sensitive help for GUI changes
  • Testing:
    • Unit tests
    • System (end to end) tests
    • Manual testing
  • UX of all users considered:
    • Speech
    • Braille
    • Low Vision
    • Different web browsers
    • Localization in other languages / culture than English
  • API is compatible with existing add-ons.
  • Security precautions taken.

@mltony

mltony commented Jan 11, 2024

Copy link
Copy Markdown
Contributor Author

Also looking for feedback from the international community on how does this PR do with your languages. I'll leave this PR as a draft for a few days to collect and address this feedback. Specifically, tell us how does this PR work with your language and also would be great to hear punctuation rules in your language if different from Latin. Here are major languages that I'm interested in:

  • CJK languages. I included Full-width CJK punctuation marks but I don't speak any CJK languages, so wondering if any extra conditions need to be added for CJK clause.
  • Languages written in Arabic script - Arabic, Farsi.
  • Languages of India.
    Also feel free to provide feedback for any other languages you speak.

@Adriani90

Copy link
Copy Markdown
Collaborator

cc: @cary-rowen, @nishimotz, @dpy013, @josephsl for chinese, japanese and corean at least.

@amirsol81

Copy link
Copy Markdown

@mltony P and SHIFT+P conflict with the go to next/previous dialog commands in your BrowserNav add-on. Any chance of also updating BrowserNav?

@mltony

mltony commented Jan 14, 2024

Copy link
Copy Markdown
Contributor Author

@amirsol81, Right, I will soon remove P command from BrowserNav to avoid keystroke collisions. You should use built-in QuickNav O instead of BrowserNav P which does the same thing.

@seanbudd seanbudd added the conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review. label Jan 15, 2024
@mltony mltony mentioned this pull request Jan 16, 2024
5 tasks
@mltony mltony marked this pull request as ready for review January 16, 2024 01:12
@mltony mltony requested review from a team as code owners January 16, 2024 01:12
Comment thread source/browseMode.py Outdated
Comment thread source/browseMode.py Outdated
Comment thread source/browseMode.py Outdated
Comment thread source/gui/settingsDialogs.py Outdated
@nishimotz

Copy link
Copy Markdown
Contributor

The Windows Uniscribe API is used for multilingual processing of word and sentence segmentation.
It is likely already being utilized in NVDA's textInfos.
This API might also be applicable for paragraph processing.

Comment thread source/browseMode.py Outdated
Comment thread source/browseMode.py Outdated
Comment thread source/browseMode.py Outdated
Comment thread source/browseMode.py Outdated
Comment thread source/gui/settingsDialogs.py Outdated
Comment thread source/gui/settingsDialogs.py Outdated
Comment thread source/config/configSpec.py Outdated
Comment thread tests/system/robot/chromeTests.py Outdated
Comment thread user_docs/en/changes.t2t Outdated
Comment thread user_docs/en/userGuide.t2t Outdated
@seanbudd seanbudd marked this pull request as draft January 23, 2024 02:16

@CyrilleB79 CyrilleB79 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had pending comments from a week ago; sorry, I release it only now.
Regarding my general impression on the feature, I will add it in a separate message.

Comment thread source/config/configSpec.py Outdated
# We have a positive lookBehind \w that resolves to a text character in any language,
# coupled with negative lookBehind \d that excludes digits.
lookBehind=r'(?<=\w)(?<!\d)',
# In some cases quote or closing parenthesis might appear right before or right after text punctuation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't single quotes sometimes used in English? ( ' or ’ )

Not being a native English writer/reader, I have not the answer.

@mltony mltony Jan 25, 2024

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither am I. In my understanding single quotes are typically used as apostrophe in words that are glued together, like I'm or don't and also in what's left of Genetive case in English, like Martin's - I think that's called posessive form. I can come up with a sentence that would end with apostrophe followed by period, like:

Yesterday caucus was republicans'.

But not sure if that's even correct grammar, and if it is, then this type of sentences would be exceedingly rare, so probably not worth including into regex.

Comment thread source/config/configSpec.py Outdated

DEFAULT_TEXT_PARAGRAPH_REGEX = (
r"({lookBehind}{optQuote}{punc}{optQuote}{optWiki}{lookAhead}|{cjk})".format(
# Look behind clause ensures that we have a text character before text punctuation mark.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumption is not always correct. For example in French:
The correct usage is to use no-break space before question mark (?), exclaim (!), semi-colon (;) or colon (:). A very common usage is also to replace this no-break space by a normal space, or even with no space at all due to the influence of English writing.
E.g:
Comment ça va ?
or
Comment ça va ?
or
Comment ça va?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I right that in French you don't put a space before period, only before question mark and exclamation mark? If so then I adjusted regex by relaxing condition for exclamation mark and question mark. Hope that should work better.

Comment thread source/gui/settingsDialogs.py Outdated
wx.OK | wx.ICON_ERROR,
self,
)
self.textParagraphRegexEdit.SetFocus()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to #15897, focusing the invalid field causes an issue in case you have changed panel when you validate the settings dialog.
Thus, this instruction should be removed.

Comment thread source/gui/settingsDialogs.py Outdated

# Translators: This is the label for a textfield in the
# browse mode settings panel.
textParagraphRegexLabelText = _("Regular expression for text paragraph navigation")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this parameter should be here or in advanced settings. Who is likely to change this value?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough - moved this to advanced settings. Not sure who would want to change this, but it seems to me that we'd rather keep this configurable.

Comment thread source/gui/settingsDialogs.py Outdated
wxCtrlClass=wx.TextCtrl,
size=(self.scaleSize(300), -1),
)
self.textParagraphRegexEdit.SetValue(config.conf["virtualBuffers"]["textParagraphRegex"])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the value of the config be checked for validity before being set to this field? If not valid, an error could be logged and the value reset to default.

Of course, a wrong value would land in this field only if the user has manually edited the config file or the config through a command in the Python Console, which are not allowed scenarios.
But this would make this dialog more robust for any tester.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user edits NVDA config file manually - they're on their own risk. I already have a validator in isValid() function.

Comment thread source/browseMode.py
qn(
"textParagraph", key="p",
# Translators: Input help message for a quick navigation command in browse mode.
nextDoc=_("moves to the next text paragraph"),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if I have no better suggestion for now, I have the feeling that "Text paragraph" is too vague (here as well as in the GUI, the Change log and the User Guide).

There should be at least a place, maybe only in the User Guide, describing what a "Text paragraph" is really regarding this feature. If I understand correctly, as part of this functionality, a "Text paragraph" is a paragraph containing at least one sentence.

And, if possible, a more precise description of this option would be nice. @XLTechie, any idea?

"Move to the next text paragraph containing at least one sentence" may be more precise, but maybe something better should be thought.

Also in the user guide, an indication that it's an algorithm trying to cover the majority of situations, but which may fail in some rare cases would be interesting.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new section in the documentation explaining use cases and definition of text paragraph and also added a sentence regarding not being 100% accurate.

Comment thread source/gui/settingsDialogs.py Outdated
@CyrilleB79

Copy link
Copy Markdown
Contributor

Regarding the general experience of this feature and after testing with the add-on, I am not very comfortable using this feature. In many cases, I fear to miss some useful information. And even many times, I miss it, e.g.:

  • Hheadings have usually no punctuation and are skipped
  • In forums where people write quite informally and/or with spelling mistakes, I sometimes miss some short messages with no punctuation or misspelt (e.g. dot not followed by a space as usually required).

Though, I acknowledge the fact that this feature is needed. And the command to jump to a non-link block of text (N key) is probably less useful today than in the web as it was 20 years ago.

To summarize, N key catches too much text, and the current feature does not catch enough. I wonder if we could still enhance the algo to find interesting text. But I have no better suggestion, hence this comment so that any suggestion is made by other people if they have.

If no better suggestion is found, I am still in favor of merging this PR, since it seems to be an improvement for many people.

@CyrilleB79

Copy link
Copy Markdown
Contributor

By the way, if the regexp for a sentence separator is validated in this PR, we should consider it as also valid for #8518 if it is implemented one day.

@CyrilleB79

Copy link
Copy Markdown
Contributor

Writing also here:

Regarding key assignation:
My personal opinion is that, if #16050 works as advertised, it would be much more interesting to have default assigned keys for it and keep the commands of the current PR unassigned.

Maybe we should merge this PR and #16050 with unassigned gestures and test them during alpha stage. Then at the end of alpha stage, we can figure if one or both of these feature need to have an assigned gesture and add them.

mltony and others added 7 commits January 24, 2024 17:22
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
@AppVeyorBot

Copy link
Copy Markdown

See test results for failed build of commit b594079fd2

@CyrilleB79

Copy link
Copy Markdown
Contributor

I would have detailed comments to do on this PR, but first, there are more general questions to answer.

Question 1: Naming / documenting
As said earlier "Text paragraph navigation" is a bit general. And I feel that it may be easily confused with normal paragraph navigation (control+up/downArrow), which is configurable with the paragraph style parameter.
I have not been able to provide a better name but really would like to try to find one. The idea is: Navigation to paragraph containing an interesting content / containing writing.
@XLTechie have you any suggestion to name or describe this feature? Are you a user of the add-on?

In this PR, we want to navigate to paragraphs containing at least one sentence. The sentences are detected by the mean of a regexp detecting their end, mainly the punctuation. This regexp is made configurable in advanced options.

Question 2: In NVDA there is already an algorithm using the detection of sentence end in documentNavigation/sentenceHelper.py and used for the existing paragraph navigation feature. I think that both algoritms (or at least regex) should be commonalized.

Question 3: Does NV Access (cc @seanbudd) agree with the configurability of this regex in the advanced options?
For existing paragraph navigation, the regex to detect the end of sentence was not configurable, but it was not a very important point of the feature.

Question 4: Could this feature be generalized and be available not only in browse mode but also in any text? E.g. Word (browse mode off), WordPad. If it's too difficult though, it may not be worth spending time on this point since the main use case remains from far the navigation on the web.

@XLTechie

XLTechie commented Jan 25, 2024 via email

Copy link
Copy Markdown
Collaborator

@mltony

mltony commented Jan 25, 2024

Copy link
Copy Markdown
Contributor Author

@CyrilleB79

In many cases, I fear to miss some useful information.

That's a valid concern, however, the way I view this feature is that it allows you to quickly find the beginning of the article where you would want to start reading. Then you can use Control+up/down commands to read if you don't want to miss a single paragraph. TBH I don't use this command to actually read things - one example is that it skips paragraphs ending with colon - and I don't want to include colon in the regex since that creates too many false positives e.g. on wikipedia.

By the way, if the regexp for a sentence separator is validated in this PR, we should consider it as also valid for #8518 if it is implemented one day.

For actual sentence navigation there will be a few more tweaks needed, but yes, we can keep single regexp for both use cases.

if #16050 works as advertised, it would be much more interesting to have default assigned keys for it and keep the commands of the current PR unassigned.

Style navigation is a much more specialized command for advanced users. That's according to TextNav usage vs style navigation in BrowserNav. That's why I prefer text paragraph navigation to have a default key binding.

Naming

I am open to this as long as we can find a better name for it.

And I feel that it may be easily confused with normal paragraph navigation (control+up/downArrow), which is configurable with the paragraph style parameter.

I added a sentence in the doc emphasizing the difference.

Question 2: In NVDA there is already an algorithm using the detection of sentence end in documentNavigation/sentenceHelper.py and used for the existing paragraph navigation feature. I think that both algoritms (or at least regex) should be commonalized.

Let me prepare a PR for sentence navigation - it will come with the fanciest regex to cover most edge cases, then we can consider converging on that one.

Question 4: Could this feature be generalized and be available not only in browse mode but also in any text? E.g. Word (browse mode off), WordPad.

This feature should work in word in browse mode. WordPad doesn't appear to have browse mode, so maybe that's a separate question we can consider - why not bring browse mode to wordPad? Although given its limited usage it's probably not a very important issue.

@mltony

mltony commented Jan 25, 2024

Copy link
Copy Markdown
Contributor Author

@XLTechie

I can imagine this feature morphing into an AI recognition mechanism one day that will be much more accurate, but that isn't happening today.

Interesting that you mention that, because I had some thoughts along the same lines. A prototype can be implemented even today - we can ask ChatGPT to skip all the clutter - we'd have to define clutter for it, and then ask it to identify first readable paragraph in some way we can parse. Or alternatively ask ChatGPT to rewrite the page removing all the clutter. I wish OpenAI could create a free account for NVAccess - this way we'd be able to bring these features to all users without pushing them to set up their developer accounts, which would be too complicated for the majority. Or alternatively we can consider running LLM locally - but I haven't looked much into this, I suspect it's going to be too slow for a typical consumer laptop without GPU acceleration.

@mltony mltony marked this pull request as ready for review January 31, 2024 00:36
Comment thread source/browseMode.py Outdated
Comment thread source/config/configDefaults.py Outdated
Comment thread user_docs/en/userGuide.t2t Outdated
Comment thread user_docs/en/userGuide.t2t Outdated
Comment thread user_docs/en/userGuide.t2t Outdated

@seanbudd seanbudd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mltony

@Qchristensen Qchristensen left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks interesting and useful, thanks!

@seanbudd seanbudd merged commit 9717b81 into nvaccess:master Feb 7, 2024
@nvaccessAuto nvaccessAuto added this to the 2024.2 milestone Feb 7, 2024
@XLTechie

XLTechie commented Feb 7, 2024 via email

Copy link
Copy Markdown
Collaborator

@CyrilleB79

Copy link
Copy Markdown
Contributor

Again, I have forgotten to publish pending comments on this PR...

Here are they:

  1. Use "end of sentence" instead of "text paragraph navigation", i.e. the mean instead of the feature for which it is used:
    • in advanced settings: label "Regular expression for text paragraph navigation" -> "Regular expression for sentence end detection"; the User Guide may indicate what this regex is used for, i.e. text paragraph navigation, sentence navigation, etc.
    • in the code: punctuationMarksRegex -> endOfSentenceRegex
    • in the config: "textParagraphRegex" -> "endOfSentenceRegex"
    • etc.

It's clearer to read the code

  1. I hade asked:

In NVDA there is already an algorithm using the detection of sentence end in documentNavigation/sentenceHelper.py and used for the existing paragraph navigation feature. I think that both algoritms (or at least regex) should be commonalized.

This factorization has not been done in this PR.

  1. "browse mode settings panel" instead of "advanced settings panel" mentioned in a translator's comment
    @mltony, I have not followed all and do not remember. Do you plan to implement sentence navigation soon, or is there something blocking (other than your availability of course)?
    If you implement sentence navigation for NVDA 2024.2, fixing these issues may fit in the sentence navigation PR. If not, maybe it would be worth fixing this for NVDA 2024.2, especially for the name in the config, to avoid compatibility issue if we change it in the future.

@mltony

mltony commented Feb 7, 2024

Copy link
Copy Markdown
Contributor Author

@CyrilleB79,
I plan to work on sentence navigation in the next couple of weeks. I'll address these comments then.

Nael-Sayegh pushed a commit to Nael-Sayegh/nvda that referenced this pull request Feb 15, 2024
Closes nvaccess#15998

Summary of the issue:
Add QuickNav text paragraph navigation

Description of user facing changes
Added QuickNav gesture p to jump to next/previous text paragraph in browse mode.
Description of development approach
In BrowseModeTreeInterceptor added function _iterSimilarParagraph that finds next/previous paragraph that satisfies condition defined by a lambda function. I will reuse this function in later PRs to implement vertical navigation.
Added a new clause in BrowseModeTreeInterceptor._quickNavScript() that handles text paragraph case and calls function defined in the previous bullet.
For text criteria I ended up implementing a user configurable regex.
Its initial value is defined in DEFAULT_TEXT_PARAGRAPH_REGEX variable in configSpec.py:11.
It is user-configurable in Browse Mode page in NVDA options.
The rationale for using a regex instead of plain-text search is to reduce the number of false positives.
For example, for period character we check that it follows a word character \w and is followed by space character or end of string. There are a few more rules for edge cases in the code.
I don't include comma, colon and semicolon punctuation marks in the regex due to high number of false positives.
The regex also accounts for CJK languages by checking for full-width punctuation marks.
seanbudd pushed a commit that referenced this pull request Mar 8, 2024
Fix-up of #16031
Fix-up of other older PRs.

Summary of the issue:
Various little issues in advanced settings panel:

in advanced settings, option regex for text paragraph navigation feature:
restoring default advanced settings do not restore this field
a wrong regex syntax in this field do not trigger the error message as expected
the code to restore defaults in advanced settings panel is working by chance for some combo boxes, just because the default option is the first one.
Description of user facing changes
restoring advanced settings will now restore the Regex for Text paragraph navigation option
a wrong regex syntax for the Regex for Text paragraph navigation will now trigger an error message
Also while at it, fixed:

a translator's comment
a call to self.Layout that probably needs to be called after all the controls are created
Adriani90 pushed a commit to Adriani90/nvda that referenced this pull request Mar 13, 2024
Closes nvaccess#15998

Summary of the issue:
Add QuickNav text paragraph navigation

Description of user facing changes
Added QuickNav gesture p to jump to next/previous text paragraph in browse mode.
Description of development approach
In BrowseModeTreeInterceptor added function _iterSimilarParagraph that finds next/previous paragraph that satisfies condition defined by a lambda function. I will reuse this function in later PRs to implement vertical navigation.
Added a new clause in BrowseModeTreeInterceptor._quickNavScript() that handles text paragraph case and calls function defined in the previous bullet.
For text criteria I ended up implementing a user configurable regex.
Its initial value is defined in DEFAULT_TEXT_PARAGRAPH_REGEX variable in configSpec.py:11.
It is user-configurable in Browse Mode page in NVDA options.
The rationale for using a regex instead of plain-text search is to reduce the number of false positives.
For example, for period character we check that it follows a word character \w and is followed by space character or end of string. There are a few more rules for edge cases in the code.
I don't include comma, colon and semicolon punctuation marks in the regex due to high number of false positives.
The regex also accounts for CJK languages by checking for full-width punctuation marks.
Adriani90 pushed a commit to Adriani90/nvda that referenced this pull request Mar 13, 2024
Fix-up of nvaccess#16031
Fix-up of other older PRs.

Summary of the issue:
Various little issues in advanced settings panel:

in advanced settings, option regex for text paragraph navigation feature:
restoring default advanced settings do not restore this field
a wrong regex syntax in this field do not trigger the error message as expected
the code to restore defaults in advanced settings panel is working by chance for some combo boxes, just because the default option is the first one.
Description of user facing changes
restoring advanced settings will now restore the Regex for Text paragraph navigation option
a wrong regex syntax for the Regex for Text paragraph navigation will now trigger an error message
Also while at it, fixed:

a translator's comment
a call to self.Layout that probably needs to be called after all the controls are created
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

conceptApproved Similar 'triaged' for issues, PR accepted in theory, implementation needs review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: add QuickNav command to jump to next/previous text paragraph