Return "blank" if text empty due to processing by SamKacer · Pull Request #13483 · nvaccess/nvda

SamKacer · 2022-03-15T10:01:47Z

closes #13431 (yay! a palindrome)

Link to issue number: #13431

Summary of the issue:

After processing a text for dictionary substitutions and symbol level, the result might be an empty string, which leads to nothing being spoken at all (not even "blank"). Something should always be spoken when reading text, since nothing being spoken gives the appearance of NVDA freezing.

Description of how this pull request fixes the issue:

When a speech sequence is submitted to speech.speak, if all strings of the sequence become empty due to symbol level and dictionary substitution processing, then the sequence is considered blank and "blank" is appended to it.

The function now also takes an additional argument, suppressBlanks, which is false by default. This parameter allows callers to suppress the "blank" being appended to the sequence, even if the conditions are met. The main intention is to suppress this behavior during say all. Usages of speech.speak have been updated to specify this parameter when appropriate.

Testing strategy:

Manually tested on following examples using NVDA built by appveyor. Also updated relevant system tests.

Some lines to test changes on:

# 2 tabs:
		
# 4 spaces:
    
***
hello *** world

Before changes:

at symbol level < All and indentation reporting off, 2 tab line, nothing is spoken
- when indentation reporting is on, read as "2 tab blank"
- when indentation off and symbol level Allread as "tab tab"
in contrast, 4 space line is spoken as "blank" with indentation reporting off, and "4 space blank" when turned on
(Note: as you can read in Tab characters should be considered blank #13431, this is because by isBlank() it is considered blank, however tabs cant be considered blank because they are actually spoken when symbol level is All, whereas spaces and other `BLANK_CHUNK_CHARS are not spoken at any symbol level)
for 3 stars line, nothing is spoken at symbol level None
also when reading by word, the 3 stars between hello and world have nothing spoken

after changes:

now when indentation reporting off and symbol level less All, 2 tab line spoken as "blank", other cases same as before
4 space line spoken same as before in all scenarios
3 star line spoken as "blank" when symbol level is None
when moving by word, 3 stars between hello and world spoken as "blank"

Known issues with pull request:

I am thinking that "blank" has been presented to translators as meaning an empty line, so maybe a lot of languages present it as literally "empty line". In which case, the last example of "hello *** world", when moving by word and the *** is read as "empty line" it could be a little confusing.
Another possible issue is that addons that use the speech.speak function might get different output than they used to. Addons are able to set suppressBlanks=True to get the previous behavior. In many cases this updated behavior could be welcomed by addon authors. A way to ensure API compatibility is maintained would be to have the default value be True instead of False, but then the new behavior, which should be correct in majority of cases, isn't automatically used.
A final consideration is whether appending "blank" to the speech sequence is the right solution, because if the sequence has an empty string between two commands such as [BeepCommand, "", BeepCommand], then "blank" will be spoken after the two commands are executed whereas the text it is replacing would have been spoken before the last command was executed.
- A better solution could be to instead of appending "blank", to replace the last empty string with "blank" instead, which would speak "blank" at the expected position, but this wouldn't cover cases when multiple empty strings and command are interleaved such as ["", Command, "", Command].

Change log entries:

New features
Changes
Bug fixes

text sequences that are empty due to dictionary or symbol level processing are spoken as "blank" instead of not being spoken at all (Tab characters should be considered blank #13431)

For Developers

Code Review Checklist:

Pull Request description:
- description is up to date
- change log entries
Testing:
- Unit tests
- System (end to end) tests
- Manual testing
API is compatible with existing add-ons.
Documentation:
- User Documentation
- Developer / Technical Documentation
- Context sensitive help for GUI changes
UX of all users considered:
- Speech
- Braille
- Low Vision
- Different web browsers
- Localization in other languages / culture than English

Adriani90 · 2022-03-15T11:24:30Z

I would rather prefer a sound introduced via speech refactor feature in NVDA which indicates that this symbol or text is not spoken due to symbol level or missing dictionary entry. Maybe two sounds, one for symbols and one for empty text linked to dictionary rule. Speaking "blank" could introduce new confusing problems, especially during navigation or say all.

SamKacer · 2022-03-15T17:28:05Z

@Adriani90 I don't think just a sound would be enough. It could be an alternative to the text spoken like for indentation reporting there is the option of speech or beeps, but a speech option would have to be there, since that is what some users might prefer.

Thanks for bringing up say all, since I forgot to consider that. Testing it manually just now, when doing say all and there is a line that is blank due to processing, such as the line with just two tabs, then it is read as "blank" during say all. This is wrong since lines being blank shouldn't be read during say all.

I will have to rethink how to implement this, so that say all is taken into account, since the last thing I want to do is fix one bug only to introduce another.

I'm not closing this right away in case some other people provide feedback on things I might not have considered

feerrenrut · 2022-03-17T03:15:41Z

@SamKacer Could you take a look at the system tests: tests/system/robot/symbolPronunciationTests.py
These cover the behavior here. You'll need to update these to match your expectations, and please feel free to extend with new test cases to cover the outcomes you wish for.

You can run just these tests by doing the following:

in tests/system/robotArgs.robot remove --include NVDA
run the system tests with runsystemtests.bat --include symbols. The tag symbols and others can be found by browsing the *.robot files.

The system tests will launch NVDA and another application (eg chrome or notepad), interact collecting speech, then exit. Any mouse / keyboard interaction may interfere with the tests. A work around for this is running the tests in a virtual machine.

I'm going to convert this PR to a draft until it is ready again, please feel free to press the "ready for review" button when this is addressed.

feerrenrut · 2022-03-17T03:25:15Z

@SamKacer You might be interested in #12710. When looking at a this area (however with a focus on symbols), my conclusion was that there is a loss of information, the initial intent of the str is lost. It gets processed multiple times, once the initial value is lost, subsequent processing can no longer differentiate and provide the correct outcome. My solution was to encapsulate the str with type that carries the original (or modified) intent separate from the contents. This didn't go ahead, there were still many edge cases and the community wasn't convinced by the approach.

SamKacer · 2022-03-27T13:13:46Z

@feerrenrut thanks for the feedback! Sorry for the late response. My life situation suddenly became quite hectic, so I've had to put this on hold, but will return to it, although might be a couple of weeks.

In regards to running the system tests with the setup you outlined, does this still need me to build the NVDA project? I suspect one thing I will have to sort out before I return to this is setup the whole suite of tools for building the project including the 7 gigs or so of MS Visual Studio build tools, which is a blocker for me at the moment. But I am in the process of setting up a new laptop where I will hav enough space to install everything.

feerrenrut · 2022-03-28T04:09:46Z

@SamKacer Yes, you'll need to be able to build NVDA locally.

SamKacer · 2022-04-30T16:42:12Z

One thing that could be changed in the implementation is that the only place suppressBlanks needs to be set to True is when calling speak in sayall. Since when initializing sayall, the speak function that is passed to it has suppressBlanks=True set, all of the suppressBlanks=reason == OutputReason.SAYALL are redundant and could be safely removed, thus keeping the code simpler.

What do others think? Keep those in there just to be extra safe or remove them to make it simpler?

The one other place I would keep it is speakText, since that is passed a suppressBlanks parameter itself, so the inner speak needs to respect it as well.

SamKacer · 2022-05-16T08:53:46Z

@feerrenrut Hi, I just wanted to ask since this is my first PR on this repo and so am unfamiliar with the general timelines here. I marked this ready for review a little over 2 weeks ago and haven't gotten any response yet. Is this normal an to be expected? Just asking since not sure if this PR was accidently overlooked

feerrenrut · 2022-05-17T10:43:18Z

@SamKacer It is common but not intended, unfortunately we are quite time constrained. An exception to this is when we identify that the change should target a later release. But in those cases we try to communicate it on the issue / PR.

feerrenrut

I'm a bit worried about known issue 3. It would be good to know if this is a real world case that will be encountered.

feerrenrut · 2022-05-17T11:19:19Z

+		speak(
+			seq,
+			priority=priority,
+			suppressBlanks=suppressBlanks or reason == OutputReason.SAYALL


Why is there an exception for say all? Can you add a comment to explain.

Why is there an exception for say all? Can you add a comment to explain.

This is consistent with logic in speech.speech.getTextInfoSpeech, where "blank" is added to the speech sequence only if it isn't a sayall. As I understand, blanks are not meant to be spoken during a sayall and that this was something generally known, so not requiring a comment.

snippet from speech.speech.getTextInfoSpeech:

if not suppressBlanks and reason != OutputReason.SAYALL and shouldConsiderTextInfoBlank: # Translators: This is spoken when the line is considered blank. speechSequence.append(_("blank"))

Also, just before asking about timeline on review, I added a comment pointing out that comparing reason == OutputReason.SAYALL is redundant since in speech\__init__.py, when SayAll is initialized, it is passed a modified speech function where suppressBlanks is overridden to be true. So, I was thinking of removing all these comparisns like this, since sayall is already handled directly at the source. But, I wanted to get feedback on that first. What do you think?

As I understand, blanks are not meant to be spoken during a sayall and that this was something generally known, so not requiring a comment.

I'd prefer a comment. It's very easy for priors to change, reasoning helps promoting questioning the behavior. I.E. does this reason still exist / make sense in the current context. There is also the aspect of making it easy for developers with different histories to get up to speed on the reasoning.

So, I was thinking of removing all these comparisns like this, since sayall is already handled directly at the source. But, I wanted to get feedback on that first. What do you think?

Yes, I think that would be preferable.

feerrenrut · 2022-05-17T11:23:23Z

 		symbolLevel=SymLevel.NONE,
 		expectedSpeech=[
 			'Say', '(quietly)', 'Hello,', 'Jim .', "don't",  # Expect:
-			'',  # todo: Expect 'right-pointing arrow'


Don't remove these expected symbol todo comments. These tests were added when assessing the current behavior for symbols. The todo comments document what we intend the output to become.

So while "blank" is better than "", it still isn't the final goal.

Don't remove these expected symbol todo comments. These tests were added when assessing the current behavior for symbols. The todo comments document what we intend the output to become.

So while "blank" is better than "", it still isn't the final goal.

This was actually on purpose. I wanted to spark a discussion about this.

I would propose that this perhaps should be the final goal. If symbol level is none, then as a user I would expect no symbols would be spoken. If a speech sequence doesn't contain anything to be spoken, "blank" should be spoken instead.

Otherwise, I don't understand why "'right-pointing arrow" should be spoken when symbol level is none. Shouldn't it be considered a symbol and so not spoken? If this was discussed and this is indeed the intended final goal then at least for me it is confusing and I think the test case would benefit from having a comment with a small explanation about the justification

Blank is misleading. It suggests there are no visible characters in the range. We've had quite some discussions about this already. The should be given the most accurate picture as a priority. This might mean announcing the name of a single symbol, or "symbols" when there are multiple, perhaps like spaces / tabs the number should be spoken "5 symbols". A mix of whitespace should perhaps be announced as "mixed whitespace".

Our last work in this area was really focusing on symbols and the inconsistency of reporting.

hmm, okay. This reasoning was missing in those tests making it unclear why that was the expected output. I will add back those todos along with short comment on the why, as you just explained it, so others are not confused as I was.

However, what about the case of a sequence that only contains tabs? In that case isn't "blank" an appropriate final solution, since tabs are whitespace and so would generally be considered as blank?

SamKacer · 2022-05-18T05:57:56Z

I'm a bit worried about known issue 3. It would be good to know if this is a real world case that will be encountered.

I haven't spotted this in the wild somewhere or come across a place in the code where such a case is created. Know issue 3 is pure hypothesis on my part.

The logic behind considering a speech sequence blank and appending "blank" if it is is, adapted from similar logic in speech.speech.getTextInfoSpeech. So, if this issue exists, then it exists already and if the proposed amelioration should probaby be implemented there as well to keep the "blank" logic consistent. Although the logic there around blanks is quite a bit more complex, so it is also possible it does handle this possible issue somehow and I missed it.

feerrenrut · 2022-05-19T10:30:25Z

I haven't spotted this in the wild somewhere or come across a place in the code where such a case is created. Know issue 3 is pure hypothesis on my part.

One option if we are concerned about this, is to put the feature behind a feature flag. This makes it easy for us to change the default / and allow alpha users to experiment more easily. It lowers the risk of having to revert the change entirely.

lukaszgo1 · 2022-05-19T11:04:37Z

***
hello *** world

These two examples really worry me. In general NVDA should never lie to user about the text it presents and saying "blank" for a non blank line/word is going to be pretty confusing. In general this introduces a uncertainty about what "blank" means and overloading a very well known concept of it being reported for empty lines may upset users. Has it been considered not to report "blank" in cases of hunks where no text is spoken due to punctuation level being too low, rather try to increase it until the entire hunk produces speech?

feerrenrut · 2022-05-20T04:59:02Z

Yes, @lukaszgo1 this is a known issue (reporting "blank" or no speech at all are equally unhelpful). I was trying to address this with #12710. That PR didn't go ahead, instead tests were added to demonstrate this behavior and currently desired outcome. We have been discussing this already, see comment thread: #13483 (comment)

Qchristensen · 2022-05-27T04:46:49Z

This might also close #8057 (Moving by word, or greater, where the whole new block of text to be read is characters at a higher punctuation level than set, so nothing is read).

I can also see @lukaszgo1's point. I do think it is better to say something rather than remain silent, however, I can't think of a good, short, phrase to read in lieu of "This utterance would all be text at a higher puncutation level than would be read" (I don't think NVDA should say that).

SamKacer · 2022-05-27T06:24:14Z

I haven't spotted this in the wild somewhere or come across a place in the code where such a case is created. Know issue 3 is pure hypothesis on my part.

One option if we are concerned about this, is to put the feature behind a feature flag. This makes it easy for us to change the default / and allow alpha users to experiment more easily. It lowers the risk of having to revert the change entirely.

I think this might be overly careful, but it can be done if you think that would be best. Could you please advise how to implement the feature flag?

SamKacer · 2022-05-27T06:32:29Z

Yes, @lukaszgo1 this is a known issue (reporting "blank" or no speech at all are equally unhelpful). I was trying to address this with #12710. That PR didn't go ahead, instead tests were added to demonstrate this behavior and currently desired outcome. We have been discussing this already, see comment thread: #13483 (comment)

I'd just like to contest that "blank" or saying nothing at all are equally unhelpful. Since, if this is the case, let us close this PR, since all it does is in situations when nothing would be spoken at all due to symbol level or dictionary substitution, "blank" is spoken instead. So, if both states ar equally bad, then this adds no value at the cost of more lines of code.

For me as a user, I feel although perhaps not the ideal final solution, "blank" is much better than nothing being said, because in the scenario that nothing is said after navigating, it gives the impression that NVDA has stopped responding and the user is left momentarily unsure about what just happened. This has happened to me several times and if "blank" was spoken (or anything at all), then it would have been preferable to avoid confusion.

I think, aiming for a better final solution is a good idea, but I find this to be an incremental improvement.

SamKacer · 2022-05-27T06:36:09Z

Based on feedback so far, I have these todos:

update comments in tests
remove all the redundant reason == OutputReason.SAYALL checks
comment on why suppressBlanks is set to True for sayall

AppVeyorBot · 2023-06-10T17:44:50Z

PASS: Translation comments check.
PASS: Unit tests.
PASS: Lint check.
FAIL: System tests (tags: installer NVDA). See test results for more information.
Build (for testing PR): https://ci.appveyor.com/api/buildjobs/66lqtfbrosgi0366/artifacts/output/nvda_snapshot_pr13483-28410,a0d096ab.exe
CI timing (mins):
INIT 0.0,
INSTALL_START 0.8,
INSTALL_END 0.9,
BUILD_START 0.0,
BUILD_END 21.3,
TESTSETUP_START 0.0,
TESTSETUP_END 0.3,
TEST_START 0.0,
TEST_END 18.7,
FINISH_END 0.2

See test results for failed build of commit a0d096abd3

SamKacer · 2023-06-17T13:12:47Z

Hello, so I finally got around to make the cleanups I promised and update new tests etc. This is ready for review again.

I think an apt summary of the discussion is that there is no universally prefferable solution to the problem of what if all the text is removed due to either symbol level or dictionary substitution. Therefore, I think the final solution will need to make this configurable, so that a user can choose the one they prefer.

However, I think most of us can at least agree that the current behavior of NVDA speaking nothing, giving the impression that it froze, isn't a behavior that anyone finds the most preferable.

So, although this wouldn't be the final solution, I propose speaking "blank" as an straightforward, incremental improvement over the current behavior of speaking nothing.

seanbudd

Thanks @SamKacer

…rocessing_text

cary-rowen · 2023-06-19T03:12:02Z

Hello @SamKacer

Can you take a look at #15024?

Thanks!

…parameter in PR nvaccess/nvda#13483. Closes #5.

…parameter in PR nvaccess/nvda#13483. + This was causing many failures of navigation and other things, when used with NVDA alphas. + Closes #5.

This reverts commit cc222a0.

) This reverts commit cc222a0. Reverts #13483 due to #15024

SamKacer · 2023-06-22T18:12:28Z

I was just thinking, if the issue with this implementation is that it isn't possible to tell if "blank" means an actually empty line with no chars or if it is a line with chars but they were filtered out, then how about in this scenario "empty" is spoken instead?

That way if you hear "blank" then you know it is a line with no chars and if you hear "empty" you know it is a line with chars that were all filtered out

Someone might argue that then when you hear "empty" then you will be wondering whether it is a line that has the text empty or if it is a line with filtered out text. I will just preemptively remind that is already the case with "blank", "heading level 1", "link", "emphasis", and all the other features that speak something that isn't actually there.

ruifontes · 2023-06-22T18:19:50Z

Header level 1, bold, emphasis, link and so on are attributs and they are visually presents...
So, if you accept the visually present attributes are voiced, why not blank characters?
If you want something different from just "blank", why not "blank characters"?

ruifontes · 2023-06-22T18:51:29Z

Or better, "blank chars", for spaces and tabs, and "filtered chars", for filtered characters...

SamKacer · 2023-06-23T06:33:10Z

@ruifontes Ok, so if I understand correctly, you are suggesting there should be 3 different things spoken:

"blank" when the line just consists of a \n or \r
"blank chars" when a line only contains spaces or tabs
"filtered chars" when the line has all its text filtered out

I think I'd prefer to keep discussion in this thread to just be about the 3rd case since that is what this PR was about.

My concern with "filtered chars" is that it is quite verbose. Even "empty" isn't that ideal because it is two syllables compared to the one syllable of "blank". I am not sure what the right phrase is, but I think shorter is better

XLTechie · 2023-06-23T06:50:53Z

2. "blank chars" when a line only contains spaces or tabs

"chars" is not a concept that users hear much. You and I know what it means, but if you go with something along these lines, that doesn't seem like language that will be approved. "whitespace" is one possible option. Or the actual characters, like I suggested--"4 tab", "4 space", "6 tab 3 space".

3. "filtered chars" when the line has all its text filtered out

And now we have two words that users will need defined or explained for them. "filtered" in this context, is another term that is not used. Is silence suddenly not looking so bad? :) This is really an ideal case for an earcon, or some little blip or beep, if people don't want to hear the punct level raised to speak lines that would otherwise render as silent.

ruifontes · 2023-06-23T10:21:32Z

For me, silence is good, since I can check in Braille display.
But people without a Braille display can think: Did I really press the arrow?
So, to be quick and informative, maybe the idea of two hearcons is the best:
One for only white spaces and another for the presence of filtered chars...

SamKacer requested a review from a team as a code owner March 15, 2022 10:01

SamKacer requested a review from feerrenrut March 15, 2022 10:01

SamKacer changed the title ~~Return "blank" if text empty after processing~~ Return "blank" if text empty due to processing Mar 15, 2022

This comment was marked as outdated.

Sign in to view

feerrenrut marked this pull request as draft March 17, 2022 03:15

This comment was marked as outdated.

Sign in to view

SamKacer force-pushed the check_blank_after_processing_text branch from aff97ad to d14bc61 Compare April 30, 2022 12:07

This comment was marked as outdated.

Sign in to view

SamKacer marked this pull request as ready for review April 30, 2022 14:47

feerrenrut reviewed May 17, 2022

View reviewed changes

SamKacer added 5 commits June 10, 2023 13:56

Fix tests with double chunk seperator

f28a97e

Fix flacate whitespace issue

4e21c5d

Revert comments in tests

fd7b449

Comment on why suppressBlanks overridden in say all

acdb8e0

Remove redundant checking of output reason == say all

b8019cf

SamKacer force-pushed the check_blank_after_processing_text branch from fcdeb6e to b8019cf Compare June 10, 2023 17:02

Fix Robot.chromeTests.ARIA details role

4c7045f

SamKacer marked this pull request as ready for review June 17, 2023 13:13

seanbudd self-requested a review June 18, 2023 23:34

seanbudd approved these changes Jun 19, 2023

View reviewed changes

seanbudd added 2 commits June 19, 2023 10:05

Merge remote-tracking branch 'origin/master' into check_blank_after_p…

aebd210

…rocessing_text

update changes

8a1103f

seanbudd merged commit cc222a0 into nvaccess:master Jun 19, 2023

nvaccessAuto added this to the 2023.2 milestone Jun 19, 2023

This was referenced Jun 19, 2023

PR#13483 caused NVDA to report unexpected Blank #15024

Closed

Reprocess speech at symbol level all if symbol level processing causes an empty sequence #15027

Closed

XLTechie pushed a commit to opensourcesys/speechLogger that referenced this pull request Jun 19, 2023

Fixed the patching of NVDA's speech.speech.speak, as it gained a new …

727eeaa

…parameter in PR nvaccess/nvda#13483. Closes #5.

seanbudd added a commit that referenced this pull request Jun 19, 2023

Revert "Return "blank" if text empty due to processing (#13483)"

adb1de0

This reverts commit cc222a0.

seanbudd mentioned this pull request Jun 20, 2023

Revert "Return "blank" if text empty due to processing" #15032

Merged

seanbudd added a commit that referenced this pull request Jun 20, 2023

Revert "Return "blank" if text empty due to processing (#13483)" (#15032

57bd51d

) This reverts commit cc222a0. Reverts #13483 due to #15024

Uh oh!

Conversation

SamKacer commented Mar 15, 2022 • edited by feerrenrut Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Link to issue number: #13431

Summary of the issue:

Description of how this pull request fixes the issue:

Testing strategy:

Known issues with pull request:

Change log entries:

Code Review Checklist:

Uh oh!

This comment was marked as outdated.

Adriani90 commented Mar 15, 2022

Uh oh!

SamKacer commented Mar 15, 2022

Uh oh!

feerrenrut commented Mar 17, 2022

Uh oh!

feerrenrut commented Mar 17, 2022

Uh oh!

SamKacer commented Mar 27, 2022

Uh oh!

feerrenrut commented Mar 28, 2022

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

SamKacer commented Apr 30, 2022

Uh oh!

SamKacer commented May 16, 2022

Uh oh!

feerrenrut commented May 17, 2022

Uh oh!

feerrenrut left a comment

Choose a reason for hiding this comment

Uh oh!

feerrenrut May 17, 2022

Choose a reason for hiding this comment

Uh oh!

SamKacer May 18, 2022

Choose a reason for hiding this comment

Uh oh!

feerrenrut May 19, 2022

Choose a reason for hiding this comment

Uh oh!

feerrenrut May 17, 2022

Choose a reason for hiding this comment

Uh oh!

SamKacer May 18, 2022

Choose a reason for hiding this comment

Uh oh!

feerrenrut May 19, 2022

Choose a reason for hiding this comment

Uh oh!

SamKacer May 27, 2022

Choose a reason for hiding this comment

Uh oh!

SamKacer commented May 18, 2022

Uh oh!

feerrenrut commented May 19, 2022

Uh oh!

lukaszgo1 commented May 19, 2022

Uh oh!

feerrenrut commented May 20, 2022

Uh oh!

Qchristensen commented May 27, 2022

Uh oh!

SamKacer commented May 27, 2022

Uh oh!

SamKacer commented May 27, 2022

Uh oh!

SamKacer commented May 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AppVeyorBot commented Jun 10, 2023

Uh oh!

SamKacer commented Jun 17, 2023

SamKacer commented Mar 15, 2022 •

edited by feerrenrut

Loading

SamKacer commented May 27, 2022 •

edited

Loading