Fix bug where only the first math expression on a line was being spoken/brailled#18526
Merged
Conversation
…ce`. It isn't used in speech (although maybe it should trigger a pause if wide), but it is used in some braille notations as a signal that this is a "fill in the blank" space. I also added the elementary math attributes used in MathPlayer. Neither MathCAT nor Access8Math currently support the elementary math notations, but it is on the list of things to implement for MathCAT. Note: potentially this could go into the beta, but I can't get the beta to build on my machine so I can't test the fix there.
…n a line, everything after the first expression is not spoken/brailled. This includes text along with math. This has been reported in a few contexts: * MathCAT: daisy/MathCATForPython#44 * DAISY: daisy/math-a11y#24 The problem lies in code that tries to delete Word's speech. Both the MathML and Word's speech are exposed. The code to recognize this and delete Word's speech was buggy. In particular, it seemed to assume that there was a most one equation on a line. The deletion of Word's math was outside of the loop, so basically everything after the start was deleted. The `controlEnd` command actually pointed to the last MathML on the line, but the speech code only spoke the `controlStart` part, so that bug never showed up. The fix cleans up the logic and deletes Word's math when it comes to the matching end of the opening MathML.
…pe()`" Based this branch on the wrong branch This reverts commit 6f88d5e.
…on `mspace`. It isn't used in speech (although maybe it should trigger a pause if wide), but it is used in some braille notations as a signal that this is a "fill in the blank" space." This branch was based on the wrong branch This reverts commit 6a6a2a8.
seanbudd
reviewed
Jul 22, 2025
Co-authored-by: Sean Budd <seanbudd123@gmail.com>
Member
|
@NSoiffer the level checks are just to ensure that the control starts and ends stay completely balanced. We can never completely guarantee that MS word doesn't expose something completely silly like math in math, though I admit that is very unlikely. |
michaelDCurran
approved these changes
Jul 22, 2025
michaelDCurran
left a comment
Member
There was a problem hiding this comment.
Thanks for debugging this. Good fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Link to issue number:
This fixes #18386
Summary of the issue:
If there is more than one math expression on a line, only the first math expression is spoken/brailled. Any text or math after that is dropped.
Description of user facing changes:
Now the full line is read/brailled.
Description of developer facing changes:
None
Description of development approach:
The bug was in wordDocument.py. There was/is special code to avoid saying both the MathML and word's speech for the math. However that code seemed to assume that there was only one piece of math per line and did a single (large) deletion after looping through what was on the line.
The code now correctly deletes only Word's math speech string. It does this in the loop. Because deletion is done on the array as we are looping, the code has to be careful to make sure the index into the array tracks where we are after the deletion.
Testing strategy:
I created a sample Word doc. This doc has three expressions on the line to better test the problem. Before the change, everything after the first expression was dropped. Now the entire line is read, including all the math and the text around the math.
Known issues with pull request:
I accidentally created the branch from another bug fix. I did a
git revertand should have pulled out the other commits, but I still see the commits mentioned. On the other hand, only two changed files are listed, and those have the appropriate changes.Note: this is a pull request against master. However, this is a pretty bad bug that was reported elsewhere before finally being reported here. I strongly suggest this goes into 2025.2.beta. I have tried to build the beta, but it isn't working for me, so I can't test the fix there.
Update: I forgot to mention that I left in the original code that does a "level check": every
controlStartbumps the level and everycontrolEndend decreases the level. The code records the level of thecontrolStartof math and only does deletion of content if thecontrolEndof the math matches that level. I don't believe this is needed, but I don't know why it was part of the original code and perhaps there are cases where it is needed. @michaelDCurran wrote the original code: do you remember why you did that (four years ago)?Code Review Checklist:
@coderabbitai summary