Skip to content

Map PUA bullets to Unicode in Word#6778

Merged
michaelDCurran merged 6 commits into
nvaccess:masterfrom
dkager:i5267
Mar 28, 2018
Merged

Map PUA bullets to Unicode in Word#6778
michaelDCurran merged 6 commits into
nvaccess:masterfrom
dkager:i5267

Conversation

@dkager

@dkager dkager commented Jan 22, 2017

Copy link
Copy Markdown
Contributor

Fixes #5267, part of #2446. Supercedes #5508.

I improved on the PR from @nvda-india. Description of my changes is in dkager@4a717d6

vrdhn and others added 4 commits September 13, 2015 21:24
* Remove U+F0B7 PUA symbol from symbol dictionary.
* Reorder and clean up the mapPUAToUnicode dict.
* More consistent code style in _normalizeFormatField().
* Update U+F0E8 to use the same mapping as Word uses when saving to plain text.
* Update U+F0FC based on http://www.alanwood.net/demos/wingdings.html.
* Update description for U+21E8 to more closely match the Unicode name.
* Add U+F0A7, used for bullets on level 3,
  based on https://en.wikipedia.org/wiki/Symbol_(typeface)#Encoding.
@dkager

dkager commented Jan 22, 2017

Copy link
Copy Markdown
Contributor Author

CC @vrdhn

@jcsteh jcsteh requested a review from michaelDCurran January 23, 2017 00:10
Comment thread source/NVDAObjects/window/winword.py Outdated
field[x]=v
bullet=field.get('line-prefix')
if bullet and len(bullet)==1:
global mapPUAToUnicode

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this line?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that variable lives outside the class.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, global is only necessary if you want to modify the value. Unnecessary use of this keyword was in the India code. Fixed.

• bullet some
… dot dot dot all always
... dot dot dot all always
 bullet some

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHy this removal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is a PUA character and as such it will only be a bullet in certain applications. For Word this character is now mapped to a proper bullet.

▪ black square some
◾ black square some
◦ white bullet some
➔ right arrow some

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this removal?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because U+2794 is not the standard rightwards arrow. IMO the definition was too generic. But if nothing else, it's confusing compared to U+2192.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, that arrow (U+2794) is not used as a bullet shape whereas the other arrows are.

@dkager

dkager commented Jul 26, 2017

Copy link
Copy Markdown
Contributor Author

@LeonarddeR Do you actually have changes to request after my replies to your comments? Happy to work on them.

@dkager dkager mentioned this pull request Aug 6, 2017
@dkager

dkager commented Aug 6, 2017

Copy link
Copy Markdown
Contributor Author

Small update: the replacement bullets, e.g. those that are not in the PUA, are included in the new English (US) 8-dot computer braille table (en-us-comp8-ext.utb). This means that for that table, bullets will render correctly in braille. Results will probably vary with other tables. I'm also not sure about non-English speech dictionaries.

An alternative would have been to replace the PUA characters with ASCII. While this would have been quicker short term, it seemed better to replace the symbols with Unicode equivalents. This is a bit more involved to get working in speech and braille, but should also be more reliable once it's done.

@dkager

dkager commented Sep 13, 2017

Copy link
Copy Markdown
Contributor Author

@michaelDCurran Would value your input on this one. :)

@Adriani90

Copy link
Copy Markdown
Collaborator

This might solve #7742 and #7804

@Adriani90

Copy link
Copy Markdown
Collaborator

Please take note of discussions in those issues.

@michaelDCurran michaelDCurran merged commit 93082e3 into nvaccess:master Mar 28, 2018
@nvaccessAuto nvaccessAuto added this to the 2018.2 milestone Mar 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants