Map PUA bullets to Unicode in Word#6778
Conversation
Adding few bullet types that word uses.
* Remove U+F0B7 PUA symbol from symbol dictionary. * Reorder and clean up the mapPUAToUnicode dict. * More consistent code style in _normalizeFormatField(). * Update U+F0E8 to use the same mapping as Word uses when saving to plain text. * Update U+F0FC based on http://www.alanwood.net/demos/wingdings.html. * Update description for U+21E8 to more closely match the Unicode name. * Add U+F0A7, used for bullets on level 3, based on https://en.wikipedia.org/wiki/Symbol_(typeface)#Encoding.
|
CC @vrdhn |
| field[x]=v | ||
| bullet=field.get('line-prefix') | ||
| if bullet and len(bullet)==1: | ||
| global mapPUAToUnicode |
There was a problem hiding this comment.
Because that variable lives outside the class.
There was a problem hiding this comment.
On second thought, global is only necessary if you want to modify the value. Unnecessary use of this keyword was in the India code. Fixed.
| • bullet some | ||
| … dot dot dot all always | ||
| ... dot dot dot all always | ||
| bullet some |
There was a problem hiding this comment.
Because it is a PUA character and as such it will only be a bullet in certain applications. For Word this character is now mapped to a proper bullet.
| ▪ black square some | ||
| ◾ black square some | ||
| ◦ white bullet some | ||
| ➔ right arrow some |
There was a problem hiding this comment.
Also, that arrow (U+2794) is not used as a bullet shape whereas the other arrows are.
|
@LeonarddeR Do you actually have changes to request after my replies to your comments? Happy to work on them. |
|
Small update: the replacement bullets, e.g. those that are not in the PUA, are included in the new English (US) 8-dot computer braille table (en-us-comp8-ext.utb). This means that for that table, bullets will render correctly in braille. Results will probably vary with other tables. I'm also not sure about non-English speech dictionaries. An alternative would have been to replace the PUA characters with ASCII. While this would have been quicker short term, it seemed better to replace the symbols with Unicode equivalents. This is a bit more involved to get working in speech and braille, but should also be more reliable once it's done. |
|
@michaelDCurran Would value your input on this one. :) |
|
Please take note of discussions in those issues. |
Fixes #5267, part of #2446. Supercedes #5508.
I improved on the PR from @nvda-india. Description of my changes is in dkager@4a717d6