Skip to content

In Microsoft Word color of certain characters is retrieved inconsistently. #16196

@mltony

Description

@mltony

Steps to reproduce:

  1. Open a blank document in Microsoft Word.
  2. Type "john.smith@gmail.com" without quotes.
  3. Press enter. Make sure that the first line is converted into a hyperlink.
  4. Press control+home to move cursor to the beginning.
  5. Press nvda+control+z to open NVDA python console.
  6. Paste the following code:
    import textInfos, config, itertools
    
    STYLE_ATTRIBUTES = frozenset([
    	"background-color",
    	"color",
    	"font-family",
    	"font-size",
    	"bold",
    	"italic",
    	"marked",
    	"strikethrough",
    	"text-line-through-style",
    	"underline",
    	"text-underline-style",
    ])
    
    def _extractStyles(
    		info: textInfos.TextInfo,
    ) -> "textInfos.TextInfo.TextWithFieldsT":
    	"""
    		This function calls TextInfo.getTextWithFields(), and then processes fields in the following way:
    		1. Highlighted (marked) text is currently reported as Role.MARKED_CONTENT, and not formatChange.
    		For ease of further handling we create a new boolean format field "marked"
    		and set its value according to presence of Role.MARKED_CONTENT.
    		2. Then we drop all control fields, leaving only formatChange fields and text.
    		@raise RuntimeError: found unknown command in getTextWithFields()
    	"""
    	stack: list[textInfos.FormatField] = [{}]
    	result: "textInfos.TextInfo.TextWithFieldsT" = []
    	reportFormattingOptions = (
    		"reportFontName",
    		"reportFontSize",
    		"reportFontAttributes",
    		"reportSuperscriptsAndSubscripts",
    		"reportHighlight",
    		"reportColor",
    		"reportStyle",
    	)
    	formatConfig = dict()
    	for i in config.conf["documentFormatting"]:
    		formatConfig[i] = i in reportFormattingOptions
    	for field in info.getTextWithFields(formatConfig):
    		if isinstance(field, textInfos.FieldCommand):
    			if field.command == "controlStart":
    				style = {**stack[-1]}
    				if field.field.get("role") == controlTypes.Role.MARKED_CONTENT:
    					style["marked"] = True
    				stack.append(style)
    			elif field.command == "controlEnd":
    				del stack[-1]
    			elif field.command == "formatChange":
    				field.field = {
    					k: v
    					for k, v in {**field.field, **stack[-1]}.items()
    					if k in STYLE_ATTRIBUTES
    				}
    				result.append(field)
    			else:
    				raise RuntimeError("Unrecognized command in the field")
    		elif isinstance(field, str):
    			result.append(field)
    		else:
    			raise RuntimeError("Unrecognized field in TextInfo.getTextWithFields()")
    	return result
    
    
    def _mergeIdenticalStyles(
    		sequence: "textInfos.TextInfo.TextWithFieldsT",
    ) -> "textInfos.TextInfo.TextWithFieldsT":
    	"""
    		This function is used to postprocess styles output of _extractStyles function.
    		Raw output of _extractStyles function might contain identical styles,
    		since textInfos might contain formatChange fields for other reasons
    		rather than style change.
    		This function removes redundant formatChange fields and merges str items as appropriate.
    	"""
    	currentStyle = None
    	redundantIndices = set()
    	for i, item in enumerate(sequence):
    		if i == 0:
    			currentStyle = item
    		elif isinstance(item, textInfos.FieldCommand):
    			if item.field == currentStyle.field:
    				redundantIndices.add(i)
    			currentStyle = item
    	sequence = [item for i, item in enumerate(sequence) if i not in redundantIndices]
    	# Now merging adjacent strings
    	result = []
    	if True:
    		for k, g in itertools.groupby(sequence, key=type):
    			if k == str:
    				result.append("".join(g))
    			else:
    				result.extend(list(g))
    	return result
    
    
    t = focus.treeInterceptor.makeTextInfo('caret')
    t.expand('paragraph')
    styles = _mergeIdenticalStyles(_extractStyles(t))
    colors = [f if isinstance(f, str) else f.field['color'] for f in styles]
    colors
    
  7. Observe that the output looks like this:
    ['aqua grey', 'john', 'dark pale aqua', '.smith@gmail.', 'aqua grey', 'com', 'automatic color', '\n']
    
    This means that "john" and "com" are written in aqua grey color, while ".smith@gmail." is written in dark pale aqua. So far so good.
  8. Now, quit NVDA python console and go back to Microsoft Word.
  9. Examine color of every character by pressing NVDA+f. This way it reports that:
    • The first "j" and the last "m" characters are written in aqua grey color.
    • Text in the middle "ohn.smith@gmail.co" is written in dark pale aqua.

Actual behavior:

Color of some characters is retrieved inconsistently. Specifically, color of some characters depends on current textInfo. If textInfo selects only a single character it reports one color, while when textInfo selects the entire paragraph, different color is reported for the same character.

Expected behavior:

Color of characters must be reported consistently regardless of extent of textInfo.

NVDA logs, crash dumps and other attachments:

N/A

System configuration

NVDA installed/portable/running from source:

Running from source

NVDA version:

master

Windows version:

Windows 11

Name and version of other software in use when reproducing the issue:

Microsoft Word '16.0.17231.20236'

Other information about your system:

Other questions

Does the issue still occur after restarting your computer?

Yes

Have you tried any other versions of NVDA? If so, please report their behaviors.

No

If NVDA add-ons are disabled, is your problem still occurring?

Yes

Does the issue still occur after you run the COM Registration Fixing Tool in NVDA's tools menu?

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions