Skip to content

Enhance math support for PDF by supporting math in associated files #9288

@NSoiffer

Description

@NSoiffer

Feature Request

Many years ago, NVDA added support for reading math in PDF documents. Unfortunately, the mechanism that PDF described for adding MathML to a PDF is difficult for software to generate, so other than test documents and sample hand tagging, there are not many PDFs around that tag the math in this manner.

In PDF v2 (ISO 32000-2), a much simpler method was added to tag math: associated files. This loses a little functionality (synchronized highlighting becomes much harder for AT that wants to do that), but it makes it much easier to add MathML. This request is for NVDA to add to its existing MathML functionality the ability to get the math from the associated file.

Because NVDA already has code to pass MathML to an application that can braille it/produce speech for it (e.g, MathPlayer), the work required here is to additionally look in the associated file for MathML. Sadly, Adobe has not updated their accessibility interface to v2, so getting that info requires diving into (I think) the PDSEdit layer. Doing so is not rocket science, but it is obviously more work than a few more PDomNode calls.

PDF Details

Spec

Section 14.13 of the ISO 32000-2 spec discusses associated files. Here are some relevant quotes from the spec:

Associated files provide a means to associate content in other formats with objects of a PDF file and to
identify the relationship between them. Such associated files are designated using file specification
dictionaries (see 7.11.3, "File specification dictionaries"), and AF keys are used in object dictionaries to
connect the associated file’s specification dictionaries with those objects.
For associated files, their associated file specification dictionaries should include the AFRelationship
key indicating one of several possible relationships that the file has to the associated PDF object
The file specification for an associated file represents either a file external to the PDF file or an
embedded file stream (see 7.11.4, "Embedded file streams") within the PDF file.
It should always be the case that the MathML is an embedded file stream, not an external file.

...the resulting PDF document might contain the following embedded files: ...MathML version of the equation embedded with an AFRelationship value Supplement, and associated using a structure element or a form XObject depending on how the equation is rendered in the page’s content stream.

14.13.6 Associated files linked to structure elements
One or more files may be associated with structure elements (see 14.7.2, "Structure hierarchy") to
accommodate content that spans pages such as in an article, section or table, in which cases logical
structural elements should be used to make an association with files. This entry represents the
associated files for the entire structure element. To associate files with structure elements, the
structure element dictionary shall contain an AF entry which represents the associated files for that
structure element. The relationship that the associated files have to the structure element is supplied
by the AFRelationship key in each file specification dictionary.

Other potential places in the spec for info:

  • Table 43 discusses the "AFRelationship" key along with the potential values ("Supplement" being the important one).

Acrobat API

The overview of the Acrobat API is found here. I believe the relevant interface to access is PDSElement. This provides access to the structure tree. Potentially the COS layer is involved to access the dictionary structure.

Since I was looking, it might save someone a minute to know that the MathML code for acrobat is in NVDAObjects/IAccessible/adobeAcrobat.py.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions