Skip to content

System.IO.Packaging.InternalRelationshipCollection.GetRelationshipIndex doesn't scale #983

@mabrahamsen

Description

@mabrahamsen

Use case
The Microsoft OpenXml SDK relies on the System.IO.Packaging API to generate Microsoft Excel files. This allows adding hyperlinks to Excel columns, but with the current resource consumption this is impossible when you have an excel sheet with about 100.000 records that each has a link to some additional record information - for instance a deep link to a HTTP resource. This API uses PackagePart.CreateRelationship inside the System.IO.Packaging stack.

Background
When adding hyperlinks to Excel cells, you have to call AddHyperlinkRelationship on the OpenXmlContainer. This will then follow the stack in System.IO.Packaging:
PackagePart.CreateRelationship -> InternalRelationshipCollection.Add -> InternalRelationshipCollection.ValidateUniqueRelationshipId -> InternalRelationshipCollection.GetRelationshipIndex.

Problem
ValidateUniqueRelationshipId will call GetRelationshipIndex to ensure that the relationship id is unique, and it loops through all the links each time it is invoked. As the identifier list grows it will (obviously) become exponentially slower. Backing this identifier store by something similar to a HashSet would greatly improve the lookup time.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions