Skip to content

PDF bookmarks in hyperlinks not UTF-8-encoded #14864

@mirabilos

Description

@mirabilos

Attach (recommended) or Link to PDF file here:

songbook.pdf

Configuration:

  • Web browser and its version: Firefox 91.8.0esr
  • Operating system and its version: Debian bullseye/amd64
  • PDF.js version: both the one built into Firefox and the precompiled Stable (v2.13.216)
  • Is a browser extension: ?

Steps to reproduce the problem:

  1. Navigate to the PDF
  2. Hover over either a link in the bookmarks or a link in the index tables

What is the expected behavior? (add screenshot)

The URL in the status bar (indicating the link target) encodes it as something like #H%c3%a4ndel%20--%20Halle%f0%9f%8e%86lujah (also so that people can just type #Händel -- Halle🎆lujah in the browser URL bar and have the link work; as you might know, URLs (including fragment identifiers) shall be UTF-8 encoded).

What went wrong? (add screenshot)

The URL in the status bar is encoded as the binary UTF-16BE value from the PDF, i.e. #%FE%FF%00H%00%E4%00n….

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions