Skip to content

[Bug]: Search does not work when search string is contains an escape character after a quote e.g. client") (recent regression) #20516

@FabianFrank

Description

@FabianFrank

Attach (recommended) or Link to PDF file

pdfjs_test.pdf

Web browser and its version

Google Chrome 143.0.7499.148

Operating system and its version

macOS 26.1 (25B78)

PDF.js version

5.4.449

Is the bug present in the latest PDF.js version?

Yes

Is a browser extension

No

Steps to reproduce the problem

  1. Open the PDF I shared, easiest is using the demo at https://mozilla.github.io/pdf.js/web/viewer.html
  2. Search for the string ("client" and you see results are highlighted
Image
  1. Search for the string ("client") and results are no longer highlighted. This appears to happen with any string that includes ")
Image

What is the expected behavior?

Search should highlight ("client")

What went wrong?

The search finds no results due to incorrect escaping of the closing ) in ("client").

Link to a viewer

No response

Additional context

My best attempt at understanding the bug is that this appears to be a recent regression introduced in 039b9e4 which changed the regex and added the + in (\p{P}+) at

/([*+^${}()|[\]\\])|(\p{P}+)|(\s+)|(\p{M})|(\p{L})/gu;
that ends up making the match greedy and preventing the ) after the " from getting escaped. CC @calixteman

I frankly don't understand these regexes or the context of this codebase enough to know if it is the right fix, but for example this patch seems to resolve the issue because it escapes the characters that now get greedily matched into p2:

diff --git a/web/pdf_viewer.mjs b/web/pdf_viewer.mjs
index 01379ad82373d47dbe999b64fd1a9bddc70d1764..d009724906a0d02d2b1c020d34fc4d771b2a2fc8 100644
--- a/web/pdf_viewer.mjs
+++ b/web/pdf_viewer.mjs
@@ -1112,7 +1112,7 @@ class PDFFindController {
         return addExtraWhitespaces(p1, `\\${p1}`);
       }
       if (p2) {
-        return addExtraWhitespaces(p2, p2.replaceAll(/[.?]/g, "\\$&"));
+        return addExtraWhitespaces(p2, p2.replaceAll(/[.?(){}\[\]]/g, "\\$&"));
       }
       if (p3) {
         return "[ ]+";

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions