-
Notifications
You must be signed in to change notification settings - Fork 10.6k
Closed
Labels
Description
Attach (recommended) or Link to PDF file
Web browser and its version
Google Chrome 143.0.7499.148
Operating system and its version
macOS 26.1 (25B78)
PDF.js version
5.4.449
Is the bug present in the latest PDF.js version?
Yes
Is a browser extension
No
Steps to reproduce the problem
- Open the PDF I shared, easiest is using the demo at https://mozilla.github.io/pdf.js/web/viewer.html
- Search for the string
("client"and you see results are highlighted
- Search for the string
("client")and results are no longer highlighted. This appears to happen with any string that includes")
What is the expected behavior?
Search should highlight ("client")
What went wrong?
The search finds no results due to incorrect escaping of the closing ) in ("client").
Link to a viewer
No response
Additional context
My best attempt at understanding the bug is that this appears to be a recent regression introduced in 039b9e4 which changed the regex and added the + in (\p{P}+) at
pdf.js/web/pdf_find_controller.js
Line 81 in 010e52e
| /([*+^${}()|[\]\\])|(\p{P}+)|(\s+)|(\p{M})|(\p{L})/gu; |
) after the " from getting escaped. CC @calixteman
I frankly don't understand these regexes or the context of this codebase enough to know if it is the right fix, but for example this patch seems to resolve the issue because it escapes the characters that now get greedily matched into p2:
diff --git a/web/pdf_viewer.mjs b/web/pdf_viewer.mjs
index 01379ad82373d47dbe999b64fd1a9bddc70d1764..d009724906a0d02d2b1c020d34fc4d771b2a2fc8 100644
--- a/web/pdf_viewer.mjs
+++ b/web/pdf_viewer.mjs
@@ -1112,7 +1112,7 @@ class PDFFindController {
return addExtraWhitespaces(p1, `\\${p1}`);
}
if (p2) {
- return addExtraWhitespaces(p2, p2.replaceAll(/[.?]/g, "\\$&"));
+ return addExtraWhitespaces(p2, p2.replaceAll(/[.?(){}\[\]]/g, "\\$&"));
}
if (p3) {
return "[ ]+";
Reactions are currently unavailable