Skip to content

Remove unused OpenJPEG wasm fallback logic#19923

Merged
Snuffleupagus merged 1 commit intomozilla:masterfrom
nicolo-ribaudo:fix-new-url
May 14, 2025
Merged

Remove unused OpenJPEG wasm fallback logic#19923
Snuffleupagus merged 1 commit intomozilla:masterfrom
nicolo-ribaudo:fix-new-url

Conversation

@nicolo-ribaudo
Copy link
Contributor

@nicolo-ribaudo nicolo-ribaudo commented May 13, 2025

Emscripten generates code that allows the caller to provide the Wasm module (thorugh Module.instantiateWasm), with a fallback in case .instantiateWasm is not provided. We always define instantiateWasm, so we can hard-code the check and let our dead code elimination logic remove the unused fallback.

This commit also improved the dead code elimination logic so that if a function declaration becomes unused as a result of removing dead code, the function itself is removed.

Closes #19921.

The diff for pdf.worker.mjs in the generic build is now:

--- ./docs/pdf.worker.old.mjs	2025-05-13 14:04:39.175001871 +0200
+++ ./docs/pdf.worker.new.mjs	2025-05-13 14:09:39.280078580 +0200
@@ -5338,12 +5338,6 @@
     };
     var _scriptName = import.meta.url;
     var scriptDirectory = "";
-    function locateFile(path) {
-      if (Module["locateFile"]) {
-        return Module["locateFile"](path, scriptDirectory);
-      }
-      return scriptDirectory + path;
-    }
     var readAsync, readBinary;
     if (ENVIRONMENT_IS_WEB || ENVIRONMENT_IS_WORKER) {
       try {
@@ -5430,58 +5424,6 @@
       throw e;
     }
     var wasmBinaryFile;
-    function findWasmBinary() {
-      if (Module["locateFile"]) {
-        return locateFile("openjpeg.wasm");
-      }
-      return new URL(
-      /*webpackIgnore: true*/
-      /*@vite-ignore*/
-      "openjpeg.wasm", import.meta.url).href;
-    }
-    function getBinarySync(file) {
-      if (file == wasmBinaryFile && wasmBinary) {
-        return new Uint8Array(wasmBinary);
-      }
-      if (readBinary) {
-        return readBinary(file);
-      }
-      throw "both async and sync fetching of the wasm failed";
-    }
-    async function getWasmBinary(binaryFile) {
-      if (!wasmBinary) {
-        try {
-          var response = await readAsync(binaryFile);
-          return new Uint8Array(response);
-        } catch {}
-      }
-      return getBinarySync(binaryFile);
-    }
-    async function instantiateArrayBuffer(binaryFile, imports) {
-      try {
-        var binary = await getWasmBinary(binaryFile);
-        var instance = await WebAssembly.instantiate(binary, imports);
-        return instance;
-      } catch (reason) {
-        err(`failed to asynchronously prepare wasm: ${reason}`);
-        abort(reason);
-      }
-    }
-    async function instantiateAsync(binary, binaryFile, imports) {
-      if (!binary && typeof WebAssembly.instantiateStreaming == "function") {
-        try {
-          var response = fetch(binaryFile, {
-            credentials: "same-origin"
-          });
-          var instantiationResult = await WebAssembly.instantiateStreaming(response, imports);
-          return instantiationResult;
-        } catch (reason) {
-          err(`wasm streaming compile failed: ${reason}`);
-          err("falling back to ArrayBuffer instantiation");
-        }
-      }
-      return instantiateArrayBuffer(binaryFile, imports);
-    }
     function getWasmImports() {
       return {
         a: wasmImports
@@ -5496,26 +5438,12 @@
         return wasmExports;
       }
       addRunDependency("wasm-instantiate");
-      function receiveInstantiationResult(result) {
-        return receiveInstance(result["instance"]);
-      }
       var info = getWasmImports();
-      if (Module["instantiateWasm"]) {
-        return new Promise((resolve, reject) => {
-          Module["instantiateWasm"](info, (mod, inst) => {
-            resolve(receiveInstance(mod, inst));
-          });
+      return new Promise((resolve, reject) => {
+        Module["instantiateWasm"](info, (mod, inst) => {
+          resolve(receiveInstance(mod, inst));
         });
-      }
-      wasmBinaryFile ??= findWasmBinary();
-      try {
-        var result = await instantiateAsync(wasmBinary, wasmBinaryFile, info);
-        var exports = receiveInstantiationResult(result);
-        return exports;
-      } catch (e) {
-        readyPromiseReject(e);
-        return Promise.reject(e);
-      }
+      });
     }
     class ExitStatus {
       name = "ExitStatus";
@@ -9396,9 +9324,6 @@
   }
   return visitor.buffer;
 }
-function parseJbig2(data) {
-  throw new Error("Not implemented: parseJbig2");
-}
 class SimpleSegmentVisitor {
   onPageInformation(info) {
     this.currentPageInfo = info;
@@ -57200,7 +57125,7 @@
       docId,
       apiVersion
     } = docParams;
-    const workerVersion = "5.2.183";
+    const workerVersion = "5.2.184";
     if (apiVersion !== workerVersion) {
       throw new Error(`The API version "${apiVersion}" does not match ` + `the Worker version "${workerVersion}".`);
     }
@@ -57733,8 +57658,8 @@
 
 ;// ./src/pdf.worker.js
 
-const pdfjsVersion = "5.2.183";
-const pdfjsBuild = "3f1ecc1ba";
+const pdfjsVersion = "5.2.184";
+const pdfjsBuild = "8e82502d6";
 globalThis.pdfjsWorker = {
   WorkerMessageHandler: WorkerMessageHandler
 };

@nicolo-ribaudo
Copy link
Contributor Author

Oh well, the unused code removal is working properly, so the tests for the transform need to be updated to actually use those functions :)

@Snuffleupagus Snuffleupagus changed the title Remove unused OpenJPG wasm fallback logic Remove unused OpenJPEG wasm fallback logic May 13, 2025
Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also fix the typo in the commit message: OpenJPG -> OpenJPEG.

Emscripten generates code that allows the caller to provide the Wasm
module (thorugh Module.instantiateWasm), with a fallback in case
.instantiateWasm is not provided. We always define instantiateWasm, so
we can hard-code the check and let our dead code elimination logic
remove the unused fallback.

This commit also improved the dead code elimination logic so that if
a function declaration becomes unused as a result of removing dead
code, the function itself is removed.
@nicolo-ribaudo
Copy link
Contributor Author

Rebased to include the changes that removed import.meta.url from qcms.

I couldn't find a nice way to test for this, instead I made the webpack build fail (after emitting the files) if the generated bundle contains new URL(..., import.meta.url)).

@Snuffleupagus
Copy link
Collaborator

/botio test

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/8630af930651a96/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/f66c94119c82701/output.txt

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/f66c94119c82701/output.txt

Total script time: 30.56 mins

  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: Passed

@moz-tools-bot
Copy link
Collaborator

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/8630af930651a96/output.txt

Total script time: 61.79 mins

  • Unit tests: Passed
  • Integration Tests: FAILED
  • Regression tests: Passed

Copy link
Collaborator

@Snuffleupagus Snuffleupagus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me, thank you.

@Snuffleupagus Snuffleupagus merged commit d4d0081 into mozilla:master May 14, 2025
9 checks passed
@nicolo-ribaudo nicolo-ribaudo deleted the fix-new-url branch May 14, 2025 10:22
@johan44co
Copy link

Potentially, the removal of the fallback is making issues now:

Bildschirmfoto 2025-07-01 um 10 59 35

@nicolo-ribaudo
Copy link
Contributor Author

This PR only removes unused code. It seems like you forgot to pass the wasmUrl parameter? (It's required since v5.0)

@cjlozanoDev
Copy link

This PR only removes unused code. It seems like you forgot to pass the wasmUrl parameter? (It's required since v5.0)

Captura de pantalla 2025-07-03 a las 18 22 27

It happens to me exactly like the guy above.

In my code I have this:

import * as pdfjsLib from 'pdfjs-dist'
import 'pdfjs-dist/build/pdf.worker.min.mjs'

const loadPdf = async () => {
const canvas = pdfCanvas.value
const context = canvas.getContext('2d')

const loadingTask = pdfjsLib.getDocument({ data: base64ToUint8Array(pdfBase64.value) })

......

i tried copying the openjpeg.wasm file from the library's wasm folder from node_modules to my public folder. Then I did this:

const loadingTask = pdfjsLib.getDocument({
data: base64ToUint8Array(pdfBase64.value),
workerOptions: {
     wasmUrl: '/openjpeg.wasm',  // copy from node_modules/pdfjsdist in my directory public
     wasmFallbackUrl: '/openjpeg_nowasm_fallback.js',  // copy from node_modules/pdfjsdist in my directory public
     standardFontDataUrl: '/standard_fonts/' // copy from node_modules/pdfjsdist in my directory public
   }
})

I've tried many ways and can't get it to work. With the previous version, I was able to load PDFs with JPX images without any problem.

Can you please tell me how to configure this now?

Thank you!

@johan44co
Copy link

The issue was solved by passing the wasmUrl parameter, the wasm folder can be found in the pdfjs-dist package or using a CDN:

pdfjs.getDocument({
  url: "...",
  wasmUrl: `https://unpkg.com/pdfjs-dist@${pdfjs.version}/wasm/`,
})

@cjlozanoDev
Copy link

The issue was solved by passing the wasmUrl parameter, the wasm folder can be found in the pdfjs-dist package or using a CDN:

pdfjs.getDocument({
  url: "...",
  wasmUrl: `https://unpkg.com/pdfjs-dist@${pdfjs.version}/wasm/`,
})

Thanks a lot.

I was going to comment that I'd solved it. In my case, I copied the wasm from the pdfjsdist folder in Node Modules to a directory in my project. In a Vite project, I copy it directly to public. With Webpack, if there's no public folder, I use a plugin to configure the copy to the desired folder and have it compiled. This way, it'll be available at runtime (so it's a URL). Finally, as you've mentioned, I pass the wasmUrl parameter pointing to this folder, which is copied to the configured address. Thanks again for your response.

Juneezee added a commit to AlphaCloudTechnologies/pdfpea that referenced this pull request Dec 30, 2025
Opening a PDF with JPEG 2000 images causes these warnings:

  Warning: JpxImage#instantiateWasm: UnknownErrorException: Ensure that the `wasmUrl` API parameter is provided.
  Warning: JpxImage#getJsModule: TypeError: Failed to resolve module specifier 'nullopenjpeg_nowasm_fallback.js'
  Warning: Unable to decode image "img_p0_2": "JpxError: OpenJPEG failed to initialize".

In version 5.3.31 [1], the OpenJPEG wasm fallback logic was removed in
PR mozilla/pdf.js#19923. The wasmUrl option is required since v5 [2] to
decode JPEG 2000 images.

There are two solutions to this:

	1. Update Webpack config to copy the
	   node_modules/pdfjs-dist/wasm folder to static folder.
	2. Load the openjpeg.wasm file from a CDN like jsDelivr.

Method 2 is preferred because jsDelivr CDN sets the Cache-Control header
and these resources will be cached by the browser correctly.

This commit fixes the bug by providing the wasmUrl to jsDelivr CDN. We also
provide the cMapUrl and iccUrl to avoid similar warnings from happening
in the future.

[1]: https://github.com/mozilla/pdf.js/releases/tag/v5.3.31
[2]: https://github.com/mozilla/pdf.js/releases/tag/v5.0.375

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Juneezee added a commit to AlphaCloudTechnologies/pdfpea that referenced this pull request Dec 30, 2025
Opening a PDF with JPEG 2000 images causes these warnings:

  Warning: JpxImage#instantiateWasm: UnknownErrorException: Ensure that the `wasmUrl` API parameter is provided.
  Warning: JpxImage#getJsModule: TypeError: Failed to resolve module specifier 'nullopenjpeg_nowasm_fallback.js'
  Warning: Unable to decode image "img_p0_2": "JpxError: OpenJPEG failed to initialize".

In version 5.3.31 [1], the OpenJPEG wasm fallback logic was removed in
PR mozilla/pdf.js#19923. The wasmUrl option is required since v5 [2] to
decode JPEG 2000 images.

There are two solutions to this:

	1. Update Webpack config to copy the
	   node_modules/pdfjs-dist/wasm folder to static folder.
	2. Load the openjpeg.wasm file from a CDN like jsDelivr.

Method 2 is preferred because jsDelivr CDN sets the Cache-Control header
and these resources will be cached by the browser correctly.

This commit fixes the bug by providing the wasmUrl to jsDelivr CDN. We also
provide the cMapUrl and iccUrl to avoid similar warnings from happening
in the future.

[1]: https://github.com/mozilla/pdf.js/releases/tag/v5.3.31
[2]: https://github.com/mozilla/pdf.js/releases/tag/v5.0.375

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Invalid new URL usage in build/pdf.worker.mjs

6 participants