Page MenuHomePhabricator

Parsoid fails on PHP 8.4+ during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found'
Closed, ResolvedPublic

Description

See: https://integration.wikimedia.org/ci/job/quibble-for-mediawiki-core-vendor-mysql-php84/5/console

15:47:42 [250e8a55b4df40e8c3d5f0dd] [no req]   Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found
15:47:42 Backtrace:
15:47:42 from /workspace/src/vendor/wikimedia/parsoid/src/Utils/DOMCompat.php(112)
15:47:42 #0 /workspace/src/includes/Parser/ContentHolder.php(96): Wikimedia\Parsoid\Utils\DOMCompat::newDocument()
15:47:42 #1 /workspace/src/includes/Parser/ParserOutput.php(368): MediaWiki\Parser\ContentHolder::createEmpty()
15:47:42 #2 /workspace/src/includes/Output/OutputPage.php(439): MediaWiki\Parser\ParserOutput->__construct()
15:47:42 #3 /workspace/src/includes/Context/RequestContext.php(331): MediaWiki\Output\OutputPage->__construct()
15:47:42 #4 /workspace/src/includes/Setup.php(571): MediaWiki\Context\RequestContext->getOutput()
15:47:42 #5 /workspace/src/maintenance/doMaintenance.php(71): require_once(string)
15:47:42 #6 /workspace/src/maintenance/install.php(278): require_once(string)
15:47:42 #7 {main}
15:47:42 <<< Finish: Install MediaWiki, db=<MySQL /workspace/db/quibble-mysql-cd7j8qln/socket>, in 0.191 s

Event Timeline

Confirming that this is seen on e.g. https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1198563 with https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-php84/26/console:

00:03:12.074 php maintenance/install.php --scriptpath= --server=http://127.0.0.1:9413 --dbtype=mysql --dbname=wikidb --dbuser=wikiuser --dbpass=secret --dbserver=localhost:/workspace/db/quibble-mysql-mt_y5fb4/socket --with-extensions --pass=testwikijenkinspass TestWiki WikiAdmin
00:03:12.074 [7dd8836f0ce44a852b302ce4] [no req]   Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found
00:03:12.074 Backtrace:
00:03:12.074 from /workspace/src/vendor/wikimedia/parsoid/src/Utils/DOMCompat.php(112)
00:03:12.074 #0 /workspace/src/includes/Parser/ContentHolder.php(96): Wikimedia\Parsoid\Utils\DOMCompat::newDocument()
00:03:12.074 #1 /workspace/src/includes/Parser/ParserOutput.php(368): MediaWiki\Parser\ContentHolder::createEmpty()
00:03:12.074 #2 /workspace/src/includes/Output/OutputPage.php(439): MediaWiki\Parser\ParserOutput->__construct()
00:03:12.074 #3 /workspace/src/includes/Context/RequestContext.php(331): MediaWiki\Output\OutputPage->__construct()
00:03:12.074 #4 /workspace/src/includes/Setup.php(571): MediaWiki\Context\RequestContext->getOutput()
00:03:12.074 #5 /workspace/src/maintenance/doMaintenance.php(71): require_once(string)
00:03:12.074 #6 /workspace/src/maintenance/install.php(278): require_once(string)
00:03:12.074 #7 {main}
Jdforrester-WMF renamed this task from "check experimental" fails with Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found (php8.4) to Parsoid fails PHP 8.4 during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found'.Dec 5 2025, 5:56 PM

This was reported on mediawiki.org support desk, when someone attempted to upgrade to 1.45. https://www.mediawiki.org/wiki/Project:Support_desk#1.45.0_update.php_can't_find_Parsoid

I'm guessing this is because php 8.4 introduces a HTMLDocument of their own ?

I had the same issue. Solved this by enabling "extension=sodium" in php.ini (in PHP folder, using PHP 8.4.15).

Enabling "extension=sodium" in php.ini makes absolutely no difference here, using PHP 8.4.7.

in fact this is the first time this occurred to me ... but i upgraded to PHP 8.5 AND 1.45.1 in parallel .... never a good thing to doing 2 changes in one step ....

The problem is that Wikimedia\Parsoid\DOM\HTMLDocument isn't properly autoloaded. I solved it temporarily like this:

require_once('vendor/wikimedia/parsoid/src/DOM/HTMLDocument.php');

Insert this line right before $doc = HTMLDocument::createEmpty( "UTF-8" ); in vendor/wikimedia/parsoid/src/Utils/DOMCompat.php.

My installation failed completely.
I asked upgrader to make a backup that I can't find.
https://empty3.one/wikilibre/index.php/Sp%C3%A9cial:Connexion

Reedy triaged this task as High priority.Sat, Dec 20, 1:35 PM

Change #1219970 had a related patch set uploaded (by Reedy; author: Reedy):

[mediawiki/services/parsoid@master] DOMCompat: Force late autoload on PHP84 for HTMLDocument

https://gerrit.wikimedia.org/r/1219970

Perhaps worth noting that I do get this error if I clone the mediawiki/vendor repo, and use that as MediaWiki's vendor folder; but I don't get the error after running composer update from the mediawiki/core repository itself.
I can reproduce the problem if I instead run composer update -a / composer dump-autoload -a, though. So this error occurs if authoritative class maps are enabled within Composer (which they are for MediaWiki-Vendor), and it doesn't if they're not.

This could probably be solved by adding

		"files": [
			"src/DOM/DOMException.php",
			"src/DOM/DOMImplementation.php",
			"src/DOM/DOMParser.php",
			"src/DOM/HTMLDocument.php"
		]

to the autoload property in Parsoid's composer.json (as, AFAICS, the files within src/DOM that don't declare a class/interface/etc., and therefore - IIUC - aren't being picked up by Composer when it generates a class map), and I could get a patch up for that if that was desired. However, the question would probably be about whether this issue should be solved that way, given that (IIUC from reading https://getcomposer.org/doc/04-schema.md#files) this would result in these files being unconditionally loaded on every request.

That seems a reasonable way forward to me.. I guess it’s what it is there for

We will be able to clean it up when PHP support version bumps further

Please make a patch!

Change #1220366 had a related patch set uploaded (by A smart kitten; author: A smart kitten):

[mediawiki/services/parsoid@master] composer.json: Add some files in `src/DOM` to `autoload.files`

https://gerrit.wikimedia.org/r/1220366

Change #1219970 abandoned by Reedy:

[mediawiki/services/parsoid@master] DOMCompat: Force late autoload on PHP84 for HTMLDocument

https://gerrit.wikimedia.org/r/1219970

Novem_Linguae renamed this task from Parsoid fails PHP 8.4 during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found' to Parsoid fails PHP 8.4 CI during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found'.Mon, Jan 5, 11:02 AM
A_smart_kitten renamed this task from Parsoid fails PHP 8.4 CI during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found' to Parsoid fails on PHP 8.4+ during MW installation with 'Error: Class "Wikimedia\Parsoid\DOM\HTMLDocument" not found'.Mon, Jan 5, 12:35 PM

This could probably be solved by adding

		"files": [
			"src/DOM/DOMException.php",
			"src/DOM/DOMImplementation.php",
			"src/DOM/DOMParser.php",
			"src/DOM/HTMLDocument.php"
		]

to the autoload property in Parsoid's composer.json (as, AFAICS, the files within src/DOM that don't declare a class/interface/etc., and therefore - IIUC - aren't being picked up by Composer when it generates a class map), and I could get a patch up for that if that was desired. However, the question would probably be about whether this issue should be solved that way, given that (IIUC from reading https://getcomposer.org/doc/04-schema.md#files) this would result in these files being unconditionally loaded on every request.

Update: After discussion on the patch, I think we're going instead with the option of adding class declarations wrapped in if ( false ) to these four files.

Change #1220366 merged by jenkins-bot:

[mediawiki/services/parsoid@master] Add (unreachable) class declarations to four files in `src/DOM`

https://gerrit.wikimedia.org/r/1220366

Change #1224160 had a related patch set uploaded (by Reedy; author: A smart kitten):

[mediawiki/services/parsoid@REL1_45] Add (unreachable) class declarations to four files in `src/DOM`

https://gerrit.wikimedia.org/r/1224160

Change #1224160 merged by jenkins-bot:

[mediawiki/services/parsoid@REL1_45] Add (unreachable) class declarations to four files in `src/DOM`

https://gerrit.wikimedia.org/r/1224160

The problem is that Wikimedia\Parsoid\DOM\HTMLDocument isn't properly autoloaded. I solved it temporarily like this:

require_once('vendor/wikimedia/parsoid/src/DOM/HTMLDocument.php');

Insert this line right before $doc = HTMLDocument::createEmpty( "UTF-8" ); in vendor/wikimedia/parsoid/src/Utils/DOMCompat.php.

It really helps, thank you. Although articles and its previews doesn't load properly, but you can workaround it that way: open DOMBuilder.php in vendor/wikimedia/remex-html/src/DOM, find this string:

$this->domImplementation = $options['domImplementation'] ??

and add this above it:

require_once('vendor/wikimedia/parsoid/src/DOM/DOMImplementation.php');

After that your wiki will be work fully on newer PHP.

Please note that we've fixed this properly in the release branches, so these hacks won't be necessary when those releases are made.

Change #1225504 had a related patch set uploaded (by Isabelle Hurbain-Palatin; author: Isabelle Hurbain-Palatin):

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.23.0-a11

https://gerrit.wikimedia.org/r/1225504

Change #1225504 merged by jenkins-bot:

[mediawiki/vendor@master] Bump wikimedia/parsoid to 0.23.0-a11

https://gerrit.wikimedia.org/r/1225504

OK, so AIUI remaining work here on this task:

  • Backport 704745 and 1220366 to REL1_44, REL1_43 (REL1_45 already done in [original patch and [https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/1224160|1224160]] respectively).
  • For each version of MW,
    • Make a release of Parsoid with the above patch
    • Pull the new release into MW core and MW vendor
    • Apply the patch for T413573 for core
    • Apply the corresponding patch(es) for extensions/skins, once they exist, that are in the tarball.
  • When the next Quarterly release of MW happens, announce PHP 8.4 being supported, and (only then) adjust the on-wiki claims about PHP 8.4 support.

Does that sound right?

  • Backport 1220366 to REL1_44, REL1_43 (REL1_45 already done in 1224160).

The Parsoid code in question was - AFAICS - introduced in a patch that was first released in MW-1.45-release; so I don't know if any more backports are actually needed for this task :)

  • Backport 1220366 to REL1_44, REL1_43 (REL1_45 already done in 1224160).

The Parsoid code in question was - AFAICS - introduced in a patch that was first released in MW-1.45-release; so I don't know if any more backports are actually needed for this task :)

Yes, sorry, the original PHP 8.4 support patches need to be backported as well as your fix to make it work. Edited!

Ah, okay :] In which case, I kinda defer to the Parsoid team on whether https://gerrit.wikimedia.org/r/c/mediawiki/services/parsoid/+/704745 should be backported. From reading the commit message, it's unclear to me on whether that's necessary for Parsoid to work properly on >=PHP 8.4, or whether it (e.g.) modifies Parsoid to use some features that are newly available in PHP 8.4.
However... if it's not the former, I'm personally currently quite cautious about the idea of backporting it to 1.44/1.43; especially as (from testing locally just now) it doesn't seem to necessarily be backwards-compatible (and I'm not sure if we should be backporting code that doesn't have backwards-compatibility). Taking the PHPUnit tests I ran in T413573#11509656 as an example, they currently succeed for me locally when running PHP 8.4 on REL1_44; however, if I cherry-pick 704745 into REL1_44's vendor/wikimedia/parsoid (& fix some merge conflicts), 19 of the same tests then fail.