Address long-running queries and OOM contributor in PHP replacement by brandonpayton · Pull Request #180 · wp-cli/search-replace-command

brandonpayton · 2023-05-09T22:23:04Z

This change makes PHP replacement queries and memory usage more efficient in order to address issues we've encountered with long-running queries and OOM conditions with certain sites.

It improves the primary key SELECT queries by eliminating the use of OFFSET because OFFSET requires that the database consider all rows up to OFFSET before taking rows up to the LIMIT. This can make queries become slower as OFFSET increases. The new SELECT query relies on primary key conditions to more efficiently eliminate previous rows from consideration. This way, the database can use an index to identify rows with keys greater than those of the previous chunk and then take rows from that set up to the LIMIT.

It improves memory usage by doing updates along the way rather than storing all a column's updated values in memory until the end. At Automattic, when we limit search-replace to 4GB of memory, we sometimes exceed that limit for large sites. It's possible there are other things that contribute to high memory usage within the search-replace command, but as a first step, we can reduce memory requirements by no longer keeping all updated column values in memory simultaneously.

This is a resurrection of #176 which was closed so we could exercise these updates in production before sharing them with the community. The --regex path of this change has been used in production for the last 3-4 months, and we have encountered no issues so far.

Prior to this PR, there were tests for the chunked --precise search-replace but none for the chunked --regex path, so this PR adds tests for chunked --regex replacement based on the --precise tests.

danielbachhuber · 2023-05-09T22:28:02Z

Thanks, @brandonpayton !

I asked @schlessera for his thoughts about including this in the v2.8.0 release: https://wordpress.slack.com/archives/C02RP4T41/p1683671164356189

brandonpayton · 2023-05-10T00:57:22Z

Thanks, @brandonpayton !

I asked @schlessera for his thoughts about including this in the v2.8.0 release: https://wordpress.slack.com/archives/C02RP4T41/p1683671164356189

Thank you!

schlessera · 2023-05-10T15:27:41Z

GHA is acting up... :/

brandonpayton · 2023-05-10T15:51:08Z

Should I manually rebase and squash this one before merge? I guess it could be a way to try triggering GHA at least, but asking first so I don't get in your way.

This change makes PHP replacement queries and memory usage more efficient. It improves the primary key SELECT queries by eliminating the use of OFFSET because OFFSET requires that the database consider all rows up to OFFSET before taking rows up to the LIMIT. The new query relies on primary key conditions to more efficiently eliminate previous rows from consideration. This way, the database can use an index to identify rows with keys greater than those of the previous chunk. It improves memory usage by doing updates along the way rather than storing all a column's updates in memory until the end. At Automattic, when we limit search-replace to 4GB of memory, we sometimes exceed that limit for large sites. It's possible there are other things that contribute to high memory usage within the search-replace command, but as a first step, we can reduce memory requirements by no longer keeping all updated column values in memory simultaneously.

brandonpayton · 2023-05-10T16:45:27Z

Squashed and pushed. Planning to leave this be for now unless you ask for other changes :) Didn't seem to trigger any actions.

schlessera · 2023-05-11T08:09:54Z

Thanks for the PR and the extended testing, @brandonpayton !

brandonpayton requested a review from a team as a code owner May 9, 2023 22:23

brandonpayton mentioned this pull request May 10, 2023

Fix regex invalid delimiter test for PHP 8.2 #181

Merged

schlessera added the command:search-replace label May 10, 2023

schlessera added this to the 2.0.21 milestone May 10, 2023

schlessera approved these changes May 10, 2023

View reviewed changes

schlessera force-pushed the make-php-replacement-more-efficient branch from 1b70852 to 3872c8d Compare May 10, 2023 15:23

brandonpayton force-pushed the make-php-replacement-more-efficient branch from c7d639b to 56faa0c Compare May 10, 2023 16:43

Retrigger checks

089244d

schlessera merged commit c238204 into wp-cli:main May 11, 2023

brandonpayton deleted the make-php-replacement-more-efficient branch May 11, 2023 15:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address long-running queries and OOM contributor in PHP replacement#180

Address long-running queries and OOM contributor in PHP replacement#180
schlessera merged 2 commits intowp-cli:mainfrom
brandonpayton:make-php-replacement-more-efficient

brandonpayton commented May 9, 2023

Uh oh!

danielbachhuber commented May 9, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

schlessera commented May 10, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

schlessera commented May 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

brandonpayton commented May 9, 2023

Uh oh!

danielbachhuber commented May 9, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

schlessera commented May 10, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

brandonpayton commented May 10, 2023

Uh oh!

schlessera commented May 11, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants