Comments: Allow WP_Comment_Query fields to select arbitrary columns#11934
Comments: Allow WP_Comment_Query fields to select arbitrary columns#11934dd32 wants to merge 3 commits into
Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
9115f83 to
5f33400
Compare
Extends the `fields` argument of `WP_Comment_Query` to accept:
- A column name from the `$wpdb->comments` table (e.g.
`'comment_post_ID'`). Returns a flat array of distinct values for
that column.
- A `'col_a=>col_b'` pair (e.g. `'comment_ID=>comment_post_ID'`).
Returns an associative array keyed by col_a's value, with col_b's
value as the entry — mirroring `WP_Query` / `WP_Term_Query`'s
`'id=>parent'` idiom.
- An array of column names. Returns an array of `stdClass` objects
with the requested columns as properties (one entry per matched
comment, no deduplication).
The motivation is to avoid hydrating full `WP_Comment` objects when the
caller only needs a subset of columns. In one reference case
(gp-translation-helpers), hydrating ~32K comments to extract distinct
post IDs took ~6.6s; the equivalent raw SQL took ~610ms (~11x), all of
it from skipping hydration.
Single-column results apply `DISTINCT` in SQL so callers can drop the
surrounding `array_unique( wp_list_pluck( ... ) )`. The array form does
not — it returns one row per matched comment, like a hand-written
projection.
Field-selection results are cached separately from the base
comment-ID query, sharing the comment `last_changed` salt so any
comment mutation invalidates both layers together. A repeat call with
identical args runs zero SQL queries.
Column names must be passed in their exact case; the only exception is
the `ID` suffix of `comment_ID` / `comment_post_ID`, which is accepted
in any case.
`'ids'` and `''` retain their existing meaning; the new behaviour is
purely additive.
Props dd32.
See #65313.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5f33400 to
908e631
Compare
westonruter
left a comment
There was a problem hiding this comment.
An initial review pass. I'll have some more suggestions, but I'm short on time for this week.
| * Return shapes: | ||
| * - `array( 'col', $column )` — single column, DISTINCT. | ||
| * - `array( 'map', $key_col, $val_col )` — `'col_a=>col_b'` associative map. | ||
| * - `array( 'list', $columns )` — array form, returns `stdClass[]`. |
There was a problem hiding this comment.
I was analyzing this method with PHPStan and this doesn't seem to be correct. It seems this is actually:
| * - `array( 'list', $columns )` — array form, returns `stdClass[]`. | |
| * - `array( 'list', $columns )` — array form, returns `string[]`. |
There was a problem hiding this comment.
I merged this, as it sounded right, and then Claude told me no-no.
Also worth re-checking: line 1048 was changed from returns stdClass[] to returns string[]. The user-facing return of the 'list' case is still stdClass[] (from $wpdb->get_results() default OBJECT mode). westonruter's PHPStan was describing the tuple payload (which holds string[] column names) — that's true but ambiguous in the prose docblock, which reads as "what the user gets back". Either keep stdClass[] and reword the line to make clear it describes the tuple's payload separately, or drop the "returns" suffix entirely.
There was a problem hiding this comment.
I don't understand. What I see is this:
$columns = array();
foreach ( $fields as $field ) {
$col = is_string( $field ) ? $this->parse_field_column( $field ) : null;
if ( null !== $col ) {
$columns[ $col ] = $col;
}
}
return $columns ? array( 'list', array_values( $columns ) ) : null;The $columns variable is only ever populated with the return value of $this->parse_field_column( $field ), in both its keys and values. That method returns a string (if not null).
How could it be stdClass?
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Drops the stray `*/` and duplicate `@param` on `parse_fields()` that made the file fail to parse. Promotes the list of `$wpdb->comments` columns to a `private const COLUMNS`, reused by both `parse_field_column()` and `parse_orderby()`. See #65313. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
c40fb88 to
6174765
Compare
Trac ticket: https://core.trac.wordpress.org/ticket/65313
Summary
Extends
WP_Comment_Query'sfieldsargument to accept:$wpdb->comments(e.g.'comment_post_ID') — returns a flat array of distinct values for that column.'col_a=>col_b'pair (e.g.'comment_ID=>comment_post_ID') — returns an associative array keyed by col_a's value with col_b's value as the entry, mirroring the'id=>parent'idiom inWP_QueryandWP_Term_Query.'ids'and''(default) are unchanged — the new behaviour is purely additive.Why
Plugins that want "the distinct
comment_post_IDs matching X" have three options today, none good:fields => ''+array_column( $q->comments, 'comment_post_ID' )— hydrates every matched comment just to discard all but one field.fields => 'ids'+get_comment( $id )->comment_post_IDin a loop — N+1 cache lookups.In the gp-translation-helpers reference case in the Trac ticket, hydrating ~32K comments to extract distinct post IDs took ~6.6s; the equivalent raw SQL returned the same 25,694 IDs in ~610ms (~11x speedup, all of it from skipping hydration).
API
Column names must be passed in their exact case. The only exception is the
IDsuffix ofcomment_ID/comment_post_ID, which is accepted in any case (comment_id,comment_Id,comment_ID). Unknown columns fall through to the default behaviour (WP_Commentobjects), keeping the path forward-compatible.Test plan
npm run test:php -- tests/phpunit/tests/comment/query.php— 166 tests / 489 assertions pass.npm run test:php -- --group comment— 540 tests / 1367 assertions pass (no regressions).'col_a=>col_b'map form returns associative array.IDsuffix on both sides of=>; strict case for non-IDcolumns.WP_Commentobjects.'ids'semantics unchanged.array()without a follow-up query.phpcs --standard=phpcs.xml.distclean on both changed files (only pre-existing warnings on untouched lines).🤖 Generated with Claude Code