Skip to content

[62953]: Adds post_ancestor query prop#8338

Open
PaulREnglish wants to merge 4 commits into
WordPress:trunkfrom
PaulREnglish:feature/62953-add-ancestor-query-prop
Open

[62953]: Adds post_ancestor query prop#8338
PaulREnglish wants to merge 4 commits into
WordPress:trunkfrom
PaulREnglish:feature/62953-add-ancestor-query-prop

Conversation

@PaulREnglish

@PaulREnglish PaulREnglish commented Feb 17, 2025

Copy link
Copy Markdown

Summary

Adds post_ancestor property to the query class and exposes it in the REST API. This property is used for fetching all descendants of a page. To see why this is needed, see the corresponding issue on Gutenberg:

Trac ticket: https://core.trac.wordpress.org/ticket/62953

Changelog

  • Adds post_ancestor property to the query class.

Testing Steps

To test that the property works in the REST API:

  • Create a page and ensure that page has children and grandchildren.
  • Using postman or otherwise, visit http://localhost:8889/wp-json/wp/v2/pages?parent=0&ancestor=<your page's id>
  • You should get an array of all the descendant pages.

To test that the property will work using the query block:

  • install the Gutenberg repo and switch to this branch: "PaulREnglish:feature/54461-query-block-parent-child-toggles"
  • Create a page and ensure that page has children and grandchildren.
  • Edit the ancestor page and insert a query block
  • Set the query block's post type to "page" and insert the "Parents" filter
  • Turn on the set parent as current page filter and the show all descendants filter
  • The query block should show all the descendant pages of the current page (it should not just be showing the child pages).
  • Save the page and go to frontend
  • All the descendant pages should be visible there too.

Image/Video

Screen.Capture.on.2025-02-17.at.15-59-57.mp4

This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

@github-actions

Copy link
Copy Markdown

Hi @PaulREnglish! 👋

Thank you for your contribution to WordPress! 💖

It looks like this is your first pull request to wordpress-develop. Here are a few things to be aware of that may help you out!

No one monitors this repository for new pull requests. Pull requests must be attached to a Trac ticket to be considered for inclusion in WordPress Core. To attach a pull request to a Trac ticket, please include the ticket's full URL in your pull request description.

Pull requests are never merged on GitHub. The WordPress codebase continues to be managed through the SVN repository that this GitHub repository mirrors. Please feel free to open pull requests to work on any contribution you are making.

More information about how GitHub pull requests can be used to contribute to WordPress can be found in the Core Handbook.

Please include automated tests. Including tests in your pull request is one way to help your patch be considered faster. To learn about WordPress' test suites, visit the Automated Testing page in the handbook.

If you have not had a chance, please review the Contribute with Code page in the WordPress Core Handbook.

The Developer Hub also documents the various coding standards that are followed:

Thank you,
The WordPress Project

@github-actions

Copy link
Copy Markdown

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@PaulREnglish PaulREnglish marked this pull request as ready for review February 20, 2025 12:34
@github-actions

github-actions Bot commented Feb 20, 2025

Copy link
Copy Markdown

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props bbpaule, peterwilsoncc.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@peterwilsoncc peterwilsoncc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the way get_page_children() works, this will return incorrect data when querying posts with different dates set. When the query also includes the number of found_rows, that number will also be incorrect as it's not reduced by filtering.

This test demonstrates the issue:

/**
 * Ensure post ancestor queries return correct posts when dates are set.
 *
 * @ticket 62953
 */
public function test_query_post_ancestor_with_post_dates() {
	register_post_type( 'ancestor_test', array( 'hierarchical' => true ) );
	$date = strtotime( '2007-01-01 00:00:00' );

	$top_level_id = self::factory()->post->create(
		array(
			'post_title' => 'Top Level Post',
			'post_date'  => gmdate( 'Y-m-d H:i:s', $date ),
			'post_type'  => 'ancestor_test',
		)
	);

	// Create a mixed set of top level and descendant posts.
	$count            = 20;
	$parent           = $top_level_id;
	$descendant_posts = array();
	$top_level_posts  = array( $top_level_id );
	while ( --$count ) {
		$date = $date + DAY_IN_SECONDS;
		// Descendant post.
		$parent             = self::factory()->post->create(
			array(
				'post_title'  => "Child $count",
				'post_date'   => gmdate( 'Y-m-d H:i:s', $date ),
				'post_type'   => 'ancestor_test',
				'post_parent' => $parent,
			)
		);
		$descendant_posts[] = $parent;

		$date = $date + DAY_IN_SECONDS;
		// Top level post.
		$top_level_posts[] = self::factory()->post->create(
			array(
				'post_title' => "Top Level $count",
				'post_date'  => gmdate( 'Y-m-d H:i:s', $date ),
				'post_type'  => 'ancestor_test',
			)
		);
	}

	// Ensure the correct number of descendant and top level posts were created.
	$this->assertCount( 19, $descendant_posts, 'Test is expected to create 19 descendent posts.' );
	$this->assertCount( 20, $top_level_posts, 'Test is expected to create 20 top level posts.' );

	// Query all the descendant posts using post__in.
	$query1 = new WP_Query(
		array(
			'post__in'  => $descendant_posts,
			'post_type' => 'ancestor_test',
		)
	);

	$this->assertCount( 10, $query1->posts, 'Querying by post__in is expected to return 10 descendent posts.' );
	$this->assertSame( 19, $query1->found_posts, 'Querying by post__in is expected to find a total of 19 descendent posts.' );

	// Query all the descendant posts using post_ancestor.
	$query2 = new WP_Query(
		array(
			'post_ancestor' => $top_level_id,
			'post_type'     => 'ancestor_test',
		)
	);

	$this->assertCount( 10, $query2->posts, 'Querying by post_ancestor is expected to return 10 descendent posts.' );
	$this->assertSame( 19, $query2->found_posts, 'Querying by post_ancestor is expected to find a total of 19 descendent posts.' );
}

In order for this to have the desired effect, it would need to make a series of database queries to find the complete tree of descendants (regardless of the limit passed to WP_Query) and following that, make a post__in query for the entire tree.

This recursion would result in a significant performance impact on sites, especially those with deep trees of posts, such as some of the handbooks on WordPress.org.

As a hacky test to demonstrate the number of queries, the following fails as there are 20 database hits rather than the usual four in WP_Query:

public function get_descendants( $post_ids = array( 0 ), $post_type = 'page' ) {
	$descendants = array();

	$query = new WP_Query(
		array(
			'post_parent__in'        => $post_ids,
			'post_type'              => $post_type,
			'fields'                 => 'ids',
			'posts_per_page'         => -1,
			'no_found_rows'          => true,
			'update_post_term_cache' => false,
			'update_post_meta_cache' => false,
			'ignore_sticky_posts'    => true,
		)
	);

	$descendants = $query->posts;
	if ( ! empty( $query->posts ) ) {
		$descendants = array_merge( $descendants, $this->get_descendants( $query->posts, $post_type ) );
	}

	return $descendants;
}

/**
 * @ticket 62953
 */
public function test_demonstrating_number_of_queries() {
	register_post_type( 'ancestor_test', array( 'hierarchical' => true ) );
	$date = strtotime( '2007-01-01 00:00:00' );

	$top_level_id = self::factory()->post->create(
		array(
			'post_title' => 'Top Level Post',
			'post_date'  => gmdate( 'Y-m-d H:i:s', $date ),
			'post_type'  => 'ancestor_test',
		)
	);

	// Create a mixed set of top level and descendant posts.
	$count            = 20;
	$parent           = $top_level_id;
	$descendant_posts = array();
	$top_level_posts  = array( $top_level_id );
	while ( --$count ) {
		$date = $date + DAY_IN_SECONDS;
		// Descendant post.
		$parent             = self::factory()->post->create(
			array(
				'post_title'  => "Child $count",
				'post_date'   => gmdate( 'Y-m-d H:i:s', $date ),
				'post_type'   => 'ancestor_test',
				'post_parent' => $parent,
			)
		);
		$descendant_posts[] = $parent;

		$date = $date + DAY_IN_SECONDS;
		// Top level post.
		$top_level_posts[] = self::factory()->post->create(
			array(
				'post_title' => "Top Level $count",
				'post_date'  => gmdate( 'Y-m-d H:i:s', $date ),
				'post_type'  => 'ancestor_test',
			)
		);
	}

	$query_count_start = get_num_queries();
	$descendants = $this->get_descendants( array( $top_level_id ), 'ancestor_test' );
	$query_count_end = get_num_queries();

	$this->assertCount( 19, $descendants, 'Expected to find 19 descendent posts.' );
	$this->assertSame( 4, $query_count_end - $query_count_start, 'Expected to make four database requests.' );
}

The ancestor prop now gets all posts before filtering out the descendant posts.
Fixes incorrect found_posts values when using ancestor prop.
@PaulREnglish

Copy link
Copy Markdown
Author

Hi @peterwilsoncc! Thanks your insights! It is much appreciated :D

Due to the way get_page_children() works, this will return incorrect data when querying posts with different dates set. When the query also includes the number of found_rows, that number will also be incorrect as it's not reduced by filtering.

I did some digging and I don't think the problem is with dates. The query retrieves the expected number of posts from the database, but since there’s no filtering at that stage, it just grabs the first set of posts. But when the filtering is applied via get_page_children(), we may end up getting no posts returned. To fix this, I have applied a commit to gets all posts from the db before filtering it through get_page_children().

With this change the test you wrote should now pass. However, getting all posts from the db might have performance concerns. And if the db is large, PHP might run out of memory unless I split the results up.

In order for this to have the desired effect, it would need to make a series of database queries to find the complete tree of descendants (regardless of the limit passed to WP_Query) and following that, make a post__in query for the entire tree.

This recursion would result in a significant performance impact on sites, especially those with deep trees of posts, such as some of the handbooks on WordPress.org.

As an alternative, we could have a setting which limits the depth of which descendants to fetch. Hopefully. that would be enough to reduce the performance impact to suitable levels?

@peterwilsoncc

Copy link
Copy Markdown
Contributor

With this change the test you wrote should now pass. However, getting all posts from the db might have performance concerns. And if the db is large, PHP might run out of memory unless I split the results up.

Yeah, I am concerned about that.

As the code from the Gutenberg PR requires allowing the query to be added to the posts REST API endpoint, it also allows the general public to trigger an unlimited database query from an unauthenticated connection. Which, on large sites, could end up becoming a neat little way of DDOSing a server.

It would be possible to loop through pages and determine the tree with something along the lines of the following pseudo code but it would still generate a bunch of queries the maximum depth of the tree.

$descendants = array()

$top_level_post = 4;
$parent_ids = array( $top_level_post );

do {
  $parent_ids =  SELECT ID from wp_posts WHERE post_parent IN $parent_ids
  $descendants = array_merge( $descendants, $parent_ids );
} while ( ! empty( $parent_ids ) );

I'll continue to think about it but probably won't get much thinking done during the 6.8 release cycle.

@PaulREnglish

Copy link
Copy Markdown
Author

As the code from the Gutenberg PR requires allowing the query to be added to the posts REST API endpoint, it also allows the general public to trigger an unlimited database query from an unauthenticated connection. Which, on large sites, could end up becoming a neat little way of DDOSing a server.

Wow, I didn't even think of the DDOS possibilities. We definitely don't want that to happen 😅

It would be possible to loop through pages and determine the tree with something along the lines of the following pseudo code but it would still generate a bunch of queries the maximum depth of the tree.

Right now I'm leaning towards using this approach but imposing a soft limit on the depth it will search for to reduce the number of queries. I'll let you decide on what approach is best. Good luck with the 6.8 release cycle!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants