Skip to content

2.x post collections#2311

Merged
acobster merged 37 commits into2.x-factoriesfrom
2.x-post-collections
Aug 26, 2020
Merged

2.x post collections#2311
acobster merged 37 commits into2.x-factoriesfrom
2.x-post-collections

Conversation

@acobster
Copy link
Copy Markdown
Collaborator

@acobster acobster commented Aug 17, 2020

@jarednova @gchtr OK, let's try this again. This is the second attempt at addressing #2308. This is a continuation of #2310 which messed up the history of the branches with a bad merge.

Summary

Right now, the logic for building and running WP_Querys is spread around PostQuery, PostGetter, QueryIterator, and elsewhere. Part of the goal of Factories is to consolidate all of that logic in one place: PostFactory. To allow that, PostQuery needs to give up some power, because right now it queries for posts itself via PostGetter::query_posts(). But if the Factory already knows what it's dealing with, there's no need for this - the factory can simply instantiate a WP_Query directly and wrap it with a PostQuery. Then there's exactly one class responsible for querying for collections of posts.

What this means

  • As of this PR, PostQuery now extends ArrayObject directly and implements the PostCollectionInterface.
  • This means we can also get rid of the PostCollection class completely - just need to make sure the ::maybe_set_preview() logic makes it out of there.
  • I think we can get rid of QueryIterator. More on that below.
  • No more direct instantiation of Timber\PostQuery except when passing a WP_Query.
  • I think we can do away with the array wrapping the WP_Query in an array (i.e. [ 'query' => $wp_query ]) when passing to the constructor. The post_class option is a precursor to the new Class Maps design and the merge_default seems like something that would better be implemented as an option to Timber::get_posts().
  • As a nice bonus, we can save on performance and insantiate Posts inside of collection lazily, meaning while we already have the WP_Post objects they wrap, the Timber\Post instances themselves are not actually created until they're needed. See PostsIterator for how this works.
  • Finally, we can completely get rid of PostGetter. I want to be careful here as a lot of high-level functionality expected from the Timber class is tested at the level of PostGetter instead, so I just want to be extra mindful about keeping that test coverage without lifting redundant tests up to the Timber level.

A note about QueryIterator

As far as I can tell, this class has four responsibilities:

  1. Implementing logic around WP_Query that (mostly) lives in PostFactory now
  2. Of the special query logic mentioned above, the stuff that doesn't live in PostFactory are various query filter callbacks, namely fix_number_posts_wp_quirk, fix_cat_wp_quirk, and handle_maybe_custom_posts_page. This stuff should probably be moved
  3. Implementing a way to iterate over and count WP_Query results, which we already have in PostQuery.
  4. Creating Pagination objects, which again, we can do directly in PostQuery (which is added in this PR).

If I'm missing something here and we need this class after all, please let me know! But if we can move out the logic we want to keep I'm hoping we can just get rid of it.

The Plan

This PR isn't quite ready to merge but it seemed like a good stopping point to have this wider design discussion, since the high-level pieces are there. But quite a few tests are broken or skipped. I imagine I'll keep skipping many of them until I'm able to switch the Posts API altogether to using Factories, but some are probably easy fixes.

I'm leaning towards getting this to a good transitional point with Collections and merging this in then, before fully switching to Factories. Just don't want the PR to get too huge. Thoughts?

@acobster
Copy link
Copy Markdown
Collaborator Author

ORIGINAL COMMENT FROM @gchtr 2020-08-17 #2310 (comment)

Up until here, what you write makes sense to me. But I also have to say that the exact inner workings of all the classes that are in involved here were always a bit hard to keep track of for me. So if we can reduce the complexity here, that would be great 👍.

* I _think_ we can do away with the array wrapping the `WP_Query` in an array (i.e. `[ 'query' => $wp_query ]`) when passing to the constructor. The `post_class` option is a precursor to the new Class Maps design and the `merge_default` seems like something that would better be implemented as an option to `Timber::get_posts()`.

Yes! I think I already partially worked on this in #2145. Maybe you could build on that.

* As a nice bonus, we can save on performance and insantiate Posts inside of collection lazily, meaning while we already have the `WP_Post` objects they wrap, the `Timber\Post` instances themselves are not actually created until they're needed. See `PostsIterator` for how this works.

Doesn’t this have big implications on caching if the actual work is only done after fetching cached posts in PHP templates during Timber::render() or earlier?

And then this is probably also related to handling posts in multisite environments (see #1434 and #2207).

And from what I see in issues and StackOverflow questions, a lot of developers are relying on dumping post objects to see what’s in there. This of course wouldn’t be reliable anymore. I’m not saying I wouldn’t add lazy-loading, but we need to check what it means for other parts of Timber and probably add documentation about this in various places.

A related example: we already fetch meta values lazily. A lot of developers were used to work with {{ dump(post.custom) }} to see what’s there. Now, this won’t work anymore. See https://timber.github.io/docs/v2/guides/custom-fields/#accessing-all-meta-values.

If I'm missing something here and we need this class after all, please let me know! But if we can move out the logic we want to keep I'm hoping we can just get rid of it.

I’m all in favor of simplifying this. What I figured when doing more complex stuff with iterators is that as soon as we have factories ready, we can move some logic regarding globals into Post::setup() method.

One specific example is this class: https://github.com/mindkomm/timber-integration-woocommerce/blob/master/lib/ProductsIterator.php. We discussed this in mindkomm/timber-integration-woocommerce#6. With factories, we can get rid of it completley and a WooCommerce product object might solve WooCommerce globals like this:

class Product extends Timber\Post {
    public function setup( $loop_index = 0 ) {
        parent::setup( $loop_index );

        wc_setup_product_data( $this->ID );

        return $this;
    }
}		

</ORIGINAL COMMENT>

@acobster
Copy link
Copy Markdown
Collaborator Author

I think I already partially worked on this in #2145. Maybe you could build on that.

Awesome! I will take a look there, thanks!

Doesn’t this have big implications on caching if the actual work is only done after fetching cached posts in PHP templates during Timber::render() or earlier?

I don't think I understand what you mean. :) Which code are you referring to as "fetching cached posts," and how is that affected by this change?

I agree we'd want to document the effects of the change if it actually affects the behavior of the API. But as far as I can tell this will be completely transparent to the caller. That is, by the time you have a post var within a Twig template, you've already done all the work that's needed. And, to be clear, this only affects how Posts are instantiated in PostQuery for now; in every other context they're instantiated eagerly. So I'm not sure why {{ dump(post.custom) }} wouldn't work.

Again, maybe I'm missing something...wouldn't be the first time. :) I'm also cool with being cautious and reverting the lazy instantiation piece for now. If I'm right and it's truly transparent, that means we can roll it out after 2.0 lands.

I’m all in favor of simplifying this. What I figured when doing more complex stuff with iterators is that as soon as we have factories ready, we can move some logic regarding globals into Post::setup() method.

Makes sense to me! Good idea.

Coby Tamayo added 3 commits August 17, 2020 18:30
Also fix the seemingly unrelated testMenuItemMetaProperty(), which was failing
because $item->id (rather than $item->ID) was unset. Not sure why this was not
an issue before...
@gchtr
Copy link
Copy Markdown
Member

gchtr commented Aug 18, 2020

I don't think I understand what you mean. :) Which code are you referring to as "fetching cached posts," and how is that affected by this change?

I think the behavior is not really affected by this change. I realize it’s probably something related to Timber in general.

$posts = Timber::get_posts( [
    'post_type' => 'book',
] );

var_dump( $posts );

If we dump the collection, this might not contain all the data we need for posts, because posts didn’t run through a factory yet. For example, I couldn’t check here whether the correct Post class from the class maps was used.

And about the cache. Imagine a custom cache mechanism:

if ( false = get_posts_from_cache() ) {
    $posts = Timber::get_posts( [
        'post_type' => 'book',
    ] );

	set_cache_for_posts( $posts );
}

This will cache the results from the query, but not the posts themselves, because they didn’t run throught a factory yet. Caching only the query might not be want you want. Maybe you want to cache all the data. This apparently is no problem when using Timber::render(), because there, we cache the output and not the data.

Another example is maybe if you want to use post data in JavaScript:

$posts = Timber::get_posts( [
    'post_type' => 'book',
] );

// Don’t do this.
$posts_json = wp_json_encode( $posts );

// Do this.
$posts_json = [];

foreach ( $posts as $post ) {
    $posts_json[] = [
        'ID'   => $post->ID,
        'link' => $post->link(),
        // And other data you need…
    ];
}

$posts_json = wp_json_encode( $posts_json );

That is all related to issues like #2207.

Again, maybe I'm missing something...wouldn't be the first time. :) I'm also cool with being cautious and reverting the lazy instantiation piece for now. If I'm right and it's truly transparent, that means we can roll it out after 2.0 lands.

I also want to confirm I’m not missing something :D. I guess I’m merely thinking out loud and I should handle this in a different issue, which is about documentation, because it has effects which cause issues many developers have run into. Let’s keep it lazy!

I will have a look at the rest later today.

@coveralls
Copy link
Copy Markdown

coveralls commented Aug 18, 2020

Coverage Status

Coverage decreased (-0.6%) to 89.507% when pulling 0985f82 on 2.x-post-collections into b0d086a on 2.x-factories.

@acobster
Copy link
Copy Markdown
Collaborator Author

@gchtr gotcha. I looked into this and you're right about {{ dump(posts) }} (plural) - it will be the uninstantiated version. I assumed it would implicitly iterate over the internal array but I was wrong. We should be able to override this behavior by implementing a custom __debugInfo() method. Unfortunately, due to a bug in PHP <= 7.3, this is not possible when extending SPL classes like ArrayObject. So we should definitely document this somewhere.

The good news is that JSON serialization and on-demand eager instantiation are both easy to implement. See latest updates, particularly the realize() method. I'm borrowing terminology from Clojure, which has various flavors of laziness built in. Dumping individual posts when iterating or indexing directly (i.e. $coll[0]) will perform as expected because those instances are realized.

Quick aside about your multisite example - it got me thinking so I wrote up #2312 .

Feeling much better about the state of this PR now. I'd still like to get @jarednova's input on the general direction, but for now my only remaining question is: should I also implement laziness on PostArrayObject? It would be pretty simple, and the biggest hurdle is just implementing a custom ArrayIterator that does the same thing as PostsIterator minus the WP loop stuff. Unless you see a reason not to?

@acobster
Copy link
Copy Markdown
Collaborator Author

acobster commented Aug 20, 2020

@gchtr @jarednova I think I was mistaken and we actually do want to use exactly the same logic from PostsIterator for both kinds of Post Collections, to abstract over The Loop. Right?? This is how PostCollection currently works. What's different between the two is mainly pagination and origin (WP_Query vs a vanilla array).

If that's the case, it's good news because it means there's even less to implement, and the latest few commits should round this out (minus any outstanding details)!

@acobster acobster removed the wip Work in Progress label Aug 20, 2020
@gchtr
Copy link
Copy Markdown
Member

gchtr commented Aug 20, 2020

I really really like the way we’re going here 🎉

One path worth considering here is whether using the WP_DEBUG flag is helpful to assist with outputting as much useful data as possible in var_dump() while not compromising on performance when a site goes live.

IIRC the dump Twig function only works if WP_DEBUG is falsey? Or are you talking about explicitly realizing posts internally if WP_DEBUG is truthy? Definitely an avenue worth exploring, I'd just want to be careful because it could lead to an implicit dependency on environment-specific behavior, and suddenly it doesn't work in production and we've created more confusion than we've solved. 😝

I wouldn’t do that either, because I already see new issues popping up because of it.

Yes! I just pushed a big update to the Posts guide with details about collections and laziness. Would love support on filling in the gaps!

I looked at it, very nice! Thanks man!

Yeah, in addition to providing some sensible default behavior in Timber\Post (I just wanted to avoid touching Post too much in this PR, and just focus on collections). Child implementations can easily call parent::jsonSerialize() to get the data to use as a basis, so we should capture that in the docs.

Right, good idea! We should definitely also have an example that explains that removing data would make sense, too, because you don’t want to fill the HTML of a page with data that has to be downloaded but that you will never need.

@jarednova
Copy link
Copy Markdown
Member

@acobster I saw you removed the WIP label. Are we ready for a final review, or do you have more commits to come before me or @gchtr should do a deep dive?

@acobster
Copy link
Copy Markdown
Collaborator Author

@jarednova @gchtr yeah, consider this ready. I will follow up with a deprecation notice in PostQuery::get_posts(), but other than that I wasn't planning any more changes for this PR.

@gchtr gchtr mentioned this pull request Aug 25, 2020
24 tasks
Copy link
Copy Markdown
Member

@gchtr gchtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go from me 👍

I added a small task in #2073 that I should document the JSON serialization of posts. You didn’t do that yet, right, @acobster?

@acobster
Copy link
Copy Markdown
Collaborator Author

@gchtr correct, I did not.

@acobster
Copy link
Copy Markdown
Collaborator Author

@jarednova did you want to give this another look before I merge this?

@jarednova
Copy link
Copy Markdown
Member

@acobster almost! I wanted to investigate the remaining failing tests and caught an issue that I wanted to make sure you were aware of. In b0efe7d I made a few updates to fix some test deprecation notices (easy). However, the test for TestTimberContext::testPostsContextSimple() revealed that we need to handle for what's returned with the default Timber::context() call.

I didn't know if this was already tackled in another branch or PR so I created my own to demonstrate and (potentially/hopefully) resolve in #2315.

Can you review that and see if you agree with those changes? That seems to me to be the only last thing we need to tackle on this piece

@acobster
Copy link
Copy Markdown
Collaborator Author

@jarednova your changes absolutely make sense, and I'm actually already working on those in a separate branch built off of this one. I'm not going to merge it in to avoid merge conflicts, but I'll keep your PR open so you can verify against the new branch once I push it.

@jarednova
Copy link
Copy Markdown
Member

Great! In that case let's call this one 👍 — please do the honors of merging into the factories branch so we can follow along w that other stuff

@acobster acobster merged commit 1772b33 into 2.x-factories Aug 26, 2020
@acobster acobster deleted the 2.x-post-collections branch August 26, 2020 03:26
gchtr added a commit that referenced this pull request Nov 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants