Add EAGER_BATCHED fetchmode to OneToMany association#1569
Add EAGER_BATCHED fetchmode to OneToMany association#1569woutervanvliet wants to merge 4 commits intodoctrine:masterfrom
Conversation
In regular EAGER mode, doctrine adds a JOIN for a OneToMany association forcing the database to load more data in memory than essentially needed. In a classic Article => Tag concept, each article would be loaded one extra time for every addition tag. LAZY mode issues a query for every article to load all associated tags, also adding more overhead to the database than needed. This new fetchmode addreses this issue by loading all tags associated to an article into one additional query.
|
Hello, thank you for creating this pull request. I have automatically opened an issue http://www.doctrine-project.org/jira/browse/DDC-4017 We use Jira to track the state of pull requests and the versions they got |
|
Please follow the coding standards: indent with 4 spaces, replace public function scheduleCollectionForBatchLoading(PersistentCollection $collection) {
$mapping = $collection->getMapping();
$name = $mapping['sourceEntity'].$mapping['fieldName'];
if (!isset($this->batchedEagerLoadingEntities[$name])) {
$this->batchedEagerLoadingEntities[$name] = Array(
'items' => Array(),
'mapping' => $mapping,
);
}
$this->batchedEagerLoadingEntities[$name]['items'][] = $collection;
}Should be: //indentation changed to spaces
public function scheduleCollectionForBatchLoading(PersistentCollection $collection)
{//<-- new line
$mapping = $collection->getMapping();
//surround with spaces
$name = $mapping['sourceEntity'] . $mapping['fieldName'];
if (!isset($this->batchedEagerLoadingEntities[$name])) {
$this->batchedEagerLoadingEntities[$name] = array(//array or [
'items' => array(),//array or [],
'mapping' => $mapping,
);
}
$this->batchedEagerLoadingEntities[$name]['items'][] = $collection;
} |
lib/Doctrine/ORM/UnitOfWork.php
Outdated
And other responses to the PR
|
@EVODelavega Thanks! Now I also understand what you meant with your comment on Array, back on Stackoverflow. I've amended as good as possible. |
There was a problem hiding this comment.
$tags needs to be initialized in the constructor to a new instance of the Doctrine\Common\Collections\ArrayCollection class. See doctrine documentation for details, especially 5.13 as to why this is required
|
I must say that the test-case isn't really a good fit here. I'd expect Articles and Tags to be either a unidirectional one-to-many relation, or a bidirectional many-to-many relation (A collection of tags, zero or more of which can be linked to articles, tags not being linked to articles OR articles being linked to zero or more tags, each tag being linked to one or more articles). This in turn begs the question, if we apply this EAGER_BATCHED mode to many-to-many relation entities, what the possible headaches would be when cascading things like DETACH: $article1 = $em->find('Article', 1);
$article2 = $em->find('Article', 2);
//if $article2 shares tags with $article1, this might throw an exception:
$em->detach($article1);
//if this tag is shared, and detach cascades, wouldn't the tag instance be unmanaged here?
$article2->getTags()->get(1)->setName('new tag name');
$em->persist($article2);
$em->flush();//if the altered tag is indeed unmanaged, this throws an exception |
That depends on how one defines Tag. To me, a Tag is a word linked to an Article. The same word may be linked to more articles, in which case it is their own Tag instance. Hence the shared primary key, over $name and $article. And, as you can see, the test case actually creates 100 articles all with the exact same 7 words as tag. As per the doctrine documentation you linked, one-to-many has to be bidirectional. |
|
@interpotential One-to-many doesn't have to be bidirectional (5.6). But aside of that, the current way your entities are set up, you might end up with a series of tags like this:
These are the exact same words, but the're unique keys (because of the article id). That, to me, seems wasteful. I'd probably use a relational table just storing an article id + tag id like so:
To avoid tag duplication, but this is turning into a DB design discussion, instead of a review of a PR |
|
True - though a discussion about DB design is also interesting ;-). Btw, with all your very useful feedback I'm surprised you overlooked me having misnamed the unit test. Update coming up, once travis-ci has told me that it's ok. |
|
Any updates on this? |
|
ping @Ocramius |
|
@Koc @Laniax I didn't look into this at all, sorry. |
|
Is there a plan to review this? It would be very useful for us. |
|
Sorry, I really never got around checking this, but overall my focus is on removing features before we add anything new. |
|
This is must have feature. It will make everything so easier. |
|
The simplicity of this is beautiful, as it just needs one point in the "hotpath" to hook into to get it working. The additional code to In SQL Alchemy this strategy is called "SELECT IN loading": https://docs.sqlalchemy.org/en/13/orm/loading_relationships.html |
|
This would be a amazing functionallity, I used a lot of Eloquent from laravel, and this is a nice way to load relations. Im developing a new applications using doctrine and I stumbled in a problem. I have a one-to-many relation, but the one is a pagination, so the solution would be use a joined eager load, so it could run in 1 queries like:
The would be ok, but is hard to cache, if any of the product_image change, I need to invalidate all this collection, in the select in method this would be so much easier.
|
…ions. Co-authored-by: Wouter M. van Vliet <wouter@interpotential.com>
|
Superseded by #8391 |
…ions. Co-authored-by: Wouter M. van Vliet <wouter@interpotential.com>
…ions. Co-authored-by: Wouter M. van Vliet <wouter@interpotential.com>
…ions. Co-authored-by: Wouter M. van Vliet <wouter@interpotential.com>
…selectFetchMode
[GH-1569] Optimize eager fetch for collections to batch query
* 2.17.x: Deprecate annotation classes for named queries Fix typos Housekeeping: Revert change to AbstractExporter, not needed without subselect fetch. Address review comments. Explain internals of eager loading in a bit more detail and how its configured. 1:1 and M:1 associations also use fetch batch size configuration now. Add another testcase for DQL based fetch eager of collection. last violation hopefully Static analysis Housekeeping: phpcs Directly load many to many collections, batching not supported yet. fix tests. Avoid new fetch mode, use this strategy with fetch=EAGER for collections. Make sure to many assocatinos are also respecting AbstractQuery::setFetchMode Disallow use of fetch=SUBSELECT on to-one associations. Go through Persister API instead of indirectly through repository. Introduce configuration option for subselect batch size. Houskeeping: phpcs Disallow WITH keyword on fetch joined associatiosn via subselect. [doctrineGH-1569] Add new SUBSELECT fetch mode for OneToMany associations.
* 2.17.x: Deprecate annotation classes for named queries Fix typos Housekeeping: Revert change to AbstractExporter, not needed without subselect fetch. Address review comments. Explain internals of eager loading in a bit more detail and how its configured. 1:1 and M:1 associations also use fetch batch size configuration now. Add another testcase for DQL based fetch eager of collection. last violation hopefully Static analysis Housekeeping: phpcs Directly load many to many collections, batching not supported yet. fix tests. Avoid new fetch mode, use this strategy with fetch=EAGER for collections. Make sure to many assocatinos are also respecting AbstractQuery::setFetchMode Disallow use of fetch=SUBSELECT on to-one associations. Go through Persister API instead of indirectly through repository. Introduce configuration option for subselect batch size. Houskeeping: phpcs Disallow WITH keyword on fetch joined associatiosn via subselect. [doctrineGH-1569] Add new SUBSELECT fetch mode for OneToMany associations.
* derrabus/3.0.x: Deprecate annotation classes for named queries Fix typos Housekeeping: Revert change to AbstractExporter, not needed without subselect fetch. Address review comments. Explain internals of eager loading in a bit more detail and how its configured. 1:1 and M:1 associations also use fetch batch size configuration now. Add another testcase for DQL based fetch eager of collection. last violation hopefully Static analysis Housekeeping: phpcs Directly load many to many collections, batching not supported yet. fix tests. Avoid new fetch mode, use this strategy with fetch=EAGER for collections. Make sure to many assocatinos are also respecting AbstractQuery::setFetchMode Disallow use of fetch=SUBSELECT on to-one associations. Go through Persister API instead of indirectly through repository. Introduce configuration option for subselect batch size. Houskeeping: phpcs Disallow WITH keyword on fetch joined associatiosn via subselect. [doctrineGH-1569] Add new SUBSELECT fetch mode for OneToMany associations.
In regular EAGER mode, doctrine adds a JOIN for a OneToMany association
forcing the database to load more data in memory than essentially
needed. In a classic Article => Tag concept, each article would be
loaded one extra time for every addition tag.
LAZY mode issues a query for every article to load all associated tags,
also adding more overhead to the database than needed.
This new fetchmode addreses this issue by loading all tags associated to
an article into one additional query.