Skip to content

Batch Processing using toIterable() #8410

@stlrnz

Description

@stlrnz

Hi all!

I've updated my application to Doctrine ORM 2.8 some days ago.

Before that, my Repository code to iterate through a large result set looked like the following and worked perfectly even on >500.000 rows without leaking any memory.

    public function iterateAll(): Generator
    {
        $iterator = $this->createQueryBuilder('e')->getQuery()->iterate(null, Query::HYDRATE_SIMPLEOBJECT);

        foreach ($iterator as $equipment) {
            // index 0 is always the object
            yield $equipment[0];
        }
    }

Since Query::iterate() is deprecated now, I tried to use Query::toIterable() as suggested.

    public function iterateAll(): Generator
    {
        $iterator = $this->createQueryBuilder('e')->getQuery()->toIterable([], Query::HYDRATE_SIMPLEOBJECT);

        foreach ($iterator as $equipment) {
            yield $equipment;
        }
    }

With this implementation I stumbled over a massive memory leak in my application. I debugged this and found out that the AbstractHydrator is never releasing the Objects in AbstractHydrator::toIterable(). The called method AbstractHydrator::hydrateRowData() (or in my case SimpleObjectHydrator::hydrateRowData) just adds new entries to $result.

$this->hydrateRowData($row, $result);

Is that the intended behaviour? Wouldn't it be better to clear $result in every loop cycle to free memory?

If that's intended, is there a better way to loop through large result sets?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions