Skip to content

[Bug] Long update times due to cache counter #8206

@pe1uca

Description

@pe1uca

Describe the bug

My purge policy is set to never delete articles (from the defaults of only 200 or up to 3 months old)
When the actualize_script is run the instance comes to a halt, the UI can't receive response from the backend and the disk usage goes to 100%.

After looking at postgresql query logs I found out this is the query which takes a long time to complete.

UPDATE pe1uca_feed
SET "cache_nbEntries"=(SELECT COUNT(e1.id) FROM pe1uca_entry e1 WHERE e1.id_feed=pe1uca_feed.id),
"cache_nbUnreads"=(SELECT COUNT(e2.id) FROM pe1uca_entry e2 WHERE e2.id_feed=pe1uca_feed.id AND e2.is_read=0);

When the table _entry is small it works fine, but in my case it contains 1.2 million records which can take between 10 and 30 seconds to complete.
Not sure if the fact that I have almost 300 feeds also affects this query.

To Reproduce

  1. Set up an instance
  2. Have hundreds of feeds
  3. Have millions of entries
  4. See the disk usage of the server spike when the actualize_script is run.
    The UI also gets affected, and we can see the spinning icon of read and favorite icons hang in there for a while.
    The network tab of the browser console shows the request to mark as read/unread and favorite waiting for a response.

Expected behavior

The actualize script should not block the UI.

FreshRSS version

1.26.3

System information

  • Database version: PostgreSQL 17
  • PHP version: [e.g. PHP 8.2]
  • Installation type: Docker compose (image: freshrss/freshrss:1.26.3)
  • Browser:Firefox 144

Additional context

Maybe this query can be optimized to only update the cache for the feeds that have been recently updated? Ideally only those updated during the current actualize run, or on the last X hours/minutes.

I see the _feed table has a lastUpdate field, but IIRC this is the time at which it was last fetched from source, regardless of new entries. So, not sure if it's possible to only update the cache only for feeds that got new entries during the current actualize run to further optimize the cache update.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bug (unconfirmed)issues that could not be reproduced yet

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions