Skip to content

Last-Modified, If-Modified-Since, and cache handling behaviors #6056

@cbiffle

Description

@cbiffle

Hi! I am not a FreshRSS user, but I operate a website and appear to have many readers using FreshRSS, at least judging by user-agent. (Good for you! People seem to like it.)

I've noticed a couple of properties of at least some of these users and wanted to ask some questions to help clarify my understanding of what's going on.

The requests are tending to miss cache (return a 200 rather than a 304 on my side). This is the only reason they came to my attention -- because some users are transferring 4 MiB of unmodified RSS XML per day. I enabled more logging for some of the heavier users and what I'm seeing is

  • Their connections don't seem to be doing ETag at all (i.e. they are never sending the if-none-match header), and
  • While they are sending if-modified-since, the date they're sending in the header appears arbitrary.

For context, my webserver implements if-modified-since using exact match -- so if the date sent in if-modified-since is the one I last handed out in last-modified, we get a 304, otherwise it's a 200. (This behavior is explicitly permitted in the HTTP spec, and all browsers I've tested conform to it.)

The dates I'm getting from connections claiming to be FreshRSS appear arbitrary and aren't related to my last-modified dates, which is causing the 200. They're always within a few days of my date, but are never a date I would have sent.

Concrete example: my RSS feed is currently sending the last-modified date of Thu, 18 Jan 2024 19:24:29 GMT. Before that it was modified on 13 Jan. However, the date I'm getting from a user identified as FreshRSS/1.23.1 is Mon, 15 Jan 2024 18:31:01 GMT, which is a date I never would have sent -- and a date that is older than the current modification date. This user has been doing this hourly for the past day or so.

I read through your code, and it looks like you're trying to record modification dates per-resource and send them with requests (particularly SimplePie seems to go to some effort to do this). This is good! I wonder if some versions of this might have a bug, or if some users might have configuration issues that cause the wrong date to be sent?

I'm not entirely sure what to do about these users, since even if I patched the server to not require exact date match, they are still sending dates in the past as though their cache is pulling the wrong dates.

Anyway, thanks in advance for any insights you can offer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions