Skip to content

[BUG] FreshRSS duplicating entries in very rare, mysterious circumstances #5410

@gbakeman

Description

@gbakeman

Describe the bug
I created a bug about this problem (#4831) thinking it was related to me upgrading to PgSQL 15. Since then though, I've experienced the same issue a couple of more times without making any changes to my database server.

What happens is, at some point, one individual article in a feed becomes stuck as unread. An error is thrown when I attempt to mark it as read:

[error] --- SQL error markRead 1: ERROR:  duplicate key value violates unique constraint "frss_[my username]_entry_id_feed_guid_key"
DETAIL:  Key (id_feed, guid)=(12, http://www.dslreports.com/shownews/Dish-Network-Takes-30M-Cyberattack-Hit-more-notable-news-143411) already exists.

It's very infrequent when it occurs, and honestly, with PgAdmin it only takes about 10 minutes for me to remember how I solved the problem last time and manually fix it in the affected database. I'm just curious why this problem is happening in the first place and if there's a simple fix to prevent it in the future.

To Reproduce
No way I know to reliable reproduce it. So far, I've only seen it occur in the DSLreports feed.

Expected behavior
Hopefully there's some simple steps that can be taken to prevent this error from happening.

Environment information (please complete the following information):

  • Device: Any (server-side error)
  • OS: Docker
  • FreshRSS version: 1.21.0
  • Database version: PostgreSQL 15.0 (Debian 15.0-1.pgdg110+1)

Additional info

When I run the following query, I see the results below.

SELECT * FROM public.frss_[user]_entry
WHERE guid LIKE '%143411'

Results

"id"	  "guid"	"title"	"author"	"content"	"link"	"date"	"lastSeen"	"hash"	"is_read"	"is_favorite"	"id_feed"	"tags"	"attributes"
1683637463772928	"http://www.dslreports.com/shownews/Dish-Network-Takes-30M-Cyberattack-Hit-more-notable-news-143411"	"Dish Network Takes $30M Cyberattack Hit; + more notable news -"		"(truncated)"	"http://www.dslreports.com/shownews/Dish-Network-Takes-30M-Cyberattack-Hit-more-notable-news-143411"	1683632220	1684242266	"binary data"	1	0	12		"{""enclosures"":[]}"
1684155865369868	"http://www.dslreports.com/shownews/Dish-Network-Takes-30M-Cyberattack-Hit-more-notable-news-143411"	"Dish Network Takes $30M Cyberattack Hit; + more notable news -"		"(truncated)"	"http://www.dslreports.com/shownews/Dish-Network-Takes-30M-Cyberattack-Hit-more-notable-news-143411"	1683632220	1684155862	"binary data"	0	0	12		"{""enclosures"":[]}"

So they have the same guid (URL) but different date and lastSeen times, one has the is_read flag set, and the same tags. Perhaps the article is updated and their feed is republishing it? Perhaps there's a way to account for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bug (unconfirmed)issues that could not be reproduced yet

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions