Skip to content

[BUG] XPath URL uses thumbnail's hostname #6152

@matejdro

Description

@matejdro

Describe the bug

Recently (not sure if due to target website's update or FreshRSS's edge update, XPath URL resolving broke.

It seems that it uses hostname from the thumbnails is used for the article URL instead of regular website URL, which causes broken links (thumbnails are hosted on separate CDN host)

To Reproduce
Steps to reproduce the behavior:

  1. Add HTML + XPath feed from https://maribor24.si/lokalno/maribor/ with
    • XPath for finding news items: //div[contains(@class,'grid-cols-1')]/div
    • XPath (relative to item) for item title: descendant::a/div
    • XPath (relative to item) for item link (URL): descendant::a/@href (note that links in this are absolute, but without a domain)
    • XPath (relative to item) for item thumbnail: descendant::div[contains(@class,'mb-2.5')]/a/picture/img/@src
  2. Load that feed

Expected behavior

Article URL should point to a link on the domain of the actual website, for example maribor24.si/lokalno/maribor/v-mariboru-bodo-ta-konec-tedna-gostili-najvecji-mednarodni-znanstveni-dogodek/

Actual behavior

Article URL points to a CDN domain, which makes the link invalid. For example https://mb24.cdn1.maribor24.si/lokalno/maribor/v-mariboru-bodo-ta-konec-tedna-gostili-najvecji-mednarodni-znanstveni-dogodek/

Environment information (please complete the following information):

  • FreshRSS version: edge image, pulled on 2024-03-02T08:49:31Z

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions