Skip to content

[3.0.x.x] Use CDATA for image caption and title and add linebreaks#15069

Merged
mhcwebdesign merged 1 commit intoopencart:3.0.x.xfrom
MarvinKlein1508:fix-sitemap
Jul 25, 2025
Merged

[3.0.x.x] Use CDATA for image caption and title and add linebreaks#15069
mhcwebdesign merged 1 commit intoopencart:3.0.x.xfrom
MarvinKlein1508:fix-sitemap

Conversation

@MarvinKlein1508
Copy link
Copy Markdown

This PR changes two things for the Google Sitemap extension.

  1. It adds PHP_EOL to each line to make the file readable when you open it as a human
  2. It adds the use of CDATA for image caption and image title. Without this the feed generation will result in an invalid feed when your caption is using html formatted special characters. For example: ä

@mhcwebdesign mhcwebdesign merged commit e04a750 into opencart:3.0.x.x Jul 25, 2025
4 checks passed
@ADDCreative
Copy link
Copy Markdown
Contributor

You shouldn't need CDATA around $product['name'] as that will already be escaped in the database. If it's not then you will have bigger problems.

@MarvinKlein1508 MarvinKlein1508 deleted the fix-sitemap branch July 25, 2025 11:35
@MarvinKlein1508
Copy link
Copy Markdown
Author

You shouldn't need CDATA around $product['name'] as that will already be escaped in the database. If it's not then you will have bigger problems.

Yes I know it will be escaped. In our products it escapes special characters as well. For example it makes ä to ä The escaped character then crashes the sitemap.

@ADDCreative
Copy link
Copy Markdown
Contributor

ADDCreative commented Jul 26, 2025

Yes I know it will be escaped. In our products it escapes special characters as well. For example it makes ä to ä The escaped character then crashes the sitemap.

That's because XML only has 5 predefined character entities, unlike HTML.

When you wrap in CDATA you are saying to ignore any markup. So once parsed the result is ä and not ä. The same for &. & outside CDATA will correctly be parsed as &, but & inside CDATA will be parsed as &, which would be incorrect. So you change brakes encoding for everyone else who are not writing data directly to the database.

In your case you would need to decode the entities and then just encode the ones valid in XML. Or use a numeric character reference.

@mhcwebdesign
Copy link
Copy Markdown
Contributor

You shouldn't need CDATA around $product['name'] as that will already be escaped in the database. If it's not then you will have bigger problems.

Agree. Just tested it with the ä character in a product name, it ends up just fine as the UTF-8 Unlaut-a character in the image caption and image title. So I removed the CDATA, see 209a574

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants