Skip to content

Conversation

@mschoettle
Copy link

@mschoettle mschoettle commented Apr 15, 2025

When a Markdown document starts with a comment (for example, a copyright notice, such as one gets when adopting REUSE, the heading is not found.

Comments are converted to HTML placeholders by markdown.
Change the logic to ignore the first element if it is a placeholder (in case it comes before the first-level heading).

@oprypin
Copy link
Contributor

oprypin commented Apr 15, 2025

I can assure you that this break was not accidentally mis-indented. Accepting a heading only at the very beginning folows the same idea as the original behavior since the year 2017. I had to comment because this is written as if prior to #3191 this title in your case would have been found, but that's not the case.

Allowing a comment, if there's some problem with it, should probably be a much more targeted change.

@mschoettle mschoettle changed the title Break only when h1 tag is finding when extracting a page title Break only when h1 tag is found when extracting a page title Apr 15, 2025
@mschoettle
Copy link
Author

I can assure you that this break was not accidentally mis-indented. Accepting a heading only at the very beginning folows the same idea as the original behavior since the year 2017. I had to comment because this is written as if prior to #3191 this title in your case would have been found, but that's not the case.

Allowing a comment, if there's some problem with it, should probably be a much more targeted change.

It was not my intention to suggest that #3191 broke previous behaviour. Sorry if that came across. I was trying to figure out where the title is determined and found the most recent change, so I did not look at original behaviour at all. I updated the description.

The way I understood "report the first H1 tag" in the PR description was that it would look until it found it. I agree that the heading should come first. I'll try to figure something out that still expects the heading to come first but allows for comments.

@mschoettle mschoettle changed the title Break only when h1 tag is found when extracting a page title Ignore first element if it is a comment when determining page title Apr 15, 2025
@mschoettle
Copy link
Author

I reverted my change and added a more targeted change to skip the first element if it is an HTML placeholder.

@oprypin
Copy link
Contributor

oprypin commented Apr 15, 2025

Nice :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants