Fixed blog readtime calculation to ignore non-content text#7370
Fixed blog readtime calculation to ignore non-content text#7370squidfunk merged 1 commit intosquidfunk:masterfrom
Conversation
There's no need to integrate the void tags at all, as for them, the data handler is not called. Void tags are tags without children, i.e., leaf components. Please remove the void handling logic, as it does not influence the result. |
|
Ah wait, no, we might need them to maintain the |
|
... you might check if the |
|
I've checked: The |
|
Okay great, thanks for investigating. Note that this is an HTML4 parser, not HTML5, which we chose for speed and package size (as no further dependency is needed), so as I remember, this is why we need to check for void tags by ourselves. We'll check in the future if we can swap this out for an approach that directly operates on the HTML AST, which would be much more efficient, but for now, I think we can keep it. |
|
From what I can see, this looks great, ready to merge! Thanks for your time! |
|
Perfect, thanks for the helpful review and feedback! 🙏 |
I've fixed the blog readtime calculation to ignore non-content text, i.e. any text nodes that are inside the following tags are omitted from calculating the readtime:
<object><script><style><svg>The implementation is inspired by the search plugin parser.
Note that I've duplicated the
voidset from the search plugin. It might be cleaner to deduplicate it, but I wasn't sure where to best move it to import from. If you'd prefer deduplication, I'd appreciate your guidance on where to factorvoidout, @squidfunk.Fixes #7367.