Ensure that angle brackets in pyscript tag are escaped before parsing#684
Conversation
Co-authored-by: James A. Bednar <jbednar@users.noreply.github.com>
| function escape(str: string): string { | ||
| return str.replace(/</g, "<").replace(/>/g, ">") | ||
| } | ||
|
|
There was a problem hiding this comment.
I think we should escape more?
I'm not an expert in the field, but a quick googling found this:
https://stackoverflow.com/a/6234804
I guess we should probably escape ', " and & as well?
There was a problem hiding this comment.
I was indeed quite conservative here, however I think < and > may indeed be special in the regard that they absolutely break the parser while the others are generally parsed correctly. Might be best to simply write some tests to confirm.
There was a problem hiding this comment.
for example, imagine the following code:
<py-script>
js.console.info("a & b");
</py-script>I would expect it to print literally a & b, what it actually prints is a & b.
And if you try to print "a " b" is even worse, because it is parsed as a quote " and so python read "a " b", which results in a python SyntaxError.
There was a problem hiding this comment.
That's indeed bad, sounds like we actually have to unescape those HTML entities.
There was a problem hiding this comment.
uhm right, for those it's the opposite direction.
Btw, I just checked what JS does:
<script>
console.info("a & b");
</script>prints a & b, so we should probably do the same.
Without escaping angle brackets (
<and>) the DOMParser will strip out anything that looks like an HTML tag.