language-definitions AutoHotkey: Modified block comment matching#1703
language-definitions AutoHotkey: Modified block comment matching#1703nnnik wants to merge 5 commits intoPrismJS:masterfrom
Conversation
Modified comment matching
Block comments now need to start on their own line
RunDevelopment
left a comment
There was a problem hiding this comment.
Thank you for contributing!
However before we can merge this, we will have to make a few changes.
The comment pattern has two parts:
(^[^";\n]*("[^"\n]*?"[^"\n]*?)*)(?:;.*$)to match single-line comments and^\s*\/\*[^\n]*(?:\n(?:[^*]|\*[^\/])*(?:\*\/)?)?to match multi-line comments.
You only changed the second part but we might as well take this chance to improve the first one as well, so let's start with that:
The single-line comment pattern can be simplified to ;.*. The $ at the end is redundant (see .) and the lookbehind group (^[^";\n]*("[^"\n]*?"[^"\n]*?)*) is to prevent matching ;s inside strings.
I removed the lookbehind group because Prism has a better way to deal with that: greedy matching. So making string greedy will have the same effect and makes the pattern simpler.
Now to multi-line comments:
First of all, there are 3 possible end of line sequences: \n, \r, and \r\n. So the pattern doesn't support \r line ends.
Second, the requirement for */ to close the comment is to be at the start of a line (plus spaces). This means that the following is a comment:
/*
* I don't end here */ but there
*/Knowing that, the pattern should be (^\s*)\/\*[\s\S]*?^\s*\*\/.
(Leading and trailing spaces should not be part of any token, so I use the lookbehind group to remove them.)
This doesn't yet include comments which aren't closed by */ and end at the end of file, so let's add them:
(^\s*)\/\*(?:[\s\S]*?^\s*\*\|[\s\S]*)/.
Please verify that this works and please extend the comment test. I haven't thoroughly tested it.
Also, are non-space characters allowed on the same line after the closing */? E.g.
/*
*/ where do I end?
*/
I dont know about the single line comments for now. I haven't looked into them extensively.
I kind of assumed that the parser would handle the newlines accordingly. AutoHotkey can deal with all types of Unicode newlines - sadly thats not a feature javascript can handle apparently.
Thats incorrect - the latest AHK Version use exactly the behavior I implemented. So this would be correct:
Oh I will fix that I will also add more unit tests to verify the new features. |
Split comment-blocks from single-line comments Removed leading whitespace from match Added unit tests for new features
|
I added multiple regexes to the comment matching in order to split of multi-line comments from single-line comments. |
|
So I went ahead and looked into single line comments and it seems that the only requirements that AHK has regarding single line comments are:
Which actually means that strings will be torn apart by ; if it follows a whitespace Will result in the creation of test.txt with the content: It will be hard to detect whether "" is currently inside an expression or not so highlighting it all the time doesn't seem to be wrong. Since ; interupts it not highlighting it then seems to be the safest option. Single line comments can be reduced to: using this logic. |
|
I greatly simplified single-line comment matching and made the behavior equivalent to autohotkeys own behavior. Strings don't take precedence over comments in my browser but I will probably modify the regex for strings in order to cover that case. |
…ized as functions
|
This one is a minor fix that fixes if( and while( |
| }, | ||
| 'comment': [ | ||
| { | ||
| pattern: /(^|\s);.*/m, |
There was a problem hiding this comment.
You don't need the m flag.
\s will also match \n and \r. Do keep ^ though.
| lookbehind: true | ||
| }, | ||
| { | ||
| pattern: /(^\s*)\/\*[^\n\r]*(?:(\n|\r|\r\n)(?:[^*]|\*[^\/])*(?:\*\/)?)?/m, |
There was a problem hiding this comment.
How about this:
/(^\s*)\/\*.*$(?:[\s\S]*?\*\/|[\s\S]*)/mLike this, we can just let JS handle all different kinds of line breaks.
| lookbehind: true | ||
| }, | ||
| { | ||
| pattern: /(^\s*)\/\*[^\n\r]*(?:(\n|\r|\r\n)(?:[^*]|\*[^\/])*(?:\*\/)?)?/m, |
There was a problem hiding this comment.
This has to be greedy because of:
/* ; what now?
*/|
Has this been integrated into v1.16? |
|
There are some comments that haven't been addressed yet which is why this PR is still open. |
Okay cool, thanks! 👍 |
|
@nnnik Is this still being worked on or shall I take over? |
|
@RunDevelopment Gonna drop this from 1.17.0, if we want to pub today. |
|
@mAAdhaTTah Agreed. |
|
What's wrong with this PR? anything missing? |
The block comment matching has been fixed.
I also made the behavior as close as possible to, how AutoHotkey actually parses block comments as of 1.1.30.01.
The parser looks for a line starting with /* (excluding whitespace).
Then starting from the next line it will look for */ to end the comment.
The closing tag seems to be entirely optional and the entire remaining file will become a comment block if no ending tag is found.
Any code directly after a starting sequence is ignored until the parser encounters a new line.