Skip to content
This repository was archived by the owner on Mar 8, 2020. It is now read-only.

More robust comment string parsing#354

Merged
juanjux merged 4 commits intobblfsh:masterfrom
juanjux:fix_strfind
Jan 31, 2019
Merged

More robust comment string parsing#354
juanjux merged 4 commits intobblfsh:masterfrom
juanjux:fix_strfind

Conversation

@juanjux
Copy link
Copy Markdown
Contributor

@juanjux juanjux commented Jan 31, 2019

  • Checks values returned by Index functions for -1 (had a case where it was -1, check failing CI in until this is merged in Comment and other fixes python-driver#176).
  • Trims the text before checking if the prefix and suffix exist with HasPrefix and HasSuffix to avoid cases where the language can give a wrong position for the start of the comment (or only the line) like happens with Python with some comments where it gives all the whitespace before if they're on their own line and indented.

Signed-off-by: Juanjo Alvarez juanjo@sourced.tech


func (c *commentElems) Split(text string) bool {
if !strings.HasPrefix(text, c.StartToken) || !strings.HasSuffix(text, c.EndToken) {
trimmed := strings.TrimSpace(text)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't happen automatically. It belongs to a separate op, or should be specified explicitly as a flag.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I rebased and this got fakely outdated. You mean like for example adding a new "trimText" field to commentUAST and a new op like CommentTextTrimmed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, something like this will be nice :)

But if positions are off the node will fail the token verification for sure. So fixing positions should fix both issues (validation and trim).

Copy link
Copy Markdown
Contributor Author

@juanjux juanjux Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. In the case of Python it works because it only happens with comments on their own line that are (sometimes) indented. In the native AST those doesn't have position, unfortunately, trough in this specific case it should be probably easy to fix (and in some other cases with some heuristics).

c.Prefix = text[:i]
text = text[i:]

if i == -1 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be nicer:

c.Prefix = ""
if i >= 0 {
  c.Prefix = text[:i]
  // ...
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but is assigns twice in the most common case and I don't trust Golang optimizer much, anyway I also find it better and I don't think it'll hurt much.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Juanjo Alvarez added 2 commits January 31, 2019 19:08
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Copy link
Copy Markdown
Member

@dennwc dennwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more nitpicks. Feel free to ignore if you have limited time.

}

func CommentTextTrimmed(tokens [2]string, vr string) Op {
return &commentUAST{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably do something like

op := CommentTextTrimmed(tokens, vr).(*commentUAST)
op.doTrim = true

or add an unexported newCommentText with the third argument.

dennwc and others added 2 commits January 31, 2019 23:43
Co-Authored-By: juanjux <juanjo@juanjoalvarez.net>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux juanjux merged commit d506b54 into bblfsh:master Jan 31, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants