parse.y: add heredoc <<~ syntax (Feature #9098)#878
parse.y: add heredoc <<~ syntax (Feature #9098)#878bjmllr wants to merge 10 commits intoruby:trunkfrom
Conversation
|
Yay! |
|
👍 |
|
What happens if there are blank lines in the here doc (not necessarily with leading whitespace) ? |
|
@Fryguy currently those would be considered lines with no indentation, so they would cause the entire heredoc to be flush left. That means that the documentation I just pushed is incorrect, but before I fix it, do you think it's better to ignore blank lines, or treat them as lines with no indentation? |
|
I was originally thinking ignore them for the purposes of figuring out the strip size. As a user of the method, my least surprise would be with this: class FancyHello
def self.hello
puts <<~README.inspect
Hello
World!
README
end
end
FancyHello.hello # => "Hello\n\n World!\n"Not 100% sure though...what do others think? @avdi? |
|
With this last commit, lines which are blank (empty or consisting only of tabs and spaces) will not be used to find the base indentation level. On a blank line, any amount of indentation less than the heredoc's base indentation level will be ignored, while any additional indentation will be preserved. |
|
I expect that literally written spaces/tabs would be stripped, but not escaped ones, such as |
|
@nobu |
|
I've thought about this a lot, and I've ended up with two options that I would find acceptable: #1. Indent is based on shortest-indented non-whitespace line. So: class FancyHello
def self.hello
puts <<~README.inspect
Hello
World!
README
end
endoutputs: #2. Final indent is based on the indent level of the closing marker ( Of the two, I suspect #1 is less likely to surprise people. In both cases, blank lines are ignored for the purpose of indent. |
|
Escaped spaces seem fine. |
|
BTW, it's better to adopt the existing coding style (indent, braces, etc.) to send patches, even if it is far from your favorites. |
|
@nobu I definitely didn't intend to introduce style inconsistencies! I guess you specifically meant where I was using just spaces for indentation, instead of tabs and then spaces ... if so, I think I have fixed it with this last commit. I'll work on the other issues later this week. Thanks for all the feedback! |
|
@nobu With the changes you mentioned above, interpolation seems to be working now. I also updated ripper to provide the dedented string and added support for backticks. |
|
👍 This would solve a long-time annoyance I (and presumably many others) have had with the heredoc syntax. Rails has had a solution to this for a while, but for non-rails code the need to remove indentation from heredoc strings has been rather irritating. |
|
Hi! I'm guessing this would need a rebase if it was to be merged, but... is it still being considered? I would personally find it very handy to have it in core. |
As suggested by nobu, this eliminates one of the passes through the string and also allows us to give different treatment to escape sequences. doing this required a few other changes: * parser_params now includes a parser_heredoc_indent element * the amount of indentation to remove from a heredoc is now found in parser_tokadd_string rather than parser_heredoc_indent * parser_heredoc_dedent is now called from parser_here_document rather than parser_str_new some cleanup happened in this commit as well: * removed unneeded parser_heredoc_dedent signature * starting size for parser->parser_heredoc_indent is now INT_MAX * parser_heredoc_dedent now calls dispose_string on the input string, unless parser->parser_heredoc_indent is 0, in which case it returns the input string
* added an interpolation test case, which still fails * eliminated STR_FUNC_DEDENT, heredoc_dedent is sufficient * changed heredoc_dedent() to accept and return NODE's * call heredoc_dedent() and reset heredoc_dedent from parser rules * save and restore heredoc_dedent around compstmt in string_content * set heredoc_indent as soon as a squiggly heredoc starts parsing
* make heredoc_line_indent a member of parser_params (needed because a line can be broken into multiple nodes by an interpolated expression) * drop reading_indentation from parser_tokadd_string in favor of heredoc_line_indent * rewrite heredoc_dedent() to walk the AST and rewrite indentation across string fragments * add failing case of interpolated string, fix it by tracking yet-to-be-removed indent for the count process and for the copy process separately * remove carriage return handling
* extract actual dedenting activity from parser_heredoc_dedent() to parser_heredoc_dedent_string() * add parser_heredoc_dedent_ripper() for use in ripper * add squiggly heredoc tests
35dbcd5 to
eb7f824
Compare
|
I rebased this branch and made a first attempt at @matz 's request regarding the handling of hard tabs. It should now do something sensible for any indentation other than spaces followed by tabs on a single line. The build error seems to be unrelated, something in |
|
@bjmllr I re-runned Travis CI. |
|
I'm confused. Why are tabs being treated as equivalent to spaces at all? E.g. If I write: def hello
puts <<~README.inspect
<tab>Hello
<space><space><space><space><space><space><space><space>World!
README
end
endAre you saying that should be accepted by the compiler? Why? Why should that be any less invalid than: def hello
puts <<~README.inspect
<tab>Hello
<space><space><space><space>World!
README
end
endor def hello
puts <<~README.inspect
<space><space><space><space>Hello
<space><space>World!
README
end
endShouldn't we just throw an error in all of those cases? Is there ever a legitimate reason why you'd want to allow inconsistent indentation in one of these blocks? What happens when someone has their editor set to display tabs as 4 spaces, and writes: def hello
puts <<~README.inspect
<space><space><space><space>Hello
<tab>World!
README
end
endWhy should that result in: hello #=> " Hello\n\n\tWorld!"I certainly wouldn't expect that result intuitively. In such a case, wouldn't a well-written error message explaining that I'm mixing tabs and spaces be much more helpful for me as a developer? |
|
Closing this since the feature was added in 9a28a29 |
Allows for the use of heredocs which appear nicely indented in ruby source code, but the indentation is removed during parsing.
Original proposal: https://bugs.ruby-lang.org/issues/9098
Uses the syntax suggested by Avdi Grimm (
<<~), and should have the same semantics asString#strip_heredocfrom ActiveSupport, that is, the indentation of the least-indented line is removed from each line of the string.No attempt was made to deal with inconsistent indentation (tabs are considered equal to spaces).
Please let me know if I can improve this patch. Thanks!