Skip to content

In Markdown table cells, apply HTML escaping only to code blocks, and apply it properly#167

Merged
tetromino merged 10 commits intobazelbuild:masterfrom
tetromino:markdown_cell
Aug 1, 2023
Merged

In Markdown table cells, apply HTML escaping only to code blocks, and apply it properly#167
tetromino merged 10 commits intobazelbuild:masterfrom
tetromino:markdown_cell

Conversation

@tetromino
Copy link
Copy Markdown
Collaborator

@tetromino tetromino commented Jul 25, 2023

In Markdown table cells, apply HTML escaping only to code blocks, and apply it properly

Since #161 removed HTML escaping for defaults and function docstrings,
we should do the same for attribute and param docs in table cells.

The only limitations Markdown places on table cells are:

  • no pipe characters (they must be escaped with a backslash)
  • no newlines (they must be transformed into <br> or an HTML entity)

The latter restriction makes it impossible to have a fenced code block inside a table cell.

Therefore:

  • we do not escape HTML or Markdown markup outside a fenced code block
  • we keep existing logic for escaping newlines outside a fenced code block
  • we fix fence detection (e.g. allowing more than 3 fence characters to support embedded code blocks in code blocks, allowing tildes as fence characters, properly handling language names, etc.);
  • in code block content, we escape HTML, and we escape newlines as HTML entities (since <br> does not work in a <pre><code> block) - finally fixing code block newlines in table cells.

This is a followup to #161.

Partially addresses #118

@tetromino tetromino changed the title Markdown cell In Markdown table cells, apply HTML escaping only to code blocks, and apply it properly Jul 25, 2023
@tetromino
Copy link
Copy Markdown
Collaborator Author

tetromino commented Jul 25, 2023

This is a draft PR to be merged after #166 lands

Relevant commits start at b198a24 (everything before that is from #166)

@fmeum fyi

… apply it properly

Since bazelbuild#161 removed HTML escaping for defaults and function docstrings,
we should do the same for attribute and param docs in table cells.

The only limitations Markdown places on table cells are:
* no pipe characters (they must be escaped with a backslash)
* no newlines (they must be transformed into <br> or an HTML entity)

The latter restriction makes it impossible to have a fenced code block
inside a table cell.

Therefore:
* we do not escape HTML or Markdown markup outside a fenced code block
* we keep existing logic for escaping newlines outside a fenced code
  block
* we fix fence detection (e.g. allowing more than 3 fence characters to
  support embedded code blocks in code blocks, allowing tildes as fence
  characters, properly handling language names, etc.);
* in code block content, we escape HTML, and we escape newlines as HTML
  entities (since <br> does not work in a <pre><code> block) - finally
  fixing code block newlines in table cells.

This is a followup to bazelbuild#161.
@tetromino tetromino marked this pull request as ready for review July 25, 2023 02:03
@tetromino tetromino requested a review from brandjon as a code owner July 25, 2023 02:03
@tetromino tetromino marked this pull request as ready for review July 25, 2023 18:30
@tetromino
Copy link
Copy Markdown
Collaborator Author

I have refactored markdownCellFormat to make it feasible to extend it later for formatting lists etc. inside markdown tables.

Copy link
Copy Markdown
Member

@brandjon brandjon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments on the parsing logic but didn't go through the whole algorithm with a fine-toothed comb.

Comment thread src/main/java/com/google/devtools/build/skydoc/rendering/MarkdownUtil.java Outdated
resultString = replaceWithTag(resultString, "`", "<code>", "</code>");
// See https://github.github.com/gfm
private static final class MarkdownCellFormatter {
private final ImmutableList<String> lines;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add one-line comments to doc the invariant for these fields. (Especially that currentLine is 0-indexed.)

Comment thread src/main/java/com/google/devtools/build/skydoc/rendering/MarkdownUtil.java Outdated
Comment thread src/main/java/com/google/devtools/build/skydoc/rendering/MarkdownUtil.java Outdated
\r\n will be normalized to \n by Bazel's Starlark interpreter, and if it somehow ends up
in the input proto, it will be stripped in our String.lines() call regardless.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants