Skip to content

Switch from ronn to AsciiDoc to generate manual pages #4897

@chrisd8088

Description

@chrisd8088

The Git LFS project has used the ronn(1) utility to generate its roff and HTML format manual pages since at least 2014, which is a good run for any type of software tooling.

However, the ronn project itself (https://github.com/rtomayko/ronn) does not appear to have had any updates in that time, and with the recent migration of GitHub Actions workflows to Windows Server 2022 runners, the need to first install a Ruby build environment in order to run ronn has proven something of a challenge (see #4883 and ruby/setup-ruby#293 for details).

Rendering Issues

There are also a number of glitches in ronn's conversions of Markdown to roff and HTML; see #4851 as an example, but also the HTML output for the git-lfs-install(1) man page, which uses <dt class="flush"> for just two out of a number of bullet points and therefore renders them oddly:

Screen Shot 2022-03-01 at 10 32 41 PM

A further complication with the use of ronn is that the ronn-format(7) specification calls for the use of angle brackets to delimit user-supplied arguments and variables. This is one of four different inline delimiters supported by ronn, the others being backticks (`), double-stars (**), and underscores (_), all of which we use more-or-less consistently in our existing manual source pages.

The use of angle brackets as delimiters, though, means that when our source manual pages are viewed in the GitHub UI, some of their contents are elided, leading to confusion. A recent issue pointed out the problem (see #4857 (comment)); at my suggestion the user opened a GitHub Community discussion on the topic.

To demonstrate the problem we can look at the Git LFS manual page for the git-lfs-migrate(1) command. The underlying
source Markdown contains this line:

`git lfs track` [options] [<pattern>...]

which describes how the command should be used.

After conversion by ronn(1) into the roff format, and then rendered from roff into HTML by Debian's online manual page viewer (https://manpages.debian.org/), this looks like:

Screen Shot 2022-03-01 at 2 49 43 PM

However, when viewed on GitHub without the ?plain=1 query argument, it looks like:

Screen Shot 2022-03-01 at 2 50 59 PM

The elision of the <pattern> term makes the page confusing and incorrect.

Moreover, not all angle bracket-delimited words are treated identically by ronn. Various HTML4 elements are specified and then exempted from conversion. We even rely on this in a few places where we intermingle custom angle bracket-delimited terms with HTML elements like <br>. And of course the list of HTML4 elements in ronn has not kept pace with the current HTML specifications.

Desired Attributes

We would like any potential replacement for ronn in our tooling to have at least the following characteristics:

  • Ability to generate high-quality roff and HTML versions of our manual pages.
    • Supports linking between manual pages and to external URLs in the HTML versions.
  • Easily installable on all GitHub Actions runners (Windows, macOS, and Linux).
  • Limited need to reformat our existing manual source pages.
  • Source format can be displayed without serious degradation in the GitHub UI.

On that last desired characteristic, when responding to user issues it is often convenient to be able to link to a recent version of a Git LFS manual page. Unlike the Git project, we don't host rendered HTML versions of our manual pages on, say, git-lfs.github.com, so it is highly convenient to be able to refer users to the source versions of the pages; see, for instance, #3805 (comment). We could try to remember to always append the ?plain=1 query argument to such links, but this adds friction when replying to users, and moreover, users themselves refer to these pages on the GitHub UI, both for reference and in their questions, such as in #4887 for instance.

Possible and Recommended Solutions

One potential option is to adopt the AsciiDoc source format and either AsciiDoc or Asciidoctor. This would align our source format with that of the Git project, for one thing, and if we named our source files with the .adoc extension, the GitHub UI should render them. However, we would then need to convert all our source files to a new format, and further tooling appears to be necessary to convert the DocBook output of these utilities into roff-format manual pages.

Another solution, and the one recommended in this proposal, is to replace ronn with Pandoc, which also offers direct conversion from Markdown to both HTML and roff formats.

The pandoc(1) utility should be available for installation via apt(8) on Debian, Homebrew on macOS, and Chocolatey on Windows. This would eliminate need to install a Ruby build environment with a ruby/setup-ruby@v1 GitHub Actions workflow step, which in turn would allow the Git LFS test suite to pass on Windows Server 2022.

The Markdown source format would mean our manual source pages would continue to render on GitHub's UI, and inline and reference links are fully supported for the HTML output.

We would need to revise all of our manual source pages, though; the existing ronn-format(7) ones will not convert cleanly when using pandoc(1). This effort, while significant, should not be as demanding as converting to an entirely different format like AsciiDoc.

One particular required change will be to replace the existing setext-style title headings with Pandoc's pandoc_title_block Markdown extension format, because Pandoc expects this in order to generate the roff manual pages' headers and footers, as described in the documentation:

The man page writer extracts a title, man page section number, and other header and footer information from the title line. The title is assumed to be the first word on the title line, which may optionally end with a (single-digit) section number in parentheses. (There should be no space between the title and the parentheses.) Anything after this is assumed to be additional footer and header text. A single pipe character (|) should be used to separate the footer text from the header text.

We will also have to replace any angle bracket-delimited words (other than HTML elements) with some other delimited form, as Pandoc does not support ronn's use of that technique for demarcating user-supplied arguments and variables in manual pages.

And we will need to write custom HTML templates for Pandoc to use, as its default HTML templates use a proportionally-spaced font and does not match the layout of a manual page displayed by man(1). We would perhaps also want to override the default template/styles.html template so that <em> elements were rendered as underlined instead of italicized, to align with their rendering by man(1), although this is not of major importance.

Additional Recommendations

If this proposal is adapted, we will need to edit all the manual source pages, and so we would have an opportunity to make some of our formatting more consistent between various pages and sections.

As noted above, ronn supports four types of inline formatting delimiters, and yet traditional roff-format manual pages are only displayed using two types of text highlighting, underlined text and/or emphasized (bold) text. And of course man(1) displays typically such pages in a monospaced font, so the use of Markdown backticks to delimit code sections seems somewhat superfluous.

We therefore propose that we eliminate the use of backticks and angle brackets as delimiters in our Markdown source, and restrict our inline formatting to just double stars (**) for emphasized text and single stars (*) for underlined text. The latter will not render in the GitHub UI, of course, but will leave the delimited text intact, unlike the use of angle brackets. Both should be correctly rendered in the output roff and HTML pages by Pandoc.

In places where we want to enclose a term in angle brackets to indicate it is a user-supplied argument or variable, we would use &lt;...&gt;. Often this would be combined with double stars to refer to a user-supplied value within a body of text. For instance, if we wanted to replicate the formatting of this portion of the Git project's git-checkout(1) manual page:

Screen Shot 2022-03-02 at 12 58 34 AM

we would write something like:

\--orphan &lt;new_branch&gt;
    Create a new *orphan* branch, named **&lt;new_branch&gt;**, started from
    **&lt;start_point&gt;** and switch to it. The first commit made on this new

which would appear in the GitHub UI as:

--orphan <new_branch>
Create a new orphan branch, named <new_branch>, started from
<start_point> and switch to it. The first commit made on this new

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions