Skip to content

Incorrect parsing of docbook link text with leading/trailing whitespace #11398

@dregad

Description

@dregad

Explain the problem

I believe pandoc does not parse the following docbook snippet correctly:

<para>Text
  <ulink url="http://example.com">
    link
  </ulink>.
</para>

Expected result

My docbook toolchain renders this as follows (note that all the whitespace within ulink tag is trimmed)

<p>Text <a href="http://example.com">link</a>.</p>

Actual result

pandoc --from docbook --to html test.xml does this (note the space between closing a tag and period).

<p>Text <a href="http://example.com">link</a> .</p>

Interestingly, when there is no space around the link text, it renders correctly

<para>Text
  <ulink url="http://example.com">link</ulink>.
</para>
<p>Text <a href="http://example.com">link</a>.</p>

I guess it could (maybe) make sense if the space were within the a tag, i.e. ...>link </a>. but it should definitely not be moved outside of it.

For the record, I'm trying to convert docbook to asciidoc, but it seems the problem is the docbook source parsing, not rendering.

  • Expected: Text http://example.com[link].
  • Actual: Text http://example.com[link] .

Pandoc version?
3.8.3 on Ubuntu 22.04.5 LTS

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions