Skip to content

Whitespace before EOL sometimes incorrectly preserved #715

@ollpu

Description

@ollpu

In the CommonMark spec, 4.8 Paragraphs:

The paragraph’s raw content is formed by concatenating the lines and removing initial and final spaces or tabs.

and in 6.8 Soft line breaks:

Spaces at the end of the line and beginning of the next line are removed:

Some test cases for these (atop tests/html.rs):

// fails
#[test]
fn trim_space_and_tab_at_end_of_paragraph() {
    let original = "one\ntwo \t";
    let expected = "<p>one\ntwo</p>\n";

    let mut s = String::new();
    html::push_html(&mut s, Parser::new(&original));
    assert_eq!(expected, s);
}

// passes
#[test]
fn trim_space_tab_nl_at_end_of_paragraph() {
    let original = "one\ntwo \t\n";
    let expected = "<p>one\ntwo</p>\n";

    let mut s = String::new();
    html::push_html(&mut s, Parser::new(&original));
    assert_eq!(expected, s);
}

// fails
#[test]
fn trim_space_nl_at_end_of_paragraph() {
    let original = "one\ntwo \n";
    let expected = "<p>one\ntwo</p>\n";

    let mut s = String::new();
    html::push_html(&mut s, Parser::new(&original));
    assert_eq!(expected, s);
}

// fails
#[test]
fn trim_space_before_soft_break() {
    let original = "one \ntwo";
    let expected = "<p>one\ntwo</p>\n";

    let mut s = String::new();
    html::push_html(&mut s, Parser::new(&original));
    assert_eq!(expected, s);
}

I tried fixing this, but it didn't turn out so simple. trim_space_tab_nl_at_end_of_paragraph seems to pass because hard breaks are correctly trimmed out. However, no hard break is recognized if the input doesn't have a trailing newline. So that case will at least require some kind of special handling or modification to parse_line.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions