Skip to content

Protect against quadratic generated table size explosion#146

Merged
jgm merged 1 commit intojgm:masterfrom
notriddle:notriddle/table-size-limit
Feb 3, 2024
Merged

Protect against quadratic generated table size explosion#146
jgm merged 1 commit intojgm:masterfrom
notriddle:notriddle/table-size-limit

Conversation

@notriddle
Copy link
Copy Markdown
Contributor

@notriddle notriddle commented Feb 1, 2024

Fixes #145

This commit adds a limit to the number of auto-completed cells around 200,000. The result is, in these original samples:

python -c 'N=100; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
102362

This is unchanged.

$ python -c 'N=1000; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
2025878

This, however, is much lower. The parser is refusing to parse the row where the limit is hit, and switches to parsing it as text instead.

$ python -c 'N=10000; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
2240258

It's not quadratic any more (without this commit, it was 1000230062).

At the limit, 200,000 autocompleted rows is approximately "</td><td>".len * 200000 generated data, or 1.8Mbyte of data. The above examples show about 2Mbytes, thanks to the row tags and text overhead, and when the parser breaks out of the table it's still including the remaining "rows" as text.

This commit adds a limit to the number of auto-completed cells
around 200,000. The result is, in these original samples:

    python -c 'N=100; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
    102362

This is unchanged.

    $ python -c 'N=1000; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
    2025878

This, however, is much lower. The parser is refusing to parse the row where the limit is hit,
and switches to parsing it as text instead.

    $ python -c 'N=10000; print("x|" * N + "\n" + "-|" * N + "\n" + "x|\n" * N)' | commonmark --extension=pipe_tables | wc -c
    2240258

It's not quadratic any more (without this commit, it was 1000230062).

At the limit, 200,000 autocompleted rows is approximately "</td><td>".len * 200,000
generated data, or 1.8Mbyte of data. The above examples show about 2Mbytes,
thanks to the row tags and text overhead, and when the parser breaks out of the table
it's still including the remaining "rows" as text.
@jgm jgm merged commit 6c07e48 into jgm:master Feb 3, 2024
@notriddle notriddle deleted the notriddle/table-size-limit branch February 3, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Quadratic output size explosion with tables extension

2 participants