Skip to content

Lazy-load syntax Contexts#393

Merged
trishume merged 1 commit into
trishume:masterfrom
Enselic:lazy-load-syntaxes
Nov 22, 2021
Merged

Lazy-load syntax Contexts#393
trishume merged 1 commit into
trishume:masterfrom
Enselic:lazy-load-syntaxes

Conversation

@Enselic

@Enselic Enselic commented Nov 9, 2021

Copy link
Copy Markdown
Collaborator

This enables making SyntaxSet::load_defaults_newlines() much faster. Intuitively it might seem like this change is enough, but it is also necessary to stop compressing the "outer" data in assets/default_newlines.packdump. Since the inner, lazy-loaded data is already compressed, compressing the data another time only makes the data CPU-intensive to decompress, without any gains in reduced size of the data.

The final step in lazy-loading will be done in a separate commit/PR after this has been merged. Here is an overview of what kind of improvements are at stake:

Version SyntaxSet::load_defaults_newlines() time
v4.6.0 35.3 ms
This commit 24.0 ms
No outer compression 1.3 ms (!)

Lazy-loading syntaxes is very fast (in the sub-ms range even on low-end desktops for most syntaxes), so time-to-parse is practically unchanged by this commit.

Naturally, the time to build a SyntaxSet (criterion test link_syntaxes) has increased significantly, since it now involves serialization and compression. That is a very good trade-off however, since it is better if the one-time link step is slow but loading assets is fast, versus fast one-time linking but slow loading of assets. Loading assets occurs over and over while linking only occurs once.

All syntect regression tests pass, and all bat regression tests pass. So risk of regressions is very low.

Overview of the syntax lazy-loading work:

Note: The target branch of this PR is temporarily store-contexts-in-syntax-references since #382 is not merged yet, but given the discussion in it it appears practically Approved to me, and I figured it wouldn't hurt to get the next PR in the series up for review.

This enables making `SyntaxSet::load_defaults_newlines()` *much* faster.
Intuitively it might seem like this change is enough, but it is also necessary
to stop compressing the "outer" data in `assets/default_newlines.packdump`.
Since the inner, lazy-loaded data is already compressed, compressing the data
another time only makes the data CPU-intensive to decompress, without any gains
in reduced size of the data.

The final step in lazy-loading will be done in a separate commit. Here are an
overview of what kind of improvements are at stake:

| Version              | `SyntaxSet::load_defaults_newlines()` time |
| -------------------- | -----------------------------------------  |
| v4.6.0               | 35.3 ms                                    |
| This commit          | 24.0 ms                                    |
| No outer compression |  1.3 ms (!)                                |

Lazy-loading syntaxes is very fast (in the sub-ms range), so time-to-parse is
practically unchanged by this commit.
Enselic added a commit to Enselic/bat that referenced this pull request Nov 13, 2021
To get fast startup, syntect will instead start to lazy-load syntaxes. See
trishume/syntect#393 and discussions in linked PRs.
@trishume trishume changed the base branch from store-contexts-in-syntax-references to master November 22, 2021 05:10
@trishume trishume merged commit aefae2f into trishume:master Nov 22, 2021
@Enselic Enselic deleted the lazy-load-syntaxes branch November 22, 2021 06:21
Enselic added a commit to Enselic/bat that referenced this pull request Nov 22, 2021
To get fast startup, syntect will instead start to lazy-load syntaxes. See
trishume/syntect#393 and discussions in linked PRs.
Enselic added a commit to sharkdp/bat that referenced this pull request Nov 22, 2021
To get fast startup, syntect will instead start to lazy-load syntaxes. See
trishume/syntect#393 and discussions in linked PRs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants