Skip to content

Conversation

@MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Apr 7, 2023

This PR introduces a new SourceFile type that replaces SourceCodeBuf. It stores the filename and, optionally, the content of the file. SourceFile uses an Arc internally to make it cheap to clone.

This change was proposed by @charliermarsh to avoid allocating a String with the filename for every Message.

Performance

This change significantly improves performance for projects with many diagnostics as we can see from the benchmarks that use --select ALL. For example, checking all rules with caching enabled improves from 470ms to 406ms

Command Mean [ms] Min [ms] Max [ms] Relative
./ruff-source-file <cpython_path> -e 40.1 ± 2.4 35.4 45.4 1.00
./ruff-main <cpython_path> -e 41.8 ± 2.6 37.4 47.6 1.04 ± 0.09
./ruff-source-file <cpython_path> -e --no-cache 221.9 ± 8.2 211.2 236.9 5.53 ± 0.39
./ruff-main <cpython_path> -e --no-cache 221.5 ± 4.3 211.7 226.3 5.52 ± 0.35
./ruff-source-file <cpython_path> -e --select=ALL 405.8 ± 10.1 391.0 427.4 10.11 ± 0.66
./ruff-main <cpython_path> -e --select=ALL 469.5 ± 8.6 451.0 482.2 11.69 ± 0.74
./ruff-source-file <cpython_path> -e --no-cache --select=ALL 739.1 ± 17.6 714.8 770.5 18.41 ± 1.20
./ruff-main <cpython_path> -e --no-cache --select=ALL 796.1 ± 13.2 780.0 820.7 19.83 ± 1.25
./ruff-source-file <cpython_path> -e --show-source 85.9 ± 3.7 80.0 93.1 2.14 ± 0.16
./ruff-main <cpython_path> -e --show-source 87.5 ± 2.9 82.5 93.2 2.18 ± 0.15
./ruff-source-file <cpython_path> -e --no-cache --show-source 262.4 ± 6.0 253.2 272.0 6.54 ± 0.42
./ruff-main <cpython_path> -e --no-cache --show-source 267.3 ± 7.4 255.7 279.7 6.66 ± 0.44
./ruff-source-file <cpython_path> -e --select=ALL --show-source 1480.3 ± 22.2 1453.7 1516.0 36.87 ± 2.30
./ruff-main <cpython_path> -e --select=ALL --show-source 1551.0 ± 24.1 1499.7 1590.7 38.63 ± 2.42
./ruff-source-file <cpython_path> -e --no-cache --select=ALL --show-source 1791.8 ± 25.6 1759.3 1845.3 44.63 ± 2.78
./ruff-main <cpython_path> -e --no-cache --select=ALL --show-source 1851.4 ± 23.0 1816.1 1887.9 46.11 ± 2.86

Memory Usage

Memory usage reduced from ~813MB to ~686MB when running all rules without showing the source code.

@MichaReiser
Copy link
Member Author

MichaReiser commented Apr 7, 2023

@MichaReiser MichaReiser marked this pull request as ready for review April 7, 2023 10:11
@github-actions
Copy link
Contributor

github-actions bot commented Apr 7, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
linter/all-rules/large/dataset.py          1.00     17.0±0.06ms     2.4 MB/sec    1.01     17.1±0.10ms     2.4 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      4.3±0.01ms     3.9 MB/sec    1.00      4.3±0.01ms     3.9 MB/sec
linter/all-rules/numpy/globals.py          1.00    548.9±0.85µs     5.4 MB/sec    1.00    549.9±1.21µs     5.4 MB/sec
linter/all-rules/pydantic/types.py         1.00      7.2±0.01ms     3.5 MB/sec    1.01      7.3±0.02ms     3.5 MB/sec
linter/default-rules/large/dataset.py      1.00      8.6±0.02ms     4.7 MB/sec    1.01      8.7±0.02ms     4.7 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00   1953.6±3.25µs     8.5 MB/sec    1.00   1960.6±2.92µs     8.5 MB/sec
linter/default-rules/numpy/globals.py      1.00    216.7±0.60µs    13.6 MB/sec    1.08   234.2±88.93µs    12.6 MB/sec
linter/default-rules/pydantic/types.py     1.00      4.0±0.01ms     6.4 MB/sec    1.00      4.0±0.01ms     6.4 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
linter/all-rules/large/dataset.py          1.00     16.0±0.10ms     2.5 MB/sec    1.00     16.0±0.09ms     2.5 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      4.2±0.02ms     4.0 MB/sec    1.00      4.2±0.04ms     4.0 MB/sec
linter/all-rules/numpy/globals.py          1.00    437.2±6.09µs     6.7 MB/sec    1.00    439.3±4.69µs     6.7 MB/sec
linter/all-rules/pydantic/types.py         1.00      6.9±0.07ms     3.7 MB/sec    1.00      6.9±0.03ms     3.7 MB/sec
linter/default-rules/large/dataset.py      1.00      8.2±0.02ms     5.0 MB/sec    1.00      8.2±0.03ms     5.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00  1776.8±30.37µs     9.4 MB/sec    1.01   1788.9±9.98µs     9.3 MB/sec
linter/default-rules/numpy/globals.py      1.00    186.1±2.91µs    15.9 MB/sec    1.01    187.4±4.94µs    15.7 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.8±0.02ms     6.8 MB/sec    1.00      3.8±0.03ms     6.8 MB/sec

Copy link
Member

@charliermarsh charliermarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for looking into this!

@MichaReiser MichaReiser force-pushed the source-frame-context branch from 177cb9b to cee4474 Compare April 11, 2023 07:26
@MichaReiser MichaReiser force-pushed the source-frame-context branch 2 times, most recently from a6b47b6 to f07154c Compare April 11, 2023 08:08
Base automatically changed from source-frame-context to main April 11, 2023 08:22
@MichaReiser MichaReiser added the internal An internal refactor or improvement label Apr 11, 2023
@MichaReiser MichaReiser enabled auto-merge (squash) April 11, 2023 08:24
@MichaReiser MichaReiser merged commit c33c9dc into main Apr 11, 2023
@MichaReiser MichaReiser deleted the code-diff branch April 11, 2023 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal An internal refactor or improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants