Use accumulator for diagnostics by MichaReiser · Pull Request #14116 · astral-sh/ruff

MichaReiser · 2024-11-05T19:31:27Z

Summary

Use a Salsa accumulator for diagnostics to avoid double-emitting diagnostics for unpack assignments.

We now use a single CompileDiagnostic accumulator that stores all diagnostics. That includes IO errors, parse errors, and type check errors. I decided that we're beyond where String annotations are fun. That's why I introduced a Diagnostic trait with 5 implementations:

ParseDiagnostic for parse errors
SourceTextDiagnostic for IO errors
TypeCheckDiagnostic for type inference errors
RevealedTypeDiagnostic for reveal_type (info severity)
CompileDiagnostic which is a cheap cloneable any diagnostic wrapper that implements salsa accumulator.

Using a structured diagnostic has the advantage of no longer needing manual parsing in the LSP.

A nice side effect of this is that mdtests now support tests with syntax errors because parse-errors are just another diagnostic :)

Performance regression

I expect this to hit performance pretty bad, especially the incremental benchmark. The problem is that there's currently no way to run an accumulator concurrently. That's why I had to resort to a hack where we run type checking in parallel, and then collect the diagnostics. This has the downside that Salsa first runs all queries (in parallel) but then has to iterate over all of them again to collect the diagnostics (even if there are none!). The queries are all cached, but it is still expensive because the query dependency tree is somewhat large.

This overhead is especially noticeable in the cache case. We'll need salsa-rs/salsa#568 to do the accumulation and checking in one go.

Test Plan

cargo test

MichaReiser · 2024-11-05T19:32:24Z

crates/red_knot_python_semantic/src/types.rs

+#[salsa::tracked]
+pub fn check_types(db: &dyn Db, file: File) {


This has to be a salsa query so that the mdtest framework can get the diagnostics (accumulator require a query)

MichaReiser · 2024-11-05T19:34:52Z

crates/red_knot_server/src/edit/range.rs

All the code here is copied over from ruff_server.

crates/ruff_db/src/parsed.rs

codspeed-hq · 2024-11-05T19:37:58Z

CodSpeed Performance Report

Merging #14116 will degrade performances by 32.2%

_{Comparing micha/accumulator-diagnostics (371e5fc) with main (4ece8e5)}

Summary

❌ 1 regressions
✅ 31 untouched benchmarks

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`main`	`micha/accumulator-diagnostics`	Change
❌	`red_knot_check_file[incremental]`	4.2 ms	6.3 ms	-32.2%

github-actions · 2024-11-05T20:10:42Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

AlexWaygood

Just a few comments from skimming. I'll try to look at this more in-depth tomorrow!

crates/red_knot_python_semantic/src/types/infer.rs

crates/red_knot_test/src/lib.rs

crates/ruff_db/src/diagnostic.rs

crates/red_knot_workspace/src/workspace.rs

crates/ruff_benchmark/benches/red_knot.rs

carljm

Looks great to me, thank you! I think this is definitely an improvement in terms of reliable handling of diagnostics (to avoid duplicating or accidentally missing them); hopefully parallel-db brings back most of the regression.

carljm · 2024-11-05T22:26:51Z

crates/ruff_db/src/diagnostic.rs

@@ -0,0 +1,175 @@
+use ruff_text_size::TextRange;
+use salsa::Accumulator as _;


What is the benefit of the as _ here? It seems to be just as happy for me locally without it.

Suggested change

use salsa::Accumulator as _;

use salsa::Accumulator;

Both work, for as long as there's no other type named Accumulator. Using as _ imports that type "unnamed". You can't reference the type itself. That's often sufficient for traits because all that is needed is to bring the trait into the scope to call that trait's functions.

crates/red_knot_test/src/lib.rs

crates/red_knot_test/src/diagnostic.rs

MichaReiser · 2024-11-06T07:21:26Z

I just noticed that we can now support invalid-syntax in mdtests. I ported the exception handler with invalid syntax test case to demonstrate it

MichaReiser · 2024-11-06T09:04:32Z

I don't think this is the way to go. The problem is that salsa's accumulator are O(tree) and not O(changes) which introduces a significant cost. They are very convenient but I'm concerned about the incremental performance in very large mono repository...

I have a fix specific for unpacking and I'll look into extracting some of the improvements I made in this PR but I'm more convinced now that accumulators, as they are today, are not a good fit with our very deeply nested queries.

MichaReiser commented Nov 5, 2024

View reviewed changes

crates/red_knot_server/src/edit/range.rs

Copy link

Member Author

MichaReiser Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the code here is copied over from ruff_server.

MichaReiser commented Nov 5, 2024

View reviewed changes

crates/ruff_db/src/parsed.rs Outdated Show resolved Hide resolved

MichaReiser force-pushed the micha/accumulator-diagnostics branch from 3dd3ee3 to 24d6e87 Compare November 5, 2024 19:37

MichaReiser force-pushed the micha/accumulator-diagnostics branch 2 times, most recently from d9ec46b to ec9a12b Compare November 5, 2024 19:50

MichaReiser marked this pull request as ready for review November 5, 2024 19:51

MichaReiser requested review from AlexWaygood, carljm and sharkdp as code owners November 5, 2024 19:51

MichaReiser added the ty Multi-file analysis & type inference label Nov 5, 2024

MichaReiser force-pushed the micha/accumulator-diagnostics branch from ec9a12b to 522f3d4 Compare November 5, 2024 20:16

AlexWaygood reviewed Nov 5, 2024

View reviewed changes

carljm approved these changes Nov 5, 2024

View reviewed changes

MichaReiser force-pushed the micha/accumulator-diagnostics branch from 522f3d4 to 27d2a8e Compare November 6, 2024 07:45

Use accumulator for diagnostics

1cefe50

MichaReiser force-pushed the micha/accumulator-diagnostics branch from 27d2a8e to c9e4d72 Compare November 6, 2024 08:07

Remove mdtest Diagnostic trait, support tests with syntax errors

371e5fc

MichaReiser force-pushed the micha/accumulator-diagnostics branch from c9e4d72 to 371e5fc Compare November 6, 2024 08:09

MichaReiser closed this Nov 6, 2024

This was referenced Nov 6, 2024

micha/structured diagnostics #14129

Closed

Introduce Diagnostic trait #14130

Merged

MichaReiser mentioned this pull request Nov 17, 2024

Faster accumulators salsa-rs/salsa#615

Merged

MichaReiser mentioned this pull request Dec 3, 2024

Use salsa accumulators for diagnostics #14760

Closed

		#[salsa::tracked]
		pub fn check_types(db: &dyn Db, file: File) {

		@@ -0,0 +1,175 @@
		use ruff_text_size::TextRange;
		use salsa::Accumulator as _;

Conversation

MichaReiser commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Performance regression

Test Plan

Uh oh!

MichaReiser Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

MichaReiser Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codspeed-hq bot commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #14116 will degrade performances by 32.2%

Summary

Benchmarks breakdown

Uh oh!

github-actions bot commented Nov 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

Uh oh!

AlexWaygood left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

carljm left a comment

Choose a reason for hiding this comment

Uh oh!

carljm Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

MichaReiser Nov 6, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MichaReiser commented Nov 6, 2024

Uh oh!

MichaReiser commented Nov 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MichaReiser commented Nov 5, 2024 •

edited

Loading

codspeed-hq bot commented Nov 5, 2024 •

edited

Loading

github-actions bot commented Nov 5, 2024 •

edited

Loading

`ruff-ecosystem` results