Use linked compiler during Cargo exec() callback#424
Conversation
| if let Some(new_analysis) = new_analysis { | ||
| analysis.reload_from_analysis(new_analysis, &project_path_clone, &cwd, false).unwrap(); | ||
| } else { | ||
| if new_analysis.is_empty() { |
There was a problem hiding this comment.
After each crate compilation rustc can return an Option with analysis data and we collect unwrapped payloads into a vector, which we then load from file. For a single crate the behaviour is the same, since it's a change None => vec![] and Some(analysis) => vec![analysis]. However with this, now Cargo routine can collect and load the analysis data. There's one what-if: if a compiled indirect dependency will return None for analysis but there will be some collected into a vector, then a missing one won't be loaded from disk, since it'll only try to load the provided analysis in-memory, and possible cause subtle problems resolving the data? @nrc is such a case possible?
There was a problem hiding this comment.
It is rare for there to be no save-analysis info, even if we fail to compile, so I expect this will not be a problem. In any case, it should be resolved once we build successfully, so it should only be temporary.
| _ => {} | ||
| } | ||
| } else { | ||
| cmd.status().expect("Couldn't execute rustc"); |
There was a problem hiding this comment.
I think we must do this before storing the args/envs into the compialtion_cx
There was a problem hiding this comment.
I see that I made it so the lock on compilation_cx is unnecessarily held for the whole duration of compilation, I'll fix that.
Besides that, should we set it after in case cmd.status() or rustc::rustc() panics and we don't store bogus values from faulty compilation? If not, then I think this may not be a problem, since we feed the values directly to rustc::rustc and it doesn't touch it nor does cmd.status() require it.
src/build/cargo.rs
Outdated
|
|
||
| if self.workspace_mode { | ||
| let args = &compilation_cx.args; | ||
| let envs = &compilation_cx.envs; |
There was a problem hiding this comment.
I don't see how these ever get populated since we never do a Cargo run without running rustc directly in workspace mode? How does this work?
There was a problem hiding this comment.
Only cargo routine populates the args and envs, rustc consumes them externally. In workspace mode currently only ever cargo routine is called, which then prepares RLS-specific args/envs from those generated by Cargo itself as usual. Then it feeds them directly to, introduced in this change, rustc::rustc (unlike being passed via Command API and actually executing the command). In practice it doesn't use use compilation_cx.{args,envs} at all, however it doesn't prevent it from being set here
There was a problem hiding this comment.
ok, going to take another look, I think I misunderstood how it all works
| // however order of received messages is non-deterministic and this | ||
| // would require implementing something like `or_expect_contains` | ||
| ExpectedMessage::new(None).expect_contains("publishDiagnostics"), | ||
| ExpectedMessage::new(None).expect_contains("publishDiagnostics"), |
There was a problem hiding this comment.
Ideally, I think we should only send a single publish/end pair, rather than one per crate. But that can be fixed later. You should check for a second diagnosticsEnd though
There was a problem hiding this comment.
Are you sure? There is only one requested build here, so there's only one pair of diagnosticsBegin/diagnosticsEnd and multiple publishDiagnostics in between because there are multiple files (issued here while iterating every file => diagnostics results pair), confirmed with output now
There was a problem hiding this comment.
oh sorry, I was confused - I thought they were multiple diagnosticsBegin, not publishDiagnostics. Then this is doing what I expect.
| } | ||
|
|
||
| // Finally, store the modified cargo-generated args/envs for future rustc calls | ||
| args.insert(0, rustc_exe); |
There was a problem hiding this comment.
@nrc while we're at it, do you know why we need to pass a rustc exe as a first argument to the linked compiler?
There was a problem hiding this comment.
we don't need to pass it, but it is so that the args look like what would happen if you took the args from inside the program (there you get the executable name as the first arg). We could not insert it and strip that arg if we ever still take the args from inside the program (I'm not sure we do).
There was a problem hiding this comment.
Oh, okay, I thought run_compiler accepts args with the first program argument omitted, nevermind then 👍
|
As a note, |
|
OK, I think I understand better what you're doing and it looks good. Just need to fix up the locking on the env and it should be good to go. |
|
Force-pushed a version with two locks (one for Cargo, one for rustc scopes) that fixes the workspace deadlock. Unfortunately right now this still can fail (what's interesting that it only seems to affect Since more work in general and the more complete workspace support will probably mean more work involving env vars, I'll try to move relevant tests into separate processes so we don't have to work around env var as much and unnecessarily. |
|
I'll get working on more foolproof locking for the tests during the weekend and hopefully will get back whenever I manage to do something with it. |
|
Pushed a prototype version of Naive approach of doing two separate locks for each procedure won't work, because our builds has 2 modes: bare I'm currently using It's only a prototype and requires more work, like organizing code better in general, encapsulating the logic a bit more, so we can ensure to the best of our ablities that the locks will be acquired in correct order and it needs proper comments when the design will be set in stone. |
src/build/cargo.rs
Outdated
| let mut restore_env = Environment::push(&env); | ||
| // There can be at most one Cargo instance ran, so don't guard env vars with a lock in case | ||
| // unerlying rustc calls will require mutually exclusive access to this process' env vars | ||
| // FIXME: Don't lock me. This is only required for tests and shouldn't modify program's behaviour |
There was a problem hiding this comment.
To be removed, forgot about this comment
157ad60 to
5bd732f
Compare
|
Pushed a reorganized and documented version of the double mutex. For this, I separated this into @nrc does this version look better? I tried to encode the none -> outer -> inner lock access order via types, but locking order is still not guaranteed. However, I think it's a lot more clear and more ergonomical (?) now, for what it's worth 🙂 |
nrc
left a comment
There was a problem hiding this comment.
I need to come back and do a proper review, but I'm feeling fuzzy. Here are some initial comments
| @@ -0,0 +1,141 @@ | |||
| use std::env; | |||
There was a problem hiding this comment.
lets keep this in the build module, rather than util (and the file needs the copyright header)
There was a problem hiding this comment.
I wanted to separate it into a different module to hide the impl details and this seemed abstract enough to put under util module. Can I put it in the build module but under inline module to achieve the same visibility semantics (so the build/cargo modules can't access private fields)?
src/util/mod.rs
Outdated
| @@ -0,0 +1 @@ | |||
| pub mod environment; | |||
There was a problem hiding this comment.
Although we don't need this with the previous comment, for future reference, you don't need a separate file here, you could use an inline module in the parent.
src/build/cargo.rs
Outdated
| let outer_lock = env_lock.as_outer(); | ||
| let (outer, inner) = outer_lock.get_lock(); | ||
| // Lock early to guarantee synchronized access to read RUSTFLAGS env var | ||
| let env_lock_guard = outer.lock().unwrap(); |
There was a problem hiding this comment.
Could we encapsulate this stuff in the EnvironmentLock class and expose descriptively named API instead of low-level locking primitives?
There was a problem hiding this comment.
@nrc what should the API return? MutexGuard<'a, ()> (newtyped?) instead of &Mutex<()>? I just realized that returning a scoped value in this case does not require providing explicit lifetime to the lock structs, so this will be an improvement.
In general, though, I'd love to limit the lifetime of InnerLock to the one of the guard that's returned alongside of it, but since it's passed to Arc<Executor>, I'm not sure that's possible.
There was a problem hiding this comment.
I imagine you would have your own guard struct that you would return, rather than returning either the mutex itself or a guard.
5bd732f to
6d30069
Compare
|
Pushed updated and rebased version, hope I addressed some points you wanted. |
6d30069 to
7790a96
Compare
|
macOS build got stuck, retrying |
|
I've read through and I think this looks OK. I'd be interested in trying to iterate on the env lock a bit more, but I can't imagine how to do that without playing with the code myself. So, lets land this as is. Could you rebase please? |
This introduces a new `build::environment` module with moved `Environment` RAII guard and new global `EnvironmentLock` and related. The lock has to be a double one, because we might need doubly scoped env vars (outer one for Cargo routine and inner one for underlying rustc calls across different threads, contained within Cargo routine invocation scope). While the lock itself and provided types don't enforce a strict lock ordering, the order of retrieved locks and types is still encoded in the type system. This might not be a complete and fully abstracted away solution to N-valued mutex and guarded environment, but works good enough and seems to cover enough cases in both production and testing scenarios.
7790a96 to
a9f5f25
Compare
|
Rebased now |
|
Thanks for the rebase! |
This is a first step to not be as reliant on Cargo and makes it so RLS uses linked compiler instead of
rustcor$RUSTCprocess.Passing analysis data from the compiler should be faster in few cases since analysis data will be passed in-memory, rather than unconditionally via file on disk. Additionally
cargoroutine can now collect diagnostics messages and display them until subsequent barerustcbuild routine is performed.