Fix dora runtime spawn command for dora installed through pip#940
Fix dora runtime spawn command for dora installed through pip#940
dora runtime spawn command for dora installed through pip#940Conversation
Pip installations of dora use a small Python wrapper script, which then calls a functions of the `dora_cli` module. This means that the `current_exe()` will be a python binary instead of a standalone `dora` binary. This leads to errors when trying to start a `dora runtime` instance for shared library operators (see #900). This commit fixes this issue by introducing a new `DoraCommand` struct that is initialized in the respective main function. For normal Rust binaries, this is just initialized with `current_exe()` as before. When invoked from Python, we also include the first argument, which should be the path to the Python wrapper script invoking dora. We also use the 0th argument instead of `current_exe` for invoking the `python` binary because `current_exe` resolves symlinks on some platforms, which affects import statements (different base directory). Fixes #900
f7c76c1 to
ac23cb4
Compare
| .build() | ||
| .context("tokio runtime failed")?; | ||
| let result = rt.block_on(Daemon::run_dataflow(&dataflow_path, uv.unwrap_or_default()))?; | ||
| let dora_command = DoraCommand::from_python_env()?; |
There was a problem hiding this comment.
FYI, this is suppose to be run within a python script unrelated to the cli.
I'm slightly confused, why do we need dora command in this case. Is this to find the runtime python path?
There was a problem hiding this comment.
We need to be able to invoke the Dora CLI when there are any operators present in the dataflow. Operators are run by a dora runtime process, which we need to spawn.
This spawn command differs based on how Dora was installed. If it was installed as a normal binary, we can use current_exe to get the path to the binary. For Python-based installations, this points to a python executable instead.
Either way, you're totally right that this is the wrong approach. Dora might still be installed as a normal binary even if the run command is invoked from Python. Or even worse, Dora might not be installed at all....
|
FYI, I have fixed this issue for the daemon and the coordinator with: fn start_daemon() -> eyre::Result<()> {
let path = if cfg!(feature = "python") {
std::env::args_os()
.nth(1)
.context("Could not get first argument correspond to dora with python installation")?
} else {
std::env::args_os()
.next()
.context("Could not get dora path")?
};
let mut cmd = Command::new(path);
cmd.arg("daemon");
cmd.arg("--quiet");
cmd.spawn().wrap_err("failed to run `dora daemon`")?;
println!("started dora daemon");
Ok(())
}I think this changes creates a lot of additional argument changes that can be annoying down the line. |
Unfortunately, that's different because this is invoked from the CLI directly. The One simple approach could be to just invoke |
|
I have opened a simpler PR that I have tested on c++ dataflow example: #1011 |
|
Closed in favor of #1011 |
Pip installations of dora use a small Python wrapper script, which then calls a functions of the
dora_climodule. This means that thecurrent_exe()will be a python binary instead of a standalonedorabinary. This leads to errors when trying to start adora runtimeinstance for shared library operators (see #900).This commit fixes this issue by introducing a new
DoraCommandstruct that is initialized in the respective main function. For normal Rust binaries, this is just initialized withcurrent_exe()as before. When invoked from Python, we also include the first argument, which should be the path to the Python wrapper script invoking dora. We also use the 0th argument instead ofcurrent_exefor invoking thepythonbinary becausecurrent_exeresolves symlinks on some platforms, which affects import statements (different base directory).Fixes #900