Redirection: Avoid blocking when err+out> redirects too much stderr message#7240
Redirection: Avoid blocking when err+out> redirects too much stderr message#7240WindSoilder wants to merge 1 commit intonushell:mainfrom
Conversation
|
I only looked at the diff so far. How is this enabling interleaved reading? And how do we ensure that there is no blocking with potentially buffered reading? |
Actually the key lays in the
The key here is still multi-threading in consider the following python script: # test.py
import sys
import time
print('aa'*300, flush=True)
time.sleep(3)
print('bb'*999999, file=sys.stderr, flush=True)
time.sleep(3)
print('cc'*300, flush=True)And run the script with |
|
I tested this against: fn main() {
eprintln!("Hello1, world!");
println!("Hello2, world!");
eprintln!("Hello3, world!");
println!("Hello4, world!");
eprintln!("Hello5, world!");
println!("Hello6, world!");
}then did: The bash equivalent keeps them in the same order: |
|
Ok, I've checked bash, the bash way to implement redirect is using |
err+out> works like bash waybf8a7c8 to
644d07e
Compare
|
I spent some time looking into interleaving stderr and stdout. This blog post was quite helpful, as was section 5.5 in The Linux Programming Interface. How Bash Does ItAs WindSoilder says, Bash uses
How Nu Can Do ItI don't think using Another option would be to use a Rust channel and spin up threads that just pump data from the external's stderr+stdout into the channel as it arrives. Some hacky prototype code that seems to work: let mut process = Command::new("some-external")
.stderr(Stdio::piped())
.stdout(Stdio::piped())
.spawn()?;
let mut child_stderr = process.stderr.take().unwrap();
let mut child_stdout = process.stdout.take().unwrap();
let (tx, rx): (Sender<Vec<u8>>, Receiver<Vec<u8>>) = mpsc::channel();
// pump stderr into the channel
let thread_tx = tx.clone();
thread::spawn(move || {
let mut buf = [0; 10];
loop {
let bytes_read = child_stderr.read(&mut buf).unwrap();
if bytes_read == 0 {
break;
}
let v = Vec::from(&buf[..bytes_read]);
thread_tx.send(v).unwrap();
}
});
// pump stdout into the channel
let thread_tx = tx.clone();
thread::spawn(move || {
let mut buf = [0; 10];
loop {
let bytes_read = child_stdout.read(&mut buf).unwrap();
if bytes_read == 0 {
break;
}
let v = Vec::from(&buf[..bytes_read]);
thread_tx.send(v).unwrap();
}
});
// read from the channel
while let Ok(bytes) = rx.recv() {
print!("{}", String::from_utf8_lossy(&bytes));
} |
|
Running the first Python script like I'm a little surprised that the |
After checking my code a bit, I think this PR is not always work like it does, it uses multi-threading, so there is no guarantee about the order it printed out... It can only guarantee that all message to stdout will consume in order, and all message to stderr will consume in order. But message between stdout and stderr can have different order..... And it even can't guatantee that |
|
Have a look into this again, the external result handling should be done at the beginning of process creation: I don't think it's an easy task, and it requires a lot of change. Close the pr for now, it's a corner case of redirection, I think it's ok to leave current behavior for now |
|
That example is a great find, thanks. |
# Description Fixes: #8565 Here is another pr #7240 tried to address the issue, but it works in a wrong way. After this change `o+e>` won't redirect all stdout message then stderr message and it works more like how bash does. # User-Facing Changes For the given python code: ```python # test.py import sys print('aa'*300, flush=True) print('bb'*999999, file=sys.stderr, flush=True) print('cc'*300, flush=True) ``` Running `python test.py out+err> a.txt` shoudn't hang nushell, and `a.txt` keeps output in the same order ## About the change The core idea is that when doing lite-parsing, introduce a new variant `LiteElement::SameTargetRedirection` if we meet `out+err>` redirection token(which is generated by lex function), During converting from lite block to block, LiteElement::SameTargetRedirection will be converted to PipelineElement::SameTargetRedirection. Then in the block eval process, if we get PipelineElement::SameTargetRedirection, we'll invoke `run-external` with `--redirect-combine` flag, then pipe the result into save command ## What happened internally? Take the following command as example: `^ls o+e> log.txt` lex parsing result(`Tokens`) are not changed, but `LiteBlock` and `Block` is changed after this pr. ### LiteBlock before ```rust LiteBlock { block: [ LitePipeline { commands: [ Command(None, LiteCommand { comments: [], parts: [Span { start: 39041, end: 39044 }] }), // actually the span of first Redirection is wrong too.. Redirection(Span { start: 39058, end: 39062 }, StdoutAndStderr, LiteCommand { comments: [], parts: [Span { start: 39050, end: 39057 }] }), ] }] } ``` ### LiteBlock after ```rust LiteBlock { block: [ LitePipeline { commands: [ SameTargetRedirection { cmd: (None, LiteCommand { comments: [], parts: [Span { start: 147945, end: 147948}]}), redirection: (Span { start: 147949, end: 147957 }, LiteCommand { comments: [], parts: [Span { start: 147958, end: 147965 }]}) } ] } ] } ``` ### Block before ```rust Pipeline { elements: [ Expression(None, Expression { expr: ExternalCall(Expression { expr: String("ls"), span: Span { start: 39042, end: 39044 }, ty: String, custom_completion: None }, [], false), span: Span { start: 39041, end: 39044 }, ty: Any, custom_completion: None }), Redirection(Span { start: 39058, end: 39062 }, StdoutAndStderr, Expression { expr: String("out.txt"), span: Span { start: 39050, end: 39057 }, ty: String, custom_completion: None })] } ``` ### Block after ```rust Pipeline { elements: [ SameTargetRedirection { cmd: (None, Expression { expr: ExternalCall(Expression { expr: String("ls"), span: Span { start: 147946, end: 147948 }, ty: String, custom_completion: None}, [], false), span: Span { start: 147945, end: 147948}, ty: Any, custom_completion: None }), redirection: (Span { start: 147949, end: 147957}, Expression {expr: String("log.txt"), span: Span { start: 147958, end: 147965 },ty: String,custom_completion: None} } ] } ``` # Tests + Formatting Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass - `cargo run -- crates/nu-utils/standard_library/tests.nu` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` # After Submitting If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date.
Description
After reading into code, I found that
err+out>will always redirect stdout message first, then stderr message.Then when we have a script which want to redirect too much error message too make nushell hangs.
Here is a python script example:
And running script:
python test.py err+out> 1.txtI think it's better to make them interleaved like #6658 mentioned.And BTW, fix
--rawflag passing into external save command.User-Facing Changes
(List of all changes that impact the user experience here. This helps us keep track of breaking changes.)
Tests + Formatting
Don't forget to add tests that cover your changes.
Make sure you've run and fixed any issues with these commands:
cargo fmt --all -- --checkto check standard code formatting (cargo fmt --allapplies these changes)cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collectto check that you're using the standard code stylecargo test --workspaceto check that all tests passAfter Submitting
If your PR had any user-facing changes, update the documentation after the PR is merged, if necessary. This will help us keep the docs up to date.