Skip to content

Implementing ByteStream interuption on infinite stream#13552

Merged
IanManske merged 13 commits intonushell:mainfrom
simon-curtis:range-on-bytestream
Jan 11, 2025
Merged

Implementing ByteStream interuption on infinite stream#13552
IanManske merged 13 commits intonushell:mainfrom
simon-curtis:range-on-bytestream

Conversation

@simon-curtis
Copy link
Copy Markdown
Contributor

Description

This PR should address #13530 by explicitly handling ByteStreams.

The issue can be replicated easily on linux by running:

open /dev/urandom | into binary | bytes at ..10

Would leave the output hanging and with no way to cancel it, this was likely because it was trying to collect the input stream and would not complete.

I have also put in an error to say that using negative offsets for a bytestream without a length cannot be used.

~/git/nushell> open /dev/urandom | into binary | bytes at (-1)..
Error: nu::shell::incorrect_value

  × Incorrect value.
   ╭─[entry #3:1:35]
 1  open /dev/urandom | into binary | bytes at (-1)..
   ·                                   ────┬─── ───┬──
   ·                                              ╰── encountered here
   ·                                       ╰── Negative range values cannot be used with streams that don't specify a length
   ╰────

User-Facing Changes

No operation changes, only the warning you get back for negative offsets

Tests + Formatting

Ran toolkit check pr with no errors or warnings

Manual testing of the example commands above

Copy link
Copy Markdown
Contributor

@WindSoilder WindSoilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The idea of using ByteStream::from_fn and reader.bytes looks good to me!

Just some comments for a corner case and some test cases

Copy link
Copy Markdown
Contributor

@WindSoilder WindSoilder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update! I left some comments about the position of tests. And CI is failed, you need to run cargo fmt to fix it :)

@simon-curtis simon-curtis force-pushed the range-on-bytestream branch 3 times, most recently from c1c96b4 to 3d7519e Compare October 23, 2024 20:11
@simon-curtis
Copy link
Copy Markdown
Contributor Author

All requested changes have been completed 😄

Copy link
Copy Markdown
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! The take and skip commands already support binary streams, and their code could be reused for this command. I.e., bytes at ..$n is essentially take $n, bytes at $n.. is skip $n, and bytes at $n..$m is skip $n | take ($m - $n). I think it would make more sense if the binary stream handling was extracted out from those commands into a separate file and then reused across all these commands (skip, take, and bytes at).

@simon-curtis simon-curtis force-pushed the range-on-bytestream branch 5 times, most recently from d4abbf3 to b7b29ac Compare October 23, 2024 20:58
@simon-curtis
Copy link
Copy Markdown
Contributor Author

Thanks for this PR! The take and skip commands already support binary streams, and their code could be reused for this command. I.e., bytes at ..$n is essentially take $n, bytes at $n.. is skip $n, and bytes at $n..$m is skip $n | take ($m - $n). I think it would make more sense if the binary stream handling was extracted out from those commands into a separate file and then reused across all these commands (skip, take, and bytes at).

I will take a look at this

@simon-curtis
Copy link
Copy Markdown
Contributor Author

I have just pushed a fix for a performance bug where in the "relative index" branch would collect the entire ByteStream before limiting.

One thing I have noticed is bytes from a File doesn't have a length?

image

@IanManske
Copy link
Copy Markdown
Member

IanManske commented Oct 24, 2024

Sorry, I should I have linked the code I was talking about. The take and skip commands I referred to before are other Nushell commands, not the take and skip defined on iterators in Rust.

https://github.com/nushell/nushell/blob/main/crates/nu-command/src/filters/skip/skip_.rs#L98C29-L114
https://github.com/nushell/nushell/blob/main/crates/nu-command/src/filters/take/take_.rs#L92-L105

You can reuse this code to implement bytes at (for byte streams).

@simon-curtis
Copy link
Copy Markdown
Contributor Author

Dw I got what you meant, I just wanted to tidy up the code first

Copy link
Copy Markdown
Contributor Author

@simon-curtis simon-curtis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changes have been made. I have merged the logic for filters::skip and filters::take into the ByteStream impl along with a ByteStream::slice function - which is now used in bytes::at. I've also added some more tests for regression

Copy link
Copy Markdown
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the tests and refactoring! A few comments:

@simon-curtis
Copy link
Copy Markdown
Contributor Author

Thanks for the feedback btw, I'm learning a lot

@sgvictorino
Copy link
Copy Markdown
Contributor

sgvictorino commented Nov 10, 2024

This also closes #13668

Copy link
Copy Markdown
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes! This is almost good to go, just a few last things.

@simon-curtis
Copy link
Copy Markdown
Contributor Author

I have moved logic into using the core IntRange instead of SubBytes as not to duplicate logic.

I have added a couple of functions on the Range to resolve the absolute range values where a length is known too, returning the original Bound value to allow the caller to apply the index as they wish.

In our case we both apply it by slicing Vec and using the new ByteStream::slice function.

I am still testing and need to format so hold out for a min

@fdncred
Copy link
Copy Markdown
Contributor

fdncred commented Dec 10, 2024

what's the status here?

@simon-curtis
Copy link
Copy Markdown
Contributor Author

I will fast forward and resolve merge conflicts tonight after work. AFAIK everything works and would be ready to merge in

@fdncred fdncred added status:wait-until-after-nushell-release A:streaming Issues related to streaming data (or collecting data when it should be streamed) deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way labels Dec 19, 2024
@WindSoilder
Copy link
Copy Markdown
Contributor

Hi, @IanManske . Can you take the last review when you get time?

@sholderbach sholderbach changed the title Implementing ByteStream interuption on infinate stream Implementing ByteStream interuption on infinite stream Jan 9, 2025
Copy link
Copy Markdown
Member

@IanManske IanManske left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good at a glance, we'll land this to start testing in the wild.

@IanManske IanManske merged commit f051628 into nushell:main Jan 11, 2025
@github-actions github-actions bot added this to the v0.102.0 milestone Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A:streaming Issues related to streaming data (or collecting data when it should be streamed) deprecated:pr-commands (deprecated: too vague) This PR changes our commands in some way status:wait-until-after-nushell-release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants