Create Record type to reduce code duplication#9648
Create Record type to reduce code duplication#9648IanManske wants to merge 31 commits intonushell:mainfrom
Record type to reduce code duplication#9648Conversation
|
I am a naughty contributor and did not talk with any core team members beforehand. (I will in the future though!) So, it's fine by me if nothing substantial comes from this PR -- I had fun writing it anyways! But yeah, this PR is pretty large, and I've broken it in into commits that can be viewed mostly in isolation.
|
| Pipeline | old | new | % change |
|---|---|---|---|
$data | get size |
174 | 149 | -14% |
$data | get name |
207 | 180 | -13% |
$data | get name size |
438 | 384 | -12% |
$data | reject size |
88 | 91 | +3% |
$data | reject name |
86 | 91 | +6% |
$data | reject name size |
93 | 96 | +3% |
$data | each { |file| { name: $file.name } } |
2216 | 2309 | +4% |
$data | each { |file| { name: $file.name, size: $file.size }} |
2314 | 2281 | -1% |
$data | each { |file| { name: $file.name, size: $file.size, type: $file.type }} |
2494 | 2415 | -3% |
$names | path parse |
162 | 145 | -10% |
scope commands |
5059 | 2981 | -41% |
get seems to be measuring the reduced time for copying memory (this command is clone heavy). Note that cloning Records can be slightly slower than before if the Record has zero or only a few columns.
This can be seen in the first each benchmark, which seems to be measuring the cost of the additional Box allocation. Note the decrease in % change as the number of fields in the record increases, since reading/writing non-record Values is faster.
reject seems to be measuring the cost of the additional pointer indirection (it simply accesses memory and mutates the Value in place).
scope commands I think is measuring the combination of performance increase from smaller Values as well as the benefit of using the record! macro (no Vec reallocations).
The most degenerate case I can think of is for nested records where each record has only one field with another record value (almost like a linked list). But overall, this looks like a net perf win. Also keep in mind that this PR should, at the very least, reduce the memory usage of nushell. Running only let data = (scope commands) shows 53MB of memory used in the old build and 43MB in the new build.
Of course, let me know what need changing and thanks for reading all of this!
|
I'm trying to think of the right words to say here. Wow comes to mind. Thanks for this. It'll take us a while to digest I'm sure, but clearly a lot of effort went into this PR. So, thanks for spending time on nushell. Stay tuned. :) |
|
Thank you very much for pushing forward with this! This takes tremendous dedication to see through. Let me summarize the observations of the @nushell/core-team chat, before we can discuss what steps we take for the detailed review or revisions we might want to do before landing it:
|
- A new one is the removal of unnecessary `#` in raw strings without `"` inside. - https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes - The automatically applied removal of `.into_iter()` touched several places where nushell#9648 will change to the use of the record API. If necessary I can remove them @IanManske to avoid churn with this PR. - Manually applied `.try_fold` in two places - Removed a dead `if`
|
@sholderbach (and core team), thanks for the response, glad to see the interest and appreciation for this PR! To comment and generally agree with the observations mentioned:
Ok, enough writing XD |
# Description - A new one is the removal of unnecessary `#` in raw strings without `"` inside. - https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes - The automatically applied removal of `.into_iter()` touched several places where #9648 will change to the use of the record API. If necessary I can remove them @IanManske to avoid churn with this PR. - Manually applied `.try_fold` in two places - Removed a dead `if` - Manual: Combat rightward-drift with early return
|
I've done some additional benchmarks boxing only the closure case, only the record case, and boxing neither. All branches are present on my fork if needed. Data table (click to expand)
My takeaways:
So, besides Regarding memory usage, given
Thus,
The only way Script (click to expand)
def no-boxing-uses-less-mem [num_fields, initial_fields?] {
let initial_fields = ($initial_fields | default $num_fields)
mut nest_lvl = 0
mut no_box = (80 + $initial_fields * 80)
mut box_both = (96 + $initial_fields * 48)
while $no_box >= $box_both and $no_box < 64000000000 {
# print $'($nest_lvl): ($no_box) >= ($box_both)'
$nest_lvl += 1
$no_box = ($no_box * $num_fields + 80)
$box_both = ($box_both * $num_fields + 96)
}
{
num_fields: $num_fields
initial_fields: $initial_fields
nest_lvl: $nest_lvl
no_box: ($no_box | into filesize)
box_both: ($box_both | into filesize)
}
}
0..=10 | each { |num_fields|
0..=10 | each { |initial_fields|
no-boxing-uses-less-mem $num_fields $initial_fields
}}
| flatten
| filter { |x|
($x.num_fields >= 2 and $x.initial_fields >= 1
and $x.no_box <= 64GB
and $x.no_box < $x.box_both)
}The code above gives the empty list. That is, So, I'm still in favor of So, I would ask the core team if I should go forward with |
|
A few thoughts (sorry for the delay here, was fighting a bout of covid). This PR feels like a few different PRs rolled up into a single one. For example, the closure changes should be spun off onto a separate PR. Likewise, you change the existing API in a few places, that should also be a separate PR. This helps with reviewing the code so we're not trying to understand unrelated changes, but instead can review them on their own. I'd also avoid macros. I get that some Rust folks like them, but I think they really hurt readability. They add a kind of "magic" syntax but you have to understand that magic to read it. You'll notice we only rarely use macros in the codebase and tend towards removing them and replacing them with non-macro Rust code. An alternate approach we use quite a bit is to use a builder pattern. Something like: record().
field("foo", Value::String("bar", ...).
field("baz", Value::String("qux", ...).And so on. The pattern is easier to maintain and folks will already be familiar with it. I like the gist of the PR - to simplify the code where we run a bit loose with the cols/vals idea. I think we could definitely land a version of that idea. |
|
Oh no, hope you are feeling better @jntrnr! Yeah, I got a little carried away with mixing different kinds of changes. Should I open a separate PR for each of the following?
(Thankfully, most of the commits are relatively atomic.) I agree that macros can hurt readability, especially ones that are too complex. But not everything's black and white. In this particular case, the only "magic" syntax introduced is I didn't use the builder pattern in this case, because the builder pattern will have to call But I should probably be talking with evidence. I can make another branch using the builder pattern instead and see if there is any noticeable difference. If not, great! Then we can just use that branch. |
|
@IanManske - that list looks like a great list of refactorings. They'd work as separate PRs, or grouped by theme. I'll be curious what you find out with the builder pattern. Some areas of the code can be sensitive to allocation (specifically cloning) but in theory the inliner should hopefully be able to simplify some of the work of the builder. We'll see if I'm right, though. |
|
Ok, I made two more branches where all instances of Like before, release builds were run in the repository root with the default config. A subset of the commands that used Script (click to expand)do {
let data = (ls crates/**) # 195 rows
let name = ($data | get name)
let size = ($data | get size)
let modified = ($data | get modified)
print (timeit { for x in 0..1000 { $name | enumerate }})
print (timeit { for x in 0..1000 { $name | uniq -c }})
print (timeit { for x in 0..1000 { $modified | date to-record }})
print (timeit { for x in 0..1000 { $size | fmt } })
print (timeit { for x in 0..1000 { char --list }})
print (timeit { for x in 0..1000 { scope commands }})
print (timeit { for x in 0..1000 { help commands }})
print (timeit { for x in 0..1000 { keybindings list }})
print (timeit { for x in 0..1000 { keybindings default }})
}The
I was expecting |
|
Hey @IanManske 👋🏻 Thanks again for putting up this PR. We appreciate you trying to make nushell better and I especially enjoyed the attention to detail that this PR took with all the back-and-forth with testing and discussing alternatives. Good job! We'd like to move forward with this PR by landing it with a few changes. Here's what we're thinking.
record! {
"char" => Value::string(format!("{c}"), Span::unknown()),
"code" => Value::string(format!("{:#08x}", u32::from(c)), Span::unknown()),
"modifier" => Value::string(format!("{modifiers:?}"), Span::unknown()),
"flags" => Value::string(format!("{modifiers:#08b}"), Span::unknown()),
"kind" => Value::string(format!("{kind:?}"), Span::unknown()),
"state" => Value::string(format!("{state:?}"), Span::unknown()),
};
Hopefully you're willing to continue working with us on this PR. Please let us know what you think. |
|
Hi @fdncred! Yes, I would love to continue working on this. I just got a little distracted with other PRs 😅. Also, I just wanted to thank everyone for their feedback -- I think the PR is in a much better spot now because of it! But yeah, the changes you mentioned were exactly what I was thinking as well. I'll go ahead and bring this PR up to date. Additionally, I'll fix a few spots to keep edits as straightforward as possible. So, all of this will probably entail a rebase. |
|
Awesome @IanManske! thanks for your work on this and the other PRs. As we don't have any line-by-line review in this PR feel free to rebase or open as a new PR, whatever makes it more convenient to turn out. |
|
Ok, I opted to open a new PR (correct branch name, better squash message, etc.). Closed in favor of #10103. |
# Description This PR creates a new `Record` type to reduce duplicate code and possibly bugs as well. (This is an edited version of #9648.) - `Record` implements `FromIterator` and `IntoIterator` and so can be iterated over or collected into. For example, this helps with conversions to and from (hash)maps. (Also, no more `cols.iter().zip(vals)`!) - `Record` has a `push(col, val)` function to help insure that the number of columns is equal to the number of values. I caught a few potential bugs thanks to this (e.g. in the `ls` command). - Finally, this PR also adds a `record!` macro that helps simplify record creation. It is used like so: ```rust record! { "key1" => some_value, "key2" => Value::string("text", span), "key3" => Value::int(optional_int.unwrap_or(0), span), "key4" => Value::bool(config.setting, span), } ``` Since macros hinder formatting, etc., the right hand side values should be relatively short and sweet like the examples above. Where possible, prefer `record!` or `.collect()` on an iterator instead of multiple `Record::push`s, since the first two automatically set the record capacity and do less work overall. # User-Facing Changes Besides the changes in `nu-protocol` the only other breaking changes are to `nu-table::{ExpandedTable::build_map, JustTable::kv_table}`.
Description
This PR creates a new
Recordtype to reduce duplicate code and possibly bugs as well. While the main purpose of this PR is to make the code cleaner and more robust, this change also makes a new optimization possible. That is, if we box both theValue::RecordandValue::Closurecases, it will decrease the byte size ofValuefrom 80 to 48. This seems to result in net performance gain, but this optimization is easy to change/undo if necessary.Code
Since there is now a separate
Recordtype, lots of duplicated code was able to be removed. E.g., no morecols.iter().zip(vals)! Similarly, an iterator can becollect-ed into aRecordand vice versa, which helps with conversions to and from (hash)maps, etc.Recordhas apush(col, val)function to help to insure that the number of columns is equal to the number of values. I caught one potential bug in thelscommand thanks to this. Finally, I added a bonusrecord!macro that should help simplify record creation. It supports{ ident => expr },{ expr => expr }, and{ ident }forms:Since
record!{}calls thevec![]macro which knows the length of theVecin advance, then theVecs in theRecordwill be created without reallocating.Performance
Nushell command code tends to be somewhat liberal with its usage of
clone(). SinceValueis now only 60% of its original size, sections of code whose running time is mostlyclones can, in the best case, take only 60% of the time they did before. For example, take this slightly unrealistic benchmark based off #5236:This now takes 68ms on my hardware, whereas it took 100ms before. Of course, the downside is that each
Recordincurs an extra memory allocation and each access to aRecordnow has an addition level of indirection. But this extra time and memory cost seems to be negliable compared to the other gains.User-Facing Changes
No big user-facing changes besides maybe a little better performance.
Valuetype and other public functions which is a breaking change for any dependents.