Create `Record` type to reduce code duplication by IanManske · Pull Request #9648 · nushell/nushell

IanManske · 2023-07-10T19:37:07Z

Description

This PR creates a new Record type to reduce duplicate code and possibly bugs as well. While the main purpose of this PR is to make the code cleaner and more robust, this change also makes a new optimization possible. That is, if we box both the Value::Record and Value::Closure cases, it will decrease the byte size of Value from 80 to 48. This seems to result in net performance gain, but this optimization is easy to change/undo if necessary.

Code

Since there is now a separate Record type, lots of duplicated code was able to be removed. E.g., no more cols.iter().zip(vals)! Similarly, an iterator can be collect-ed into a Record and vice versa, which helps with conversions to and from (hash)maps, etc.

Record has a push(col, val) function to help to insure that the number of columns is equal to the number of values. I caught one potential bug in the ls command thanks to this. Finally, I added a bonus record! macro that should help simplify record creation. It supports { ident => expr }, { expr => expr } , and { ident } forms:

record! {
  some_int => Value::int(1, span),
  some_str => Value::string("something", span),
}

record! {
  "some col" => Value::int(1, span),
  "other-col" => Value::string("other", span,
}

let field_name1 = Value::string("1", span);
let field_name2 = Value::string("2", span);
record! { field_name1, field_name2 }

Since record!{} calls the vec![] macro which knows the length of the Vec in advance, then the Vecs in the Record will be created without reallocating.

Performance

Nushell command code tends to be somewhat liberal with its usage of clone(). Since Value is now only 60% of its original size, sections of code whose running time is mostly clones can, in the best case, take only 60% of the time they did before. For example, take this slightly unrealistic benchmark based off #5236:

timeit { 0..100000 | each {|i| { foo: $i }} }

This now takes 68ms on my hardware, whereas it took 100ms before. Of course, the downside is that each Record incurs an extra memory allocation and each access to a Record now has an addition level of indirection. But this extra time and memory cost seems to be negliable compared to the other gains.

User-Facing Changes

No big user-facing changes besides maybe a little better performance.

⚠️ Of course, this PR changes the Value type and other public functions which is a breaking change for any dependents.

IanManske · 2023-07-10T19:37:23Z

I am a naughty contributor and did not talk with any core team members beforehand. (I will in the future though!) So, it's fine by me if nothing substantial comes from this PR -- I had fun writing it anyways!

But yeah, this PR is pretty large, and I've broken it in into commits that can be viewed mostly in isolation.
The most relevant commit is e103a0b, which is where the Record type was added.
Most changes are simple conversions following a few rules:

Rules

(click to expand)

cols.iter().zip(vals) => record or record.iter() or val or val.iter().
Similar transformation for into_iter().

from

Value::Record { cols, vals, .. } => { f1(cols); f2(vals) }

to

Value::Record { val, .. } => { f1(val.cols); f2(val.vals); }

from

let mut cols = vec![];
let mut vals = vec![];

cols.push("key");
vals.push(value);
//...

Value::Record { cols, vals, span }

to

Value::record(
    record! {
        key => value,
        //...
    },
    span,
)

from

let mut cols = vec![];
let mut vals = vec![];

if let Some(val) = value {
    cols.push("key");
    vals.push(val);
}

//...

Value::Record { cols, vals, span }

to

let mut record = Record::new();

if let Some(val) = value {
    record.push("key", val);
}

//...

Value::record(record, span)

from

let mut cols = vec![];
let mut vals = vec![];

for (key, val) in some_iter {
  cols.push(key);
  vals.push(val);
}

Value::Record { cols, vals, span }

to

Value::record_from_iter(some_iter, span) // or some_iter.map().collect(), etc., if necessary

&Span => Span since Span is Copy

I also understand performance isn't super high on the agenda right now (e.g., #9464), so I can unbox the Record and Closure cases if necessary (and revisit at a later date). Regardless, here are the results for some preliminary benchmarks.
The old (v0.82.0) and new (this PR) release builds of nushell were run with the default config and with the CWD at the repository root. Then, test data was loaded:

let data = (ls crates/**) # 195 rows
let names = ($data | get name)

Next, each benchmark pipeline was run like so:

timeit { for x in 0..1000 { $pipeline } }

Results (old and new times are in ms):

Pipeline	old	new	% change
`$data \| get size`	174	149	-14%
`$data \| get name`	207	180	-13%
`$data \| get name size`	438	384	-12%
`$data \| reject size`	88	91	+3%
`$data \| reject name`	86	91	+6%
`$data \| reject name size`	93	96	+3%
`$data \| each { \|file\| { name: $file.name } }`	2216	2309	+4%
`$data \| each { \|file\| { name: $file.name, size: $file.size }}`	2314	2281	-1%
`$data \| each { \|file\| { name: $file.name, size: $file.size, type: $file.type }}`	2494	2415	-3%
`$names \| path parse`	162	145	-10%
`scope commands`	5059	2981	-41%

get seems to be measuring the reduced time for copying memory (this command is clone heavy). Note that cloning Records can be slightly slower than before if the Record has zero or only a few columns.

This can be seen in the first each benchmark, which seems to be measuring the cost of the additional Box allocation. Note the decrease in % change as the number of fields in the record increases, since reading/writing non-record Values is faster.

reject seems to be measuring the cost of the additional pointer indirection (it simply accesses memory and mutates the Value in place).

scope commands I think is measuring the combination of performance increase from smaller Values as well as the benefit of using the record! macro (no Vec reallocations).

The most degenerate case I can think of is for nested records where each record has only one field with another record value (almost like a linked list). But overall, this looks like a net perf win. Also keep in mind that this PR should, at the very least, reduce the memory usage of nushell. Running only let data = (scope commands) shows 53MB of memory used in the old build and 43MB in the new build.

Of course, let me know what need changing and thanks for reading all of this!

fdncred · 2023-07-10T19:52:08Z

I'm trying to think of the right words to say here. Wow comes to mind. Thanks for this. It'll take us a while to digest I'm sure, but clearly a lot of effort went into this PR. So, thanks for spending time on nushell. Stay tuned. :)

sholderbach · 2023-07-11T08:28:38Z

Thank you very much for pushing forward with this! This takes tremendous dedication to see through.
I absolutely love that you encapsulate most of the behavior in the Record type. This will certainly make future improvements or refactorings a more reasonable undertaking.

Let me summarize the observations of the @nushell/core-team chat, before we can discuss what steps we take for the detailed review or revisions we might want to do before landing it:

Having the sensible interfaces like .push() or impl IntoIterator and impl FromIterator is a godsend for future readability.
Bundling in the Boxing of Closure might make the overall review a bit harder and obscures individual performance oriented choices for Record. But the general intuition is that this is probably a good move on its own to improve the size of Value.
The one thing where we all had some push back is around the macro record![]:
- General sentiment: more magic makes the code harder to read, rustfmt, refactor etc.
- Allowing both $($key:literal => $val:expr),* and $($key:ident => $val:expr),* is a crime ;) The former suggests it would be an expression but the latter form stringify!es when you try to pass a variable as a key. Saving two " for convenience is certainly not worth it. Either we only allow the literal form or have it as $($key:expr => $val:expr),*. This doesn't seem to be that needed based on your existing code and might allow some extra shenanigans, but probably matches the expectations from "sane" Rust.
- The $($varname:ident),* form is probably also a bit controversial based on personal taste. I can see the beauty of how compact you got the code in some places with that. Others might see it as a bit too magical leaving a very expression oriented design for "scope-magic". Maybe a compromise could be to have a separate macro with this written on the tin.
Lastly on performance:
- Improving the performance by removing explicit bottlenecks and studying existing tradeoffs is absolutely on the agenda. The pushback is mostly on "magic wand" solutions while we haven't profiled out all the worst offenders or have comprehensive benchmarks to observe the tradeoffs.
- To this end, we absolutely love your effort to have a broader list of benchmarks. We should really commit those to a benchmark suite. (The ones in the scripts repo are only covering a subset with a deep vertical but not much diversity. Having more data processing focussed ones would be great. As an alternative place we might consider the criterion benches/ for which @amtoine has tried to setup up some long term tracking)
- I am personally not yet fully convinced that the Record should be boxed. They are absolutely core to the current data model with tables being a list of records. So the "occupancy rate" of a Value being filed with a large Value::Record is probably relatively high. So the indirection won't save that much on total memory (if Value would be smaller than the separate Record allocation). And going with two indirections for a hot datatype will certainly introduce a tradeoff. I think this was more clear cut for the Value::Error where having to go through the indirection is much more rare in typical use (Box ShellError in Value::Error #8375). One alternative way to shrink Value::Record could be to change from two Vecs to a single Vec<(String, Value)>. But that is just me guessing and we should resolve that through benchmarks. Having things encapsulated in Record should make trying out things significantly easier after this PR.
- P.S.: Nice to see you polishing in some places by using the impl Copy for Span. Given available registers on the architecture or pressure from the signature, this should be a nice improvement.

@IanManske

- A new one is the removal of unnecessary `#` in raw strings without `"` inside. - https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes - The automatically applied removal of `.into_iter()` touched several places where nushell#9648 will change to the use of the record API. If necessary I can remove them @IanManske to avoid churn with this PR. - Manually applied `.try_fold` in two places - Removed a dead `if`

IanManske · 2023-07-11T20:19:18Z

@sholderbach (and core team), thanks for the response, glad to see the interest and appreciation for this PR!

To comment and generally agree with the observations mentioned:

I originally had only the { ident => expr } form, but then added { expr => expr } without thinking in order to handle keys that were not valid identifiers.
- But I agree that this is probably not a good idea! { literal => expr } or { expr => expr } is indeed the way to go. indexmap!{} uses { expr => expr }, but I think either form should suit our needs. I'll remove { ident => expr } and keep { expr => expr } for now, but we can easily change this to { literal => expr } later.
- This form of the macro does hinder rustfmt if the expression is inline. E.g.,
```
record! {
  "key" => { some really large expression },
}
```
  I intended this to be used for very simple expressions (i.e. { "key" => Value::case(value, span), }). That is, value is either declared somewhere above or it is also a very simple expression like x.field. Although, I might not have followed this rule in some places, which may say something about including this macro.
- Then again, this is the one point I will disagree with, I think the macro is much more readable. The macro allows all the record column and values to be easily been seen in a central location. I.e., it's much easier to see the shape of the record from a quick glance. Also, I think the declarative { key => value } syntax makes it much easier to see each key and its corresponding value. Compare this to two separate Vecs in which the values are raw expressions and/or the values are far apart from their corresponding key (keybindings_listen.rs):
```
// imagine if there were more columns...
Value::Record {
    cols: vec![
        "char".into(),
        "code".into(),
        "modifier".into(),
        "flags".into(),
        "kind".into(),
        "state".into(),
    ],
    vals: vec![
        Value::string(format!("{c}"), Span::unknown()),
        Value::string(format!("{:#08x}", u32::from(c)), Span::unknown()),
        Value::string(format!("{modifiers:?}"), Span::unknown()),
        Value::string(format!("{modifiers:#08b}"), Span::unknown()),
        Value::string(format!("{kind:?}"), Span::unknown()),
        Value::string(format!("{state:?}"), Span::unknown()),
    ],
    span: Span::unknown(),
};
```
  Compare this to (click to expand)
```
record! {
  "char" => Value::string(format!("{c}"), Span::unknown()),
  "code" => Value::string(format!("{:#08x}", u32::from(c)), Span::unknown()),
  "modifier" => Value::string(format!("{modifiers:?}"), Span::unknown()),
  "flags" => Value::string(format!("{modifiers:#08b}"), Span::unknown()),
  "kind" => Value::string(format!("{kind:?}"), Span::unknown()),
  "state" => Value::string(format!("{state:?}"), Span::unknown()),
};
```
- Additionally, the macro should? make it harder to make mistakes when changing record construction. E.g., inserting a new column currently requires two edits which better be in the same index of each Vec! For the macro, simply inserting a new line will suffice. Alternatively, I guess a sequence of record.push() will also work, but that could reallocate the Vecs unless Record::with_capacity is used. However, the number of columns can change over time and the capacity would need to be updated. So, similar to how we don't use a bunch of vec.push()s, but rather the vec![] macro, I think we should be using the record!{} macro where possible instead of a bunch of record.push()s.
- It should be noted that we were already using the indexmap!{} macro in certain places anyways and this isn't much different.
Regarding the { ident } form, I'm sort of indifferent to this (I just added it in case). But I realized this form helps if ident is already a Value and does not need to be wrapped like Value::case(ident, span). This way, redundant record!{ "key" => key } can be avoided. Of course, this simply saves some typing, so there's definitely a good argument for removing this form all together. I'd have to double check and see how applicable or commonly used it is/can be.
For performance, boxing only Closure and not Record decreases the size to 72 bytes, which is probably not worth it. So, I think the two should be boxed together or not at all. But you are right that this obscures the performance from only boxing Record. I assumed some things were blocks when they actually might be closures, which may be messing with the benchmarks above. Thankfully, the Box changes are easy to undo, so I'll experiment more by boxing none and then each in isolation and report back. I'll also check out the other benchmarks you mentioned. If nothing conclusive can be shown, I guess we should just unbox both? The change easy to undo and redo (I think I got rid of all raw constructions of Value::Record {} and Value::Closure {}), so we can revisit in the future when we have more benchmarks.
Aside:

Another alternative I considered was to use thin-vec which stores the length and capacity in the Vec's data buffer, so the size of ThinVec is just a pointer. But this would add another dependency and might be getting into the weeds too much.

Ok, enough writing XD

@IanManske

# Description - A new one is the removal of unnecessary `#` in raw strings without `"` inside. - https://rust-lang.github.io/rust-clippy/master/index.html#/needless_raw_string_hashes - The automatically applied removal of `.into_iter()` touched several places where #9648 will change to the use of the record API. If necessary I can remove them @IanManske to avoid churn with this PR. - Manually applied `.try_fold` in two places - Removed a dead `if` - Manual: Combat rightward-drift with early return

IanManske · 2023-07-13T16:01:48Z

I've done some additional benchmarks boxing only the closure case, only the record case, and boxing neither. All branches are present on my fork if needed.

Data table (click to expand)

Pipeline	v0.82	`no boxing`	`box closure`	`box record`	`box both`
`$data \| get size`	162	139	141	159	138
`$data \| get name`	204	184	183	191	181
`$data \| get name size`	419	380	390	400	380
`$data \| reject name`	81	77	86	97	81
`$data \| reject size`	90	87	87	85	92
`$data \| reject name size`	94	90	81	90	88
`$data \| each { \|file\| { name: $file.name } }`	2007	2070	2060	2110	2021
`$data \| each { \|file\| { name: $file.name, size: $file.size }}`	2100	2148	2151	2170	2105
`$data \| each { \|file\| { name: $file.name, size: $file.size, type: $file.type }}`	2212	2249	2235	2317	2175
`$names \| path parse`	163	156	155	159	127
`scope commands`	4458	2811	2791	3065	2743
`'turtle' \| wrap a \| wrap b \| wrap c \| wrap d \| wrap e \| get e.d.c.b.a` 10k iter	36	34	34	36	36
memory usage after `let data = (scope commands)`	53MB	45MB	44MB	45MB	43MB

My takeaways:

box closure: little to no difference in time compared to no boxing for these benchmarks.
box record: has slightly worse times compared to no boxing as a result of the extra record box/indirection.
box both: seems to be on par with no boxing even though box record was worse. It seems like gains from smaller Values are offsetting the cost of the record box?

So, besides v0.82 and box record, I think any of the other three should be fine? The times seem too close to call.

Regarding memory usage, given r record Values and v non-record Values, memory usage should be as follows:

memory(no boxing) = size(Value) * r + size(Value) * v = 80r + 80v
memory(box both) = (size(Value) + size(Record)) * r + size(Value) * v = (48 + 48)r + 48v = 96r + 48v

Thus, memory(no boxing) > memory(box both) if 2v > r. In other words:

no boxing uses more memory if the number of non-record values is more than half the number of records.
box both uses more memory if the number of records is more the twice the number of non-record values.

The only way 2. is possible is if most records have zero or one columns. To verify this, I made the following nushell script:

Script (click to expand)

num_fields is the branching factor that specifies the number of record children each outer record has.
initial_fields specifies the number of non-record values in the leaf records.

def no-boxing-uses-less-mem [num_fields, initial_fields?] {
  let initial_fields = ($initial_fields | default $num_fields)
  mut nest_lvl = 0
  mut no_box = (80 + $initial_fields * 80)
  mut box_both = (96 + $initial_fields * 48)

  while $no_box >= $box_both and $no_box < 64000000000 {
    # print $'($nest_lvl): ($no_box) >= ($box_both)'
    $nest_lvl += 1
    $no_box = ($no_box * $num_fields + 80)
    $box_both = ($box_both * $num_fields + 96)
  }

  {
    num_fields: $num_fields
    initial_fields: $initial_fields
    nest_lvl: $nest_lvl
    no_box: ($no_box | into filesize)
    box_both: ($box_both | into filesize)
  }
}

0..=10 | each { |num_fields|
0..=10 | each { |initial_fields|
  no-boxing-uses-less-mem $num_fields $initial_fields
}}
| flatten
| filter { |x|
  ($x.num_fields >= 2 and $x.initial_fields >= 1
  and $x.no_box <= 64GB
  and $x.no_box < $x.box_both)
}

The code above gives the empty list. That is, box both will use less memory if each record has at least two child records or at least one non-record value (assuming maximum of 64GB memory). You can experiment on your own and change some of the parameters and/or filters. For example, changing part of the filter to $x.num_fields >= 1 gives 10 entries, but each entry needs a level of record nesting equal to 2 * initial_fields before no boxing uses less memory.

So, I'm still in favor of box both on grounds of less memory usage. Then again, we may choose to prioritize running time, in which case no boxing is more understood and the effects of box both are not as clear. E.g., all the benchmarks were run on my hardware, which is obviously not a substitute for the range of hardware running nushell. (The decreased memory usage should be present on all platforms/hardware though?)

So, I would ask the core team if I should go forward with no boxing, box closure, or box both? Thanks!
(The macro decision can come later or at the same time, whatever works.)

sophiajt · 2023-07-14T16:19:03Z

A few thoughts (sorry for the delay here, was fighting a bout of covid).

This PR feels like a few different PRs rolled up into a single one. For example, the closure changes should be spun off onto a separate PR. Likewise, you change the existing API in a few places, that should also be a separate PR. This helps with reviewing the code so we're not trying to understand unrelated changes, but instead can review them on their own.

I'd also avoid macros. I get that some Rust folks like them, but I think they really hurt readability. They add a kind of "magic" syntax but you have to understand that magic to read it. You'll notice we only rarely use macros in the codebase and tend towards removing them and replacing them with non-macro Rust code.

An alternate approach we use quite a bit is to use a builder pattern. Something like:

record().
  field("foo", Value::String("bar", ...).
  field("baz", Value::String("qux", ...).

And so on. The pattern is easier to maintain and folks will already be familiar with it.

I like the gist of the PR - to simplify the code where we run a bit loose with the cols/vals idea. I think we could definitely land a version of that idea.

IanManske · 2023-07-14T20:47:15Z

Oh no, hope you are feeling better @jntrnr!

Yeah, I got a little carried away with mixing different kinds of changes. Should I open a separate PR for each of the following?

Reuse the existing Closure type (probably no boxing for now?)
Adding the functions for creating each Value case
Adding the functions for creating test data for each Value case
Adding the as functions for each Value case
(maybe the above three can be a single PR?)
&Span => Span
Making columns() return a slice intsead of Vec
impl Display for CellPath
Did I miss anything else?

(Thankfully, most of the commits are relatively atomic.)

I agree that macros can hurt readability, especially ones that are too complex. But not everything's black and white. In this particular case, the only "magic" syntax introduced is key => value. I would argue not that it's not that foreign or hard to understand. (Maybe I could change it to (key, value)?) By this logic we should not be using format!("{value}") either, since this also adds "magic" syntax.

I didn't use the builder pattern in this case, because the builder pattern will have to call vec.push() under the hood. This will do unnecessary capacity checking, etc., but more importantly it will reallocate the Vecs -- potentially multiple times! Again, specifying the Record capacity beforehand will help with this, but a hardcoded capacity would be brittle. record!, on the other hand, will set the capacity automatically and will always be accurate for future changes.

But I should probably be talking with evidence. I can make another branch using the builder pattern instead and see if there is any noticeable difference. If not, great! Then we can just use that branch.

sophiajt · 2023-07-15T03:11:47Z

@IanManske - that list looks like a great list of refactorings. They'd work as separate PRs, or grouped by theme.

I'll be curious what you find out with the builder pattern. Some areas of the code can be sensitive to allocation (specifically cloning) but in theory the inliner should hopefully be able to simplify some of the work of the builder. We'll see if I'm right, though.

IanManske · 2023-07-21T21:30:16Z

Ok, I made two more branches where all instances of record! were replaced with the builder pattern. One branch, record-builder, uses Record::new. Another branch, record-builder-capacity, uses Record::with_capacity instead. record-builder-capacity is one commit ahead of record-builder which is one commit ahead of no-box.

Like before, release builds were run in the repository root with the default config. A subset of the commands that used record! were benchmarked using the following script/command:

Script (click to expand)

do {
  let data = (ls crates/**) # 195 rows
  let name = ($data | get name)
  let size = ($data | get size)
  let modified = ($data | get modified)
  print (timeit { for x in 0..1000 { $name | enumerate }})
  print (timeit { for x in 0..1000 { $name | uniq -c }})
  print (timeit { for x in 0..1000 { $modified | date to-record }})
  print (timeit { for x in 0..1000 { $size | fmt } })
  print (timeit { for x in 0..1000 { char --list }})
  print (timeit { for x in 0..1000 { scope commands }})
  print (timeit { for x in 0..1000 { help commands }})
  print (timeit { for x in 0..1000 { keybindings list }})
  print (timeit { for x in 0..1000 { keybindings default }})
}

The no-box times were used as the baseline, and the percent change from this baseline is given for each of the record builder branches. All times are in milliseconds:

Command	`no-box`	`record-builder`	% change	`record-builder-capacity`	% change
`enumerate`	33	45	+36%	43	+30%
`uniq -c`	132	147	+11%	145	+10%
`date to-record`	83	146	+76%	135	+63%
`fmt`	129	189	+47%	174	+35%
`char --list`	23	35	+52%	35	+52%
`scope commands`	3110	3508	+13%	3392	+9%
`help commands`	956	1069	+12%	1030	+8%
`keybindings list`	52	65	+25%	62	+19%
`keybindings default`	85	96	+13%	95	+12%

I was expecting record-builder to be slower, but I'm surprised record-builder-capacity is slower as well. I guess the reallocation cost isn't as big as I thought, but rather the overhead of Vec::push is adding up?

fdncred · 2023-08-14T21:01:04Z

Hey @IanManske 👋🏻

Thanks again for putting up this PR. We appreciate you trying to make nushell better and I especially enjoyed the attention to detail that this PR took with all the back-and-forth with testing and discussing alternatives. Good job!

We'd like to move forward with this PR by landing it with a few changes. Here's what we're thinking.

We like the new Record type and would like to keep that.
We like the new record!() macro and want to keep the flavor that looks like this

record! {
  "char" => Value::string(format!("{c}"), Span::unknown()),
  "code" => Value::string(format!("{:#08x}", u32::from(c)), Span::unknown()),
  "modifier" => Value::string(format!("{modifiers:?}"), Span::unknown()),
  "flags" => Value::string(format!("{modifiers:#08b}"), Span::unknown()),
  "kind" => Value::string(format!("{kind:?}"), Span::unknown()),
  "state" => Value::string(format!("{state:?}"), Span::unknown()),
};

We'd like to leave the Boxing for another PR. Admittedly, there was a lot of stuff in this PR originally, LOL.

Hopefully you're willing to continue working with us on this PR. Please let us know what you think.

IanManske · 2023-08-15T18:38:43Z

Hi @fdncred!

Yes, I would love to continue working on this. I just got a little distracted with other PRs 😅. Also, I just wanted to thank everyone for their feedback -- I think the PR is in a much better spot now because of it!

But yeah, the changes you mentioned were exactly what I was thinking as well. I'll go ahead and bring this PR up to date. Additionally, I'll fix a few spots to keep edits as straightforward as possible. So, all of this will probably entail a rebase.

sholderbach · 2023-08-15T20:40:28Z

Awesome @IanManske! thanks for your work on this and the other PRs.

As we don't have any line-by-line review in this PR feel free to rebase or open as a new PR, whatever makes it more convenient to turn out.

IanManske · 2023-08-23T16:08:02Z

Ok, I opted to open a new PR (correct branch name, better squash message, etc.). Closed in favor of #10103.

# Description This PR creates a new `Record` type to reduce duplicate code and possibly bugs as well. (This is an edited version of #9648.) - `Record` implements `FromIterator` and `IntoIterator` and so can be iterated over or collected into. For example, this helps with conversions to and from (hash)maps. (Also, no more `cols.iter().zip(vals)`!) - `Record` has a `push(col, val)` function to help insure that the number of columns is equal to the number of values. I caught a few potential bugs thanks to this (e.g. in the `ls` command). - Finally, this PR also adds a `record!` macro that helps simplify record creation. It is used like so: ```rust record! { "key1" => some_value, "key2" => Value::string("text", span), "key3" => Value::int(optional_int.unwrap_or(0), span), "key4" => Value::bool(config.setting, span), } ``` Since macros hinder formatting, etc., the right hand side values should be relatively short and sweet like the examples above. Where possible, prefer `record!` or `.collect()` on an iterator instead of multiple `Record::push`s, since the first two automatically set the record capacity and do less work overall. # User-Facing Changes Besides the changes in `nu-protocol` the only other breaking changes are to `nu-table::{ExpandedTable::build_map, JustTable::kv_table}`.

IanManske added 29 commits July 10, 2023 14:15

Add functions to create each Value case

844c473

Add functions to create test values

7a8b318

Add "as" functions to convert Values

1612829

Box closure case in Value

f1c7fd8

Create Record type and box it in Value

12755e1

Apply Value changes for value mod.rs

1ef659e

Apply Value changes for nu-protocol

e3ffbdf

Make columns return slice instead of cloned Vec

9579829

Apply Value changes for nu-cli

75700fa

Apply Value changes for benchmarks

e05e704

Apply Value changes for nu-plugin-example

58769bf

Apply Value changes for nu-plugin-formats

1f934dd

Apply Value changes for nu-plugin-gstat

762a4be

Apply Value changes for nu-plugin-query

6fc0aef

Apply Value changes for nu-table

6d62983

Apply Value changes for nu-parser

1cfa376

Apply Value changes for nu-explore

9e5be0e

Apply Value changes for nu-engine

9c04a68

Apply Value changes for nu-color-config

1d03b39

Apply Value changes for nu-cmd-lang

55a686b

Apply Value changes for nu-cmd-extra

6137531

Apply Value changes for nu-cmd-dataframe

1822a31

Apply Value changes for math commands

a2922f3

Apply Value changes for string commands

d460241

Apply Value changes for from and to commands

3a76405

Apply Value changes for filter commands

809b9c1

Apply Value changes for remainder of nu-command

f5a91b6

Apply Value changes to missed spots

34d1f80

Format imports

5b9bb33

sholderbach mentioned this pull request Jul 11, 2023

Apply nightly clippy lints #9654

Merged

IanManske force-pushed the small-value branch from 63b5e84 to 78f33c7 Compare July 12, 2023 00:05

Remove { ident => expr } form of record! macro

f33aae7

IanManske force-pushed the small-value branch from 78f33c7 to f33aae7 Compare July 12, 2023 00:06

Remove { ident => expr } form of record! macro

005fce5

hyiltiz mentioned this pull request Jul 19, 2023

add missing $. in front of *cell paths* when displaying #9718

Closed

sholderbach mentioned this pull request Jul 22, 2023

Replace &Span with Span since Span is Copy #9770

Merged

IanManske mentioned this pull request Aug 23, 2023

Create Record type #10103

Merged

IanManske closed this Aug 23, 2023

dead10ck mentioned this pull request Sep 29, 2023

expand record! macro #10547

Closed

IanManske deleted the small-value branch October 30, 2023 23:43

Conversation

IanManske commented Jul 10, 2023

Description

Code

Performance

User-Facing Changes

Uh oh!

IanManske commented Jul 10, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rules

Uh oh!

fdncred commented Jul 10, 2023

Uh oh!

sholderbach commented Jul 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IanManske commented Jul 11, 2023

Uh oh!

IanManske commented Jul 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sophiajt commented Jul 14, 2023

Uh oh!

IanManske commented Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sophiajt commented Jul 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IanManske commented Jul 21, 2023

Uh oh!

fdncred commented Aug 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IanManske commented Aug 15, 2023

Uh oh!

sholderbach commented Aug 15, 2023

Uh oh!

IanManske commented Aug 23, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

IanManske commented Jul 10, 2023 •

edited

Loading

sholderbach commented Jul 11, 2023 •

edited

Loading

IanManske commented Jul 13, 2023 •

edited

Loading

IanManske commented Jul 14, 2023 •

edited

Loading

sophiajt commented Jul 15, 2023 •

edited

Loading

fdncred commented Aug 14, 2023 •

edited

Loading