Add common map-like API to nu_protocol::Record#10841
Add common map-like API to nu_protocol::Record#10841sholderbach merged 13 commits intonushell:mainfrom
nu_protocol::Record#10841Conversation
Building up the `Record` from its `Vec`s directly is dangerous if the parts don't have equal length. `Record::from_raw_cols_vals` will assert equal length. Using it may not be the best choice in the future if we choose to change the underlying representation. For now skip validating properties about key uniqueness
We have a few places where we read the `.cols` field. Let's turn them into a `Columns` iterator. End goal, make the internal representation of `Record` opaque.
Avoid exposing `record.vals` directly.
Compared to `Record.values()` gives you owned values. Primarily to support the `values` command.
The existing `Record.push()` doesn't take into account if a key already exists. The second insertion could become inaccessible and the data meaningless. For most purposes `.insert()` is the better choice. It returns `Some(Value)` with the old value if found. As we limit ourselves to `String` keys, we can use a `Cow<str>` to only clone the key if we need to.
These are the common accessors for maps. Note: as we can't guarantee yet that `cols.len() == vals.len()` the lookup in `vals` still needs a bounds check.
To enable some existing command implementations that perform indexing themselves expose a way to get values with an index. Short-circuits if (partially) out of bounds.
Warning: this `Extend` implementation is equivalent to `.push` and doesn't deduplicate incoming keys. In the future it should probably use `.insert` internally. If we want to provide an escape hatch for performance on the contiguous `Vec` backing-store we may want use a different API.
To be able to make the fields private we need to use the public checking `Record::from_raw_cols_vals()` constructor. This change will introduce a single length assert that can probably be elided during optimization.
| } | ||
|
|
||
| pub fn contains(&self, col: impl AsRef<str>) -> bool { | ||
| self.cols.iter().any(|k| k == col.as_ref()) |
There was a problem hiding this comment.
What about making a variable let col_ref = col.as_ref(), instead of calling col.as_ref() inside closure everytime?
let col_ref = col.as_ref();
self.cols.iter().any(|k| k == col_ref)It can be a performance improvement if we call the function many times.
There was a problem hiding this comment.
I assumed that .as_ref() is pretty transparent to the optimizer but we could lift that out. Then there might be some optimizer luck necessary so it can short circuit that if cols is empty but we shouldn't worry too much without profiling.
I think there is an aspect that makes me dislike AsRef slightly, so I might look into Borrow as a replacement.
|
Looks good! Glad to see more work is being on One suggestion that I have is regarding |
Should lead to more optimized code h/t @IanManske
Some usages in the public API envisioned in conjunction with `.get_index` Also should simplify some internal implementations
Note: It appears that `impl AsRef<str>` has the downside of taking a directly directly by value if not prefixed with `&`. We may want to look into `Borrow` for those.
Good call, done! Yeah that should monormophize to simpler code. (and maybe make it also more likely to inline) |
|
didn't go through the whole PR, but i support the overall idea of treating |
# Description While we have now a few ways to add items or iterate over the collection, we don't have a way to cleanly remove items from `Record`. This PR fixes that: - Add `Record.remove()` to remove by key - makes the assumption that keys are unique, so can not be used universally, yet (see #10875 for an important example) - Add naive `Record.retain()` for inplace removal - This follows the two separate `retain`/`retain_mut` in the Rust std library types, compared to the value-mutating `retain` in `indexmap` - Add `Record.retain_mut()` for one-pass pruning Continuation of #10841 # User-Facing Changes None yet. # Tests + Formatting Doctests for the `retain`ing fun
# Description Consequences of #10841 This does not yet make the assumption that columns are always duplicated. Follow the existing logic here - Use saner record API in `nu-engine/src/eval.rs` - Use checked record construction in `nu-engine/src/scope.rs` - Use `values` iterator in `nu-engine/src/scope.rs` - Use `columns` iterator in `nu_engine::get_columns()` - Start using record API in `value/mod.rs` - Use `.insert` in `eval_const.rs` Record code - Record API for `eval_const.rs` table code # User-Facing Changes None # Tests + Formatting None
# Description Since #10841 the goal is to remove the implementation details of `Record` outside of core operations. To this end use Record iterators and map-like accessors in a bunch of places. In this PR I try to collect the boring cases where I don't expect any dramatic performance impacts or don't have doubts about the correctness afterwards - Use checked record construction in `nu_plugin_example` - Use `Record::into_iter` in `columns` - Use `Record` iterators in `headers` cmd - Use explicit record iterators in `split-by` - Use `Record::into_iter` in variable completions - Use `Record::values` iterator in `into sqlite` - Use `Record::iter_mut` for-loop in `default` - Change `nu_engine::nonexistent_column` to use iterator - Use `Record::columns` iter in `nu-cmd-base` - Use `Record::get_index` in `nu-command/network/http` - Use `Record.insert()` in `merge` - Refactor `move` to use encapsulated record API - Use `Record.insert()` in `explore` - Use proper `Record` API in `explore` - Remove defensiveness around record in `explore` - Use encapsulated record API in more `nu-command`s # User-Facing Changes None intentional # Tests + Formatting (-)
# Description > Our `Record` looks like a map, quacks like a map, so let's treat it with the API for a map Implement common methods found on e.g. `std::collections::HashMap` or the insertion-ordered [indexmap](https://docs.rs/indexmap). This allows contributors to not have to worry about how to get to the relevant items and not mess up the assumptions of a Nushell record. ## Record assumptions - `cols` and `vals` are of equal length - for all practical purposes, keys/columns should be unique ## End goal The end goal of the upcoming series of PR's is to allow us to make `cols` and `vals` private. Then it would be possible to exchange the backing datastructure to best fit the expected workload. This could be statically (by finding the best balance) or dynamically by using an `enum` of potential representations. ## Parts - Add validating explicit part constructor `Record::from_raw_cols_vals()` - Add `Record.columns()` iterator - Add `Record.values()` iterator - Add consuming `Record.into_values()` iterator - Add `Record.contains()` helper - Add `Record.insert()` that respects existing keys - Add key-based `.get()`/`.get_mut()` to `Record` - Add `Record.get_index()` for index-based access - Implement `Extend` for `Record` naively - Use checked constructor in `record!` macro - Add `Record.index_of()` to get index by key # User-Facing Changes None directly # Developer facing changes You don't have to roll your own record handling and can use a familiar API # Tests + Formatting No explicit unit tests yet. Wouldn't be too tricky to validate core properties directly. Will be exercised by the following PRs using the new methods/traits/iterators.
# Description While we have now a few ways to add items or iterate over the collection, we don't have a way to cleanly remove items from `Record`. This PR fixes that: - Add `Record.remove()` to remove by key - makes the assumption that keys are unique, so can not be used universally, yet (see nushell#10875 for an important example) - Add naive `Record.retain()` for inplace removal - This follows the two separate `retain`/`retain_mut` in the Rust std library types, compared to the value-mutating `retain` in `indexmap` - Add `Record.retain_mut()` for one-pass pruning Continuation of nushell#10841 # User-Facing Changes None yet. # Tests + Formatting Doctests for the `retain`ing fun
# Description Consequences of nushell#10841 This does not yet make the assumption that columns are always duplicated. Follow the existing logic here - Use saner record API in `nu-engine/src/eval.rs` - Use checked record construction in `nu-engine/src/scope.rs` - Use `values` iterator in `nu-engine/src/scope.rs` - Use `columns` iterator in `nu_engine::get_columns()` - Start using record API in `value/mod.rs` - Use `.insert` in `eval_const.rs` Record code - Record API for `eval_const.rs` table code # User-Facing Changes None # Tests + Formatting None
# Description Since nushell#10841 the goal is to remove the implementation details of `Record` outside of core operations. To this end use Record iterators and map-like accessors in a bunch of places. In this PR I try to collect the boring cases where I don't expect any dramatic performance impacts or don't have doubts about the correctness afterwards - Use checked record construction in `nu_plugin_example` - Use `Record::into_iter` in `columns` - Use `Record` iterators in `headers` cmd - Use explicit record iterators in `split-by` - Use `Record::into_iter` in variable completions - Use `Record::values` iterator in `into sqlite` - Use `Record::iter_mut` for-loop in `default` - Change `nu_engine::nonexistent_column` to use iterator - Use `Record::columns` iter in `nu-cmd-base` - Use `Record::get_index` in `nu-command/network/http` - Use `Record.insert()` in `merge` - Refactor `move` to use encapsulated record API - Use `Record.insert()` in `explore` - Use proper `Record` API in `explore` - Remove defensiveness around record in `explore` - Use encapsulated record API in more `nu-command`s # User-Facing Changes None intentional # Tests + Formatting (-)
# Description > Our `Record` looks like a map, quacks like a map, so let's treat it with the API for a map Implement common methods found on e.g. `std::collections::HashMap` or the insertion-ordered [indexmap](https://docs.rs/indexmap). This allows contributors to not have to worry about how to get to the relevant items and not mess up the assumptions of a Nushell record. ## Record assumptions - `cols` and `vals` are of equal length - for all practical purposes, keys/columns should be unique ## End goal The end goal of the upcoming series of PR's is to allow us to make `cols` and `vals` private. Then it would be possible to exchange the backing datastructure to best fit the expected workload. This could be statically (by finding the best balance) or dynamically by using an `enum` of potential representations. ## Parts - Add validating explicit part constructor `Record::from_raw_cols_vals()` - Add `Record.columns()` iterator - Add `Record.values()` iterator - Add consuming `Record.into_values()` iterator - Add `Record.contains()` helper - Add `Record.insert()` that respects existing keys - Add key-based `.get()`/`.get_mut()` to `Record` - Add `Record.get_index()` for index-based access - Implement `Extend` for `Record` naively - Use checked constructor in `record!` macro - Add `Record.index_of()` to get index by key # User-Facing Changes None directly # Developer facing changes You don't have to roll your own record handling and can use a familiar API # Tests + Formatting No explicit unit tests yet. Wouldn't be too tricky to validate core properties directly. Will be exercised by the following PRs using the new methods/traits/iterators.
# Description While we have now a few ways to add items or iterate over the collection, we don't have a way to cleanly remove items from `Record`. This PR fixes that: - Add `Record.remove()` to remove by key - makes the assumption that keys are unique, so can not be used universally, yet (see nushell#10875 for an important example) - Add naive `Record.retain()` for inplace removal - This follows the two separate `retain`/`retain_mut` in the Rust std library types, compared to the value-mutating `retain` in `indexmap` - Add `Record.retain_mut()` for one-pass pruning Continuation of nushell#10841 # User-Facing Changes None yet. # Tests + Formatting Doctests for the `retain`ing fun
# Description Consequences of nushell#10841 This does not yet make the assumption that columns are always duplicated. Follow the existing logic here - Use saner record API in `nu-engine/src/eval.rs` - Use checked record construction in `nu-engine/src/scope.rs` - Use `values` iterator in `nu-engine/src/scope.rs` - Use `columns` iterator in `nu_engine::get_columns()` - Start using record API in `value/mod.rs` - Use `.insert` in `eval_const.rs` Record code - Record API for `eval_const.rs` table code # User-Facing Changes None # Tests + Formatting None
# Description Since nushell#10841 the goal is to remove the implementation details of `Record` outside of core operations. To this end use Record iterators and map-like accessors in a bunch of places. In this PR I try to collect the boring cases where I don't expect any dramatic performance impacts or don't have doubts about the correctness afterwards - Use checked record construction in `nu_plugin_example` - Use `Record::into_iter` in `columns` - Use `Record` iterators in `headers` cmd - Use explicit record iterators in `split-by` - Use `Record::into_iter` in variable completions - Use `Record::values` iterator in `into sqlite` - Use `Record::iter_mut` for-loop in `default` - Change `nu_engine::nonexistent_column` to use iterator - Use `Record::columns` iter in `nu-cmd-base` - Use `Record::get_index` in `nu-command/network/http` - Use `Record.insert()` in `merge` - Refactor `move` to use encapsulated record API - Use `Record.insert()` in `explore` - Use proper `Record` API in `explore` - Remove defensiveness around record in `explore` - Use encapsulated record API in more `nu-command`s # User-Facing Changes None intentional # Tests + Formatting (-)
Description
Implement common methods found on e.g.
std::collections::HashMapor the insertion-ordered indexmap.This allows contributors to not have to worry about how to get to the relevant items and not mess up the assumptions of a Nushell record.
Record assumptions
colsandvalsare of equal lengthEnd goal
The end goal of the upcoming series of PR's is to allow us to make
colsandvalsprivate.Then it would be possible to exchange the backing datastructure to best fit the expected workload.
This could be statically (by finding the best balance) or dynamically by using an
enumof potential representations.Parts
Record::from_raw_cols_vals()Record.columns()iteratorRecord.values()iteratorRecord.into_values()iteratorRecord.contains()helperRecord.insert()that respects existing keys.get()/.get_mut()toRecordRecord.get_index()for index-based accessExtendforRecordnaivelyrecord!macroUser-Facing Changes
None directly
Developer facing changes
You don't have to roll your own record handling and can use a familiar API
Tests + Formatting
No explicit unit tests yet. Wouldn't be too tricky to validate core properties directly.
Will be exercised by the following PRs using the new methods/traits/iterators.
Recordtype #10103).record!macro #10840