Don't copy in From<String> for InternedString.#14674
Don't copy in From<String> for InternedString.#14674jacobbramley wants to merge 1 commit intorust-lang:masterfrom
From<String> for InternedString.#14674Conversation
If we have an owned `String` already, then don't clone it just to internalise it.
|
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @epage (or someone else) some time within the next two weeks. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
| let mut cache = interned_storage(); | ||
| let s = cache.get(item.as_str()).copied().unwrap_or_else(|| { | ||
| let s = item.leak(); | ||
| cache.insert(s); | ||
| s | ||
| }); | ||
|
|
||
| InternedString { inner: s } |
There was a problem hiding this comment.
Without performance data, this falls to being a refactor. My main concern is that we're duplicating the logic for creating an InternedString.
Unsure whether thats sufficient to block this or not. Will think more on this
There was a problem hiding this comment.
Another way to handle this is having a Cow variant of new(), and then both new() and from() call it.
There was a problem hiding this comment.
I agree, it just looked like an attempted optimisation — the impl for an already-owned String — that wasn't quite complete. I couldn't work out a way to avoid the duplication without allocating on a common path.
Cow would work but turns a statically known property into a runtime check, which might be reasonable, but requires benchmarking to check.
There was a problem hiding this comment.
Uncertainty about the additional complexity is why this was not done originally.
If we want to make it part of the type system without duplicating the core logic we could do something like:
fn new_inner<S: AsRef<str>>(str: S, make_stat: impl Fn(S) -> &'static str) -> InternedString {
let mut cache = interned_storage();
let s = cache.get(str.as_ref()).copied().unwrap_or_else(|| {
let s = make_stat(str);
cache.insert(s);
s
});
InternedString { inner: s }
}I don't know if it's worth the additional complexity.
There was a problem hiding this comment.
imo cow offers us a fairly clean design and I don't expect the check will impact performance all that much
### What does this PR try to resolve? Use `Cow` instead of duplicating the interning logic. Closes #14674 ### How should we test and review this PR? ### Additional information
If we have an owned
Stringalready, then don't clone it just to internalise it.What does this PR try to resolve?
Before,
InternedString::from(String::new("blahblah"))would take ownership of theStringbut then make and leak a copy (str.to_string().leak()) for use in theinterned_storage(). This change just avoids that copy, which I think was the intention behind the interface and the interning mechanism.How should we test and review this PR?
Several existing tests hit this implementation, so I'd expect
cargo testto suffice for functional correctness. The implementation ofInternedString::new()(a few lines down) is useful as a reference.Additional information
Whilst it looks like an optimisation, I've no particular performance objective here. This was just something I noticed whilst looking around the code.