Skip to content

PoC: precompute embeded string literals hash code#553

Closed
casperisfine wants to merge 1 commit intomasterfrom
interned-string-hashcode
Closed

PoC: precompute embeded string literals hash code#553
casperisfine wants to merge 1 commit intomasterfrom
interned-string-hashcode

Conversation

@casperisfine
Copy link
Copy Markdown

Unclear how much worth it, it would be, but with embeded strings we often have some space left in the slot, which we can use to store the string Hash code.

It's probably only worth it for string literals, as they are the ones likely to be used as hash keys.

We could even ony do this for string literal actually used as hash keys, as seen by the compiler.

It is also unclear what the best layout would be. We currently store it right after the string terminator as to make it easy/fast to compute, but ideally it would be at a fixed position, however this means creating another RString union, which would complexify the code base significantly.

It's also unclear to us whether trying to respect alignment would lead to a performance gain or not.

hash = 10.times.to_h do |i|
  [i, i]
end

dyn_sym = "dynamic_symbol".to_sym
hash[:some_symbol] = 1
hash[dyn_sym] = 1
hash["small"] = 2
hash["frozen_string_literal"] = 2

Benchmark.ips do |x|
  x.report("symbol") { hash[:some_symbol] }
  x.report("dyn_symbol") { hash[:some_symbol] }
  x.report("small_lit") { hash["small"] }
  x.report("frozen_lit") { hash["frozen_string_literal"] }
  x.compare!(order: :baseline)
end

Before:

ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.392M i/100ms
          dyn_symbol     2.440M i/100ms
           small_lit     2.155M i/100ms
          frozen_lit     2.010M i/100ms
Calculating -------------------------------------
              symbol     24.175M (± 1.7%) i/s -    122.002M in   5.048306s
          dyn_symbol     24.345M (± 1.6%) i/s -    122.019M in   5.013400s
           small_lit     21.252M (± 2.1%) i/s -    107.744M in   5.072042s
          frozen_lit     20.095M (± 1.3%) i/s -    100.489M in   5.001681s

Comparison:
              symbol: 24174848.1 i/s
          dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error
           small_lit: 21252403.2 i/s - 1.14x  slower
          frozen_lit: 20094766.0 i/s - 1.20x  slower

After:

ruby 3.4.0dev (2024-04-08T07:20:15Z interned-string-ha.. 76efed65bd) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.400M i/100ms
          dyn_symbol     2.405M i/100ms
           small_lit     2.308M i/100ms
          frozen_lit     2.268M i/100ms
Calculating -------------------------------------
              symbol     23.528M (± 6.9%) i/s -    117.584M in   5.033231s
          dyn_symbol     23.777M (± 4.7%) i/s -    120.231M in   5.071734s
           small_lit     23.066M (± 2.9%) i/s -    115.376M in   5.006947s
          frozen_lit     22.729M (± 1.1%) i/s -    115.693M in   5.090700s

Comparison:
              symbol: 23527823.6 i/s
          dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error
           small_lit: 23065535.3 i/s - same-ish: difference falls within error
          frozen_lit: 22729351.6 i/s - same-ish: difference falls within error

cc @etiennebarrie

compile.c Outdated
}
else {
lit = rb_fstring(lit);
lit = rb_fstring(rb_str_precompute_hash(rb_str_freeze(lit)));
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: the compiler should always feed pre-frozen strings to rb_fstring as to save a dup.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, it could use rb_enc_interned_str to save a bunch of allocs.

@casperisfine casperisfine force-pushed the interned-string-hashcode branch from e205c02 to f22b383 Compare April 8, 2024 13:04
@casperisfine
Copy link
Copy Markdown
Author

Note to self, we're precomputing in the compiler, we'd need to handle loading cached ISeq too.

@casperisfine
Copy link
Copy Markdown
Author

@XrXr: shared https://bugs.ruby-lang.org/issues/15331, but seems like that patch was lazily computing the hashcode, here we eagerly do it during compilation, so I think we benefit more from it.

Unclear how much worth it, it would be, but with embeded
strings we often have some space left in the slot, which
we can use to store the string Hash code.

It's probably only worth it for string literals, as they
are the ones likely to be used as hash keys.

We could even ony do this for string literal actually
used as hash keys, as seen by the compiler.

It is also unclear what the best layout would be. We
currently store it right after the string terminator
as to make it easy/fast to compute, but ideally it would
be at a fixed position, however this means creating another
`RString` union, which would complexify the code base significantly.

It's also unclear to us whether trying to respect alignment would
lead to a performance gain or not.

```ruby
hash = 10.times.to_h do |i|
  [i, i]
end

dyn_sym = "dynamic_symbol".to_sym
hash[:some_symbol] = 1
hash[dyn_sym] = 1
hash["small"] = 2
hash["frozen_string_literal"] = 2

Benchmark.ips do |x|
  x.report("symbol") { hash[:some_symbol] }
  x.report("dyn_symbol") { hash[:some_symbol] }
  x.report("small_lit") { hash["small"] }
  x.report("frozen_lit") { hash["frozen_string_literal"] }
  x.compare!(order: :baseline)
end
```

Before:

```
ruby 3.3.0 (2023-12-25 revision 5124f9a) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.392M i/100ms
          dyn_symbol     2.440M i/100ms
           small_lit     2.155M i/100ms
          frozen_lit     2.010M i/100ms
Calculating -------------------------------------
              symbol     24.175M (± 1.7%) i/s -    122.002M in   5.048306s
          dyn_symbol     24.345M (± 1.6%) i/s -    122.019M in   5.013400s
           small_lit     21.252M (± 2.1%) i/s -    107.744M in   5.072042s
          frozen_lit     20.095M (± 1.3%) i/s -    100.489M in   5.001681s

Comparison:
              symbol: 24174848.1 i/s
          dyn_symbol: 24345476.9 i/s - same-ish: difference falls within error
           small_lit: 21252403.2 i/s - 1.14x  slower
          frozen_lit: 20094766.0 i/s - 1.20x  slower
```

After:

```
ruby 3.4.0dev (2024-04-08T07:20:15Z interned-string-ha.. 76efed6) [arm64-darwin23]
Warming up --------------------------------------
              symbol     2.400M i/100ms
          dyn_symbol     2.405M i/100ms
           small_lit     2.308M i/100ms
          frozen_lit     2.268M i/100ms
Calculating -------------------------------------
              symbol     23.528M (± 6.9%) i/s -    117.584M in   5.033231s
          dyn_symbol     23.777M (± 4.7%) i/s -    120.231M in   5.071734s
           small_lit     23.066M (± 2.9%) i/s -    115.376M in   5.006947s
          frozen_lit     22.729M (± 1.1%) i/s -    115.693M in   5.090700s

Comparison:
              symbol: 23527823.6 i/s
          dyn_symbol: 23776757.8 i/s - same-ish: difference falls within error
           small_lit: 23065535.3 i/s - same-ish: difference falls within error
          frozen_lit: 22729351.6 i/s - same-ish: difference falls within error
```

Co-Authored-By: Étienne Barrié <etienne.barrie@gmail.com>
@casperisfine casperisfine force-pushed the interned-string-hashcode branch from 1a17cf3 to 49c75af Compare April 15, 2024 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants