Skip to content

Performance: remove useless header merge from Response#1325

Merged
tenderlove merged 3 commits into
rack:masterfrom
jodosha:performance/remove-useless-header-merge
Apr 2, 2019
Merged

Performance: remove useless header merge from Response#1325
tenderlove merged 3 commits into
rack:masterfrom
jodosha:performance/remove-useless-header-merge

Conversation

@jodosha

@jodosha jodosha commented Dec 19, 2018

Copy link
Copy Markdown
Contributor

Change

There is no need to instantiate and merge headers, as HeaderHash already sets the given values from header.

By looking at the code of HeaderHash#merge we see that it duplicates itself:

      def merge(other)
        hash = dup
        hash.merge! other
      end

Each time we instantiate a Rack::Response we instantiate a HeaderHash twice: the first time because of new, and the second time because of merge.

Req/s benchmark

By running this benchmark on my machine, I can see a performance gain of +350req/s.

Gemfile

# frozen_string_literal: true

source "https://rubygems.org"

gem "rack" #, git: "https://github.com/jodosha/rack.git", branch: "performance/remove-useless-header-merge"
gem "puma"

config.ru

# frozen_string_literal: true

require "bundler/setup"

class MyAction
  def call(*)
    res = Rack::Response.new
    res.write "OK"
    res
  end
end

run MyAction.new

Run the app with:

$ bundle exec puma -e production -t 16:16 config.ru

Before:

$ wrk -t 2 http://localhost:9292/
Running 10s test @ http://localhost:9292/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.96ms  520.27us  17.09ms   86.12%
    Req/Sec     5.33k   317.62     5.67k    91.50%
  106150 requests in 10.00s, 4.05MB read
Requests/sec:  10614.81
Transfer/sec:    414.65KB

After:

$ wrk -t 2 http://localhost:9292/
Running 10s test @ http://localhost:9292/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.93ms  525.01us  16.51ms   87.49%
    Req/Sec     5.53k   358.19     5.85k    92.00%
  109938 requests in 10.00s, 4.19MB read
Requests/sec:  10993.58
Transfer/sec:    429.45KB

Comment thread lib/rack/response.rb
def initialize(body = [], status = 200, header = {})
@status = status.to_i
@header = Utils::HeaderHash.new.merge(header)
@header = Utils::HeaderHash.new(header)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fail when header is a HeaderHash already. In that case new just return the same object what makes this object to mutate the original argument too. I do think it does make sense to avoid the useless instantiation when header is a regular hash but we need to take in consideration when it is a HeaderHash, probably changing new to dup the argument in that case.

@jodosha jodosha Jan 3, 2019

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafaelfranca Correct me if I'm wrong. From this line of code, it seems safe to pass both Utils::HeaderHash and Hash: https://github.com/rack/rack/blob/master/lib/rack/utils.rb#L415

def self.new(hash = {})
  HeaderHash === hash ? hash : super(hash)
end

@rafaelfranca rafaelfranca Jan 3, 2019

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is safe, but the object passed will be mutated.

For example, before this change:

header = Rack::Utils::HeaderHash.new

response = Rack::Response.new([], 200, header)
response.header['a'] = 1
p header # => {}

With your change:

header = Rack::Utils::HeaderHash.new

response = Rack::Response.new([], 200, header)
response.header['a'] = 1
p header # => { "a" => 1 }

This only happens when it is a HeaderHash:

header = {}

response = Rack::Response.new([], 200, header)
response.header['a'] = 1
p header # => {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafaelfranca Thanks for the pointer.
It only happens with HeaderHash, because Hash is iterated during HeaderHash#initialize.

      def initialize(hash = {})
        super()
        @names = {}
        hash.each { |k, v| self[k] = v }
      end

If I try to dup HeaderHash in .new, there is a spec that fails:

      def self.new(hash = {})
        HeaderHash === hash ? hash.dup : super(hash)
      end
# test/spec_utils.rb:677

  it "avoid unnecessary object creation if possible" do
    a = Rack::Utils::HeaderHash.new("foo" => "bar")
    b = Rack::Utils::HeaderHash.new(a)
    b.object_id.must_equal a.object_id
    b.must_equal a
  end

This makes sense for performance POV, but doesn't guarantee immutability.

So I grep'd for the usage of HeaderHash and discovered it's used only with new (without merge) in the rest of the code base. These are middleware, so I'm not sure if we care about immutability over there.


A solution to preserve the current behavior of .new and to make Response#initialize faster for the most common case (header is a Hash), we can introduce HeaderHash.build, as a special case:

      def self.build(hash)
        new(HeaderHash === hash ? hash.dup : hash)
      end

This will make existing specs to pass, and also a new one that I wrote:

# test/spec_response.rb
  it "doesn't mutate given headers" do
    [{}, Rack::Utils::HeaderHash.new].each do |header|
      response = Rack::Response.new([], 200, header)
      response.header["Content-Type"] = "text/plain"
      response.header["Content-Type"].must_equal "text/plain"

      header.wont_include("Content-Type")
    end
  end

Here's the results before and after this implementation:

$ wrk -t 2 http://localhost:9292/
Running 10s test @ http://localhost:9292/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.00ms  480.28us  13.76ms   86.93%
    Req/Sec     5.13k   237.42     5.39k    91.09%
  103020 requests in 10.10s, 3.93MB read
Requests/sec:  10200.58
Transfer/sec:    398.47KB
$ wrk -t 2 http://localhost:9292/
Running 10s test @ http://localhost:9292/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.96ms  453.09us  12.87ms   87.46%
    Req/Sec     5.35k   269.14     5.69k    90.50%
  106459 requests in 10.00s, 4.06MB read
Requests/sec:  10644.92
Transfer/sec:    415.83KB

Please let me know what you think. Thanks! 👍

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tenderlove what do you think about the current behavior of new? Should we care about immutability of the HeaderHash in the Response#initialize if we don't care in other places?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should care about immutability.

@jeremy jeremy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. It's surprising that .new doesn't dupe. Can we safely change that behavior rather than introducing new API?

"Coerce to HeaderHash unless already an instance" may be desirable in other cases, but I'd expect something other than .new to handle that coercion. Perhaps .build 😅

@tenderlove

Copy link
Copy Markdown
Member

I have a few thoughts about this:

  1. I think the subclass HeaderHash should behave as close to the super as possible, meaning .new should dup
  2. I really don't like that HeaderHash is a subclass of Hash. I'd prefer it delegates to Hash
  3. HeaderHash should be :nodoc: and private.

I can't find any usage of HeaderHash on GitHub code search that isn't internal to Rack, so I think we're pretty safe to change this class. I don't expect anyone to be calling HeaderHash.new.

@jodosha is it possible to just delete our current implementation of self.new on HeaderHash?

@jodosha

jodosha commented Jan 23, 2019

Copy link
Copy Markdown
Contributor Author

@tenderlove Please have a look at d671a0c, this commit:

  1. It removes Utils::HeaderHash.new and .build (previously introduced in this PR)
  2. It changes an existing test to assert that all the given headers (both Hash and Utils::HeaderHash) will be duped.
  3. It marks Utils::HeaderHash as private API and adds a :nodoc: directive for RDoc.

Regarding Utils::HeaderHash inheriting from Hash, I agree. Do you think there is room for this improvement in this PR, or it's better to wait for this change? Thanks.

@tenderlove

Copy link
Copy Markdown
Member

Regarding Utils::HeaderHash inheriting from Hash, I agree. Do you think there is room for this improvement in this PR, or it's better to wait for this change? Thanks.

I guess we could consider this PR a "breaking change" already, so I'm fine with removing the inheritance in this PR.

We haven't decided on the version for master, whether it's 2.1 or 3.0. Maybe we should merge this in and call it 3.0? @rafaelfranca @matthewd @eileencodes @jeremy any opinions? As far as performance goes, this HeaderHash thing has been a thorn in our side for a while. I'd be really happy to ship this.

@tenderlove

Copy link
Copy Markdown
Member

To be clear, I think we could merge this PR as-is and ship it as 2.1. If we change inheritance to delegation we'll have to go to 3.0 as the API surface area will change drastically (there's no way we can change to delegation and have even close to 100% api compatibility with Hash)

@eileencodes

Copy link
Copy Markdown
Member

I have no objections to merging this as is 👍

@jodosha

jodosha commented Feb 20, 2019

Copy link
Copy Markdown
Contributor Author

@tenderlove 2.1 vs 3.0: it really depends on the general plans for Rack.

If you're planning other major changes for 3.0, it's worth to modify Utils::HeaderHash and use delegation, otherwise it can wait for now. Thanks.

EDIT: as I'm rewriting parts of Hanami, I'll probably submit other PRs for perf enhancement, if needed. 🐈

@jodosha

jodosha commented Mar 6, 2019

Copy link
Copy Markdown
Contributor Author

@tenderlove I realized that asking for Rack plans is out of the scope for this PR. So here's my proposal: if the current implementation looks good, please merge it.

I'll take care after to create another PR that introduces the breaking changes. So it's ready if you folks decide one day to go for 3.0.

@tenderlove

Copy link
Copy Markdown
Member

@jodosha sorry to take so long to respond. I'll merge this.

I'll take care after to create another PR that introduces the breaking changes. So it's ready if you folks decide one day to go for 3.0.

This sounds great. If we decide master should be 3.0 we can merge that PR.

@tenderlove tenderlove merged commit 4c07085 into rack:master Apr 2, 2019
@jodosha jodosha deleted the performance/remove-useless-header-merge branch April 3, 2019 11:55
@schneems schneems mentioned this pull request Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants