Skip to content

GH-39303: [Archery][Benchmarking] Allow setting C++ repetition min time#39324

Merged
kou merged 2 commits intoapache:mainfrom
pitrou:gh39303-benchmark-repetition-time
Jan 6, 2024
Merged

GH-39303: [Archery][Benchmarking] Allow setting C++ repetition min time#39324
kou merged 2 commits intoapache:mainfrom
pitrou:gh39303-benchmark-repetition-time

Conversation

@pitrou
Copy link
Copy Markdown
Member

@pitrou pitrou commented Dec 20, 2023

Rationale for this change

We want to be able to increase the number of repetitions for each C++ micro-benchmark without increasing the total runtime.

What changes are included in this PR?

  • Add a --repetition-min-time argument to set the repetition duration in seconds
  • Add a --cpp-benchmark-extras argument to pass arbitrary arguments to Google Benchmark executables
  • Add a couple tests with multiple benchmark repetitions

Are these changes tested?

Not entirely. Command-line argument passing is not unit-tested.

Are there any user-facing changes?

No.

…min time

We want to be able to increase the number of repetitions for each C++ micro-benchmark without increase the total runtime.
@github-actions github-actions bot added the awaiting review Awaiting review label Dec 20, 2023
@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Dec 20, 2023

@pitrou
Copy link
Copy Markdown
Member Author

pitrou commented Dec 20, 2023

With this PR, if I run:

$ archery benchmark  run /build/build-release/ --suite-filter arrow-value-parsing-benchmark --benchmark-filter "FloatParsing.DoubleType" --repetitions 20 --repetition-min-time=0.05 --output=t.json

... then the total runtime is less than 2 second and I get a results file like this:

{
  "suites": [
    {
      "name": "arrow-value-parsing-benchmark",
      "benchmarks": [
        {
          "name": "FloatParsing<DoubleType>",
          "unit": "items_per_second",
          "less_is_better": false,
          "values": [
            79950121.48431177,
            85113591.18022932,
            90551140.90867223,
            90609487.2652054,
            90628541.38461946,
            90649225.23022957,
            91166686.43369755,
            91220387.63998744,
            91538368.41832638,
            91640734.18134348,
            91846292.52809268,
            91929727.83451708,
            91948825.97105175,
            91975895.10168795,
            92096737.69094324,
            92171049.58800516,
            92262171.0999602,
            92316911.2854944,
            92499358.69477023,
            93462194.03456633
          ],
          "time_unit": "ns",
          "times": [
            10717.94139001506,
            10813.586956610783,
            10835.81713547928,
            10852.307757681898,
            10854.492114290344,
            10871.693946617952,
            10876.263213883914,
            10879.553708338452,
            10881.460784591169,
            10905.518542276943,
            10916.3064791786,
            10929.045609416553,
            10967.386402201393,
            10972.461636954646,
            11034.734867894433,
            11037.859122534785,
            11046.999999422633,
            11047.337169190347,
            11760.27770670678,
            12512.818840690854
          ],
          "counters": {
            "family_index": 0,
            "per_family_instance_index": 0,
            "run_name": "FloatParsing<DoubleType>",
            "repetitions": 20,
            "repetition_index": 9,
            "threads": 1,
            "iterations": 4692
          }
        }
      ]
    }
  ]
}

@pitrou pitrou force-pushed the gh39303-benchmark-repetition-time branch from 9f89e51 to 2cd0f0c Compare December 20, 2023 15:47
@jgehrcke
Copy link
Copy Markdown
Contributor

We want to be able to increase the number of repetitions for each C++ micro-benchmark without increase the total runtime.

That clarity of wording is much appreciated.

        "repetitions": 20,
        "iterations": 4692

Lovely. For people reading along, a related thread about that terminology: conbench/conbench#1398

@pitrou pitrou requested a review from kszucs December 20, 2023 16:51
Copy link
Copy Markdown
Member

@kou kou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@@ -0,0 +1,2 @@
pytest
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the license header?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, apparently requirements files are excluded from the RAT checks?
I presume comments are not supported in those files.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, apparently requirements files are excluded from the RAT checks?

In general, I think that we should use the RAT checks as much as possible for safety.

I presume comments are not supported in those files.

https://github.com/apache/arrow/blob/main/docs/requirements.txt uses comment. So we may be able to use comment here too. If we can't, I'm OK with excluding from the RAT checks.

We can defer this to another PR if needed.
I'll merge this.

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting review Awaiting review labels Dec 21, 2023
@kou kou merged commit 1c1fa74 into apache:main Jan 6, 2024
@kou kou removed the awaiting merge Awaiting merge label Jan 6, 2024
@github-actions github-actions bot added the awaiting changes Awaiting changes label Jan 6, 2024
@pitrou pitrou deleted the gh39303-benchmark-repetition-time branch January 6, 2024 21:59
@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit 1c1fa74.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 8 possible false positives for unstable benchmarks that are known to sometimes produce them.

dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…min time (apache#39324)

### Rationale for this change

We want to be able to increase the number of repetitions for each C++ micro-benchmark without increasing the total runtime.

### What changes are included in this PR?

* Add a `--repetition-min-time` argument to set the repetition duration in seconds
* Add a `--cpp-benchmark-extras` argument to pass arbitrary arguments to Google Benchmark executables
* Add a couple tests with multiple benchmark repetitions

### Are these changes tested?

Not entirely. Command-line argument passing is not unit-tested.

### Are there any user-facing changes?

No.
* Closes: apache#39303

Authored-by: Antoine Pitrou <antoine@python.org>
Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting changes Awaiting changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Archery] Allow passing options to Google Benchmark

3 participants