Add basic benchmark test for Python#17793
Conversation
gnossen
left a comment
There was a problem hiding this comment.
Just some nits. Looking really nice.
| function join { local IFS="$1"; shift; echo "$*"; } | ||
|
|
||
| if [[ -e "${SCENARIOS_FILE}" ]]; then | ||
| echo "Running against scenarios.json:" |
There was a problem hiding this comment.
Maybe this could be echo "Running against ${SCENARIOS_FILE}:"?
|
|
||
| ## Why keep the scenario file if it can be generated? | ||
|
|
||
| Well... The `tools/run_tests/performance/scenario_config.py` is 1274 lines long. The intention of building these benchmark tools is reducing the complexity of existing infrastructure code. Depending on something that is |
There was a problem hiding this comment.
Thanks for pointing out. The updated version:
Well... The tools/run_tests/performance/scenario_config.py is 1274 lines long. The intention of building these benchmark tools is reducing the complexity of existing infrastructure code. So, instead of calling layers of abstraction to generate the scenario file, keeping a valid static copy is more preferable.
|
A few insights:
|
|
I scanned through the code quickly and looks like this is only adding a bazel test that runs a few basic python benchmarks locally. If that's the case and there are no changes to the run_performance_tests.py suite, I probably don't need to review. (but I still don't quite understand why running the run_performance_tests.py script is so slow for python - we have accelerated the python build recently, haven't we?). |
|
There are two major performance issue with The purpose of this PR is adding a handy tool to tune performance locally with frequent code changes. However, we still need the |
ericgribkoff
left a comment
There was a problem hiding this comment.
LGTM
Regarding build speed of python: the recent changes did make it faster, but it's still on the order of minutes when building core as well. It seems that setuptools has little (afaict, no) support for incremental compilation: we found https://stackoverflow.com/questions/47539538/setup-py-build-that-doesnt-reinvoke-compiler-on-every-c-source-file-increment, which suggests an approach like plugging in bazel to do the c/cython step within setuptools may be a viable solution, but that's a longer term approach.
| bazel test --test_output=streamed src/python/grpcio_tests/tests/qps:basic_benchmark_test | ||
| ``` | ||
|
|
||
| ## How is the output look like? |
|
|
||
| ## Why keep the scenario file if it can be generated? | ||
|
|
||
| Well... The `tools/run_tests/performance/scenario_config.py` is 1274 lines long. The intention of building these benchmark tools is reducing the complexity of existing infrastructure code. So, instead of calling layers of abstraction to generate the scenario file, keeping a valid static copy is more preferable. |
There was a problem hiding this comment.
s/more preferable/preferable/
Optional: Add a "future-proofing" sentence like - "If the use case for this tool grows beyond these two scenarios, we can incorporate automatic generation and selection of scenarios into the tool."
There was a problem hiding this comment.
Added:
Also, if the use case for this tool grows beyond simple static scenarios, we can incorporate automatic generation and selection of scenarios into the tool.
| * python_protobuf_sync_streaming_qps_unconstrained | ||
| * python_protobuf_sync_unary_ping_pong_1MB | ||
|
|
||
| Here I picked the top 2 most representative scenarios of them, and reduce their benchmark duration from 30 seconds to 10 seconds: |
There was a problem hiding this comment.
Is "most representative" backed by an objective measurement? If not its fine, perhaps just reword to "I [we? I think we usually use we in the readme files] picked a small but representative subset"
Running a single performance benchmark scenario for Python takes 15 minutes. The scenario here named "python_protobuf_sync_unary_ping_pong".
To boost our productivity, we have to speed up the benchmark test. The whole benchmark mechanism is designed for CI, and it is mixing the control flow with data flow. The related code amount is huge (thousands of lines across languages & folders). IMO it is better to utilize existing code as much as I can without changing its behavior.
The first change is building Python C extension incrementally. To achieve that, this PR added a Bazel BUILD file to instruct Bazel how to build Python benchmark related code. Then make it depends on our core package
//src/python/grpcio/grpc:grpcio.The second change is adding a
basic_benchmark_test. It runs two scenario that could cover major use cases of gRPC Python. What is does is that it spins up two Pythonqps_workerto serve as client and server, and then spins up aqps_json_driverto control the two worker to run benchmarks. The result will be print to stdout. And this PR includes a simple doc to make it easier to use.The time usage reduces dramatically (and it is running two scenarios):
PS: If I use a
py_test, the sanity check will complain that I didn't add it to thetests.json. Then, I need to disable it at two or three places...