[REVIEW][Perf] Remove Redundant string concatenations in dask code-base#6137
[REVIEW][Perf] Remove Redundant string concatenations in dask code-base#6137quasiben merged 33 commits intodask:masterfrom
Conversation
|
These look reasonable to me but I am biased here. I think it would be good to have an additional maintainer review here before merging It is worth noting these changes are in the graph construction |
|
Do these changes affect performance in real-world workflows?
…On Mon, Apr 27, 2020 at 7:10 AM Benjamin Zaitlen ***@***.***> wrote:
These look reasonable to me but I am biased here. I think it would be good
to have an additional maintainer review here before merging
It is worth noting these changes are in the graph construction
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTCREYSCA7P2QKADKLDROWG5JANCNFSM4MQ2ZTIA>
.
|
| shuffle_join_token = "shuffle-join-" + token | ||
|
|
||
| start = dict( | ||
| (("shuffle-join-" + token, 0, inp), (b2.name, i) if i < b.npartitions else []) |
There was a problem hiding this comment.
These changes look good @galipremsagar - Although, I do have a minor preference for names like shuffle_join_name rather than ..._token (this is proabably just a personal preference).
This definitely reduces the time needed to build the graph for shuffle operations. I think we should definitley be sure to predefine (concatenate) all task labels outside python loops in the future. |
Awesome. I'm glad to hear it. If you all happen to have a before and after comparison for a large graph, like a dataframe merge, I'd love to see the results. The microbenchmark in the original post shows that this is a great start here in that it shows that it will make things faster, but it could be made stronger by showing how that speedup affects performance in the broader context of Dask. |
Yes, it does. While testing an internal workflow noticed a 0.5 to 0.8s drop in benchmark times. |
|
Can you include a minimal example in the comments of this pull request?
Ideally something not using cudf so that others here can see the effects.
I recommend using dask.datasets.timeseries for this.
…On Mon, Apr 27, 2020 at 8:19 AM GALI PREM SAGAR ***@***.***> wrote:
Do these changes affect performance in real-world workflows?
Yes, it does. While testing an internal workflow noticed a 0.5 to 0.8s
drop in benchmark times.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#6137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTGGPPAX7P77CCGVJX3ROWPBFANCNFSM4MQ2ZTIA>
.
|
Here is a simple example that illustrates the performance bump for shuffle-graph creation in from dask.datasets import timeseries
from dask.dataframe.shuffle import shuffle
ddf_d = timeseries(start='2000-01-01', end='2010-01-01', partition_freq='1d')
ddf_h = timeseries(start='2000-01-01', end='2010-01-01', partition_freq='1h')
%timeit ddf_d_2 = shuffle(ddf_d, "id", shuffle="tasks")
%timeit ddf_h_2 = shuffle(ddf_h, "id", shuffle="tasks")Master: This PR: Note that the benefit is much more significant when the intial dataframe has a large number of partitions (common for ETL on large-scale datasets in the real-world). |
|
Any other concerns with this PR ? If not, I'd like to merge tomorrow afternoon |
@quasiben Sure, few more minor tweaks I have to test. If they work I'll push those as well in this PR and this should be ready for merge. |
|
@quasiben This is ready for review. |
rjzamora
left a comment
There was a problem hiding this comment.
Thanks agian for making these optimizations @galipremsagar - My only suggestion is to make sure we get back the same DataFrame subclass if we are using something like dask_cudf. This suggestion is not related to a change from your PR - but I think it is a good oportunity to fix it.
dask/dataframe/shuffle.py
Outdated
| graph2, "repartition-get-" + token, df2, [None] * (npartitions + 1) | ||
| repartition_get_name, dsk, dependencies=[df2] | ||
| ) | ||
| df3 = DataFrame(graph2, repartition_get_name, df2, [None] * (npartitions + 1)) |
There was a problem hiding this comment.
Just realizing since we can/do use shuffle on a dask_cudf DataFrame, this should probably be something like df.__class__ rather than DataFrame.
There was a problem hiding this comment.
See the new_dd_object function commonly used in dask/dataframe/core.py
There was a problem hiding this comment.
Made the change 👍
dask/dataframe/shuffle.py
Outdated
| graph = HighLevelGraph.from_collections("shuffle-" + token, dsk, dependencies=[df]) | ||
| df2 = DataFrame(graph, "shuffle-" + token, df, df.divisions) | ||
| graph = HighLevelGraph.from_collections(shuffle_token, dsk, dependencies=[df]) | ||
| df2 = DataFrame(graph, shuffle_token, df, df.divisions) |
There was a problem hiding this comment.
See comment below - We might not want DataFrame here.
|
@galipremsagar can I ask you to post new numbers with and without this PR ? |
|
@quasiben Here are the numbers with respect to master and current changes: from dask.datasets import timeseries
from dask.dataframe.shuffle import shuffle
ddf_d = timeseries(start='2000-01-01', end='2010-01-01', partition_freq='1d')
ddf_h = timeseries(start='2000-01-01', end='2010-01-01', partition_freq='1h')
%timeit ddf_d_2 = shuffle(ddf_d, "id", shuffle="tasks")
%timeit ddf_h_2 = shuffle(ddf_h, "id", shuffle="tasks")THIS PR: MASTER: |
|
@galipremsagar thanks for doing that. I think this is ready to merge but I'd like to give other maintainers time to comment. If there are no more comments I'll plan to merge tonight or tomorrow morning |
Thanks @quasiben ! Sounds good to me. |
# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
|
@galipremsagar Thanks again for the work here. Do you think this is ready to go now? |
I'm not sure we can remove the check, do you have any suggestions on that part of code? @rjzamora Update: Changes are ready for review. |
|
Thanks for going through all this @galipremsagar and definitely appreciate the timings and snakeviz along the way. @TomAugspurger do you think this PR is worth running agains the dask benchmarks to ensure a few things:
I am only asking for an opinion, not for you to do this work |
|
A new benchmark at dask-benchmarks demonstrating the performance gain here
would be great to have, to help ensure that we don't regress on this gain.
For testing, you can change the path to dask in `dask/asv.conf.json`. I
have my layout like
```
dask/
dask-benchmarks/
```
so
```diff
diff --git a/dask/asv.conf.json b/dask/asv.conf.json
index 147e999..2b3f673 100644
--- a/dask/asv.conf.json
+++ b/dask/asv.conf.json
@@ -11,7 +11,7 @@
// The URL or local path of the source code repository for the
// project being benchmarked
- "repo": "https://github.com/dask/dask.git",
+ "repo": "../../dask",
// List of branches to benchmark. If not provided, defaults to "master"
// (for git) or "default" (for mercurial).
```
then you can compare master against your HEAD to see the improvement.
…On Tue, May 5, 2020 at 3:35 PM Benjamin Zaitlen ***@***.***> wrote:
Thanks for going through all this @galipremsagar
<https://github.com/galipremsagar> and definitely appreciate the timings
and snakeviz along the way. @TomAugspurger
<https://github.com/TomAugspurger> do you think this PR is worth running
agains the dask benchmarks <https://github.com/dask/dask-benchmarks> to
ensure a few things:
1. we haven't broken anything
2. we either have improved performance or, at worst, kept performance
the same without degradation ?
I am only asking for an opinion, not for you to do this work
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXMVFSK5CT6FHFBPZ3RQB2CFANCNFSM4MQ2ZTIA>
.
|
Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>
Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>
Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>
|
Here are the Master [ 0.00%] · For dask commit edd92d58:
[ 0.00%] ·· Benchmarking conda-py3.7-cloudpickle-distributed-fsspec-numpy-pandas-partd-pip+cityhash-pip+xxhash-pyarrow-pytables-s3fs-scipy-toolz
[ 1.09%] ··· Running (array.BlockInfoBlockwise.time_compute--)..........
[ 11.96%] ··· Running (array.Slicing.time_slice_slice_tail--).......
[ 19.57%] ··· Running (dataframe.MemoryDataFrame.time_reduction--).........
[ 29.35%] ··· Running (order.OrderCholesky.time_order_cholesky_lower--)....
[ 33.70%] ··· Running (order.OrderLinalgSolves.time_order_linalg_solve--)...
[ 36.96%] ··· Running (order.OrderManySubgraphs.time_order_many_subgraphs--)..
[ 39.13%] ··· Running (order.OrderRechunkTranspose.time_order_rechunk_transpose--).....
[ 44.57%] ··· Setting up io.py:33 ok
[ 44.57%] ··· Running (io.CSV.time_read_csv--)..
[ 46.74%] ··· Setting up io.py:58 ok
[ 46.74%] ··· Running (io.HDF5.time_read_hdf5--)..
[ 48.91%] ··· Setting up io.py:85 ok
[ 48.91%] ··· Running (io.Parquet.time_optimize_getitem--)..
[ 51.09%] ··· array.BlockInfoBlockwise.time_compute 3.67±0.02s
[ 52.17%] ··· array.BlockInfoBlockwise.time_optimize 21.3±0.2ms
[ 53.26%] ··· array.BlockInfoSingleton.time_optimize_singleton 140±4μs
[ 54.35%] ··· array.Blockwise.time_make_blockwise_graph 2.70±0.02s
[ 55.43%] ··· array.FancyIndexing.time_fancy 28.8±0.2ms
[ 56.52%] ··· array.Rechunk.time_rechunk 25.0±0.6ms
[ 57.61%] ··· array.Rechunk.time_rechunk_meta 10.2±0.03ms
[ 58.70%] ··· array.Slicing.time_slice_int_head 8.85±0.4ms
[ 59.78%] ··· array.Slicing.time_slice_int_tail 10.6±0.6ms
[ 60.87%] ··· array.Slicing.time_slice_slice_head 20.7±1ms
[ 61.96%] ··· array.Slicing.time_slice_slice_tail 21.0±1ms
[ 63.04%] ··· array.Slicing.time_slices_from_chunks 1.19±0.01s
[ 64.13%] ··· array.TestSubs.time_subs 43.8±1ms
[ 65.22%] ··· array_overlap.MapOverlap.time_map_overlap ok
[ 65.22%] ··· ================= ========== ============
shape boundary
----------------- ---------- ------------
(100, 100, 100) reflect 62.1±0.7ms
(100, 100, 100) periodic 61.0±1ms
(100, 100, 100) nearest 62.4±0.8ms
(100, 100, 100) none 15.5±0.3ms
(50, 512, 512) reflect 69.8±2ms
(50, 512, 512) periodic 68.5±2ms
(50, 512, 512) nearest 71.2±2ms
(50, 512, 512) none 21.4±1ms
================= ========== ============
[ 66.30%] ··· dataframe.MemoryDataFrame.time_boolean_indexing 126±6ms
[ 67.39%] ··· dataframe.MemoryDataFrame.time_count_values 39.6±1ms
[ 68.48%] ··· dataframe.MemoryDataFrame.time_groupby 203±8ms
[ 69.57%] ··· dataframe.MemoryDataFrame.time_reduction 137±20ms
[ 70.65%] ··· dataframe.MemoryDataFrame.time_scalar_comparison 12.5±0.7ms
[ 71.74%] ··· dataframe.MemoryDataFrame.time_set_index 324±10ms
[ 72.83%] ··· optimization.Cull.time_cull 191±1ms
[ 73.91%] ··· optimization.Fuse.time_fuse ok
[ 73.91%] ··· ========= =========
param1
--------- ---------
diamond 381±6ms
linear 175±3ms
========= =========
[ 75.00%] ··· optimization.Inline.time_inline_constants 43.2±0.5ms
[ 76.09%] ··· optimization.Inline.time_inline_functions 259±4ms
[ 77.17%] ··· optimization.Inline.time_inline_keys 168±4ms
[ 78.26%] ··· order.OrderCholesky.time_order_cholesky 496±4ms
[ 79.35%] ··· order.OrderCholesky.time_order_cholesky_lower 474±6ms
[ 80.43%] ··· ...r.OrderCholeskyMixed.time_order_cholesky_mixed 384±7ms
[ 81.52%] ··· ...rCholeskyMixed.time_order_cholesky_mixed_lower 354±10ms
[ 82.61%] ··· order.OrderFullLayers.time_order_full_layers ok
[ 82.61%] ··· ============ =========
param1
------------ ---------
(1, 50000) 513±6ms
(2, 10000) 383±3ms
(10, 1000) 372±3ms
(100, 20) 416±3ms
(500, 2) 571±6ms
(9999, 1) 115±2ms
(50000, 1) 397±2ms
============ =========
[ 83.70%] ··· order.OrderLinalgSolves.time_order_linalg_solve 340±6ms
[ 84.78%] ··· order.OrderLinearFull.time_order_linear_full 1.48±0.01s
[ 85.87%] ··· ...rLinearWithDanglers.time_order_linear_danglers ok
[ 85.87%] ··· ============ =========
param1
------------ ---------
(2, 10000) 191±2ms
(5, 5000) 265±2ms
============ =========
[ 86.96%] ··· ...r.OrderManySubgraphs.time_order_many_subgraphs ok
[ 86.96%] ··· =========== =========
param1
----------- ---------
(1, 9999) 233±4ms
(3, 3333) 239±6ms
(10, 999) 258±4ms
(30, 303) 255±1ms
(100, 99) 277±3ms
(999, 10) 295±6ms
=========== =========
[ 88.04%] ··· order.OrderMapOverlap.time_order_mapoverlap ok
[ 88.04%] ··· =========================================== =========
param1
------------------------------------------- ---------
((10000.0, 10000.0), (200, 200), 1) 666±5ms
((1000, 1000, 1000), (100, 100, 100), 10) 707±5ms
=========================================== =========
[ 89.13%] ··· ...rRechunkTranspose.time_order_rechunk_transpose 797±10ms
[ 90.22%] ··· order.OrderSVD.time_order_svd 83.8±0.2ms
[ 91.30%] ··· tokenize.TokenizeBuiltins.time_tokenize 197±1ms
[ 92.39%] ··· tokenize.TokenizeNumpy.time_tokenize ok
[ 92.39%] ··· ======== =============
dtype
-------- -------------
int 606±3μs
float 602±10μs
str 2.43±0.05ms
bytes 447±9μs
object 91.8±2ms
======== =============
[ 93.48%] ··· tokenize.TokenizePandas.time_tokenize ok
[ 93.48%] ··· ===================== ============= =============
-- as_series
--------------------- ---------------------------
dtype True False
===================== ============= =============
period 62.7±1μs 29.8±0.7μs
datetime64[ns] 59.9±2μs 36.6±0.5μs
datetime64[ns, CET] 62.0±0.9μs 32.2±0.1μs
int 64.9±0.08μs 24.3±1μs
category 202±2μs 48.5±0.3μs
sparse 68.6±0.2μs 30.3±0.05μs
Int 1.23±0.01ms 1.19±0.01ms
string 465±4μs 411±6μs
boolean 871±10μs 820±1μs
===================== ============= =============
[ 94.57%] ··· io.CSV.time_read_csv ok
[ 94.57%] ··· ================= ============
param1
----------------- ------------
single-threaded 279±4ms
processes 1.96±0.02s
threads 306±2ms
================= ============
[ 95.65%] ··· io.CSV.time_read_csv_meta ok
[ 95.65%] ··· ================= ============
param1
----------------- ------------
single-threaded 28.3±0.1ms
processes 29.2±0.4ms
threads 28.7±0.3ms
================= ============
[ 96.74%] ··· io.HDF5.time_read_hdf5 ok
[ 96.74%] ··· ================= =========
param1
----------------- ---------
single-threaded 393±3ms
processes n/a
threads 409±9ms
================= =========
[ 97.83%] ··· io.HDF5.time_read_hdf5_meta ok
[ 97.83%] ··· ================= ===========
param1
----------------- -----------
single-threaded 258±0.9ms
processes n/a
threads 259±0.6ms
================= ===========
[ 98.91%] ··· io.Parquet.time_optimize_getitem 37.9±0.9ms
[100.00%] ··· io.Parquet.time_read_getitem_projection 693±20msThis PR root@dt07:/cudf/dask-benchmarks/dask# asv run
· Creating environments
· Discovering benchmarks
·· Uninstalling from conda-py3.7-cloudpickle-distributed-fsspec-numpy-pandas-partd-pip+cityhash-pip+xxhash-pyarrow-pytables-s3fs-scipy-toolz.
·· Building 14708abf for conda-py3.7-cloudpickle-distributed-fsspec-numpy-pandas-partd-pip+cityhash-pip+xxhash-pyarrow-pytables-s3fs-scipy-toolz...
·· Installing 14708abf into conda-py3.7-cloudpickle-distributed-fsspec-numpy-pandas-partd-pip+cityhash-pip+xxhash-pyarrow-pytables-s3fs-scipy-toolz..
· Running 46 total benchmarks (1 commits * 1 environments * 46 benchmarks)
[ 0.00%] · For dask commit 14708abf:
[ 0.00%] ·· Benchmarking conda-py3.7-cloudpickle-distributed-fsspec-numpy-pandas-partd-pip+cityhash-pip+xxhash-pyarrow-pytables-s3fs-scipy-toolz
[ 1.09%] ··· Running (array.BlockInfoBlockwise.time_compute--)........
[ 9.78%] ··· Running (array.Slicing.time_slice_int_tail--)........
[ 18.48%] ··· Running (dataframe.MemoryDataFrame.time_groupby--)......
[ 25.00%] ··· Running (optimization.Inline.time_inline_constants--)......
[ 31.52%] ··· Running (order.OrderCholeskyMixed.time_order_cholesky_mixed_lower--)...
[ 34.78%] ··· Running (order.OrderLinearFull.time_order_linear_full--)....
[ 39.13%] ··· Running (order.OrderRechunkTranspose.time_order_rechunk_transpose--).....
[ 44.57%] ··· Setting up io.py:33 ok
[ 44.57%] ··· Running (io.CSV.time_read_csv--)..
[ 46.74%] ··· Setting up io.py:58 ok
[ 46.74%] ··· Running (io.HDF5.time_read_hdf5--)..
[ 48.91%] ··· Setting up io.py:85 ok
[ 48.91%] ··· Running (io.Parquet.time_optimize_getitem--)..
[ 51.09%] ··· array.BlockInfoBlockwise.time_compute 3.62±0.02s
[ 52.17%] ··· array.BlockInfoBlockwise.time_optimize 21.3±0.7ms
[ 53.26%] ··· array.BlockInfoSingleton.time_optimize_singleton 134±3μs
[ 54.35%] ··· array.Blockwise.time_make_blockwise_graph 2.63±0.01s
[ 55.43%] ··· array.FancyIndexing.time_fancy 27.7±0.8ms
[ 56.52%] ··· array.Rechunk.time_rechunk 26.4±0.9ms
[ 57.61%] ··· array.Rechunk.time_rechunk_meta 10.2±0.1ms
[ 58.70%] ··· array.Slicing.time_slice_int_head 10.6±0.5ms
[ 59.78%] ··· array.Slicing.time_slice_int_tail 10.5±0.6ms
[ 60.87%] ··· array.Slicing.time_slice_slice_head 20.3±1ms
[ 61.96%] ··· array.Slicing.time_slice_slice_tail 20.4±0.9ms
[ 63.04%] ··· array.Slicing.time_slices_from_chunks 1.16±0.01s
[ 64.13%] ··· array.TestSubs.time_subs 43.6±0.4ms
[ 65.22%] ··· array_overlap.MapOverlap.time_map_overlap ok
[ 65.22%] ··· ================= ========== ============
shape boundary
----------------- ---------- ------------
(100, 100, 100) reflect 61.5±0.7ms
(100, 100, 100) periodic 62.2±0.6ms
(100, 100, 100) nearest 62.7±0.6ms
(100, 100, 100) none 15.7±0.3ms
(50, 512, 512) reflect 69.3±2ms
(50, 512, 512) periodic 69.9±2ms
(50, 512, 512) nearest 70.6±1ms
(50, 512, 512) none 23.1±1ms
================= ========== ============
[ 66.30%] ··· dataframe.MemoryDataFrame.time_boolean_indexing 132±9ms
[ 67.39%] ··· dataframe.MemoryDataFrame.time_count_values 41.0±0.8ms
[ 68.48%] ··· dataframe.MemoryDataFrame.time_groupby 201±9ms
[ 69.57%] ··· dataframe.MemoryDataFrame.time_reduction 133±4ms
[ 70.65%] ··· dataframe.MemoryDataFrame.time_scalar_comparison 12.4±0.6ms
[ 71.74%] ··· dataframe.MemoryDataFrame.time_set_index 326±6ms
[ 72.83%] ··· optimization.Cull.time_cull 172±1ms
[ 73.91%] ··· optimization.Fuse.time_fuse ok
[ 73.91%] ··· ========= =========
param1
--------- ---------
diamond 372±2ms
linear 175±2ms
========= =========
[ 75.00%] ··· optimization.Inline.time_inline_constants 42.7±0.6ms
[ 76.09%] ··· optimization.Inline.time_inline_functions 265±5ms
[ 77.17%] ··· optimization.Inline.time_inline_keys 171±3ms
[ 78.26%] ··· order.OrderCholesky.time_order_cholesky 488±9ms
[ 79.35%] ··· order.OrderCholesky.time_order_cholesky_lower 466±3ms
[ 80.43%] ··· ...r.OrderCholeskyMixed.time_order_cholesky_mixed 376±7ms
[ 81.52%] ··· ...rCholeskyMixed.time_order_cholesky_mixed_lower 337±3ms
[ 82.61%] ··· order.OrderFullLayers.time_order_full_layers ok
[ 82.61%] ··· ============ =========
param1
------------ ---------
(1, 50000) 506±8ms
(2, 10000) 365±6ms
(10, 1000) 369±2ms
(100, 20) 419±3ms
(500, 2) 575±6ms
(9999, 1) 119±2ms
(50000, 1) 400±4ms
============ =========
[ 83.70%] ··· order.OrderLinalgSolves.time_order_linalg_solve 334±2ms
[ 84.78%] ··· order.OrderLinearFull.time_order_linear_full 1.44±0.01s
[ 85.87%] ··· ...rLinearWithDanglers.time_order_linear_danglers ok
[ 85.87%] ··· ============ ===========
param1
------------ -----------
(2, 10000) 189±2ms
(5, 5000) 268±0.7ms
============ ===========
[ 86.96%] ··· ...r.OrderManySubgraphs.time_order_many_subgraphs ok
[ 86.96%] ··· =========== =========
param1
----------- ---------
(1, 9999) 228±2ms
(3, 3333) 240±3ms
(10, 999) 257±7ms
(30, 303) 252±6ms
(100, 99) 275±3ms
(999, 10) 284±2ms
=========== =========
[ 88.04%] ··· order.OrderMapOverlap.time_order_mapoverlap ok
[ 88.04%] ··· =========================================== =========
param1
------------------------------------------- ---------
((10000.0, 10000.0), (200, 200), 1) 650±7ms
((1000, 1000, 1000), (100, 100, 100), 10) 696±5ms
=========================================== =========
[ 89.13%] ··· ...rRechunkTranspose.time_order_rechunk_transpose 801±10ms
[ 90.22%] ··· order.OrderSVD.time_order_svd 84.0±0.5ms
[ 91.30%] ··· tokenize.TokenizeBuiltins.time_tokenize 192±0.9ms
[ 92.39%] ··· tokenize.TokenizeNumpy.time_tokenize ok
[ 92.39%] ··· ======== =============
dtype
-------- -------------
int 589±7μs
float 580±6μs
str 2.39±0.05ms
bytes 429±8μs
object 89.8±2ms
======== =============
[ 93.48%] ··· tokenize.TokenizePandas.time_tokenize ok
[ 93.48%] ··· ===================== ============ =============
-- as_series
--------------------- --------------------------
dtype True False
===================== ============ =============
period 62.1±1μs 30.4±0.02μs
datetime64[ns] 59.6±2μs 36.8±0.9μs
datetime64[ns, CET] 64.4±2μs 32.8±2μs
int 63.2±0.1μs 24.5±0.03μs
category 209±7μs 48.6±2μs
sparse 69.0±0.2μs 30.3±0.1μs
Int 1.23±0ms 1.20±0.02ms
string 463±6μs 409±7μs
boolean 862±1μs 826±8μs
===================== ============ =============
[ 94.57%] ··· io.CSV.time_read_csv ok
[ 94.57%] ··· ================= ============
param1
----------------- ------------
single-threaded 280±1ms
processes 2.00±0.01s
threads 308±2ms
================= ============
[ 95.65%] ··· io.CSV.time_read_csv_meta ok
[ 95.65%] ··· ================= =============
param1
----------------- -------------
single-threaded 29.0±0.1ms
processes 29.1±0.8ms
threads 28.3±0.09ms
================= =============
[ 96.74%] ··· io.HDF5.time_read_hdf5 ok
[ 96.74%] ··· ================= =========
param1
----------------- ---------
single-threaded 396±2ms
processes n/a
threads 409±4ms
================= =========
[ 97.83%] ··· io.HDF5.time_read_hdf5_meta ok
[ 97.83%] ··· ================= =========
param1
----------------- ---------
single-threaded 263±4ms
processes n/a
threads 265±5ms
================= =========
[ 98.91%] ··· io.Parquet.time_optimize_getitem 37.7±0.3ms
[100.00%] ··· io.Parquet.time_read_getitem_projection 691±20msExcept for |
|
@galipremsagar is there any chance you could organize the above into a table? I'm struggling to read the ASV output above so I imagine others will as well :). |
|
Apologies for not organizing it in a table in first-place.
array_overlap.MapOverlap.time_map_overlap
optimization.Fuse.time_fuse
order.OrderFullLayers.time_order_full_layers
...rLinearWithDanglers.time_order_linear_danglers
...r.OrderManySubgraphs.time_order_many_subgraphs
order.OrderMapOverlap.time_order_mapoverlap
tokenize.TokenizeNumpy.time_tokenize
tokenize.TokenizePandas.time_tokenize
io.CSV.time_read_csv
io.CSV.time_read_csv_meta
io.HDF5.time_read_hdf5
io.HDF5.time_read_hdf5_meta
|
|
Thanks @galipremsagar! |
|
When I scanned through this only 3 items stood out and even then they are fairly close:
If someone else has time it would be go to recheck if there are any other outliers in the ASV benchmarks. |
|
Just FYI, our ASV's likely aren't sensitive enough to pick up the gains here. For context, in the pandas benchmarks I watch for regressions I ignore any differences smaller than ~30 - 40%. We might be able to construct a micro-benchmark that targets just graph construction improvements here and have a reasonable chance of seeing these improvements. |
|
Thanks @TomAugspurger I think for this PR we are good to merge in. However, we'd like to pick up adding a mirco-benchmark for graph construction if that's ok with you? |
|
Yeah I think so (haven't looked closely at the latest changes, but what I
saw earlier looked fine).
…On Thu, May 7, 2020 at 3:16 PM Benjamin Zaitlen ***@***.***> wrote:
Thanks @TomAugspurger <https://github.com/TomAugspurger> I think for this
PR we are good to merge in. However, we'd like to pick up adding a
mirco-benchmark for graph construction if that's ok with you?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6137 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIXO6KUSFZQAWKNTNL3RQMJLNANCNFSM4MQ2ZTIA>
.
|
|
Thanks. I'll plan to merge tomorrow morning unless anyone else has concerns |
|
Thanks again all. Merging in |
While profiling an internal
cudfcode sample, we found that there are some redundant computations being performed in nested for loops.Here is a minimal profile:
black dask/flake8 dask