Skip to content

Commit c0ea2aa

Browse files
committed
gc debug no profiling
killed this after ~5min. scheduler could barely open dashboard. clearly turning on GC debug affected something.
1 parent ac61e5f commit c0ea2aa

File tree

5 files changed

+117
-9
lines changed

5 files changed

+117
-9
lines changed

dask_profiling_coiled/run_profile.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -91,20 +91,20 @@ def main():
9191
# print("Disabling GC on scheduler")
9292
# client.run_on_scheduler(disable_gc)
9393

94-
# def enable_gc_debug():
95-
# import gc
94+
def enable_gc_debug():
95+
import gc
9696

97-
# gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE)
97+
gc.set_debug(gc.DEBUG_STATS | gc.DEBUG_COLLECTABLE | gc.DEBUG_UNCOLLECTABLE)
9898

99-
# print("Enabling GC debug logging on scheduler")
100-
# client.run_on_scheduler(enable_gc_debug)
99+
print("Enabling GC debug logging on scheduler")
100+
client.run_on_scheduler(enable_gc_debug)
101101

102102
print("Here we go!")
103103

104104
# This is key---otherwise we're uploading ~300MiB of graph to the scheduler
105105
dask.config.set({"optimization.fuse.active": False})
106106

107-
test_name = "cython-shuffle-gc-noprofiling-env"
107+
test_name = "cython-shuffle-gc-debug-noprofiling"
108108
with (
109109
distributed.performance_report(f"results/{test_name}.html"),
110110
pyspy_on_scheduler(

environment.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,5 +28,4 @@ dependencies:
2828
# - git+https://github.com/gjoseph92/scheduler-profilers.git # TODO this conflicts with --install-option for distributed, using postBuild instead
2929
# - git+https://github.com/gjoseph92/dask-noop.git
3030
variables:
31-
DASK_DISTRIBUTED__WORKER__PROFILE__INTERVAL: 2h
32-
DASK_DISTRIBUTED__WORKER__PROFILE__CYCLE: 10h
31+
DASK_CONFIG: dask.yaml

make-coiled-env.sh

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,30 @@
11
#!/bin/bash
22

3-
# Install py-spy separately so it doesn't conflict with Cythonized distributed
3+
# Install py-spy separately so it doesn't conflict with Cythonized distributed.
4+
# Also add dask config.
5+
6+
# HACK: Coiled offers no easy way to add auxiliary data files---or a dask config---in software environments,
7+
# so we generate a post-build shell script that has the contents of `dask.yaml` within itself, and writes
8+
# those contents out when executed.
9+
OUT_CONFIG_PATH="~/.config/dask/dask.yaml"
10+
YAML_CONTENTS=$(<dask.yaml)
411
cat > postbuild.sh <<EOF
512
#!/bin/bash
613
714
python3 -m pip install git+https://github.com/gjoseph92/scheduler-profilers.git@8d59e7f8b2ab59e22f0937557fefe388eac6ea61
15+
16+
OUT_CONFIG_PATH=$OUT_CONFIG_PATH
17+
# ^ NOTE: no quotes, so ~ expands (https://stackoverflow.com/a/32277036)
18+
mkdir -p \$(dirname \$OUT_CONFIG_PATH)
19+
20+
cat > \$OUT_CONFIG_PATH <<INNER_EOF
21+
$YAML_CONTENTS
22+
INNER_EOF
23+
24+
echo "export DASK_CONFIG=\$OUT_CONFIG_PATH" >> ~/.bashrc
25+
26+
echo "Wrote dask config to \$OUT_CONFIG_PATH:"
27+
cat \$OUT_CONFIG_PATH
828
EOF
929
coiled env create -n profiling --conda environment.yml --post-build postbuild.sh
1030
rm postbuild.sh

results/cython-shuffle-gc-debug-noprofiling.html

Lines changed: 88 additions & 0 deletions
Large diffs are not rendered by default.

results/cython-shuffle-gc-debug-noprofiling.json

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)