🐛 Bug
Enabling autograd.profiler introduces nearly a 180% overhead (4.6s v. 8.17s)
To Reproduce
- From benchmarks dir, run
numactl -C 3 python -m fastrnns.bench --fuser=te --executor=profiling --group=rnns --rnns=jit_premul_bias to get a baseline
- Apply the following patch to instrument the bench with
autograd.profiler: https://github.com/pytorch/pytorch/commit/dd4b3268605c055a1a8653f8554ccffc9e874044.patch (or git cherry-pick dd4b3268605c055a1a8653f8554ccffc9e874044
- Re-run the bench (no re-comp is necessary as the patch contains only python changes).
Expected behavior
An overhead that's within 5-10%.
@ngimel @ilia-cher
cc @VitalyFedyunin @ngimel
🐛 Bug
Enabling
autograd.profilerintroduces nearly a 180% overhead (4.6s v. 8.17s)To Reproduce
numactl -C 3 python -m fastrnns.bench --fuser=te --executor=profiling --group=rnns --rnns=jit_premul_biasto get a baselineautograd.profiler: https://github.com/pytorch/pytorch/commit/dd4b3268605c055a1a8653f8554ccffc9e874044.patch (orgit cherry-pick dd4b3268605c055a1a8653f8554ccffc9e874044Expected behavior
An overhead that's within 5-10%.
@ngimel @ilia-cher
cc @VitalyFedyunin @ngimel