Skip to content

Fixes for profiling JIT code#38453

Closed
ilia-cher wants to merge 11 commits intogh/ilia-cher/69/basefrom
gh/ilia-cher/69/head
Closed

Fixes for profiling JIT code#38453
ilia-cher wants to merge 11 commits intogh/ilia-cher/69/basefrom
gh/ilia-cher/69/head

Conversation

@ilia-cher
Copy link
Copy Markdown
Contributor

@ilia-cher ilia-cher commented May 14, 2020

Stack from ghstack:

Summary:

  • RecordFunction in JIT interpreter should exist during the execution
    of the frame, and not just when we enter the frame
  • When creating a JIT continuation in wait instruction, we'd want to
    preserve the original thread local context, right now when we resume
    execution in continuation we preserve the thread local state of the
    thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: D21565959

Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

[ghstack-poisoned]
@ilia-cher ilia-cher requested review from albanD and apaszke as code owners May 14, 2020 01:40
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label May 14, 2020
Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented May 14, 2020

💊 CI failures summary and remediations

As of commit afe7020 (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 2/2 non-CircleCI failure(s)

Extra GitHub checks: 1 failed


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 36 times.

ilia-cher added 4 commits May 13, 2020 19:31
Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
unittest, CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
@ilia-cher ilia-cher changed the title [wip] Fixes for profiling JIT code Fixes for profiling JIT code May 14, 2020
ilia-cher added 2 commits May 14, 2020 13:53
Summary:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Copy link
Copy Markdown
Collaborator

@dzhulgakov dzhulgakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, minor comments

Comment thread aten/src/ATen/Parallel.h Outdated

namespace at {
namespace internal {
namespace internal {//
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Comment thread torch/csrc/jit/runtime/interpreter.cpp Outdated
static std::atomic<size_t> num_frames;

// RecordFunction object associated with this frame
std::shared_ptr<at::RecordFunction> record_function;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it can be a unique_ptr (or even c10::optional)

int64_t dist_autograd_context_id = 0)
: state(state_), stack(std::move(stack_)) {
int64_t dist_autograd_context_id = 0,
c10::optional<at::ThreadLocalState> tls_state = c10::nullopt)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just curious - can you remind me the reason dist_autograd_context is not part of TLS? (I think there was one but I don't recall)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I better refer to #38510 , there's been some ongoing discussion

ilia-cher added 3 commits May 18, 2020 21:04
Summary:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
Summary:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan:
python test/test_jit.py TestJit.test_profiler
CI

Differential Revision: [D21565959](https://our.internmc.facebook.com/intern/diff/D21565959)

[ghstack-poisoned]
@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ilia-cher merged this pull request in 235f624.

@facebook-github-bot facebook-github-bot deleted the gh/ilia-cher/69/head branch May 23, 2020 14:16
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#38453

Two fixes:
 - RecordFunction in JIT interpreter should exist during the execution
   of the frame, and not just when we enter the frame
 - When creating a JIT continuation in wait instruction, we'd want to
   preserve the original thread local context, right now when we resume
   execution in continuation we preserve the thread local state of the
   thread that set future value (i.e. executed a forked task)

Test Plan: unittest, CI

Reviewed By: ngimel

Differential Revision: D21565959

Pulled By: ilia-cher

fbshipit-source-id: 206b98e3bfb0052fc8e4031da778e372cc71afc1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants