Skip to content

[autograd] lower MAX_DEPTH limit according to TSAN limit#36745

Closed
wanchaol wants to merge 2 commits intogh/wanchaol/100/basefrom
gh/wanchaol/100/head
Closed

[autograd] lower MAX_DEPTH limit according to TSAN limit#36745
wanchaol wants to merge 2 commits intogh/wanchaol/100/basefrom
gh/wanchaol/100/head

Conversation

@wanchaol
Copy link
Copy Markdown
Collaborator

@wanchaol wanchaol commented Apr 16, 2020

Stack from ghstack:

As we hold a mutex for our custom C++ Node, when calling reentrant
backward from custom C++ function, we will cocurrently holding many
mutexes up to MAX_DEPTH. TSAN only allow 65 mutexes at once, otherwise
it will complain. This PR lower the limit according to TSAN.

TSAN Reference: google/sanitizers#950

Differential Revision: D21072604

As we hold a mutex for our custom C++ Node, when calling reentrant
backward from custom C++ function, we will cocurrently holding many
mutexes up to MAX_DEPTH. TSAN only allow 65 mutexes at once, otherwise
it will complain. This PR lower the limit according to TSAN.

TSAN Reference: google/sanitizers#95

[ghstack-poisoned]
Copy link
Copy Markdown
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the PR.
Let's make sure it fixes the tsan test before merging.

// As we hold mutex in every of our custom C++ autograd Node, we would
// like to avoid TSAN complains on this when doing reentrant backwards
// For reference, see https://github.com/google/sanitizers/issues/950
static constexpr int MAX_DEPTH = 60;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this value be different based on whether we have TSAN enabled?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not have any meaningful impact on perf. So I would lean toward consistency between all the builds :D

As we hold a mutex for our custom C++ Node, when calling reentrant
backward from custom C++ function, we will cocurrently holding many
mutexes up to MAX_DEPTH. TSAN only allow 65 mutexes at once, otherwise
it will complain. This PR lower the limit according to TSAN.

TSAN Reference: google/sanitizers#950

[ghstack-poisoned]
wanchaol added a commit that referenced this pull request Apr 16, 2020
As we hold a mutex for our custom C++ Node, when calling reentrant
backward from custom C++ function, we will cocurrently holding many
mutexes up to MAX_DEPTH. TSAN only allow 65 mutexes at once, otherwise
it will complain. This PR lower the limit according to TSAN.

TSAN Reference: google/sanitizers#950

ghstack-source-id: de61c26
Pull Request resolved: #36745
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Apr 16, 2020

💊 Build failures summary and remediations

As of commit 4505f04 (more details on the Dr. CI page):


  • 1/1 failures possibly* introduced in this PR
    • 1/1 non-CircleCI failure(s)

Extra GitHub checks


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 1 time.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@wanchaol merged this pull request in 6d4c509.

@facebook-github-bot facebook-github-bot deleted the gh/wanchaol/100/head branch April 20, 2020 14:17
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Pull Request resolved: pytorch#36745

As we hold a mutex for our custom C++ Node, when calling reentrant
backward from custom C++ function, we will cocurrently holding many
mutexes up to MAX_DEPTH. TSAN only allow 65 mutexes at once, otherwise
it will complain. This PR lower the limit according to TSAN.

TSAN Reference: google/sanitizers#950

Test Plan: Imported from OSS

Differential Revision: D21072604

Pulled By: wanchaol

fbshipit-source-id: 99cd1acab41a203d834fa4947f4e6f0ffd2e70f2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants