Skip to content

Simply exit DataLoader when Python is dying#12700

Closed
ssnl wants to merge 3 commits intopytorch:masterfrom
ssnl:atexit_dl
Closed

Simply exit DataLoader when Python is dying#12700
ssnl wants to merge 3 commits intopytorch:masterfrom
ssnl:atexit_dl

Conversation

@ssnl
Copy link
Copy Markdown
Collaborator

@ssnl ssnl commented Oct 16, 2018

I struggled with yet another DataLoader hang for the entire evening. After numerous experiments, I realized that it is unsafe to do anything when Python is shutting down. We also unfortunately implement our DataLaoder cleaning-up logic in __del__, a function that may or may not be called during shutdown, and if called, may or may not be called before core library resources are freed.

Fortunately, we are already setting all our workers and pin_memory_thread as daemonic. So in case of Python shutting down, we can just do a no-op in __del__ and rely on the automatic termination of daemonic children.

An atexit hook is used to detect Python exit.

@ssnl
Copy link
Copy Markdown
Collaborator Author

ssnl commented Oct 16, 2018

Also, the general recommendation is to never use __del__ since it can't be relied upon. But I've yet to find a satisfying replacement. Any suggestion/idea is welcomed.

#
# 2. The iterator exits the workers when the loader process and/or worker
# processes exits unexpectedly (e.g., SIGKILL-ed).
# processes exits normally or with error.

This comment was marked as off-topic.

Copy link
Copy Markdown
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, love the commentary.

@ssnl
Copy link
Copy Markdown
Collaborator Author

ssnl commented Oct 17, 2018

last commit is only comment change and build passed on previous commit so I'm landing.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SsnL is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ssnl ssnl deleted the atexit_dl branch October 17, 2018 05:47
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
I struggled with yet another DataLoader hang for the entire evening. After numerous experiments, I realized that it is unsafe to do anything when Python is shutting down. We also unfortunately implement our DataLaoder cleaning-up logic in `__del__`, a function that may or may not be called during shutdown, and if called, may or may not be called before core library resources are freed.

Fortunately, we are already setting all our workers and pin_memory_thread as daemonic. So in case of Python shutting down, we can just do a no-op in `__del__` and rely on the automatic termination of daemonic children.

An `atexit` hook is used to detect Python exit.
Pull Request resolved: pytorch#12700

Differential Revision: D10419027

Pulled By: SsnL

fbshipit-source-id: 5753e70d03e69eb1c9ec4ae2154252d51e2f79b0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants