Skip to content

Fix the invariant that fd->dispatcher thread exists if fd->h is set#1154

Closed
jmarshall wants to merge 1 commit intosamtools:developfrom
jmarshall:SAM_state-dispatcher
Closed

Fix the invariant that fd->dispatcher thread exists if fd->h is set#1154
jmarshall wants to merge 1 commit intosamtools:developfrom
jmarshall:SAM_state-dispatcher

Conversation

@jmarshall
Copy link
Copy Markdown
Member

[An alternative take on the pthread fd->dispatcher problem identified in #1153.]

Remove fd->dispatcher test as it makes unwarranted assumptions about the pthread_t type. In fact this code block assumes that fd->dispatcher is set up whenever fd->h has been set (see the unguarded pthread_join() further down the function).

Fix pthread_create() error handling accordingly. (Also use the canonical name for sam_hdr_destroy().)

This may be incomplete: it may be worth making the same change in the other nearby error handling; more significantly, the relationship between the thread pool in fd->q and the dispatcher thread is unclear, so resetting fd->h may not be quite right in some cases… maybe some of the fd->q handling needs to change too.

Remove `fd->dispatcher` test as it makes unwarranted assumptions about
the pthread_t type. In fact this code assumes that fd->dispatcher is
set up whenever fd->h has been set (see the unguarded pthread_join()).
Fix pthread_create() error handling accordingly.

(Also use the canonical name for sam_hdr_destroy().)
@jmarshall
Copy link
Copy Markdown
Member Author

jmarshall commented Oct 7, 2020

(Note that in the current develop code, if a pthread_create(&fd->dispatcher, …) fails, then the subsequent sam_close() calls sam_state_destroy() which calls pthread_join() on an indeterminate pthread ⇒ 💥 on an unfriendly pthread implementation.)

@jkbonfield
Copy link
Copy Markdown
Contributor

It's been hammered on linux, but agreed there are probably poor bits of code that don't work well on other platforms.

Getting the SAM threading to work with valid data and no errors is fine. The problems all come when you have a failure - malformed data, running out of memory, and so on. If I recall the line was added after a huge round of fuzzing where we died after the Nth malloc call, to test as many possible failure code paths as possible. The number of ways it could die (obviously) was huge. Things like failing to launch the thread would have been one of those that lead to deadlocks.

I don't want to simply undo that check, but yes it needs to cope with the thread state being a struct. Given SAM_state is internal it could just have a status integer to indicate if it's been initialised and use that instead of a boolean comparison on a pthread_t.

jkbonfield added a commit to jkbonfield/htslib that referenced this pull request Oct 28, 2020
Replaces samtools#1154

It does this by the addition of a separate variable to do the boolean
check on instead of fd->dispatcher itself.  I deemed this easier to
understand than overloading the interpretation of fd->h being set plus
I'm unsure of the potential failure case in sam_hdr_fill_hrecs where
fd->h has been set but we errored before creating the threads.

Also improves error recovering in case of pthread creation failure,
avoiding a false pthread_join later.  Thanks to John Marshall for
identifying that issue.
jkbonfield added a commit to jkbonfield/htslib that referenced this pull request Oct 28, 2020
Replaces samtools#1154
See also samtools#1153

It does this by the addition of a separate variable to do the boolean
check on instead of fd->dispatcher itself.  I deemed this easier to
understand than overloading the interpretation of fd->h being set plus
I'm unsure of the potential failure case in sam_hdr_fill_hrecs where
fd->h has been set but we errored before creating the threads.

Also improves error recovering in case of pthread creation failure,
avoiding a false pthread_join later.  Thanks to John Marshall and
Anders Kaplan for identifying that issue.
daviesrob pushed a commit to jkbonfield/htslib that referenced this pull request Nov 5, 2020
Replaces samtools#1154
See also samtools#1153

It does this by the addition of a separate variable to do the boolean
check on instead of fd->dispatcher itself.  I deemed this easier to
understand than overloading the interpretation of fd->h being set plus
I'm unsure of the potential failure case in sam_hdr_fill_hrecs where
fd->h has been set but we errored before creating the threads.

Also improves error recovering in case of pthread creation failure,
avoiding a false pthread_join later.  Thanks to John Marshall and
Anders Kaplan for identifying that issue.
@jmarshall jmarshall closed this Nov 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants