Skip to content

Fix some leaks in systhreads#11406

Merged
gasche merged 5 commits intoocaml:trunkfrom
gasche:threads-leaks
Jul 19, 2022
Merged

Fix some leaks in systhreads#11406
gasche merged 5 commits intoocaml:trunkfrom
gasche:threads-leaks

Conversation

@gasche
Copy link
Copy Markdown
Member

@gasche gasche commented Jul 6, 2022

This PR, joint work with @Engil and @fabbing, fixes some leaks in threads that we hope are the leak issues reported in #11289.

@toots, could you try on trunk (or 5.0) and tell us if you still observe leaks with this PR?

We wonder who would be interested in reviewing this. Maybe @gadmm, @xavierleroy or @sadiqj?

@gadmm
Copy link
Copy Markdown
Contributor

gadmm commented Jul 6, 2022

I had a quick look, overall this looks good and makes sense to me. Not a formal review yet but let me know if you need one from me. I am also reassured that there is no overlap with the other PRs on systhreads.

The issue was originally for 4.13, is the fix substantially different there? Do we know what introduced the issue?

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 6, 2022

Not a formal review yet but let me know if you need one from me.

If you are available and interested to do a review, then yes a review would be welcome. (I think it could definitely help other people decide to approve the PR.)

We haven't investigated the 4.13 systhreads codebase, so I don't know where leaks are there and how 4.14 fixed them. My guess is that the two implementations are very different, and the sources of leak are probably different as well.

gasche and others added 4 commits July 7, 2022 10:19
Co-authored-by: Enguerrand Decorne <decorne.en@gmail.com>
Co-authored-by: Fabrice Buoro <fabrice@tarides.com>
Co-authored-by: Enguerrand Decorne <decorne.en@gmail.com>
Co-authored-by: Fabrice Buoro <fabrice@tarides.com>
Co-authored-by: Enguerrand Decorne <decorne.en@gmail.com>
Co-authored-by: Fabrice Buoro <fabrice@tarides.com>
(there are probably other resource leaks in domain_terminate,
but those are reused from the work on thread cleanup.)

Co-authored-by: Enguerrand Decorne <decorne.en@gmail.com>
Co-authored-by: Fabrice Buoro <fabrice@tarides.com>
@toots
Copy link
Copy Markdown
Contributor

toots commented Jul 10, 2022

Thanks for working on this y'all. I'm happy to report that the leak seems fixed on my end as well:

Memory

caml_free_stack(domain_state->current_stack);
}
caml_free_backtrace_buffer(domain_state->backtrace_buffer);
caml_free_gc_regs_buckets(domain_state->gc_regs_buckets);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is cleaning up some resource on the main domain thread and is not strictly speaking part of the systhread leak, isn't it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed. (This is a separated fourth commit.) We thought it made more sense to employ the new resource-cleanup in both places where they initialized, for threads and for domains. (Otherwise the code ends up in an arguably inconsistent state.)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for cleaning up memory at domain exit too, not just thread exit. The title of this PR and the Changes entry could reflect this fact.

caml_stat_free(gc_regs_buckets);
gc_regs_buckets = next;
}
}
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: @Octachron and myself discussed this offline and we decided that it would be nice to have a documentation comment that explains what gc_regs and gc_regs_buckets are. I may submit this as a separate (complementary) PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, some documentation about gc_regs_buckets would be great. I know what gc_regs does in OCaml 4, but I'm a bit lost as to the OCaml 5 implementation of gc_regs, what with the allocation via malloc and so on.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record, I proposed some documentation of gc_regs and gc_regs_buckets in #11437.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall, but I don't claim I understand the gc_regs_bucket business. Let's wait for @Octachron's approval before merging.

caml_free_stack(domain_state->current_stack);
}
caml_free_backtrace_buffer(domain_state->backtrace_buffer);
caml_free_gc_regs_buckets(domain_state->gc_regs_buckets);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for cleaning up memory at domain exit too, not just thread exit. The title of this PR and the Changes entry could reflect this fact.

caml_stat_free(gc_regs_buckets);
gc_regs_buckets = next;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, some documentation about gc_regs_buckets would be great. I know what gc_regs does in OCaml 4, but I'm a bit lost as to the OCaml 5 implementation of gc_regs, what with the allocation via malloc and so on.

@Octachron
Copy link
Copy Markdown
Member

As far as I understand, caml_thread_reinitialize should also free the same resource as caml_thread_remove_info when it frees the list of threads within the domain.

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 18, 2022

As far as I understand, caml_thread_reinitialize should also free the same resource as caml_thread_remove_info when it frees the list of threads within the domain.

My understanding is rather that only "pthread resources" (mutexes and what not) need to be reset after a fork, not memory which we assume was correctly duplicated. (A hint is that there is currently no code to cleanup the call stack in reinitialize, while there is one in remove_info.)

@Octachron
Copy link
Copy Markdown
Member

The reinitialize function is currently freeing the current stack too? Moreover, the function is also freeing the thread information structure, thus there is no pointer left pointing to the gc_reg_buffers or backtrace_buffer structures for secondary threads. What I am missing?

@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 18, 2022

Ah, right, the reinitialize function appears to be trying to cleanup all threads that are not the active thread. I don't know what the systhreads+fork combination is supposed to be doing, but clearly you are right and it should share the cleanup logic. Sorry for missing this earlier.

Suggested-by: Florian Angeletti <florian.angeletti@inria.fr>
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 18, 2022

@Octachron this should have been addressed in cacc1cd. Thanks!
( @Engil I wonder if you could confirm that this is the right approach. )

Copy link
Copy Markdown
Member

@Octachron Octachron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this PR fixes a leak.

@gasche gasche merged commit b8dbb53 into ocaml:trunk Jul 19, 2022
gasche added a commit that referenced this pull request Jul 19, 2022
Fix some leaks in systhreads

(cherry picked from commit b8dbb53)

The cherry-pick required solving a conflict coming from
  159f3f3
(PR #11390) which is not (yet?) merged in 5.0.
@gasche
Copy link
Copy Markdown
Member Author

gasche commented Jul 19, 2022

Merged, and cherry-picked in 5.0 as 29e912c. I had to resolve a conflict coming from #11390 (which is not (yet?) in 5.0).

@abbysmal
Copy link
Copy Markdown
Contributor

Sorry, I'm currently on leave (will be back on monday).
I think the change is correct, however I spotted one thing I think is strange?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants