switch to strerror_r for reentrant error string conversion#10969
switch to strerror_r for reentrant error string conversion#10969avsm wants to merge 3 commits intoocaml:trunkfrom
Conversation
|
It looks like |
dra27
left a comment
There was a problem hiding this comment.
That's quite a nettle to have grasped 🙂
Various thoughts:
- We pepper around
_GNU_SOURCEat various points, so it unnerves me that there's a reliance on the definition ofstrerror_rthat we get in a header. That said, it looks like the whole ofsync.hshould be guarded withCAML_INTERNALSanyway, which would lessen that worry. - Given that paranoia, can we do
#ifdef _GNU_SOURCEtests before string.h is included with an#errorif it is defined? - For platform.h, I think a small refactoring can sort that -
caml_fatal_error_arg2is defined solely for its use (I think that's a multicore addition, but I didn't check the git-blame) - we could instead definecaml_fatal_strerroror something which takes theintargument and remove thestrerrorpart fromplatform.h- the definition ofcaml_fatal_strerrorcan then usestrerror_ras normal. - Nice to see an unusual instance of the Windows CRT function being ahead of the game 🤪
|
Also linking in #10505, as the POSIX 2001 part of that might therefore be relevant for 5.00.0 |
|
Thanks for tackling strerror.
How about defining a Some advantages:
Disadvantage:
It is very unlikely that strerror_r itself would fail in practice, so maybe the fallback code could just print |
|
Many thanks for the rapid comments, @dra27 @edwintorok. I do prefer the idea of a caml_safe_strerror function rather than peppering the source with ERANGE checks. One thought I had to simplify the caml_safe_strerror function is that we could give it a __thread local buffer to copy the results into. This would allow us to use the GNU variant (which might return a pointer to some other buffer) and also the Windows strerror would just work. I think this is safe since the only thing the runtime does with the strerror result is to blit it into some OCaml value right after. |
|
I like the idea of a thread local buffer, but if you use the GNU variant we should have some fallback code for non-Linux and non-Windows OSes (I'm assuming the POSIX |
| Caml_inline void sync_check_error(int retcode, char * msg) | ||
| { |
There was a problem hiding this comment.
Note to self: why is this (big) function inlined and declared in a .h while it's obviously not performance critical and could be static to sync.c ? Something needs to be done.
There was a problem hiding this comment.
Looks like the check_sync_error is used in more places than just sync.c, so it looks better to sort this out in a followup PR.
otherlibs/systhreads/st_posix.h
388: sync_check_error(retcode, "Thread.sigmask");
408: sync_check_error(retcode, "Thread.wait_signal");
otherlibs/systhreads/st_stubs.c
587: sync_check_error(err, "Thread.create");
592: sync_check_error(err, "Thread.create");
638: sync_check_error(err, "caml_register_c_thread");
729: sync_check_error(rc, "Thread.join");
764: sync_check_error(st_event_create(&ts), "Thread.create");
runtime/sync.c
100: sync_check_error(sync_mutex_create(&mut), "Mutex.create");
120: sync_check_error(retcode, "Mutex.lock");
130: sync_check_error(retcode, "Mutex.unlock");
140: sync_check_error(retcode, "Mutex.try_lock");
199: sync_check_error(sync_condvar_create(&cond), "Condition.create");
218: sync_check_error(retcode, "Condition.wait");
226: sync_check_error(sync_condvar_signal(Condition_val(wrapper)),
233: sync_check_error(sync_condvar_broadcast(Condition_val(wrapper)),
runtime/caml/sync.h
77:Caml_inline void sync_check_error(int retcode, char * msg)
I'm pretty sure many C libraries implement |
|
Re |
8963b9b to
363648e
Compare
This uses the glibc or POSIX variants of strerror_r on *nix, and strerror on Win32 (which is thread-safe).
|
This is ready for re-review now. It implements a |
| /* platform dependent best-effort thread naming */ | ||
| extern void caml_thread_setname(const char* name); | ||
|
|
||
| extern char *caml_strerror(int errnum); |
There was a problem hiding this comment.
Small nitpick: would const char* be better? That'll make it obvious it shouldn't be free()-d.
There was a problem hiding this comment.
It don't think this can be const, since the pointer returned by the underlying strerror implementation is a char * (for historical reasons) and that might point to other storage than our TLS buffer.
There was a problem hiding this comment.
The prototype of strerror itself is a char*, so lets keep it that way then.
Although I thought you could always return a char* as const char*, i.e. advertise the intention to not modify it, but not the other way around (if you were given something with the promise not to modify you can't go ahead and modify it), but its been a while since I've dabbled in C code.
xavierleroy
left a comment
There was a problem hiding this comment.
I'm sorry but I don't like the use of thread-local variables here, esp. when declared using __thread, which is not standard C as far as I know. Can I please have some time to think about an alternate implementation? Thank you.
|
Yes, of course. It is unsatisfying to have to cart around 1K of extra TLS, especially when most of the time the underlying implementation returns a pointer to a statically declared string (but might not). Regarding |
|
I've been peering inside the implementation of strerror functions, and the only real case where corruption might result seems to be if another thread changes the locale while an strerror() is running and so the returned pointers to error strings will mutate. I guess we can't control the case where we're embedded in a C library and that changes the locale, so I'm discounting this line of thinking unless someone else can spot a way. |
|
Apologies for leaving you hanging. My trivial variation on this PR is here: #11010. |
We currently use
strerrorin the runtime, which isn't reentrant in the presence of multiple threads (noted by @edwintorok in #10861). This PR switches over tostrerror_r, which isn't quite as straightforward as I expected.In glibc, there are two entirely separate variations of the function: a GNU one and a POSIX one. The memory management of both are entirely distinct, as the GNU one can sometimes return a static string, and sometimes write into the supplied buffer. This PR adopts (I think) the POSIX one, but opts to have an empty error message string rather than figuring out how to deal with a failing
strerror_rcall (which itself varies across old glibc<2.13).Then there is the question of "how big should the strerror_r buffer be?". Old glibc used to use a 1024-byte buffer, but now it's all intertwined in locale-specific logic. I just opted for a simple 1024 byte stack buffer.
There are a few places left in the runtime that still use
strerror, notably inplatform.hwhich looks a bit performance-sensitive and inline:Is it ok to use dynamic stack allocation here in the
if (err)branch to have a scratch buffer without stack-allocating it for every call ofcheck_err?