Skip to content

Add missing local root in win32 Unix.select#9051

Merged
dra27 merged 3 commits intoocaml:trunkfrom
dra27:fix-9043
Oct 19, 2019
Merged

Add missing local root in win32 Unix.select#9051
dra27 merged 3 commits intoocaml:trunkfrom
dra27:fix-9043

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Oct 17, 2019

This fixes #9043.

The patch is organised in three commits. The first is the simplest fix:

-- a/otherlibs/win32unix/select.c
+++ b/otherlibs/win32unix/select.c
@@ -961,9 +961,10 @@ static int fdlist_to_fdset(value fdlist, fd_set *fdset)
 static value fdset_to_fdlist(value fdlist, fd_set *fdset)
 {
   value res = Val_int(0);
-  Begin_roots2(fdlist, res)
+  value s = Val_int(0);
+  Begin_roots3(fdlist, res, s)
     for (/*nothing*/; fdlist != Val_int(0); fdlist = Field(fdlist, 1)) {
-      value s = Field(fdlist, 0);
+      s = Field(fdlist, 0);
       if (FD_ISSET(Socket_val(s), fdset)) {
         value newres = caml_alloc_small(2, 0);
         Field(newres, 0) = s;

It appears that the planets aligned (consistently) in such a way that msvc64 AppVeyor was routinely triggering a GC at the caml_alloc_small and fdlist was moved. The planets had to be aligned to the point that the test was failing only as part of make all and not as make one.

I did a limited test to confirm the problem by manually inserting a call caml_gc_full_major immediately after the value s = Field(… line. This causes all the Windows ports to freeze at this stage, suggesting a problem. With the Begin_roots3 version, they continue to work.

The second commit converts the entire function to use CAMLparam instead (I think it only used Begin_roots2 because it was taken from an old function which predates those macros).

While going through select.c, I noticed uses of Store_field following caml_alloc_small which violates GC Rule 5 (although I'm not sure that it's a problematic violation of Rule 5, since the expressions in question don't allocate).

@gasche
Copy link
Copy Markdown
Member

gasche commented Oct 18, 2019

cc our local-roots experts, I guess @let-def, @stedolan and @jhjourdan.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 18, 2019

Smoking gun on my fork's AppVeyor with v=31 and extra printfs around the initialisation of s you can clearly see the minor gc run and the value being promoted.

Copy link
Copy Markdown
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations and thanks for the detective work!

The fixes are correct as far as I can say.

Concerning GC rule 5, it is a hard rule: caml_alloc_small does not initialize the contents of the new block, while Store_field, which is a wrapper around caml_modify, reads the old value of the modified field. So, the original code was reading uninitialized memory....

value newres = caml_alloc_small(2, 0);
Field(newres, 0) = s;
Field(newres, 1) = res;
res = newres;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the record: the original code (with Begin_roots2) was modeled after the fdset_to_fdlist function from otherlibs/unix/select.c, which still uses... old-style Begin_roots2/End_roots root management.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed - I think it's OK only to "upgrade" these macros when something's actually changed/corrected?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the Unix version obviously has no problem, since the representation of fd's is simpler)

@xavierleroy
Copy link
Copy Markdown
Contributor

Could you please add a Changes entry in the 4.10 section? We'll cherry-pick to 4.10, of course.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Oct 18, 2019

Thanks @xavierleroy - I was happy to keep it without a Changes entry, but it's now rebased with one (on each changing commit, @gasche 😉). I'll merge once CI re-passes and cherry-pick to 4.10

@dra27 dra27 merged commit dd492d8 into ocaml:trunk Oct 19, 2019
@dra27 dra27 deleted the fix-9043 branch October 19, 2019 07:55
dra27 added a commit that referenced this pull request Oct 19, 2019
Add missing local root in win32 `Unix.select`

(cherry picked from commit dd492d8)
hannesm referenced this pull request in mirage/mirage-solo5 Oct 19, 2019
While treating Field() as an lvalue and assigning directly to it is
possible, it appears that this is not always safe.

Tests have shown that when storing Int64 values into structured blocks
(tuples) by direct assignment to Field(), the resulting code can fault
under load attempting to access an Int64 at an invalid address (0x1).

Therefore, we will play it safe and use Store_field() throughout.

/cc @stedolan
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Appveyor CI msvc64 is broken

3 participants