Skip to content

Fix segfault in Unix.create_process on Windows 10#912

Merged
dra27 merged 1 commit intoocaml:4.04from
dra27:fix-create_process
Nov 13, 2016
Merged

Fix segfault in Unix.create_process on Windows 10#912
dra27 merged 1 commit intoocaml:4.04from
dra27:fix-create_process

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Nov 13, 2016

#656 didn't add #define CAML_INTERNALS to the top of otherlibs/win32unix/createprocess.c which means that its import of search_exe_in_path gets a default definition meaning it gets a default return type of int instead of char * (certainly MSVC emits a warning - I didn't look at gcc).

This appears to be innocuous on Windows 7, but on Windows 10 its causing segfaults (lib-scanf-2 and all the debugger tests failing).

This should probably be pushed to 4.04 as well? (@damiendoligez ?)

@dra27 dra27 changed the title Fix segfault in Unix.create_process on Windows Fix segfault in Unix.create_process on Windows 10 Nov 13, 2016
@shindere
Copy link
Copy Markdown
Contributor

shindere commented Nov 13, 2016 via email

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 13, 2016

Non-invasive patch fixing a segfault: of course this should go into 4.04.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 13, 2016

Should this get a Changes entry? I think it should if users can observe the segfault (they may then grep for "segfault" in the changelog to see if the issue they notice has been fixed). Do I correctly understand that any user code using Unix.create_process under Windows will segfault under Windows 10? (Urgh...)

If this is correct, please add a Changes entry. You should put it directly in the 4.04-branch section ("Next minor version") for easier cherry-picking.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

Actually, given that I'm working with 64-bit ports, I expect it's down to whether the pointer returned is actually 64-bits, so it may well not be Windows 10 specific. However, given the number of times I've run the testsuite recently (on VMs with both 2GB and 4GB RAM), it may have something to do with address layout on Windows 10.

@dra27 dra27 force-pushed the fix-create_process branch from d7b262a to 471e815 Compare November 13, 2016 19:36
@dra27 dra27 changed the base branch from trunk to 4.04 November 13, 2016 19:36
@dra27 dra27 force-pushed the fix-create_process branch 2 times, most recently from a03df67 to 471e815 Compare November 13, 2016 19:37
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

Does that look OK? Is it OK to cherry pick it and push directly to trunk (i.e. without a GPR)?

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

@shindere - this is probably worth putting in separately, given that it's part of a bug? I could have saved myself a few hours' debugging if I'd remembered seeing your change in #892!

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 13, 2016

Yes, I think that it's worth merging the present PR that minimally (and more explicitly) fixes the issue. Yes, you should just cherry-pick into 4.04 and push from the command-line -- a separate PR would add noise to all subscribers. Your Changelog entry looks fine.

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Nov 13, 2016 via email

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

No, no - it's not your fault! I was kicking myself, because I reviewed that PR!

@dra27 dra27 merged commit ee59d8f into ocaml:4.04 Nov 13, 2016
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

Cherry-picked onto trunk as well (1f9608a)

@shindere
Copy link
Copy Markdown
Contributor

David Allsopp (2016/11/13 12:07 -0800):

No, no - it's not your fault! I was kicking myself, because I reviewed
that PR!

I still think it owuld have deserved a separate PR and more emphasis put
on it. Lack of awareness from my side...

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 13, 2016

I wonder if there is something we can do to better catch those issues in the future. Can this kind of MSVC warnings be turned into errors? Should we try to add a Windows 10 machine to the CI arsenal?

Another question: is there a chance that 4.04.0 users with C code could suffer from the same issue? Maybe they use internal stuff which is now undefined and, like we did, did not notice any issue at compilation time. Is that a realistic scenario, should we then warn the users (beside releasing a 4.04.1 eventually; cc @damiendoligez), and how can they reliably check if they are affected?

@shindere
Copy link
Copy Markdown
Contributor

Gabriel Scherer (2016/11/13 12:22 -0800):

I wonder if there is something we can do to better catch those issues
in the future. Can this kind of MSVC warnings be turned into errors?

Yes. And MinGW warnigns can (and I think should) also be turned into
errors.

The problem encountered by David was actually reported under mingw.

The thing is, there are a few warnings that need to be fixed, currently.

And they are not obvious to fix, I am working precisely on that at the
moment and will talk with @damiendloigez and @xavierleroy to figure out
how to best fix them.

Once they are all fixed we can turn them into errors.

Should we try to add a Windows 10 machine to the CI arsenal?

It could of course not hurt but I think if we have warning-free code
things will already be much better.

Another question: is there a chance that user C code could suffer from
the same issue?

Could be.

Maybe they use internal stuff which is now undefined and, like we did,
did not notice any issue at compilation time. Is that a realistic
scenario, should we then warn the users, and how can they reliably
check if they are affected?

I the code is warning-free I think it's okay.

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 13, 2016

Oh, thanks for working on the issue.

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

Yes, I agree that we should head towards all C warnings as errors - but it can be very difficult to keep both sets of compilers in sync - especially as there are multiple supported versions of Microsoft's still out there (VC 19 - the current version - has modern standards support, but the older versions are all C89)

@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Nov 13, 2016

If it's easy to add another a machine doing Windows 10 testing, then yes - but otherwise, I shall remember to run the Windows tests on a Windows 10 box for 4.04.1/4.05.0. I hit this while doing some investigation into bleeding-edge support in Windows 10 for building native OCaml using Ubuntu-on-Windows (while it's still not as fast as Ubuntu itself, it's waaaaaay faster than Cygwin already!) which will hopefully eventually feed into CI running Azure (which would be Windows 10 or Server 2016)

@shindere
Copy link
Copy Markdown
Contributor

shindere commented Nov 13, 2016 via email

@damiendoligez
Copy link
Copy Markdown
Member

@shindere C89 is very very old. A few years ago we made an explicit decision to switch to C99 (which is only very old) after checking that the MS compiler would mostly support it. If we were modern, we would switch to C11...

@gasche
Copy link
Copy Markdown
Member

gasche commented Nov 27, 2016

This bug now has a Mantis issue: MPR#7422.

@damiendoligez
Copy link
Copy Markdown
Member

@dra27 I just tried adding a Win10 machine to CI, and failed. It doesn't seem to be supported by Inria's CI yet.

lpw25 pushed a commit to janestreet/ocaml that referenced this pull request Dec 14, 2016
camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017
@dra27 dra27 deleted the fix-create_process branch July 6, 2021 14:05
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Oct 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants