-
Notifications
You must be signed in to change notification settings - Fork 291
Initial implementation of actual dynamic FFI to DLLs #6008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b6692d0 to
d0479f0
Compare
|
Awesome! I saw you mentioned you could use help with manual testing, but can you discuss a bit what automated testing could look like? We have access to many platforms via Github runners so that's where my head is at. |
|
CI will at least make sure that my library builds on Windows. Transcripts that import DLLs could also be written, of course. But that requires a DLL. I guess we could also build one of those in CI, but I'd have to do some reading on how to get that done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like the API. One thing about it though is that the types don't prevent you from loading a value, getting back a pure function, and then close over that and hopping to another node and attempt to call it. What happens when you attempt to serialize one of these FFI references?
One option is it bombs - basically, "don't do that". I think this could be fine.
Or could these function references serialize just fine and work on the other end as long as the other end has the "same" symbol loaded? If achievable, the sandbox-checking should definitely treat all foreign functions as tracked - you do not want to be able to load a library locally, then hop to the cloud cluster and call the same C functions if the same C libraries happen to be installed in the same place.
Definitely adding more types is good - I'd be interested in supporting arrays of various types, at least as an input type. Could be as a follow up to this PR though.
|
It's just going to bomb if you try to send one of these to another machine. I don't really see how you could support that with any kind of assurance. DLLs are just files, so even if you sent the information used to load from the DLL on one machine, there's no guarantee that the file with the same name is actually the same library, or the same version of it with identical calling conventions, etc. And if you send something from Windows to Linux, the file names for 'the same' library are probably different. I would say that if you want to send code around, you send code that uses some API that it gets passed in (possibly via abilities), and that API is implemented locally by foreign functions. The code you send around is portable, but the local implementation of the API isn't. Sending around DLL references that somehow adapt to the local system doesn't seem like a realistic expectation. |
|
Just brainstorming: we could have a Github job that installs the current build of ucm, installs whatever binary libs, whether it's zlib, libcurl, libmysqlclient, whatever we feel is going to exercise the FFI well, and then run Unison programs against them. |
Could a library bundle some kind of library-specific test to determine if the compatible function exists on the other side? |
I'm pretty sold by this argument. Just pass around functions that use an ability, much more portable. So I think bombing is fine during serialization, though I'd like it be with a meaningful error that tells the user what they did wrong. |
|
Okay, decision here: we're not going to do anything fancy for transferring FFI pointers over the network, will just bomb with an error. In userland, you could build something which is more portable (using abilities, say). @dolio is going to get CI fixed up and at least doing smoke tests on Linux / Mac / Windows. @aryairani can advise on this. Other types like additional primitives and arrays will likely be a follow-up PR. |
|
I wonder if there's any reasonable way we can use the type-system (or some other compiler mechanism) to tag a block as being unable to be serialized. There's already a good amount of frustration with things like We have the workaround that you can just express your ffi stuff as an ability and run that ffi outside of where that serialization happens, but I think it'd be a lot more user friendly if we could enforce or hint at that pattern so newer unison users don't get surprised or bitten by this. I can imagine a system ability like Not sure exactly how that'd work, and may not need to be in the first iteration, but I think if we're breaking Unison's promise that you can serialize anything that it's good to think of the UX. |
It's definitely possible but I do not want to do that in this PR. :) I really want this PR and the core FFI functionality to be low level and unsafe and as out of the way as possible. Adding fancy types could instead be done in various ways as separate userland layers. Like if you're planning to do distributed stuff, I think it's a reasonable thing to pass around functions that use an ability rather than direct pointers to the foreign functions. But again, I don't want to get fancy with this PR or the basic FFI primitives. |
|
Also, it's not clear to me that abilities are able to track this. The problem with serialization is that any reference to the value in the closure will fail, not just if you apply the function. I suppose you could arrange to fail remotely, by serializing something and only failing if you try to apply the function remotely. Then perhaps you could argue that an annotation on the imported function is showing you that its effects won't be supported remotely. But I'm not sure how good that is, either. |
- Added support for void types, both for unit results and nullary functions. - Exposed doubles in the API - Improved errors from FFI failures
|
BTW, it looks like Runar added pinned arrays as a separate type, so they should be relatively easy to add as FFI arguments without people having to worry about whether it's safe to pass. It requires a little tweaking of the implementation (not the interface), though. So I'll leave it to a future PR. |
|
I think this is ready to go, unless anyone has objections. |
|
Yeah looks good to me. |
This PR implements FFI to dynamically linked libraries. At the moment, it's a bit limited as far as types are concerned, but the basic functionality works, and types aren't difficult to add (I'll continue to do so after this writeup).
The functionality is provided by several new builtins.
DLLtype represents a handle to a dynamic libraryopenDLL : Text ->{IO, Exception} DLLallows opening a dynamic library by file pathFFI.TypeandFFI.Spectypes. These have a parameter giving the unison type they correspond to.int64 : Type Intanduint64 : Type Natare base casesbase : Type a -> Type r -> Spec (a -> r)is the base case of a specification of a pure functionbaseIO : Type a -> Type r -> Spec (a ->{IO} r)is the base case of an effectful functionarr : Type a -> Spec b -> Spec (a -> b)allows adjoining more arguments onto a specificationgetDLLSym : DLL -> Text -> Spec a ->{IO, Exception} aimports the function with the given symbol name in the DLL, declaring that its signature matches the specification.Using this, I have successfully called the following C function from a DLL:
The implementation uses libffi. This is a library for dynamically arranging for calls into libraries using C calling conventions. Apparently this is used by GHC in some capacity, and the Haskell library just uses whatever version of the library is bundled with GHC. I didn't use the high level API, which involves some redundancy and use of lists. Instead, I wrote my own calls directly into the low level stuff the library imports, with some lower level copying to/from our stacks and such.
As you can see, right now I only have support for 64-bit signed and unsigned integers. But adding more C types isn't difficult. One thing I'd like some thoughts on is how smaller values should be handled. E.G. should there be
uint32 : Type Natthat just automatically chops/promotes between 32 and 64 bits? That's the best we can do right now, I think, because we only have 64-bit types in unison.Another thing I'll need help with once this is cleaned up a bit more is testing on other platforms. I can only really test on Linux, so we need people to try Mac and Windows. I wrote a wrapper library to provide a common API on top of the DLL loading differences between Unix and Windows, but I haven't actually built the Windows version yet.
The signature specifiers are obviously not great to use in their form here. But it should be pretty easy to write a parser that takes a nicer looking signature and produces the right terms.