-
Notifications
You must be signed in to change notification settings - Fork 269
[Verify/Document] Native (conditional) dependency support. #300
Description
Originally posted on Paket, but since this is something NuGet is considering as well, I'm adding it here so we can track and (hopefully sync efforts). I hear that the ASP.NET team is doing something with native dependencies, but I can't find a spec for it. I suspect that a lockfile and transitive dependency support are pre-requisites for this.
A lot has been written about this (challenging) topic; so I'll link to what I've found (and please add more links in the comments).
- https://stackoverflow.com/questions/19478775/add-native-files-from-nuget-package-to-project-output-directory
- Support consuming and producing nuget packages with native binaries aspnet/dnx#402
- https://nuget.codeplex.com/workitem/1221
- https://nuget.codeplex.com/workitem/679
- https://nuget.codeplex.com/discussions/412012
- http://ilnumerics.net/blog/anycpu-computing-limping-platform-specific-targets-and-a-happy-deployment-end/
- http://youku.io/questions/392214/nuget-powershell-how-to-add-native-dependencies-how-to-add-files-to-a-project
- https://nuget.codeplex.com/discussions/446656
- Sort out SQLite installer for winrt MvvmCross/MvvmCross#307
- https://nuget.codeplex.com/discussions/286521
- https://nuget.codeplex.com/discussions/230521
- https://nuget.codeplex.com/workitem/109
I've been using a https download-during-boot approach for ImageResizer, but that is slow, unreliable, and annoying. I tried to create an example of how to build the ideal native/managed hybrid project, failed, then started a project to try to hot-fix the problem at runtime, and hit another series of roadblocks.
I've identified a few invalid assumptions that seem responsible for the current state of things.
Some of these are somewhat comical considering how easy it is to parse binaries for the major platforms and determine runtime compatibility.
- It is bad to assume all managed dlls are AnyCPU. x86 and x64-only binaries continue to have an important place. a) C++/CLI is still important, but only targets Win32 and x64. b) Binding generation tools like CppSharp cannot produce AnyCPU C# - the structure layouts (and other details) are calculated based on a pointer width assumption. c) There's also manually written C# that uses unsafe code or performs P/Invokes, and doesn't use IntPtr in all the right places, and is therefore x86 or x64-only.
- It is bad to assume that precompiled native binaries are appropriate for every operating system distribution. Source compilation needs to be first-class if we want to target a large number of linux or bsd distributions; we can't precompile for everything.
- It is bad to assume that binaries are small enough that all the permutations can go in the same zipped nuget package. OpenCV base libraries could easily be > 2 gigabytes if you did this. You'd also exhaust your server's disk space. And likely your bandwidth allotment. And requests would time out.
- It is bad to assume that a windows machine is capable of compiling native code.
Removing these assumptions, what new requirements are we left with?
- Nuget packages need to be able to reference other nuget packages - conditionally - based on architecture and operating system. This likely means that we need conditions in the lockfile, too - use version X of package Y on platform Z, etc. At build time, the right versions are copied.
- Nuget packages need to be able to describe native binaries - or rather - arbitrary files, and provide multiple versions for each target platform. For small binaries, a common pattern is likely to combine these two approaches, and provide x86/x64 binaries for windows, and a conditional reference for other platforms. For larger binaries, it's likely that the 'main' package will be empty, and simply list conditional, plat-specific packages as dependencies.
- Build time. This is a bit harder. What needs to happen?
a) We need to gather the referenced files (nuget references, mind you), and verify that the output folder does not have any conflicting named files. If there are conflicting names, the output folder version MUST be deleted, so that we can AssemblyResolve or LoadLibrary the correct version. We then copy each of the files to an appropriate subfolder of the output folder (or, if AnyCPU, the output folder itself). Since VisualStudio is blind to compatibility (by choice, one must assume), we may end up fighting with the build process a bit. Perhaps disabling copylocal? Another nice sanity check would be to simply parse the binary headers of everything in the output folder and ensure they are all able to run on a common environment. - Run time. This is where developers have to do a little magic, and call AssemblyResolve for non-AnyCPU managed dlls, and LoadLibrary (or a platform alternative) for the native dependencies. Given a standardized convention, we can make this co-operation possible without .NET runtime changes (although not optimal, since Assembly.Load won't use the default security context). It would be much better if .NET would modify the search path based on platform, as it does for culture. It would also be great if ASP.NET would apply some intelligence or header parsing to assemblies before globbing them all into memory. It's not hard, and only requires between 1 and 3 I/O reads of a few hundred bytes.
- Tooling. Tooling needs to understand that there are non-.NET dependencies involved. Unit test runners, in particular, are known for leaving behind native dependencies when they copy (or shadow copy) assembles for testing. These will need to understand the (transitive) dependencies. Perhaps we should emit some kind of manifest? While we can piggy-back on .NET assemblies to document dependencies (via resource manifests or regular assembly attributes), those .NET assemblies would need to document the final transitive set of native dependencies, which might be difficult to achieve.
So, I guess
- Step 1: Establish (a) plat/arch conditions, and (b) target strings we can use as subfolders.
- Step 2: Describe required changes to the nuget package specification
- Step 3: Implement handling in paket for nuget conditions and native file manifests, through to lockfile.
- Step 4: Implement paket support for triggering bash/.bat build scripts (perhaps requiring user authorization on a per-commit-id basis, if applied to a git reference)? CMake is probably most likely to be used, but if we require Git, we can require Bash.
- Step 5: Implement build-time support for copying and manifest generation.
- Step 6: Submit PRs to major runners and web frameworks to handle manifests and assembly loading properly.
- Step 7: Make .NET into a real platform, where C interop is practical, so we can play with the big kids.
Conditions:
- pointerSize=32|64. Let's say that our MSIL makes an assumption about pointer size, but not platform.
- endianess="little|big" Sometimes this matters more than architecture, and ARM can switch between endianess modes.
- architecture=x86|x64|IA64|Alpha|MIPS|HPPA|PowerPC|SPARC32|s390|s390x - Should probably support everything Mono does
- os="posix|winish|linux|osx|win7|win8|win10" I'm not sure how to best divide windows (or linux) operating systems into groups, or to number them for easy inequality testing, but we probably want to establish sane identifiers that correspond to common build/api compat targets.
Given that a fallback mode (building from source) is likely popular, we want to make it easy to ensure that only 1 reference from a conditional set is chosen. We should probably group them within another element or provide an id that prevents duplicates.
Target strings
Target string need to be as generic as can be permitted based on their restrictions.
/ - root is AnyCPU
/32b/ - managed, assumes 32-bit pointer, otherwise portable
/64b/ - managed, assumes 64-bit pointer, otherwise portable
/x86/winish/ - 32-bit, requires windows APIs.
pointer size, architecutre, and endianess are combined into the first string. Pointer size and endianess are only included if the architecture string doesn't make them redundant. I.e, we would see /ARM-little/ and /ARM-big/, but not /x86-little/.