20

The question is pretty much in the title: in terms of OS-level implementation, how are shared objects and dlls different?

The reason I ask this is because I recently read this page on extending Python, which states:

Unix and Windows use completely different paradigms for run-time loading of code. Before you try to build a module that can be dynamically loaded, be aware of how your system works.

In Unix, a shared object (.so) file contains code to be used by the program, and also the names of functions and data that it expects to find in the program. When the file is joined to the program, all references to those functions and data in the file’s code are changed to point to the actual locations in the program where the functions and data are placed in memory. This is basically a link operation.

In Windows, a dynamic-link library (.dll) file has no dangling references. Instead, an access to functions or data goes through a lookup table. So the DLL code does not have to be fixed up at runtime to refer to the program’s memory; instead, the code already uses the DLL’s lookup table, and the lookup table is modified at runtime to point to the functions and data.

Could anyone elaborate on that? Specifically I'm not sure I understand the description of shared objects containing references to what they expect to find. Similarly, a DLL sounds like pretty much the same mechanism to me.

Is this a complete explanation of what is going on? Are there better ones? Is there in fact any difference?

I am aware of how to link to a DLL or shared object and a couple of mechanisms (.def listings, dllexport/dllimport) for writing DLLs so I'm explicitly not looking for a how to on those areas; I'm more intrigued as to what is going on in the background.

(Edit: another obvious point - I'm aware they work on different platforms, use different file types (ELF vs PE), are ABI-incompatible etc...)

3
  • DLL is closed, all its symbols are resolved (the linker knows where to find them). Commented Jun 6, 2011 at 11:35
  • That sounds very odd to me. A DLL has dangling references, of course it does. They need to be resolved by the loader. Commented Jun 6, 2011 at 11:36
  • 1
    Sorry, sent the comment by mistake and could not edit it. Of course a DLL can have dangling references, but DLL knows where (to which specific other DLL) each one points. Whereas with .so, a dangling reference is resolved by the first symbol with that name, wherever it's found. If it happens to be in the executable itself and not in any other .so, that's OK too. Commented Jun 6, 2011 at 11:47

1 Answer 1

19

A Dll is pretty much the same mechanism as used by .so or .dylib (MacOS) files, so it is very hard to explain exactly what the differences are.

The core difference is in what is visible by default from each type of file. .so files export the language (gcc) level linkage - which means that (by default) all C & c++ symbols that are "extern" are available for linking when .so's are pulled in. It also means that, as resolving .so files is essentially a link step, the loader doesn't care which .so file a symbol comes from. It just searches the specified .so files in some order following the usual link step rules that .a files adhere to.

Dll files on the other hand are an Operating system feature, completely separate to the link step of the language. MSVC uses .lib files for linking both static, and dynamic libraries (each dll file generates a paired .lib file that is used for linking) so the resulting program is fully "linked" (from a language centric point of view) once its built.

During the link stage however, symbols were resolved in the lib's that represents the Dlls, allowing the linker to build the import table in the PE file containing an explicit list of dlls and the entry points referenced in each dll. At load time, Windows does not have to perform a "link" to resolving symbols from shared libraries: That step was already done - the windows loader just loads up the dll's and hooks up the functions directly.

Sign up to request clarification or add additional context in comments.

3 Comments

I see, I think that makes sense. I'll leave the question open for a day or two to see if it attracts any more detailed answers, but I think I understand it from this. Thanks again.
But isn't it possible to have multiple interchangeable DLLs (i.e. same interface, different implementation) on windows? If the location of the symbols in the DLL is baked into the executable, then how can two DLLs have the same interface without having those symbols in the same location? Or do they have them in the same location? Or, is that kind of functionality only possible with loadLibrary, etc?
@cheshirekow: For statically linked DLLs (those that are resolved by an include library) the addresses have been fixed => a replacement DLL has to have the same addresses for every exported function. For dynamically loaded DLLs (e.g. LoadLibrary, COM,…) GetProcAdress() resolves the address at runtime => manually!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.