The Executable and Linkable Format (ELF) has been the standard binary format for executable files, shared libraries, and other file types on Linux, Unix, and other operating systems for over 20 years. This comprehensive guide will demystify ELF by providing a detailed look at its internal structure and contents.
A Brief History of ELF
Let‘s start at the beginning – where and why did ELF come from? During the late 1990s, there were several competing binary formats in use across various Unix and Unix-like systems…
Inside the ELF Header
Every ELF file starts with a fixed-size header that identifies key properties like the target architecture, endianness, and other details. Let‘s examine the header format in depth:
7f 45 4c 46 | .ELF | Magic number identifying file as ELF
02 01 01 00 | Class | 64-bit (x86-64, ARM aarch64 etc)
01 00 00 00 | Data | Little-endian
01 00 00 00 | Version | Current version 1
00 00 00 00 | OS/ABI | Unix System V
00 00 00 00 | ABI Version | 0
02 00 00 00 | Type | Executable
3e 00 00 00 | Machine | x86-64
As we can see, the header gives rapid insight into CPU type, endianness, ABI version compatibility and more…
Sections and Segments
ELF files contain two parallel tables – the section header table and program header table. What‘s the difference?
Sections are the main chunks of contents within an ELF binary like code, data, debugging symbols etc. Segments describe how these sections get loaded into memory at runtime…
Shared Libraries
A key feature of ELF files on Linux/Unix systems is dynamic linking using shared libraries. How does this work under the hood?
Shared libraries use the .dynamic section to describe dependencies that must be resolve at load time. Entries have a simple structure:
d_tag | d_val
Each dynamic entry d_tag indicates the type like a string table reference, followed by d_val value for it.
And versions allow symbol binding from multiple library variants…
Debugging and Symbols
To enable debugging and examination of binaries, ELF contains dedicated sections holding signature information like symbols (.symtab), strings (.strtab) and more. For instance:
00000000000005a8 g DF .text 000000000000002e main
Contains symbol name, size, type info and location in binary. These reflect the source code structure directly.
Debugging formats like DWARF have their own ELF sections storing compiler-generated metadata…
Relocation
When linking combining input object files into executables, references between them need to be patched up. ELF holds this relocation information in sections like .rel.plt:
Offset Info Type Value Name
00020348 0005000600000007 R_X86_64_JUMP_SLO 00000000 __libc_start_main@GLIBC_2.2.5
Tells linker to patch 32-bit relative jump for __libc_start_main function
Understanding relocations lets you see how an ELF will modify itself post-linking.
Security Features
In addition to structure, ELF provides security oriented features like ASLR, NX stacks, fortified source and more. For example…
Comparison with PE and Mach-O
The equivalent formats on Windows and macOS – PE and Mach-O – share similarities with ELF. Key differences are…
Conclusion
In this complete guide, we‘ve examined nearly aspect of low-level details to high-level design in the ELF format. You should now have be equipped to analyze and understand ELF files as they are encountered on Linux systems.
The key is connecting the behavior of tools like readelf and objdump that print ELF internals back to the format details. Debugging odd binaries, crafting custom linker scripts and writing exploits all build upon knowledge of what makes up ELF at a fundamental level.
References
- [1] Tool Interface Standards (TIS) Executable and Linking Format (ELF) Specification Version 1.2
- [2] Linux Standard Base Core Specification 5.0
- [3] Linkers and Loaders by John Levine


