In this blogpost, I will be discussing a hole I have in my blog - the Mach-O file format.
I will also be revealing a silly DoS bug I found in one of Apple's utilities that parses Mach-O files.
In macOS, every executable, dynamic library, kernel extension, and bundle shares a common skeleton: the Mach-O file format.
If you've ever opened a binary with otool, debugged with lldb, or wondered what dyld is actually loading at process start, you've been brushing up against Mach-O.
Mach-O ("Mach Object") is inherited from the Mach kernel work at CMU and adopted by NeXTSTEP, which Apple acquired in 1996.
It replaced a.out and now serves macOS, iOS, watchOS, tvOS, and visionOS.
Just like Linux uses ELF and Windows uses PE files - Mach-O is Apple's equivalent, and it's designed around a few things ELF and PE handle differently:
- Multi-architecture binaries via the "fat" / universal wrapper
- Two-level namespaces so symbols carry which library they came from
- Code signing baked in as a first-class load command
- Dyld-driven dynamic linking - rather than relying on PLT/GOT conventions alone
A Mach-O file has some conceptual layers, from outside in:
- An optional Fat header - wraps multiple architecture slices into one file
- Mach header - identifies the architecture and file type
- Load commands - a directive list telling the loader what to do
- Segments and sections - the actual code and data
When Apple ships a binary that runs on both Apple Silicon and Intel, they wrap two Mach-O files in a single container.
The magic number 0xCAFEBABE (or 0xCAFEBABE byte-swapped, depending on Endianness) at offset 0 signals this.
struct fat_header {
uint32_t magic; // FAT_MAGIC / FAT_MAGIC_64
uint32_t nfat_arch; // Number of slices
};Followed by nfat_arch of:
struct fat_arch {
cpu_type_t cputype; // e.g. CPU_TYPE_ARM64
cpu_subtype_t cpusubtype;
uint32_t offset; // Where this slice lives in the file
uint32_t size;
uint32_t align;
};The lipo utility is quite useful for parsing and handling Fat binaries:
lipo -info /bin/lswill tell you which slices are presentlipo -thin arm64 ...extracts slices.
Here's an example:
jbo@JBOs-MacBook-Pro ~ % lipo -info /bin/ls
Architectures in the fat file: /bin/ls are: x86_64 arm64e
As a side note, 0xCAFEBABE is also Java's class file magic, which is why file sometimes needs context to disambiguate.
Each architecture slice begins with the Mach header. On 64-bit (the vast majority of files in this day and age):
struct mach_header_64 {
uint32_t magic; // MH_MAGIC_64 = 0xFEEDFACF
cpu_type_t cputype;
cpu_subtype_t cpusubtype;
uint32_t filetype; // MH_EXECUTE, MH_DYLIB, MH_BUNDLE, ...
uint32_t ncmds; // Number of load commands
uint32_t sizeofcmds; // Total bytes of load commands
uint32_t flags; // MH_PIE, MH_TWOLEVEL, ...
uint32_t reserved;
};The filetype field is the first thing to look at. Here are the common ones:
| filetype | Meaning |
|---|---|
MH_EXECUTE |
A runnable program |
MH_DYLIB |
A dynamic library (.dylib) |
MH_BUNDLE |
A plugin (.bundle) |
MH_DYLINKER |
dyld itself |
MH_KEXT_BUNDLE |
A kernel extension |
MH_OBJECT |
A relocatable .o |
Immediately after the header is a list of variable-sized load commands. Each one is an instruction to dyld or the kernel.
Every load command starts with:
struct load_command {
uint32_t cmd; // Command type
uint32_t cmdsize; // Total size (including the header)
};There are dozens of load command types. Here are a couple of important ones:
LC_SEGMENT_64- Describes a segment of the file to map into memory.LC_LOAD_DYLIB- Dependencies (e.g./usr/lib/libSystem.B.dylibmeans that this Mach-O depends onlibSystem.B.dyliband therefore it should be loaded).LC_MAIN- Entry point for executables (replacedLC_UNIXTHREADaround macOS 10.8).LC_DYLD_INFO_ONLY/LC_DYLD_CHAINED_FIXUPS- Rebase and binding information.LC_SYMTAB/LC_DYSYMTAB- Symbol tables.LC_CODE_SIGNATURE- Embedded code signing blob.LC_BUILD_VERSION- Minimum OS version and SDK info.LC_RPATH- Runtime search path for dylibs (the@rpathtoken resolves against these).LC_UUID- A unique build identifier, used bydsymutilto pair binaries with their.dSYMdebug bundles.LC_FUNCTION_STARTS- Compact-encoded list of function start addresses for the unwinder and debuggers.
We can run otool -l <binary> to dump the full list of the load commands from a given binary.
The LC_SEGMENT_64 Load command defines a segment.
Segments are contiguous ranges of the file that gets mapped into memory with specific permissions.
Each segment is then divided into sections. The standard layout for an executable:
The first segment, but it has no file content - it's a 4GB (on 64-bit) unmapped region at virtual address 0.
Its only job is to make null-pointer dereferences crash hard - it's not readable, writable or executable.
From my experiments, modifying this to be smaller than 4GB would make AMFI kill the process upon loading, at least on AArch64.
It's read-only and executable. Contains the following sections:
__text- Actual machine instructions.__stubs/__stub_helper- Jump stubs for lazily-bound external functions.__cstring- NUL-terminated string literals.__const- read-only constants.__unwind_info- Compact unwind tables for exception handling.
Being read-only and executable means __TEXT can be shared between processes - every running copy of /bin/ls shares the same physical pages for its code.
Mapped as readable and writable. Apple split this into two segments to harden the runtime:
__DATA_CONSTis mapped read-write briefly during linking, then flipped to read-only. It holds things like the GOT and function pointers that shouldn't change after process start.__DATAis read-write throughout the program's life: globals, BSS and so on.
Typical sections include __got, __la_symbol_ptr (lazy symbol pointers), __data, __bss, and __common.
The catch-all segment at the end of the file.
Holds data that dyld needs but the running program doesn't address directly: the symbol table, string table, chained fixups, code signature, function starts table, and so on.
It is mapped as read-only.
Run this on any macOS executable:
otool -hV /bin/ls # Mach header
otool -lV /bin/ls # Load commands (verbose)
otool -L /bin/ls # Just the LC_LOAD_DYLIB entries
nm /bin/ls # Symbol table
codesign -dv /bin/ls # Code signature infoFor a deeper view, the dyld_info utility (added in macOS 12) gives a much nicer breakdown of chained fixups, exports, and bindings than the older otool -I output.
When you exec() a Mach-O executable, the kernel does the following:
- Maps the
__TEXT,__DATA,__DATA_CONST, and__LINKEDITsegments according to theLC_SEGMENT_64commands. - Loads the dynamic linker named in
LC_LOAD_DYLINKER(almost always points to/usr/lib/dyld). - Hands control to
dyld.
At that point dyld takes over and does the following:
- Walks the
LC_LOAD_DYLIBlist and recursively loads dependencies. - Applies rebases (adjusting pointers for ASLR slide) and binds external symbols using the chained fixups or older bind opcodes.
- Runs initializers (
+loadmethods,__attribute__((constructor))functions, etc.). - Jumps to the entry point given by
LC_MAIN.
While Mach-O's structure is less complex (at least in my opinion) when compared to PE or ELF files, parsing binary files (in general) is difficult.
Here's a bug I recently reported to Apple Product Security against macOS 26.4 that's a nice illustration of why the format-level details matter.
Earlier I mentioned LC_DYLD_INFO_ONLY and LC_DYLD_CHAINED_FIXUPS.
There's a related load command, LC_DYLD_EXPORTS_TRIE, that points (via dataoff / datasize) at a blob inside __LINKEDIT containing the dylib's exported symbols, encoded as a Trie (prefix tree) rather than a flat list.
Each node in the Trie is a few bytes:
- A ULEB128
terminal size(0 if this node isn't itself a complete symbol, otherwise the size of the export info payload that follows). - The export info payload (flags, address, etc.) if the terminal size is non-zero.
- A child count (in a single byte).
- For each child: a NULL-terminated edge label (string fragment) and a ULEB128
child node offsetrelative to the Trie start.
To resolve a symbol like _hello, you walk from the root, matching label fragments against the symbol name and following the corresponding child offset, until you land on a terminal node.
The format is compact and well-suited to symbol lookup. It is also entirely controlled by whoever produced the binary.
Inside dyld/mach_o/ExportsTrie.cpp, the function mach_o::GenericTrie::recurseTrie() walks the trie recursively, descending into each child via _trieStart + childNodeOffset.
The only validation on childNodeOffset is that it is non-zero.
More importantly, there is no depth bound, and no tracking of visited nodes.
So, if the Trie contains a cycle - node B's child offset points back to node B (or any ancestor) - recurseTrie() recurses forever until it walks off the bottom of the stack and the kernel stack-guard page kills the process with EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE.
There is a funny asymmetry, by the way - the iterative sibling function GenericTrie::hasEntry(), used at runtime symbol lookup, does maintain a visitedNodeOffsets array (cap 256) and rejects cycles cleanly. The recursive walker, used during validation, doesn't. So, there are really two implementations of the same traversal - only one is hardened.
The malicious Trie payload is tiny: two nodes, eight bytes, with node B's only child being itself:
# node A @ offset 0 -> child @ offset 4
0x00, # terminalSize = 0 (not a complete symbol)
0x01, # childCount = 1
0x00, # edge label "" (just the NUL terminator)
0x04, # childNodeOffset = 4 -> node B
# node B @ offset 4 -> child @ offset 4 (self-loop)
0x00, 0x01, 0x00, 0x04,
Drop that blob into the LC_DYLD_EXPORTS_TRIE region of any ~16 KB dylib (the script in the report uses otool -l to locate dataoff / datasize, then patches the bytes in place), and:
dyld_info -exports tiny_cycle.dylib # exit 139, SIGSEGV
clang -o loader loader.c tiny_cycle.dylib # ld: Bus error: 10Crash report shows roughly 65,000 stack frames of mach_o::GenericTrie::recurseTrie before ___chkstk_darwin catches the overflow.
I put my POC under cycle_poc.sh:
jbo@JBOs-MacBook-Pro macho % ./cycle_poc.sh
./tiny_cycle.dylib [arm64]:
./cycle_poc.sh: line 16: 97961 Segmentation fault: 11 dyld_info ./tiny_cycle.dylib
Under ~/Library/Logs/DiagnosticReports/ you will find later a dyld-info-<timestamp>.ips file, which will have the right information in it:
{
...
"id": 13829582,
"recursionInfoArray": [
{
"hottestElided": 4,
"coldestElided": 65272,
"depth": 65275,
"keyFrame": {
"imageOffset": 97916,
"symbol": "mach_o::GenericTrie::recurseTrie(unsigned char const*, dyld3::OverflowSafeArray<char, 4294967295ull>&, int, bool&, void (char const*, std::__1::span<unsigned char const, 18446744073709551615ul>, bool&) block_pointer) const",
"symbolLocation": 520,
"imageIndex": 0
}
}
],
}The runtime loader doesn't crash on dlopen() of the same file, because runtime symbol lookup goes through the iterative hasEntry().
So this isn't a runtime exploitation primitive - it's simply a denial-of-service against tooling that validates the Trie:
/usr/bin/dyld_info, called by automated analysis pipelines.- The Apple system linker
ld, invoked transitively byclang <...> some.dylib. Build pipelines that link against attacker-supplied prebuilt frameworks crash with no actionable diagnostic. - Any tool using
mach_o::ExportsTrie::forEachExportedSymbol()fromlibdyld_introspection- malware analysis services, app review pre-scanners, EDR products that triage uploaded binaries.
Because the failure mode is a stack-guard fault rather than a memory write, there is no code-execution path.
It is a clean DoS against developer and analysis tooling.
As a side note, the issue I found was tracked under OE1105704294054, and was not deemed important enough for a security fix, so I feel comfortable releasing that information.
In this blogpost, I finally got to talk about the Mach-O structure and even highlight an issue in Apple's own code.
I intend to discuss more about how Objective-C code is compiled and how it affects the file structure further.
Stay tuned!
Jonathan Bar Or