12

In an assembly program, the .text section is loaded at 0x08048000; the .data and the .bss section comes after that.

What would happen if I don't put an exit syscall in the .text section? Would it lead to the .data and the .bss section being interpreted as code causing "unpredictable" behavior? When will the program terminate -- probably after every "instruction" is executed?

I can easily write a program without the exit syscall, but testing if .data and .bss gets executed is something I don't know because I guess I would have to know the real machine code that is generated under-the-hoods to understand that.

I think this question is more about "How would OS and CPU handle such a scenario?" than assembly language, but it is still interesting to know for assembly programmers etc.

5
  • 7
    Execution would continue into whatever is after your code, yes. That will probably hit an invalid instruction sooner or later, or you will run into unmapped memory. If you are extremely lucky, you might hit a harmless endless loop in which case your program wouldn't terminate. Commented Apr 5, 2018 at 13:52
  • @Jester I'd say more chances of winning the lottery than hitting an endless loop. Commented Apr 5, 2018 at 21:33
  • @TonyTannous I did say "extremely lucky" :D However, you can make an endless loop in x86 using 2 bytes and assuming random memory contents that's already way better chance than any lottery I know. Unfortunately you are likely to hit some zero bytes instead of random, and that is add al, [eax] on x86 which will probably fault. Commented Apr 5, 2018 at 22:10
  • 2
    For the record, 00 00 decodes as add [eax], al: memory destination, not memory source, so EAX (or RAX in 64-bit code) has to be pointing at writeable memory, but repeated execution doesn't change the low byte of the address. Commented Apr 6, 2018 at 2:58
  • Related: Nasm segmentation fault on RET in _start - _start isn't a function, there's nothing to return to. You need to make an exit system-call. That Q&A has actual code examples for x86 / x86-64. Commented Mar 16, 2022 at 17:23

2 Answers 2

28

The processor does not know where your code ends. It faithfully executes one instruction after another until execution is redirected elsewhere (e.g. by a jump, call, interrupt, system call, or similar).

If your code ends without jumping elsewhere, the processor continues executing whatever is in memory after your code. It is fairly unpredictable what exactly happens, but eventually, your code typically crashes because it tries to execute an invalid instruction or tries to access memory that it is not allowed to access.

If neither happens and no jump occurs, eventually the processor tries to execute unmapped memory or memory that is marked as “not executable” as code, causing a segmentation violation. On Linux, this raises a SIGSEGV or SIGBUS. When unhandled, these terminate your process and optionally produce core dumps.

If you're curious, run under a debugger and look at disassembly of the faulting instruction.

Sign up to request clarification or add additional context in comments.

7 Comments

At the CPU level, the exception you're talking about is a page-fault (#PF), not segment-related. A 32-bit or 64-bit process on x86 Linux runs with CS base=0 / limit = unlimited. The "segmentation" in SIGSEGV has basically nothing to do with x86 segments, because that's not what x86 Linux uses for memory protection.
@PeterCordes I would prefer if you removed all mention of CPU-specific details as otherwise readers will be confused if this applies to x86 only or also to other architectures. I had intentionally written the answer without such references to avoid this uncertainty.
Maybe I should post a separate answer with x86 details? This is a nice question for a canonical duplicate, but the answer doesn't show how to actually fix the problem for those who don't know. That was the compromise I was trying to strike. (I'd typed the 00 00 part before noticing that the question wasn't x86 specific; maybe I should have taken it out. My comment from 4 years ago is also x86-specific, for better or for worse :/) The 0x08048000 .text address in the question is what ld uses for 32-bit x86 Linux non-PIE, but it might also use the same address on others.
This is of course your answer, so it's your call what it says. Let me know if you decide to take out the x86 stuff; if I don't find a better canonical for x86 Q&As that fall off the end of _start, I might just add a separate answer here. Maybe also with some mention of the equivalents for ARM and AArch64 Linux if that isn't too cluttered.
Finally got around to posting those edits as a new answer to make this a better dup target, inspired by a couple recent duplicates.
|
5

To fix this problem, see Nasm segmentation fault on RET in _start for correct ways to make an exit system-call in 32 or 64-bit x86 code for Linux or Windows. (The entry point, _start, isn't a function in Linux; the stack pointer points at argc, not a return address, so ret doesn't work either.) For other ISAs or OSes, check their manuals or look at existing examples for how to exit.

See also What is the correct constant for the exit system call? re: #include <sys/syscall.h> in a .S file to get constants like SYS_exit( or SYS_exit_group) on Unix-like OSes.

Or in assemblers that can't use C headers directly, look for asm/unistd.h on Linux; for x86-64 vs. -32, see unistd_64.h vs. unistd_32.h. (And/or see this Q&A for tools to make .inc files from C headers with just the constants, also useful for getting stuff like O_RDWR or MAP_PRIVATE constants for syscall args.)


If you're using any libc functions, especially printf, you should call exit to flush stdio buffers. Or ret from main, not _start, and let the CRT startup code call exit. See also


As @fuz explains, execution will continue into whatever bytes come next in memory where your executable is loaded / mapped. The CPU doesn't know where your source ended, it just fetches and decodes bytes from memory.

Often there are some zero bytes of padding at the end of the .text section. On 32-bit x86, 00 00 decodes as add [eax], al, a memory-destination add. It's add [rax], al in 64-bit mode. That will fault1 if RAX doesn't point to a writeable page.

RISC-V specifically chose its opcodes so 00 00 00 00 (and 00 00 compressed instructions) would be invalid instructions that fault, definitely not a NOP, so regions of zero-padding can't work as NOP sleds for exploits which send execution nearby instead of exactly to the bytes they want to execute. Some older RISCs do run all-zero bytes as a NOP or non-faulting ALU instruction, but AArch64 similarly avoids this problem with 0000xxxx as udf #imm16, an illegal instruction.

If execution gets past whatever 00 or non-zero garbage bytes are in memory, eventually it'll come to an unmapped or non-executable page. This will also lead to an invalid page fault, just like a data access for a bad pointer, so you also get SIGSEGV on Unix-like systems.

(Or on primitive CPUs without memory protection, the instruction pointer could wrap. e.g. on 8086, code-fetch from CS:IP wraps IP without affecting CS, so execution implicitly loops over a 64KiB region if it's all straight-line code with no jumps.)

If you're curious, run under a debugger and look at disassembly of the faulting instruction, and the hexdump of its machine code in case you recognize it as ASCII data or 00 padding. (Don't put data in the path of execution either.)


Footnote 1: hardware #PF exception -> software SIGSEGV

The x86 CPU exception is #PF, a page fault. The CPU will run the kernel's page-fault handler, which checks whether the process should be allowed to access that virtual address.

If so, it can allocate a new page for copy-on-write or just zeroed, or find an existing page of file data in the pagecache for regions mapped to files. Then set up the page-table entry to "wire" that page into the process's address-space to handle a "minor" or "soft" page fault. Or do I/O to get the page from disk, waking up to wire the page when the I/O is done (major or hard page fault).

But in this case, we're talking about a page the process doesn't have mapped, so it's an invalid page fault. The kernel's page-fault handler will deliver a SIGSEGV segmentation-fault signal if this is a Unix-like OS, or do something similar for other OSes like Windows.

The default action for SIGSEGV signal is killing your process, if it hasn't set up a handler for that signal. Not that you should; unless you know why your process should be raising SIGSEGV, usually that's unrecoverable.

Note that "segmentation fault" is unrelated to x86 segments like CS and DS, because modern OSes use paging for memory protection, not x86 segments. 32-bit or 64-bit process on x86 Linux runs with CS base=0 / limit = unlimited. The name is historical.

1 Comment

Note that AArch64 too maps 0000xxxx to the undefined instruction udf #imm16.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.