Skip to content

docs: refactor hypercall_api#201

Merged
Wenzel merged 3 commits intomasterfrom
docs/refactor_hypercall_api
Jul 18, 2023
Merged

docs: refactor hypercall_api#201
Wenzel merged 3 commits intomasterfrom
docs/refactor_hypercall_api

Conversation

@Wenzel
Copy link
Copy Markdown
Contributor

@Wenzel Wenzel commented Jun 6, 2023

Refactor the hypercall API reference.

TODO:

  • double check page alignment / lock code for Linux / Windows
  • try to change pygments theme. function calls are not highlighted
  • REQ_STREAM_DATA_BULK and detail req_data_bulk_t
  • PERSIST_PAGE_PAST_SNAPSHOT
  • DUMP_FILE example, detail kafl_dump_file_t
  • LOCK: how to use generated snapshot and resume execution
  • detail host_config_t and agent_config_t structs
  • detail kAFL_ranges struct and add example in USER_RANGE_ADVISE
  • add examples
    • RANGE_SUBMIT
    • USER_RANGE_ADVISE
    • USER_FAST_ACQUIRE
  • kafl initialization protocol

@Wenzel Wenzel force-pushed the docs/refactor_hypercall_api branch from 257f146 to 72b384a Compare June 6, 2023 15:18
@Wenzel Wenzel force-pushed the docs/refactor_hypercall_api branch from 72b384a to 0950f1b Compare July 5, 2023 11:42
@Wenzel Wenzel marked this pull request as ready for review July 5, 2023 11:42
@Wenzel Wenzel force-pushed the docs/refactor_hypercall_api branch from 0950f1b to 4c27cf6 Compare July 5, 2023 11:43
@Wenzel Wenzel force-pushed the docs/refactor_hypercall_api branch from 4c27cf6 to dc4dc2b Compare July 5, 2023 12:05
@Wenzel Wenzel requested review from il-steffen and schumilo July 5, 2023 12:08
@Wenzel
Copy link
Copy Markdown
Contributor Author

Wenzel commented Jul 5, 2023

Current build:
html.zip

@il-steffen , @schumilo would you like to review this PR ?

I documented most of the Nyx API here, with examples , and i'd like to already merge this already, if there are no other feedback.

Thanks !

@schumilo
Copy link
Copy Markdown
Contributor

Generally, the documentation already looks good! However, here are some nits that I found:

  • NEXT_PAYLOAD:

"The while() loop in our guest agent is not actually needed anymore."

-> That is not entirely correct. So in case the user wants to gain some extra performance, disabling snapshot mode might be beneficial, which will require a loop. However, disabling snapshot mode does not mean that the actual snapshot is not taken and restored in case of timeouts or crashes.

-> This obviously depends on whether the target or agent supports fuzzing without restoring the snapshot after each and every execution. That's why there is the agent_non_reload_mode field in the agent_config to tell the fuzzer about this capability (and in case the user wants to run the target in non_snapshot mode and the agent does not support that, we can simply throw an error and we're good).

  • SUBMIT_PANIC / SUBMIT_KASAN:

Maybe it's worth mentioning that with this approach, around 20-26 Bytes are overwritten (depending on whether the target runs in protected or long mode; however, I'm actually not quite sure if protected mode is still supported; 32-Bit userland applications running in long mode work just fine).

See https://github.com/nyx-fuzz/QEMU-Nyx/blob/60c216bc9e4c79834716d4099993d8397a3a8fd9/nyx/hypercall/hypercall.h#L53)

  • PRINTF:

I would add a warning that hprintfs should be seen more as a debug mechanism for agent "debug" builds. Once you run the fuzzer, hprintfs called in the fuzzing loop will significantly impact performance.

And instead of using this hypercall directly, you should always use the hprintf wrapper instead (especially if you need format string capabilities).

  • SUBMIT_CR3:

Running the fuzzer without an enabled CR3 filter is actually not supported if Intel PT mode is enabled and also by the actual decoding library libxdc.

This hypercall must be called at least once before HYPERCALL_KAFL_ACQUIRE. If snapshots are enabled, it is, in most cases, sufficient to call SUBMIT_CR3 once before HYPERCALL_KAFL_NEXT_PAYLOAD. If snapshots are disabled, but the agent keeps running in the same process, it is also sufficient to call this hypercall once. For userland fuzzing in non-snapshot mode, however, it might be necessary to call SUBMIT_CR3 with each execution after HYPERCALL_KAFL_NEXT_PAYLOAD but before HYPERCALL_KAFL_ACQUIRE to ensure that the current CR3 value is passed to the hypervisor. This is especially true if, in non-snapshot mode, a fork server is being used.

If Intel PT mode is disabled, this hypercall is not required, but it is still good practice to include it in case the user wants to use Intel PT mode.

  • USER_RANGE_ADVISE:

The reason for this hypercall is that in the case of userland fuzzing, the agent is supposed to make the corresponding code ranges persistent (and prefetched) in the guest's memory by calling mlock() so that the hypervisor has a chance to dump the required pages for the PT decoder.

However, this step is most likely no longer required since code pages can now be dumped even if they are not yet present in the guest's memory at the time of creating the snapshot (for that, we are using hardware breakpoints and some other hacks). Nevertheless, it might be reasonable to prefetch the code pages for better fuzzing performance.

  • DUMP_FILE:

For files larger than 4KB, this hypercall needs to be called multiple times for each page. The first call needs to reset the append field, while the following ones need to set the append byte (otherwise, the resulting file will be <= 4KB in size and will only contain the content of the last call).

  • USER_FAST_ACQUIRE:

In addition to the two hypercalls mentioned, the hypercall SUBMIT_CR3 is also called automatically with it (the hypercall routine for SUBMIT_CR3 will be called after NEXT_PAYLOAD).

https://github.com/nyx-fuzz/QEMU-Nyx/blob/qemu-nyx-4.2.0/nyx/hypercall/hypercall.c#L912

This hypercall basically solves the issue of changing CR3 values in case of disabled snapshots and an in-guest employed fork server without requiring to call 3 different snapshots in a row.

  • LOCK:

This hypercall will create a Nyx pre-snapshot if QEMU-Nyx is launched in a specific mode. You can find some documentation regarding the pre-snapshot feature here:

https://github.com/nyx-fuzz/Nyx/blob/main/docs/01-Nyx-VMs.md

In case QEMU-Nyx is started without enabling the pre-snapshot capability, this hypercall will effectively do nothing.

  • REQ_STREAM_DATA_BULK:

This hypercall serves basically the same purpose as REQ_STREAM_DATA, but can be used to achieve much better transfer speeds for larger files due to bulk operations instead of fetching only 4KB per executed hypercall. It's worth mentioning that this hypercall might only be as fast or even slightly slower for smaller files (<= 1MB) than REQ_STREAM_DATA .

  • PERSIST_PAGE_PAST_SNAPSHOT:

This hypercall excludes a single page frame from being reset by the snapshot restore mechanism. This hypercall expects a page-aligned virtual address of a single page at a time (but can be called multiple times to exclude a number of page frames from being reset).

@il-steffen
Copy link
Copy Markdown
Collaborator

This obviously depends on whether the target or agent supports fuzzing without restoring the snapshot after each and every execution. That's why there is the agent_non_reload_mode field in the agent_config to tell the fuzzer about this capability (and in case the user wants to run the target in non_snapshot mode and the agent does not support that, we can simply throw an error and we're good).

This relationship is good to point out + document also the elements of the buffer structures (agent_config, host_config, ip filter list, ...?). On the host side, kafl fuzz has an option -R that determines the number of persistent executions between full snapshot restore. The default is to always reload but this was tested to work with 10 or 100 or 0=infinite executions between reloads. Should work on any of the toy examples that do not have side-effects in their execution.

Note that the ijon_bitmap is not supported in the frontend - I think we allocate the default size buffer to make qemu happy but don't parse it as part of overall bitmap (qemu.py).

A set of additional utility functions have been built on top of kAFL hypercalls and made available in the nyx_api.h for convenience.

Note that this only has the most basic functions and even that has to be changed for anything not Linux. I was thinking to remove the variable args hprintf() there as well since it's been incompatible even with Linux kernel and Zephyr, who have all the types but slightly different function/macro names.

@Wenzel Wenzel force-pushed the docs/refactor_hypercall_api branch from dc4dc2b to c6a77d7 Compare July 12, 2023 14:16
@Wenzel
Copy link
Copy Markdown
Contributor Author

Wenzel commented Jul 12, 2023

Thanks @schumilo and @il-steffen for your comments !
I integrated your insights as best I could.

Lastest build:
html.zip

@il-steffen

Note that this only has the most basic functions and even that has to be changed for anything not Linux. I was thinking to remove the variable args hprintf() there as well since it's been incompatible even with Linux kernel and Zephyr, who have all the types but slightly different function/macro names.

I managed to call hprintf() functions on Windows, without any trouble so far.
But it was compiled by gcc-mingw-w64-x86-64.

What about introducing conditional compilation with #ifdef to check and adapt for the target platform on which we are building ?

@Wenzel Wenzel merged commit dad5ef5 into master Jul 18, 2023
@Wenzel Wenzel deleted the docs/refactor_hypercall_api branch July 18, 2023 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants