-
Notifications
You must be signed in to change notification settings - Fork 1
Home
TaintEMU: Decoupling Tracking from Functional Domains for Architecture-Agnostic and Efficient Whole-System Taint Tracking
First, create a directory named xtaint, and place TaintEMU inside this directory. Next, prepare a guest image (for example, a Debian on ARM64, which can be downloaded form DQIB, and the image used by this wiki can be download from this LINK) and place it into the xtaint directory.

As shown in Figure 1-1, after this step, the xtaint directory contains two folders: dqib-arm64-virt and TaintEMU-VMI. The dqib-arm64-virt folder contains the ARM64 Debian system and its configuration files, while the TaintEMU-VMI folder contains all the source code download form this git repository.
On Ubuntu 22.04, run the following commands:
sudo apt update
sudo apt install build-essential ninja-build make git bison flex gawk libpixman-1-dev libsdl2-dev libslirp-dev python3 python3-pip rlwrap socat
pip3 install mesonNote: Make sure to add ~/.local/bin to your PATH in order to use meson.
Navigate to the xtaint/TaintEMU-VMI folder and run the following commands:
mkdir build
cd build
../configure --target-list=aarch64-softmmu --enable-taint-engine
make -j8After compilation, the build directory will contain the executable file qemu-system-aarch64.

As shown in Figure 1-2, after compilation, the build directory contains the executable file qemu-system-aarch64.
Open a new terminal (Terminal 1) and navigate to the xtaint/dqib_arm64-virt folder, then run:
./start_x_taint.sh
As shown in Figure 1-3, TaintEMU has started successfully.
Open another terminal (Terminal 2), navigate to the xtaint/dqib_arm64-virt folder, and run:
./serial.sh

After connecting to the serial, TaintEMU runs the Debian operating system.
Open another terminal (Terminal 3), navigate to the xtaint/dqib_arm64-virt folder, and run:
./qmp.sh
After successfully connecting via QMP, Terminal 3 will receive a response from QEMU.
Wait for the Debian operating system to boot completely until the login prompt appears.

Open another terminal (Terminal 4), navigate to the xtaint/dqib_arm64-virt folder, and run:
python3 copy_out_sysmap.pyOnce you see the message done., the guest kernel symbol table has been copied to the host machine.

In Terminal 3, run the following commands:
{ "execute": "qmp_capabilities" }
{ "execute": "setup-vmi", "arguments":{"path":"config.json"} }After executing these commands, the virtual machine introspection function will be successfully configured.

In Terminal 3, run the command:
{ "execute": "x-ray-ps" }The process list will be returned, indicating that the VMI function is working properly.

Log in as the root user:
Username: root
Password: root
Run the following command:
cat /dev/ttyUSB0 | grep hello
In Terminal 2, type:
hello,world
Terminal 1 will echo the tracking information.

This concludes the example.
TaintEMU is a QEMU-based dynamic information flow tracking tool that provides high-performance and high-compatibility tracking capabilities across various instruction sets. The functionality consists of two main modules: virtual machine introspection and dynamic information flow tracking.
The virtual machine introspection feature follows an event-driven programming model based on the publish-subscribe pattern. The event source is guest function call, and users need to register the corresponding event handler as needed. The dynamic information flow tracking functionality includes a series of interfaces for reading and writing data labels, with these interfaces, users can easily achieve whole-system dynamic information flow tracking as they need.
Include the header file "sysemu/x-ray.h"
int x_ray_add_kernel_hook (const char *name, xray_callback_t cb);Description: Set a hook function based on the function name; when the guest executes the kernel function name, the callback function cb is called.
Parameters:
-
@name: The name of the kernel function. -
@cb: The callback function.
int x_ray_add_process_hook (TVM_task_struct *task, uint64_t ptr, xray_callback_t cb);Description: Set a hook function based on the process descriptor and memory address; when the guest executes the process described by task and reaches the address ptr, the callback function cb is called.
Parameters:
-
@task: The process descriptor. -
@ptr: The memory address. -
@cb: The callback function.
TVM_task_struct* x_ray_get_current_task (CPUState *cpu);Description: Return the process descriptor running on the specified CPU based on CPUState.
Parameters:
-
@cpu: The specified CPUState.
- Data Label Read/Write Interfaces
Include the header file "exec/cpu_ldst.h"
uint32_t cpu_ldub_taint(CPUArchState *env, abi_ptr ptr);
int cpu_ldsb_taint(CPUArchState *env, abi_ptr ptr);
uint32_t cpu_lduw_be_taint(CPUArchState *env, abi_ptr ptr);
int cpu_ldsw_be_taint(CPUArchState *env, abi_ptr ptr);
uint32_t cpu_ldl_be_taint(CPUArchState *env, abi_ptr ptr);
uint64_t cpu_ldq_be_taint(CPUArchState *env, abi_ptr ptr);
uint32_t cpu_lduw_le_taint(CPUArchState *env, abi_ptr ptr);
int cpu_ldsw_le_taint(CPUArchState *env, abi_ptr ptr);
uint32_t cpu_ldl_le_taint(CPUArchState *env, abi_ptr ptr);
uint64_t cpu_ldq_le_taint(CPUArchState *env, abi_ptr ptr);
void cpu_stb_taint(CPUArchState *env, abi_ptr ptr, uint32_t val);
void cpu_stw_be_taint(CPUArchState *env, abi
_ptr ptr, uint32_t val);
void cpu_stl_be_taint(CPUArchState *env, abi_ptr ptr, uint32_t val);
void cpu_stq_be_taint(CPUArchState *env, abi_ptr ptr, uint64_t val);
void cpu_stw_le_taint(CPUArchState *env, abi_ptr ptr, uint32_t val);
void cpu_stl_le_taint(CPUArchState *env, abi_ptr ptr, uint32_t val);
void cpu_stq_le_taint(CPUArchState *env, abi_ptr ptr, uint64_t val); The interfaces are similar to QEMU native CPU interfaces. For usage details, refer to the QEMU API Documentation.
- Callback Interfaces
Include the header file "tcg/tcg-taint.h"
void taint_write_notify (uint64_t addr, uint64_t taint, uint64_t val, CPUArchState *env);Description: Callback function when labeled data is written to memory.
Parameters:
-
@addr: The address of the memory being written to. -
@taint: The label value. -
@val: The value being written. -
@env: The guest environment variable.
void taint_read_notify (uint64_t addr, uint64_t taint, uint64_t val, CPUArchState *env);Description: Callback function when labeled data is read from memory.
Parameters:
-
@addr: The address of the labeled data in memory. -
@taint: The label value. -
@val: The value being read. -
@env: The guest environment variable.
void taint_exec_notify (uint64_t addr, uint64_t taint);Description: Callback function when labeled data is executed.
Parameters:
-
@addr: The address of the labeled data in memory. -
@taint: The label value. -
@val: The instruction value. -
@env: The guest environment variable.