Skip to content

yo-yo-yo-jbo/python_for_researchers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Python for security researchers

This blogpost will show some techniques I personally like to use, that involve Python.
Unlike other publications I had, this is an "opinion" or "usage" blogpost, so take what I write here as my favorite techniques rather than things set in stone.
Of course, I could be talking about pwntools (which I slightly covered in my introduction to pwn blogpost) or any other module, but I thought I'd show things that are more unique.
In the first part of this blogpost, we'll show how to "move from C to Python" - assuming we have native code execution and show how to execute Python code.
In the second part - we'll do the opposite - we'll use ctypes to interact with low-level API using Python.

Why Python?

Python is awesome. These days there's a lot of debate on how much we should be using Python due to its performance compared to C, but in my opinion that's a fallacy.
I use Python when I need to use Python, and I use C when I need to use C. Also, last time I checked, scripting languages such as PowerShell on Windows or AppleScript on macOS are used by real malware authors all the time...
In any case, Python's advantages are clear:

  1. It's popular (has tons of libraries to use, has rich community of developers).
  2. It's open-source.
  3. It has amazing integration with C (more on that later).
  4. It's very easy to express ideas quickly into code without worrying about strong typing or freeing memory (in most cases).

Now that we got that out of the way, I'd like to show a quick introduction on how to move from C to Python execution and vice-versa.

From C to Python

One thing I sometimes might do is compile CPython myself, add functionality etc. and ship it, if I need a quick implant.
This is quite easy to do but might end up with a large binary, which might not be adventagous, albeit being a bit harder for defenders to sign.
One other trick I use and I haven't seen anyone use is load Python dynamically on systems that do have Python (commonly Linux, some macOS).
You can easily find your dynamic library path like that:

from distutils.sysconfig import get_config_var
import os
print(os.path.join(get_config_var('LIBDIR'), get_config_var('LDLIBRARY')))

For example, on my Linux box, I got /usr/lib/x86_64-linux-gnu/libpython3.10.so.1.0.
We could use this library to execute arbitrary Python code straight from C! Here's why:

jbo@hax:~$ readelf -sW /usr/lib/x86_64-linux-gnu/libpython3.10.so | grep Run_SimpleString
  1118: 000000000020d660   130 FUNC    GLOBAL DEFAULT   14 PyRun_SimpleStringFlags
  1214: 000000000020d6f0    11 FUNC    GLOBAL DEFAULT   14 PyRun_SimpleString

The documentation for it is clear:

int PyRun_SimpleString(const char *command);

If we try to execute it as-is we'll get a segmentation fault because we need to "initialize an environment", which is really a function Py_Initialize.
We will also call a corresponding Py_Finalize function to finalize the environment (cleanup), and now we could run any code we'd want!

#include <stdio.h>
#include <dlfcn.h>
#include <stdbool.h>

typedef void (*py_init_t)();
typedef void (*py_fini_t)();

typedef int (*py_exec_t)(const char*);

int
main(
    int argc,
    char** argv
)
{
    int result = -1;
    char* libpython_path = NULL;
    char* python_code = NULL;
    void* handle = NULL;
    py_init_t py_init = NULL;
    py_fini_t py_fini = NULL;
    py_exec_t py_exec = NULL;
    bool should_finalize = false;

    // Validate argument(s)
    if (3 > argc)
    {
        fprintf(stderr, "Missing arguments.\nUsage: %s [LIBPYTHON_PATH] [PYTHON_CODE]\n", argv[0]);
        goto cleanup;
    }

    // Get arguments
    libpython_path = argv[1];
    python_code = argv[2];

    // Load Python
    handle = dlopen(libpython_path, RTLD_LAZY);
    if (NULL == handle)
    {
        fprintf(stderr, "Error loading Python library.\n");
        goto cleanup;
    }

    // Resolve symbols
    py_init = dlsym(handle, "Py_Initialize");
    if (NULL == py_init)
    {
        fprintf(stderr, "Error resolving symbol \"Py_Initialize\".\n");
        goto cleanup;
    }
    py_fini = dlsym(handle, "Py_Finalize");
    if (NULL == py_init)
    {
        fprintf(stderr, "Error resolving symbol \"Py_Finalize\".\n");
        goto cleanup;
    }
    py_exec = dlsym(handle, "PyRun_SimpleString");
    if (NULL == py_exec)
    {
        fprintf(stderr, "Error resolving symbol \"PyRun_SimpleString\".\n");
        goto cleanup;
    }

    // Initialize environment
    py_init();
    should_finalize = true;

    // Execute command and propagate result
    result = py_exec(python_code);

cleanup:

    // Free resources
    if (should_finalize)
    {
        py_fini();
    }
    if (NULL != handle)
    {
        dlclose(handle);
        handle = NULL;
    }

    // Return result
    return result;
}

Note how simple that is:

jbo@hax:~$ gcc -oc2py ./c2py.c
jbo@hax:~$ ./c2py /usr/lib/x86_64-linux-gnu/libpython3.10.so "import os;os.system('id');"
uid=1000(jbo) gid=1000(jbo) groups=1000(jbo),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),122(lpadmin),135(lxd),136(sambashare),142(libvirt)

Of course you can make this code stealthy - but my point is using the Python library as a living-off-the-land technique.
In macOS, it's even worse since EDRs might rely on a Python commandline or script on disk - this is completely different, of course.
I've included the source of c2py in this repository for you to download and exepriment with.

From Python to C

This transition might seem odd - we already have the power of Python, why would we need to work in native code?
Well, sometimes you might want to do low-level work, interacting with memory, working with shellcodes and so on.
The advantage of working with Python here is clear - we benefit from running in a process that can run anything, with its logic not present on disk.
So, for this blogpost, we'll do an interesting exercise - write a DLL injector for Windows!
As a reminder, here's what a C-coded injector would do (I've shown how to implement that exactly in a past blogpost):

  1. Given a process name, map it to a process ID by calling the CreateToolhelp32Snapshot API and calling Process32First and Process32Next APIs.
  2. Once a PID is found - call OpenProcess on it to get a handle.
  3. Now we'd like to write the DLL name in the foreign process address space - we first allocate memory via the VirtualAllocEx API and then write the DLL path there with WriteProcessMemory.
  4. Finally, we call CreateRemoteThread API on the address of kernel32!LoadLibraryW and the argument which is the foreign process memory address we allocated and wrote on.

So, we will do all of that one step at-a-time, with the best Python module - ctypes!

Introduction to ctypes

The ctypes module comes pre-shipped with Python and is capable of interacting with shared libraries, raw memory and so on.
It's quite a heavy module in terms of the functionality it exposes, so we'll learn by example.

From process name to PID

The first part we'd want to do is convert the process name into an ID, and use the CreateToolhelp32Snapshot, Process32First and Process32Next APIs. For that, we'll need a handle to kernel32.dll, which is the DLL that implements all of those APIs.
In ctypes, we have the capability to load libraries:

Thus, we can get kernel32.dll handle easily:

import ctypes
kernel32 = ctypes.windll.LoadLibrary('kernel32.dll')

We can already refer to functions exported by kernel32 but we'd need to help Python understand the structures, argument types and return type of functions.
Let's start with CreateToolhelp32Snapshot:

HANDLE CreateToolhelp32Snapshot(
  [in] DWORD dwFlags,
  [in] DWORD th32ProcessID
);

We need to declare the function gets two DWORDs and returns a HANDLE, and we can do it easily:

import ctypes
from ctypes import wintypes

# Get kernel32
kernel32 = ctypes.windll.LoadLibrary('kernel32.dll')

# Prototype - kernel32!CreateToolhelp32Snapshot
kernel32.CreateToolhelp32Snapshot.argtypes = [ wintypes.DWORD, wintypes.DWORD ]
kernel32.CreateToolhelp32Snapshot.restype = wintypes.HANDLE

Note I use wintypes which is imported from ctypes - if you're dealing with raw C you can use ctypes types:

C type ctypes type
bool ctypes.c_bool
unsigned char ctypes.c_byte
char ctypes.c_char
char* ctypes.c_char_p
double ctypes.c_double
int ctypes.c_int
int64t ctypes.c_int64
uint64t ctypes.c_uint64
long ctypes.c_long
size_t ctypes.c_size_t
void* ctypes.c_void_p

The wintypes types are just nice definitions of existing ctypes types - for example:

HANDLE = ctypes.c_void_p

Let's continue to the next API - Process32First:

BOOL Process32First(
  [in]      HANDLE           hSnapshot,
  [in, out] LPPROCESSENTRY32 lppe
);

First note MSDN "lies" - Process32First is either an ANSI or a Wide version - we really would like to work with the wide version since this is how Windows works internally, as well as Python strings as wide by default. So, we'll refer to the API as kernel32.Process32FirstW.
The other problem is the data structure PROCESSENTRY32W (note that I took the wide version of the structure!), which is luckily documented:

typedef struct tagPROCESSENTRY32W {
  DWORD     dwSize;
  DWORD     cntUsage;
  DWORD     th32ProcessID;
  ULONG_PTR th32DefaultHeapID;
  DWORD     th32ModuleID;
  DWORD     cntThreads;
  DWORD     th32ParentProcessID;
  LONG      pcPriClassBase;
  DWORD     dwFlags;
  WCHAR     szExeFile[MAX_PATH];
} PROCESSENTRY32W;

Luckily, we can easily define that structure in ctypes!

MAX_PATH = 260
ULONG_PTR = wintypes.LPVOID

# Define the PROCESSENTRY32W structure
class PROCESSENTRY32W(ctypes.Structure):
    _fields_ = [
        ('dwSize',              wintypes.DWORD),
        ('cntUsage',            wintypes.DWORD),
        ('th32ProcessID',       wintypes.DWORD),
        ('th32DefaultHeapID',   ULONG_PTR),
        ('th32ModuleID',        wintypes.DWORD),
        ('cntThreads',          wintypes.DWORD),
        ('th32ParentProcessID', wintypes.DWORD),
        ('pcPriClassBase',      wintypes.LONG),
        ('dwFlags',             wintypes.DWORD),
        ('szExeFile',           wintypes.WCHAR * MAX_PATH)
    ]

There are a few interesting takeaways here:

  1. Note each sturcture like that is a Python class. Inheriting from ctypes.Structure means Python will be looking for a member _fields_ which is a list of 2-tuples name -> type.
  2. Note how the last member has a type of wintypes.WCHAR * MAX_PATH - yes, ctypes allow defining a type of an array like that (equivalent to WCHAR[260]).
  3. In C, structures by default might have padding (unless defined packed) - ctypes takes care of that too and thus the PROCESSENTRY32W class is binary-compatible with the Windows PROCESSENTRY32W structure!
  4. I defined ULONG_PTR manually since that doesn't exist in wintypes.

Now we can easily declare the prototype for the functions:

ctypes.kernel32.Process32FirstW.argtypes = [ wintypes.HANDLE, ctypes.POINTER(PROCESSENTRY32W) ]
ctypes.kernel32.Process32FirstW.restype = wintypes.BOOL
ctypes.kernel32.Process32NextW.argtypes = [ wintypes.HANDLE, ctypes.POINTER(PROCESSENTRY32W) ]
ctypes.kernel32.Process32NextW.restype = wintypes.BOOL

Note how you can use ctypes.POINTER to define a pointer to an existing type - you can also use the builtin types (e.g. ctypes.POINTER(ctypes.c_int) is equivalent to C's int*.
Now we can write our first function that gets a process name and finds its PID:

import ctypes
from ctypes import wintypes

# Windows definitions
MAX_PATH = 260
TH32CS_SNAPPROCESS = 0x00000002
INVALID_HANDLE_VALUE = wintypes.HANDLE(-1)
SIZE_T = ctypes.size_t
ULONG_PTR = wintypes.LPVOID

# Get kernel32
kernel32 = ctypes.windll.LoadLibrary('kernel32.dll')

# Define the PROCESSENTRY32W structure
class PROCESSENTRY32W(ctypes.Structure):
    _fields_ = [
        ('dwSize',              wintypes.DWORD),
        ('cntUsage',            wintypes.DWORD),
        ('th32ProcessID',       wintypes.DWORD),
        ('th32DefaultHeapID',   wintypes.LPVOID),
        ('th32ModuleID',        wintypes.DWORD),
        ('cntThreads',          wintypes.DWORD),
        ('th32ParentProcessID', wintypes.DWORD),
        ('pcPriClassBase',      wintypes.LONG),
        ('dwFlags',             wintypes.DWORD),
        ('szExeFile',           wintypes.WCHAR * MAX_PATH)
    ]

# Prototype - kernel32!CreateToolhelp32Snapshot
kernel32.CreateToolhelp32Snapshot.argtypes = [ wintypes.DWORD, wintypes.DWORD ]
kernel32.CreateToolhelp32Snapshot.restype = wintypes.HANDLE

# Prototype - kernel32!CloseHandle
kernel32.CloseHandle.argtypes = [ wintypes.HANDLE ]
kernel32.CloseHandle.restype = wintypes.BOOL

# Prototype - kernel32!Process32FirstW
kernel32.Process32FirstW.argtypes = [ wintypes.HANDLE, ctypes.POINTER(PROCESSENTRY32W) ]
kernel32.Process32FirstW.restype = wintypes.BOOL

# Prototype - kernel32!Process32NextW
kernel32.Process32NextW.argtypes = [ wintypes.HANDLE, ctypes.POINTER(PROCESSENTRY32W) ]
kernel32.Process32NextW.restype = wintypes.BOOL

def find_pid(process_name:str) -> int:
    """
        Finds a process by its name and returns its PID.
    """

    # Save the lowercase process name
    process_name_lower = process_name.lower()

    # Handle exceptions
    snapshot = INVALID_HANDLE_VALUE
    try:

        # Get a snapshot
        snapshot = kernel32.CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0)
        assert snapshot != INVALID_HANDLE_VALUE, Exception('kernel32!CreateToolhelp32Snapshot failed')

        # Get the first process
        entry = PROCESSENTRY32W()
        entry.dwSize = ctypes.sizeof(entry)
        assert kernel32.Process32FirstW(snapshot, ctypes.byref(entry)), Exception('kernel32!Process32FirstW failed')

        # Iterate the snapshot
        while True:

            # Compare entry name (case insensitive) and return the PID if found
            if entry.szExeFile.lower() == process_name_lower:
                return entry.th32ProcessID

            # Continue to the next entry
            assert kernel32.Process32NextW(snapshot, ctypes.byref(entry)), Exception(f'Process name "{process_name}" not found')

    # Cleanup
    finally:

        # Free resources
        if snapshot != INVALID_HANDLE_VALUE:
            kernel32.CloseHandle(snapshot)

Note some intersting insights:

  1. I defined kernel32!CloseHandle prototype as well because we need to clean up Windows handles when we're done (the snapshot handle).
  2. Note how we use try..finally to cleanup the snapshot - this is a common Python pattern for cleaning up resources.
  3. Note how we access PROCESSENTRY32W members using Python literals and syntax, including treating szExeFile as a Python string!
  4. Note the use of ctypes.byref to pass a variable by its reference (just like the & operator you'd use in C).

Injecting to a PID

Similarly, we'll define the main prototypes we require for injecting the DLL:

# Type - SIZE_T
SIZE_T = ctypes.size_t

# Prototype - kernel32!OpenProcess
kernel32.OpenProcess.argtypes = [ wintypes.DWORD, wintypes.BOOL, wintypes.DWORD ]
kernel32.OpenProcess.restype = wintypes.HANDLE

# Prototype - kernel32!VirtualAllocEx
kernel32.VirtualAllocEx.argtypes = [ wintypes.HANDLE, wintypes.LPVOID, SIZE_T, wintypes.DWORD, wintypes.DWORD ]
kernel32.VirtualAllocEx.restype = wintypes.LPVOID

# Prototype - kernel32!WriteProcessMemory
kernel32.WriteProcessMemory.argtypes = [ wintypes.HANDLE, wintypes.LPVOID, wintypes.LPVOID, SIZE_T, ctypes.POINTER(SIZE_T) ]
kernel32.WriteProcessMemory.restype = wintypes.BOOL

# Prototype - kernel32!GetProcAddress
kernel32.GetProcAddress.argtypes = [ ctypes.wintypes.HMODULE, ctypes.wintypes.LPCSTR ]
kernel32.GetProcAddress.restype = wintypes.LPVOID

# Prototype - kernel32!CreateRemoteThread
kernel32.CreateRemoteThread.argtypes = [ wintypes.HANDLE, wintypes.LPVOID, SIZE_T, wintypes.LPVOID, wintypes.LPVOID, wintypes.DWORD, ctypes.POINTER(wintypes.DWORD) ]
kernel32.CreateRemoteThread.restype = wintypes.HANDLE
  1. Note how in CreateRemoteThread I skipped defining the structure for SECURITY_ATTRIBUTES and used wintypes.LPVOID instead - since I know I will be supplying None (the Python way of supplying a C NULL) there's no need to really define it as a structure.
  2. I also skipped defining the THREAD_START_ROUTINE. In ctypes, you can easily do that with ctypes.WINFUNCTYPE, but since I will simply be supplying the address of kernel32, I will not be needing it.

Now let's get to the injection itself:

PROCESS_CREATE_THREAD = 0x0002
PROCESS_VM_OPERATION = 0x0008
PROCESS_VM_READ = 0x0010
PROCESS_VM_WRITE = 0x0020
MEM_COMMIT = 0x00001000
MEM_RESERVE = 0x00002000
PAGE_READWRITE = 0x04

def inject_to_pid(pid:int, dll_path:str):
    """
        Injects a DLL to the given PID.
    """

    # Handle exceptions
    proc = None
    remote_thread = None
    try:
        
        # Get the address of kernel32!LoadLibraryW
        load_library_w_pfn = kernel32.GetProcAddress(kernel32._handle, b'LoadLibraryW\0')
        assert load_library_w_pfn is not None, Exception('kernel32!GetProcAddress failed')

        # Open the process
        proc = kernel32.OpenProcess(PROCESS_CREATE_THREAD | PROCESS_VM_OPERATION | PROCESS_VM_READ | PROCESS_VM_WRITE, False, pid)
        assert proc is not None, Exception('kernel32!OpenProcess failed')

        # Create a wide buffer containing the DLL path including the NUL terminator
        dll_buffer = ctypes.create_unicode_buffer(dll_path)

        # Allocate memory in foreign process
        remote_addr = kernel32.VirtualAllocEx(proc, None, ctypes.sizeof(dll_buffer), MEM_COMMIT | MEM_RESERVE, PAGE_READWRITE)
        assert remote_addr != 0, Exception('kernel32!VirtualAllocEx failed')

        # Copy the side buffer
        written = SIZE_T()
        total = 0        
        while total < ctypes.sizeof(dll_buffer):
            assert kernel32.WriteProcessMemory(proc, remote_addr + total, ctypes.addressof(dll_buffer) + total, ctypes.sizeof(dll_buffer) - total, ctypes.byref(written)), Exception('kernel32!WriteProcessMemory failed')
            total += written.value

        # Create the remote thread
        remote_thread = kernel32.CreateRemoteThread(proc, None, 0, load_library_w_pfn, remote_addr, 0, None)
        assert remote_thread is not None, Exception('kernel32!CreateRemoteThread failed')

    # Cleanup
    finally:

        # Free resources
        if remote_thread is not None:
            kernel32.CloseHandle(remote_thread)
        if proc is not None:
            kernel32.CloseHandle(proc)
  • Note how I use ctypes.addressoff to get an address of a variable - I use that in case WriteProcessMemory only wrote a partial buffer.
  • Note ctypes.create_unicode_buffer, which also takes care of the NUL terminator for me (in C you'd have to remember copying the NUL terminator in principal).
  • Note I used kernel32._handle to get the base address of kernel32 (I could have used kernel32!GetModuleHandleW but I already have a handle to kernel32 anyway).
  • Again note the nice cleanup in a try..finally pattern.

I have uploaded the source code of the complete Python Windows DLL injector to injector.py.

Summary

The C-to-Python angle might be a bit easier to digest, but some of you might still be asking - why go through all of this trouble when you can just compile your code (for going from Python to C)?
Well, in my perspective, after you've done that enough times, you'd accumulate enough code that wraps useful APIs (or use an LLM, if you trust them).
I've done similar things for COM, WinAPI, much of the Linux glibc and some macOS API from both libSystem.dylib as well as some private frameworks.
In any case, I hope this blogpost has been useful.

Stay tuned!

Jonathan Bar Or

About

Python for offensive security research

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors