<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://nevesnunes.github.io/blog/feed.xml" rel="self" type="application/atom+xml" /><link href="https://nevesnunes.github.io/blog/" rel="alternate" type="text/html" /><updated>2025-02-02T09:08:29+00:00</updated><id>https://nevesnunes.github.io/blog/feed.xml</id><title type="html">onKeyPress</title><subtitle>A showcase of interesting debugging sessions and other technical writeups related to software development.
</subtitle><entry><title type="html">No thumbnails for you</title><link href="https://nevesnunes.github.io/blog/2022/11/12/No-thumbnails-for-you.html" rel="alternate" type="text/html" title="No thumbnails for you" /><published>2022-11-12T00:00:00+00:00</published><updated>2022-11-12T00:00:00+00:00</updated><id>https://nevesnunes.github.io/blog/2022/11/12/No-thumbnails-for-you</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2022/11/12/No-thumbnails-for-you.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<p>When faced with some systems issue, tracing syscalls usually elucidates us on possible causes. But what if the syscall result itself is elusive?</p>

<p>This was the case with evince-thumbnailer on a Debian system, which was failing to create any thumbnail files:</p>

<pre><code class="language-strace">openat(AT_FDCWD, "/tmp/o.png", O_WRONLY|O_CREAT|O_TRUNC, 0666) = -1 EACCES (Permission denied)
</code></pre>

<p>But we could replicate this exact syscall with a python script:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">ctypes</span> <span class="k">as</span> <span class="n">ct</span>
<span class="kn">import</span> <span class="nn">os</span>

<span class="n">libc</span> <span class="o">=</span> <span class="n">ct</span><span class="p">.</span><span class="n">CDLL</span><span class="p">(</span><span class="bp">None</span><span class="p">)</span>
<span class="n">syscall</span> <span class="o">=</span> <span class="n">libc</span><span class="p">.</span><span class="n">syscall</span>
<span class="n">path</span> <span class="o">=</span> <span class="n">ct</span><span class="p">.</span><span class="n">c_char_p</span><span class="p">(</span><span class="s">"/tmp/o.png"</span><span class="p">.</span><span class="n">encode</span><span class="p">(</span><span class="s">'latin-1'</span><span class="p">))</span>
<span class="n">openat</span> <span class="o">=</span> <span class="mi">257</span>
<span class="n">AT_FDCWD</span> <span class="o">=</span> <span class="mh">0xffffff9c</span>  <span class="c1"># (unsigned int32) -100
</span><span class="n">syscall</span><span class="p">(</span><span class="n">openat</span><span class="p">,</span> <span class="n">AT_FDCWD</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="n">os</span><span class="p">.</span><span class="n">O_WRONLY</span><span class="o">|</span><span class="n">os</span><span class="p">.</span><span class="n">O_CREAT</span><span class="o">|</span><span class="n">os</span><span class="p">.</span><span class="n">O_TRUNC</span><span class="p">,</span> <span class="mo">0o666</span><span class="p">)</span>
</code></pre></div></div>

<p>Which works fine, even running with the same user:</p>

<pre><code class="language-strace">openat(AT_FDCWD, "/tmp/o.png", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
</code></pre>

<p>It’s odd that writing to a public directory would give a “Permission denied” error…</p>

<p>Now, any seasoned sysadmin will likely have a hunch about what is going on, but what if we are bumping into a component we weren’t aware of? Let’s figure it out with a modest debugging session, where we trace the kernel code related to that syscall.</p>

<h2 id="navigating-the-kernel-source">Navigating the kernel source</h2>

<p>We can start by <a href="https://unix.stackexchange.com/a/397643/318118">getting the source for our distro</a>. The reason we don’t go vanilla is because there might be some distro specific patches that could affect the code path we are hitting.</p>

<p>Afterwards, we <a href="https://shellbombs.github.io/vscode-for-linux-kernel/">configure VSCode to read kernel compile commands</a>, so that following function references takes into account include paths defined by our kernel configuration, which is just copied over from <code class="language-plaintext highlighter-rouge">/boot/config-$(uname -r)</code>. In our case, we see several “/arch/x86/include/” paths in the generated “compile_commands.json”.</p>

<h2 id="identifying-the-syscall-handler">Identifying the syscall handler</h2>

<p>If we check the <a href="https://www.kernel.org/doc/html/latest/process/adding-syscalls.html">documentation for adding a new syscall</a>, we see a reference to the syscall table file “arch/x86/entry/syscalls/syscall_64.tbl”. If we lookup the syscall number shown by strace, we get the handler name <code class="language-plaintext highlighter-rouge">sys_openat</code> from entry <code class="language-plaintext highlighter-rouge">257 common openat sys_openat</code>. The docs also mention these points of interest:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">SYSCALL_DEFINEn(xyzzy, ...)</code> for the entry point;</li>
  <li>corresponding prototype in “include/linux/syscalls.h”;</li>
</ul>

<p>With the macro expansion explained in this snippet on “include/linux/syscalls.h”:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
 * The asmlinkage stub is aliased to a function named __se_sys_*() which
 * sign-extends 32-bit ints to longs whenever needed. The actual work is
 * done within __do_sys_*().
 */</span>
<span class="cp">#ifndef __SYSCALL_DEFINEx
#define __SYSCALL_DEFINEx(x, name, ...)					\
	__diag_push();							\
	__diag_ignore(GCC, 8, "-Wattribute-alias",			\
		      "Type aliasing is used to sanitize syscall arguments");\
	asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))	\
		__attribute__((alias(__stringify(__se_sys##name))));	\
	ALLOW_ERROR_INJECTION(sys##name, ERRNO);			\
	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
	asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__));	\
	asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__))	\
	{								\
		long ret = __do_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__));\
		__MAP(x,__SC_TEST,__VA_ARGS__);				\
		__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__));	\
		return ret;						\
	}								\
	__diag_pop();							\
	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
#endif </span><span class="cm">/* __SYSCALL_DEFINEx */</span><span class="cp">
</span></code></pre></div></div>

<p>Grepping for <code class="language-plaintext highlighter-rouge">sys_openat</code> leads us to the corresponding entries:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>include/linux/syscalls.h:465:asmlinkage long sys_openat(
fs/open.c:1164:static long do_sys_openat2(
</code></pre></div></div>

<p>Wait, how come there’s only <code class="language-plaintext highlighter-rouge">do_sys_openat2</code>? That seems like another syscall:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>entry/syscalls/syscall_64.tbl:361:437 common openat2 sys_openat2
</code></pre></div></div>

<p>Turns out that they share an implementation, we see the defines in “include/linux/syscalls.h”:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">SYSCALL_DEFINE4</span><span class="p">(</span><span class="n">openat</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">dfd</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="n">__user</span> <span class="o">*</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span>
		<span class="n">umode_t</span><span class="p">,</span> <span class="n">mode</span><span class="p">)</span>
<span class="p">{</span>
	<span class="c1">// ...</span>
	<span class="k">return</span> <span class="n">do_sys_open</span><span class="p">(</span><span class="n">dfd</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
<span class="p">}</span>

<span class="n">SYSCALL_DEFINE4</span><span class="p">(</span><span class="n">openat2</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">dfd</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="n">__user</span> <span class="o">*</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span>
		<span class="k">struct</span> <span class="n">open_how</span> <span class="n">__user</span> <span class="o">*</span><span class="p">,</span> <span class="n">how</span><span class="p">,</span> <span class="kt">size_t</span><span class="p">,</span> <span class="n">usize</span><span class="p">)</span>
<span class="p">{</span>
	<span class="c1">// ...</span>
	<span class="k">return</span> <span class="n">do_sys_openat2</span><span class="p">(</span><span class="n">dfd</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">tmp</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>But the implementation for <code class="language-plaintext highlighter-rouge">do_sys_open</code> calls <code class="language-plaintext highlighter-rouge">do_sys_openat2</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">long</span> <span class="nf">do_sys_open</span><span class="p">(</span><span class="kt">int</span> <span class="n">dfd</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="n">__user</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span> <span class="n">umode_t</span> <span class="n">mode</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">open_how</span> <span class="n">how</span> <span class="o">=</span> <span class="n">build_open_how</span><span class="p">(</span><span class="n">flags</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
	<span class="k">return</span> <span class="n">do_sys_openat2</span><span class="p">(</span><span class="n">dfd</span><span class="p">,</span> <span class="n">filename</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">how</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="analysis-attempt-1-using-kernel-probes">Analysis attempt 1: Using kernel probes</h2>

<p>It’s a convenient option, since it doesn’t require us to run a second kernel instance to debug, or add any additional drivers, just tools like bpftrace to show us which kernel functions are hit.</p>

<p>From the previous section, if we run <code class="language-plaintext highlighter-rouge">bpftrace -lv '*openat*'</code>, we get the expected <code class="language-plaintext highlighter-rouge">kprobe:do_sys_openat2</code>. But there are different kinds of probes, including these ones specific to entry and exit from syscalls:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tracepoint:syscalls:sys_enter_openat
tracepoint:syscalls:sys_exit_openat
</code></pre></div></div>

<p>How to instrument these probes is described in the <a href="https://github.com/iovisor/bpftrace/blob/master/docs/reference_guide.md">reference guide</a>.</p>

<p>Since we are interested in observing the call stack when we exit the syscall, you would hope that a <code class="language-plaintext highlighter-rouge">sys_exit_openat</code> would be enough… but how do you filter it to just match when the filename is “/tmp/o.png”? You don’t, because <code class="language-plaintext highlighter-rouge">args-&gt;filename</code> is only available in <code class="language-plaintext highlighter-rouge">sys_enter_openat</code>, so we have to use a global variable <code class="language-plaintext highlighter-rouge">@match</code> like so:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bpftrace <span class="nt">-e</span> <span class="s1">'
BEGIN { @match = 0; }
tracepoint:syscalls:sys_enter_openat {
    if (strncmp("/tmp/", str(args-&gt;filename), 5) == 0) {
        printf("  path=%s ", str(args-&gt;filename));
        @match = 1;
    }
}
tracepoint:syscalls:sys_exit_openat /@match == 1/ {
    printf("retval=%d\n", args-&gt;ret);
    @[kstack()] = count();
    @match = 0;
}'</span> <span class="nt">-c</span> <span class="s1">'/usr/bin/evince-thumbnailer foo.pdf /tmp/o.png'</span>

<span class="c"># Attaching 3 probes...</span>
<span class="c">#   path=/tmp/o.png retval=-13</span>
<span class="c"># @[]: 1</span>
<span class="c"># @match: 0</span>
</code></pre></div></div>

<p>This isn’t entirely free of issues: what if there’s more than one call to the function we are tracing? The global variable would be set on one of them, but we could be seeing the return value for another call done with other arguments. Luckly, there was only a single call with this error return value being done, which was confirmed by removing the conditional logic.</p>

<p>We can double check that the return value matches the errno symbol <code class="language-plaintext highlighter-rouge">EACCES</code> in “include/uapi/asm-generic/errno-base.h”:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define	EACCES		13	</span><span class="cm">/* Permission denied */</span><span class="cp">
</span></code></pre></div></div>

<p>Here’s the corresponding call with the python script, which gives a non-error return value:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bpftrace <span class="nt">-e</span> <span class="s1">'
BEGIN { @match = 0; }
tracepoint:syscalls:sys_enter_openat {
    if (strncmp("/tmp/", str(args-&gt;filename), 5) == 0) {
        printf("  path=%s ", str(args-&gt;filename));
        @match = 1;
    }
}
tracepoint:syscalls:sys_exit_openat /@match == 1/ {
    printf("retval=%d\n", args-&gt;ret);
    @[kstack()] = count();
    @match = 0;
}'</span> <span class="nt">-c</span> <span class="s1">'/usr/bin/python3.9 /home/fn/syscall.py'</span>

<span class="c"># Attaching 3 probes...</span>
<span class="c">#   path=/tmp/o.png retval=3</span>
<span class="c"># @[]: 1</span>
<span class="c"># @match: 0</span>
</code></pre></div></div>

<p>Oh, another caveat… the call stack is empty on these probes, so we actually have to use <code class="language-plaintext highlighter-rouge">kprobe</code>, along with <code class="language-plaintext highlighter-rouge">kretprobe</code> to get the return value:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bpftrace <span class="nt">-e</span> <span class="s1">'
BEGIN { @match = 0; }
kprobe:do_sys_openat2 {
    if (strncmp("/tmp/", str(arg1), 5) == 0) {
        printf("  path=%s ", str(arg1));
        @[kstack()] = count();
        @match = 1;
    }
}
kretprobe:do_sys_openat2 /@match == 1/ {
    printf("retval=%d\n", retval);
    @match = 0;
}
'</span> <span class="nt">-c</span> <span class="s1">'/usr/bin/evince-thumbnailer foo.pdf /tmp/o.png'</span>
<span class="c"># Attaching 3 probes...</span>
<span class="c">#   path=/tmp/o.png retval=-13</span>
<span class="c"># @[</span>
<span class="c">#     do_sys_openat2+1</span>
<span class="c">#     __x64_sys_openat+84</span>
<span class="c">#     do_syscall_64+51</span>
<span class="c">#     entry_SYSCALL_64_after_hwframe+97</span>
<span class="c"># ]: 1</span>
<span class="c"># @match: 0</span>
</code></pre></div></div>

<p>But this call stack isn’t very interesting, since it stops at <code class="language-plaintext highlighter-rouge">do_sys_openat2</code>, we actually want to see what happens inside that call. If we look at the source, there are a few open functions called as well:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">long</span> <span class="nf">do_sys_openat2</span><span class="p">(</span><span class="kt">int</span> <span class="n">dfd</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="n">__user</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span>
			   <span class="k">struct</span> <span class="n">open_how</span> <span class="o">*</span><span class="n">how</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">open_flags</span> <span class="n">op</span><span class="p">;</span>
	<span class="kt">int</span> <span class="n">fd</span> <span class="o">=</span> <span class="n">build_open_flags</span><span class="p">(</span><span class="n">how</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">op</span><span class="p">);</span>
	<span class="k">struct</span> <span class="n">filename</span> <span class="o">*</span><span class="n">tmp</span><span class="p">;</span>

	<span class="k">if</span> <span class="p">(</span><span class="n">fd</span><span class="p">)</span>
		<span class="k">return</span> <span class="n">fd</span><span class="p">;</span>

	<span class="n">tmp</span> <span class="o">=</span> <span class="n">getname</span><span class="p">(</span><span class="n">filename</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">IS_ERR</span><span class="p">(</span><span class="n">tmp</span><span class="p">))</span>
		<span class="k">return</span> <span class="n">PTR_ERR</span><span class="p">(</span><span class="n">tmp</span><span class="p">);</span>

	<span class="n">fd</span> <span class="o">=</span> <span class="n">get_unused_fd_flags</span><span class="p">(</span><span class="n">how</span><span class="o">-&gt;</span><span class="n">flags</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">fd</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
		<span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="n">f</span> <span class="o">=</span> <span class="n">do_filp_open</span><span class="p">(</span><span class="n">dfd</span><span class="p">,</span> <span class="n">tmp</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">op</span><span class="p">);</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">IS_ERR</span><span class="p">(</span><span class="n">f</span><span class="p">))</span> <span class="p">{</span>
			<span class="n">put_unused_fd</span><span class="p">(</span><span class="n">fd</span><span class="p">);</span>
			<span class="n">fd</span> <span class="o">=</span> <span class="n">PTR_ERR</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
		<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
			<span class="n">fsnotify_open</span><span class="p">(</span><span class="n">f</span><span class="p">);</span>
			<span class="n">fd_install</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n">f</span><span class="p">);</span>
		<span class="p">}</span>
	<span class="p">}</span>
	<span class="n">putname</span><span class="p">(</span><span class="n">tmp</span><span class="p">);</span>
	<span class="k">return</span> <span class="n">fd</span><span class="p">;</span>
<span class="p">}</span>

<span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="nf">do_filp_open</span><span class="p">(</span><span class="kt">int</span> <span class="n">dfd</span><span class="p">,</span> <span class="k">struct</span> <span class="n">filename</span> <span class="o">*</span><span class="n">pathname</span><span class="p">,</span>
		<span class="k">const</span> <span class="k">struct</span> <span class="n">open_flags</span> <span class="o">*</span><span class="n">op</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">nameidata</span> <span class="n">nd</span><span class="p">;</span>
	<span class="kt">int</span> <span class="n">flags</span> <span class="o">=</span> <span class="n">op</span><span class="o">-&gt;</span><span class="n">lookup_flags</span><span class="p">;</span>
	<span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="n">filp</span><span class="p">;</span>

	<span class="n">set_nameidata</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nd</span><span class="p">,</span> <span class="n">dfd</span><span class="p">,</span> <span class="n">pathname</span><span class="p">);</span>
	<span class="n">filp</span> <span class="o">=</span> <span class="n">path_openat</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nd</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">flags</span> <span class="o">|</span> <span class="n">LOOKUP_RCU</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="n">filp</span> <span class="o">==</span> <span class="n">ERR_PTR</span><span class="p">(</span><span class="o">-</span><span class="n">ECHILD</span><span class="p">)))</span>
		<span class="n">filp</span> <span class="o">=</span> <span class="n">path_openat</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nd</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">flags</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="n">filp</span> <span class="o">==</span> <span class="n">ERR_PTR</span><span class="p">(</span><span class="o">-</span><span class="n">ESTALE</span><span class="p">)))</span>
		<span class="n">filp</span> <span class="o">=</span> <span class="n">path_openat</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nd</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">flags</span> <span class="o">|</span> <span class="n">LOOKUP_REVAL</span><span class="p">);</span>
	<span class="n">restore_nameidata</span><span class="p">();</span>
	<span class="k">return</span> <span class="n">filp</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>So let’s trace one of those instead:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bpftrace <span class="nt">-e</span> <span class="s1">'
BEGIN { @match = 0; }
kprobe:path_openat {
    $path = str(((struct filename *)((struct nameidata *)arg0)-&gt;name)-&gt;name);
    if (strncmp("/tmp/", $path, 5) == 0) {
        printf("  path=%s ", $path);
        @[kstack()] = count();
        @match = 1;
    }
}
kretprobe:path_openat /@match == 1/ {
    printf("retval=%d\n", retval);
    @match = 0;
}
'</span> <span class="nt">-c</span> <span class="s1">'/usr/bin/evince-thumbnailer foo.pdf /tmp/o.png'</span>
<span class="c"># Attaching 3 probes...</span>
<span class="c">#   path=/tmp/o.png retval=-13</span>
<span class="c"># @[</span>
<span class="c">#     path_openat+1</span>
<span class="c">#     do_filp_open+136</span>
<span class="c">#     do_sys_openat2+155</span>
<span class="c">#     __x64_sys_openat+84</span>
<span class="c">#     do_syscall_64+51</span>
<span class="c">#     entry_SYSCALL_64_after_hwframe+97</span>
<span class="c"># ]: 1</span>
<span class="c"># @match: 0</span>
</code></pre></div></div>

<p>Now we know <code class="language-plaintext highlighter-rouge">path_openat</code> is reached, which has all these calls:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="nf">path_openat</span><span class="p">(</span><span class="k">struct</span> <span class="n">nameidata</span> <span class="o">*</span><span class="n">nd</span><span class="p">,</span>
			<span class="k">const</span> <span class="k">struct</span> <span class="n">open_flags</span> <span class="o">*</span><span class="n">op</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="n">flags</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="n">file</span><span class="p">;</span>
	<span class="kt">int</span> <span class="n">error</span><span class="p">;</span>

	<span class="n">file</span> <span class="o">=</span> <span class="n">alloc_empty_file</span><span class="p">(</span><span class="n">op</span><span class="o">-&gt;</span><span class="n">open_flag</span><span class="p">,</span> <span class="n">current_cred</span><span class="p">());</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">IS_ERR</span><span class="p">(</span><span class="n">file</span><span class="p">))</span>
		<span class="k">return</span> <span class="n">file</span><span class="p">;</span>

	<span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="n">file</span><span class="o">-&gt;</span><span class="n">f_flags</span> <span class="o">&amp;</span> <span class="n">__O_TMPFILE</span><span class="p">))</span> <span class="p">{</span>
		<span class="n">error</span> <span class="o">=</span> <span class="n">do_tmpfile</span><span class="p">(</span><span class="n">nd</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">file</span><span class="p">);</span>
	<span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="n">file</span><span class="o">-&gt;</span><span class="n">f_flags</span> <span class="o">&amp;</span> <span class="n">O_PATH</span><span class="p">))</span> <span class="p">{</span>
		<span class="n">error</span> <span class="o">=</span> <span class="n">do_o_path</span><span class="p">(</span><span class="n">nd</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="n">file</span><span class="p">);</span>
	<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
		<span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">s</span> <span class="o">=</span> <span class="n">path_init</span><span class="p">(</span><span class="n">nd</span><span class="p">,</span> <span class="n">flags</span><span class="p">);</span>
		<span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">error</span> <span class="o">=</span> <span class="n">link_path_walk</span><span class="p">(</span><span class="n">s</span><span class="p">,</span> <span class="n">nd</span><span class="p">))</span> <span class="o">&amp;&amp;</span>
		       <span class="p">(</span><span class="n">s</span> <span class="o">=</span> <span class="n">open_last_lookups</span><span class="p">(</span><span class="n">nd</span><span class="p">,</span> <span class="n">file</span><span class="p">,</span> <span class="n">op</span><span class="p">))</span> <span class="o">!=</span> <span class="nb">NULL</span><span class="p">)</span>
			<span class="p">;</span>
		<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">error</span><span class="p">)</span>
			<span class="n">error</span> <span class="o">=</span> <span class="n">do_open</span><span class="p">(</span><span class="n">nd</span><span class="p">,</span> <span class="n">file</span><span class="p">,</span> <span class="n">op</span><span class="p">);</span>
		<span class="n">terminate_walk</span><span class="p">(</span><span class="n">nd</span><span class="p">);</span>
	<span class="p">}</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">likely</span><span class="p">(</span><span class="o">!</span><span class="n">error</span><span class="p">))</span> <span class="p">{</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">likely</span><span class="p">(</span><span class="n">file</span><span class="o">-&gt;</span><span class="n">f_mode</span> <span class="o">&amp;</span> <span class="n">FMODE_OPENED</span><span class="p">))</span>
			<span class="k">return</span> <span class="n">file</span><span class="p">;</span>
		<span class="n">WARN_ON</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
		<span class="n">error</span> <span class="o">=</span> <span class="o">-</span><span class="n">EINVAL</span><span class="p">;</span>
	<span class="p">}</span>
	<span class="n">fput</span><span class="p">(</span><span class="n">file</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">error</span> <span class="o">==</span> <span class="o">-</span><span class="n">EOPENSTALE</span><span class="p">)</span> <span class="p">{</span>
		<span class="k">if</span> <span class="p">(</span><span class="n">flags</span> <span class="o">&amp;</span> <span class="n">LOOKUP_RCU</span><span class="p">)</span>
			<span class="n">error</span> <span class="o">=</span> <span class="o">-</span><span class="n">ECHILD</span><span class="p">;</span>
		<span class="k">else</span>
			<span class="n">error</span> <span class="o">=</span> <span class="o">-</span><span class="n">ESTALE</span><span class="p">;</span>
	<span class="p">}</span>
	<span class="k">return</span> <span class="n">ERR_PTR</span><span class="p">(</span><span class="n">error</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It still isn’t very clear where <code class="language-plaintext highlighter-rouge">-EACCES</code> is being returned… After adapting the above snippet to trace <code class="language-plaintext highlighter-rouge">link_path_walk</code> and then <code class="language-plaintext highlighter-rouge">terminate_walk</code>, we confirm both calls are reached, so we know we enter that else-block, but <code class="language-plaintext highlighter-rouge">link_path_walk</code> doesn’t set the expected error, so it must be set by <code class="language-plaintext highlighter-rouge">do_open</code>.</p>

<hr />

<p>Now we reach the fun part: there are no probes for <code class="language-plaintext highlighter-rouge">do_open</code>. There wasn’t anything special about it to not have probes, but this also happened with a few other functions as well. Perhaps it’s inlined? Let’s confirm with a disassembly.</p>

<p>This was still done with a live image, for which the linux binary was at <code class="language-plaintext highlighter-rouge">/boot/vmlinuz-$(uname -r)</code>, which is stripped from symbols. There are some <a href="https://github.com/therealdreg/linux_kernel_debug_disassemble_ida_vmware">scripts to retrieve symbols and add them to a disassembly</a>, but for now we just want to extract vmlinuz and get the symbols as root with <code class="language-plaintext highlighter-rouge">cat /proc/kallsyms</code>. Since the live image is running with kernel address space layout randomization (KASLR), we need to translate the address of <code class="language-plaintext highlighter-rouge">path_openat</code> with these steps:</p>

<ol>
  <li>Add <code class="language-plaintext highlighter-rouge">kaddr("path_openat")</code> to our bpftrace snippet, which outputs this address:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 9EEE5390
</code></pre></div>    </div>
  </li>
  <li>Find out the virtual base address of the live image by grepping for <code class="language-plaintext highlighter-rouge">_text</code> (a label set at the start of the <code class="language-plaintext highlighter-rouge">.text</code> section) in the dumped <code class="language-plaintext highlighter-rouge">/proc/kallsyms</code>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ffffffff9ec00000 T _text
</code></pre></div>    </div>
  </li>
  <li>Find out the base address of the binary with <code class="language-plaintext highlighter-rouge">objdump -tC vmlinux | grep _text</code>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ffffffff81000000 g .text 0000000000000000 _text
</code></pre></div>    </div>
  </li>
  <li>Translate the live address of <code class="language-plaintext highlighter-rouge">path_openat</code>:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 0xffffffff9eee5390 - 0xffffffff9ec00000 + 0xffffffff81000000 = 0xffffffff812e5390
</code></pre></div>    </div>
  </li>
</ol>

<p>Which gives us this decompilation:</p>

<p><img src="https://nevesnunes.github.io/blog/assets/img/kernel-dis1.png" alt="" /></p>

<p>Even without adding symbols, we already see a much larger number of calls in this function than the ones in the source, and we can lookup some of the constants being compared to confirm that <code class="language-plaintext highlighter-rouge">do_open</code> is inlined.</p>

<div class="c-aside">
  <p>EDIT: Another less stubborn approach would be to <a href="https://blog.viraptor.info/post/a-process-murder-mystery-a-debugging-story">disable ASLR and use debugging symbols</a> with gdb’s <code class="language-plaintext highlighter-rouge">disassemble/m</code>.</p>
</div>

<p>So, how do we trace an inlined function? We could add probes to its callees, but some of them were also inlined. Maybe bisect using offsets like <code class="language-plaintext highlighter-rouge">kprobe:path_openat+7</code>, but no luck there:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Can't check if kprobe is in proper place (compiled without (k|u)probe offset support):
    /usr/lib/debug/boot/vmlinux-5.10.0-18-amd64:path_openat+7
</code></pre></div></div>

<p>Alright, let’s compile a kernel then.</p>

<p>At this point, I decided to switch approaches and instead prepare it for remote debugging, since we would have more context in a gdb session anyway, and hopefully bump into less caveats…</p>

<h2 id="analysis-attempt-2-debugging-the-kernel">Analysis attempt 2: Debugging the kernel</h2>

<p>There are several options on how to setup this. Since I already had a VirtualBox VM with Debian installed, I went with <a href="https://www.adityabasu.me/blog/2020/03/kgdboc-setup/">kgdb over a serial port</a>, which involved the following steps:</p>

<ol>
  <li>On VirtualBox, configure serial port via host pipe “/tmp/vboxS0”;</li>
  <li>On the guest, bootstrap from currently loaded modules:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> make localmodconfig
</code></pre></div>    </div>
  </li>
  <li>Prepare the generated “.config” not only for kgdb, but also to generate “vmlinux-gdb.py”, which allows gdb to correctly resolve kernel addresses and symbols:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> CONFIG_FRAME_POINTER=y
 CONFIG_GDB_SCRIPTS=y
 CONFIG_KGDB=y
 CONFIG_KGDB_SERIAL_CONSOLE=y
 CONFIG_STRICT_KERNEL_RWX=n
</code></pre></div>    </div>
  </li>
  <li>Build kernel:
    <div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code> make bzImage
 make modules
 make modules_install
 make <span class="nb">install</span>
</code></pre></div>    </div>
  </li>
  <li>If you didn’t build in a shared folder, then copy over build artifacts to host;</li>
  <li>Reboot guest, then edit GRUB entry to add boot parameters (on the line with “linux /boot/vmlinuz-…”):
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> kgdboc=ttyS0,115200 nokaslr
</code></pre></div>    </div>
  </li>
</ol>

<p>After starting the graphical session, we make the guest wait for a remote gdb connection:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo </span>g <span class="o">&gt;</span> /proc/sysrq-trigger
</code></pre></div></div>

<p>Now we connect from the host with our gdb client:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gdb <span class="se">\</span>
    <span class="nt">-ex</span> <span class="s1">'set serial baud 115200'</span> <span class="se">\</span>
    <span class="nt">-ex</span> <span class="s1">'target remote /tmp/vboxS0'</span> <span class="se">\</span>
    <span class="nt">-x</span> /usr/src/linux-source-5.10/vmlinux-gdb.py <span class="se">\</span>
    /usr/src/linux-source-5.10/vmlinux
</code></pre></div></div>

<p>For the gdb frontend, I use <a href="https://github.com/cyrus-and/gdb-dashboard">gdb-dashboard</a>, since it makes less assumptions about the gdb server, therefore less possible issues than fancier frontends like <a href="https://github.com/pwndbg/pwndbg">pwndbg</a>. It shows the following panes:</p>

<pre><code class="language-gdb">dashboard -layout assembly !breakpoints !expressions !history memory registers source stack !threads variables
</code></pre>

<p>In our session, we set a conditional breakpoint inside the function of interest, so that it only stops execution when the filename to open is our output file:</p>

<pre><code class="language-gdb">break *path_openat
condition 1 $_streq((char *)nd-&gt;name-&gt;name, "/tmp/o.png")
</code></pre>

<p>After running the evince-thumbnailer command on the guest, the breakpoint is hit, and we can step through the function. However, there’s also caveats here…</p>

<p>Sometimes, before a call instruction is skipped, we get a context switch:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0xffffffff812d4366 path_openat+70 call   0xffffffff812c5c50 &lt;alloc_empty_file&gt;

&gt;&gt;&gt; ni
Warning:
Cannot insert breakpoint 0.
Cannot access memory at address 0x0
</code></pre></div></div>

<p>After another <code class="language-plaintext highlighter-rouge">ni</code>, we are now… back to the function start in another thread?</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Thread 145 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 896]
0xffffffff812d4327 in path_openat (nd=nd@entry=0xffffc900023e3dd0, op=op@entry=0xffffc900023e3ee4, flags=flags@entry=65) at fs/namei.c:3346
3346    fs/namei.c: No such file or directory.
   0xffffffff812d4325 &lt;path_openat+5&gt;:  41 57   push   %r15
=&gt; 0xffffffff812d4327 &lt;path_openat+7&gt;:  41 56   push   %r14
</code></pre></div></div>

<p>Yet the filename doesn’t match our condition:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; p nd-&gt;name-&gt;name
$5 = 0xffff88803504c020 "/proc/1095/cmdline"
</code></pre></div></div>

<p>Whatever, let’s switch back to our thread:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; thread 162
[Switching to thread 162 (Thread 1095)]
#0  0x0000000000000000 in fixed_percpu_data ()
=&gt; 0x0000000000000000 &lt;fixed_percpu_data+0&gt;:    Cannot access memory at address 0x0

&gt;&gt;&gt; ni
Thread 145 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 896]
</code></pre></div></div>

<p>Yet again in the wrong thread… a workaround is to just set another conditional breakpoint somewhere after the place the context switch occurred:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; b *(path_openat+75)
Breakpoint 2 at 0xffffffff812d436b: file fs/namei.c, line 3350.
&gt;&gt;&gt; condition 2 $_streq((char *)nd-&gt;name-&gt;name, "/tmp/o.png")
&gt;&gt;&gt; c
</code></pre></div></div>

<div class="c-aside">
  <p>EDIT: These context switches are likely due to <a href="https://twitter.com/h0mbre_/status/1591253731898961921">interrupts</a>, which can be avoided with <code class="language-plaintext highlighter-rouge">handle SIGINT nostop pass</code>.</p>
</div>

<p>Definitely do not <code class="language-plaintext highlighter-rouge">set scheduler-locking step</code>, unless you like core dumps…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>/build/gdb-Nav6Es/gdb-10.1/gdb/infrun.c:7249: internal-error: 
    int switch_back_to_stepped_thread(execution_control_state*): 
    Assertion `!schedlock_applies (tp)' failed.
...
Aborted (core dumped)
</code></pre></div></div>

<p>Even a simple sanity check isn’t devoid from issues:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; lx-symbols
loading vmlinux
/build/gdb-Nav6Es/gdb-10.1/gdb/dwarf2/frame.c:1085: internal-error: Unknown CFA rule.
...
Aborted (core dumped)
</code></pre></div></div>

<p>Luckly, this doesn’t affect the disassembly itself, so we can tell which functions are being called with <code class="language-plaintext highlighter-rouge">disass /r path_openat</code>. For example, here’s a snippet surronding the else-block where we saw <code class="language-plaintext highlighter-rouge">do_open</code> inlined:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0xffffffff812d4492 &lt;+370&gt;:   a9 00 00 18 00  test   $0x180000,%eax
0xffffffff812d4497 &lt;+375&gt;:   0f 84 07 07 00 00       je     0xffffffff812d4ba4 &lt;path_openat+2180&gt;
0xffffffff812d449d &lt;+381&gt;:   a9 00 00 10 00  test   $0x100000,%eax
0xffffffff812d44a2 &lt;+386&gt;:   0f 84 15 07 00 00       je     0xffffffff812d4bbd &lt;path_openat+2205&gt;
0xffffffff812d44a8 &lt;+392&gt;:   40 f6 c5 40     test   $0x40,%bpl
0xffffffff812d44ac &lt;+396&gt;:   0f 85 99 06 00 00       jne    0xffffffff812d4b4b &lt;path_openat+2091&gt;
0xffffffff812d44b2 &lt;+402&gt;:   41 f6 46 38 02  testb  $0x2,0x38(%r14)
0xffffffff812d44b7 &lt;+407&gt;:   0f 84 3c 08 00 00       je     0xffffffff812d4cf9 &lt;path_openat+2521&gt;
0xffffffff812d44bd &lt;+413&gt;:   49 8b 46 08     mov    0x8(%r14),%rax
0xffffffff812d44c1 &lt;+417&gt;:   8b 00   mov    (%rax),%eax
0xffffffff812d44c3 &lt;+419&gt;:   25 00 00 70 00  and    $0x700000,%eax
0xffffffff812d44c8 &lt;+424&gt;:   3d 00 00 20 00  cmp    $0x200000,%eax
0xffffffff812d44cd &lt;+429&gt;:   0f 85 0c 0e 00 00       jne    0xffffffff812d52df &lt;path_openat+4031&gt;
0xffffffff812d44d3 &lt;+435&gt;:   41 f6 47 46 10  testb  $0x10,0x46(%r15)
0xffffffff812d44d8 &lt;+440&gt;:   0f 84 b3 0a 00 00       je     0xffffffff812d4f91 &lt;path_openat+3185&gt;
0xffffffff812d44de &lt;+446&gt;:   81 e5 ff fd ff ff       and    $0xfffffdff,%ebp
0xffffffff812d44e4 &lt;+452&gt;:   31 db   xor    %ebx,%ebx
0xffffffff812d44e6 &lt;+454&gt;:   44 89 ee        mov    %r13d,%esi
0xffffffff812d44e9 &lt;+457&gt;:   89 ea   mov    %ebp,%edx
0xffffffff812d44eb &lt;+459&gt;:   4c 89 f7        mov    %r14,%rdi
0xffffffff812d44ee &lt;+462&gt;:   e8 8d e1 ff ff  call   0xffffffff812d2680 &lt;may_open&gt;
</code></pre></div></div>

<p>Which maps to this snippet in <code class="language-plaintext highlighter-rouge">do_open</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">file</span><span class="o">-&gt;</span><span class="n">f_mode</span> <span class="o">&amp;</span> <span class="p">(</span><span class="n">FMODE_OPENED</span> <span class="o">|</span> <span class="n">FMODE_CREATED</span><span class="p">)))</span> <span class="p">{</span> <span class="c1">// test $0x180000,%eax</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">file</span><span class="o">-&gt;</span><span class="n">f_mode</span> <span class="o">&amp;</span> <span class="n">FMODE_CREATED</span><span class="p">))</span> <span class="c1">// test $0x100000,%eax</span>
    <span class="c1">// ...</span>
<span class="k">if</span> <span class="p">(</span><span class="n">open_flag</span> <span class="o">&amp;</span> <span class="n">O_CREAT</span><span class="p">)</span> <span class="p">{</span> <span class="c1">// test $0x40,%bpl</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">((</span><span class="n">nd</span><span class="o">-&gt;</span><span class="n">flags</span> <span class="o">&amp;</span> <span class="n">LOOKUP_DIRECTORY</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">d_can_lookup</span><span class="p">(</span><span class="n">nd</span><span class="o">-&gt;</span><span class="n">path</span><span class="p">.</span><span class="n">dentry</span><span class="p">))</span> <span class="c1">// testb $0x2,0x38(%r14) ; left side of expression</span>
    <span class="c1">// ...</span>

<span class="c1">// ...</span>
<span class="n">error</span> <span class="o">=</span> <span class="n">may_open</span><span class="p">(</span><span class="o">&amp;</span><span class="n">nd</span><span class="o">-&gt;</span><span class="n">path</span><span class="p">,</span> <span class="n">acc_mode</span><span class="p">,</span> <span class="n">open_flag</span><span class="p">);</span>
</code></pre></div></div>

<p>As we step through the function, inlined parts are marked in the disassembly, such as this part after calling <code class="language-plaintext highlighter-rouge">link_path_walk</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0xffffffff812d43d1 path_openat+177 call   0xffffffff812d2db0 &lt;link_path_walk&gt;
0xffffffff812d43d6 path_openat+182 mov    %eax,%r13d
0xffffffff812d43d9 path_openat+185 test   %r13d,%r13d
0xffffffff812d43dc open_last_lookups-6  jne    0xffffffff812d4723 &lt;path_openat+1027&gt;
0xffffffff812d43e2 open_last_lookups+0  mov    0x48(%r15),%edx
0xffffffff812d43e6 open_last_lookups+4  mov    0x38(%r15),%eax
</code></pre></div></div>

<p>Recall that we are looking for whatever sets <code class="language-plaintext highlighter-rouge">error = -13</code> (which has unsigned value <code class="language-plaintext highlighter-rouge">0xfffffff3</code>). Eventually we step up to that point, as we can see from the return value stored in <code class="language-plaintext highlighter-rouge">rax</code> after calling <code class="language-plaintext highlighter-rouge">security_path_mknod</code>:</p>

<p><img src="https://nevesnunes.github.io/blog/assets/img/kernel-dis2.png" alt="" /></p>

<p>Having to step until we find the expected return value isn’t very efficient. In theory, we could set a conditional watchpoint in gdb like <code class="language-plaintext highlighter-rouge">watch $rax == 0xfffffff3</code>. In practice, this just hangs both the gdb client as well as the guest VM, to the point where manually stepping is faster. It might be related with the implementation itself, since we do hit more common values, such as <code class="language-plaintext highlighter-rouge">watch $rax == 0</code>.</p>

<hr />

<p>After running evince-thumbnailer again, we now step inside <code class="language-plaintext highlighter-rouge">security_path_mknod</code>, where we see an indirect call:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0xffffffff813cd94b security_path_mknod+27 mov    0xdf6ff6(%rip),%rbx        # 0xffffffff821c4948 &lt;security_hook_heads+328&gt;
0xffffffff813cd96e security_path_mknod+62 call   *0x18(%rbx)
=&gt;
0xffffffff8141a220 apparmor_path_mknod+0 nopl   0x0(%rax,%rax,1)
</code></pre></div></div>

<p>At this point, we have some funny names to search for, leading us to conclude that we are dealing with mandatory access control. But let’s go a bit further, to see where the difference appears between commands.</p>

<p>Here’s the source for some hit functions in “security/apparmor/lsm.c”:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">apparmor_path_mknod</span><span class="p">(</span><span class="k">const</span> <span class="k">struct</span> <span class="n">path</span> <span class="o">*</span><span class="n">dir</span><span class="p">,</span> <span class="k">struct</span> <span class="n">dentry</span> <span class="o">*</span><span class="n">dentry</span><span class="p">,</span>
			       <span class="n">umode_t</span> <span class="n">mode</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">dev</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">return</span> <span class="n">common_perm_create</span><span class="p">(</span><span class="n">OP_MKNOD</span><span class="p">,</span> <span class="n">dir</span><span class="p">,</span> <span class="n">dentry</span><span class="p">,</span> <span class="n">AA_MAY_CREATE</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">int</span> <span class="nf">common_perm_create</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">op</span><span class="p">,</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">path</span> <span class="o">*</span><span class="n">dir</span><span class="p">,</span>
			      <span class="k">struct</span> <span class="n">dentry</span> <span class="o">*</span><span class="n">dentry</span><span class="p">,</span> <span class="n">u32</span> <span class="n">mask</span><span class="p">,</span> <span class="n">umode_t</span> <span class="n">mode</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">path_cond</span> <span class="n">cond</span> <span class="o">=</span> <span class="p">{</span> <span class="n">current_fsuid</span><span class="p">(),</span> <span class="n">mode</span> <span class="p">};</span>

	<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">path_mediated_fs</span><span class="p">(</span><span class="n">dir</span><span class="o">-&gt;</span><span class="n">dentry</span><span class="p">))</span>
		<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>

	<span class="k">return</span> <span class="n">common_perm_dir_dentry</span><span class="p">(</span><span class="n">op</span><span class="p">,</span> <span class="n">dir</span><span class="p">,</span> <span class="n">dentry</span><span class="p">,</span> <span class="n">mask</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">cond</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">int</span> <span class="nf">common_perm_dir_dentry</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">op</span><span class="p">,</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">path</span> <span class="o">*</span><span class="n">dir</span><span class="p">,</span>
				  <span class="k">struct</span> <span class="n">dentry</span> <span class="o">*</span><span class="n">dentry</span><span class="p">,</span> <span class="n">u32</span> <span class="n">mask</span><span class="p">,</span>
				  <span class="k">struct</span> <span class="n">path_cond</span> <span class="o">*</span><span class="n">cond</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">path</span> <span class="n">path</span> <span class="o">=</span> <span class="p">{</span> <span class="p">.</span><span class="n">mnt</span> <span class="o">=</span> <span class="n">dir</span><span class="o">-&gt;</span><span class="n">mnt</span><span class="p">,</span> <span class="p">.</span><span class="n">dentry</span> <span class="o">=</span> <span class="n">dentry</span> <span class="p">};</span>

	<span class="k">return</span> <span class="n">common_perm</span><span class="p">(</span><span class="n">op</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">path</span><span class="p">,</span> <span class="n">mask</span><span class="p">,</span> <span class="n">cond</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">int</span> <span class="nf">common_perm</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">op</span><span class="p">,</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">path</span> <span class="o">*</span><span class="n">path</span><span class="p">,</span> <span class="n">u32</span> <span class="n">mask</span><span class="p">,</span>
		       <span class="k">struct</span> <span class="n">path_cond</span> <span class="o">*</span><span class="n">cond</span><span class="p">)</span>
<span class="p">{</span>
	<span class="k">struct</span> <span class="n">aa_label</span> <span class="o">*</span><span class="n">label</span><span class="p">;</span>
	<span class="kt">int</span> <span class="n">error</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

	<span class="n">label</span> <span class="o">=</span> <span class="n">__begin_current_label_crit_section</span><span class="p">();</span>
	<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">unconfined</span><span class="p">(</span><span class="n">label</span><span class="p">))</span>
		<span class="n">error</span> <span class="o">=</span> <span class="n">aa_path_perm</span><span class="p">(</span><span class="n">op</span><span class="p">,</span> <span class="n">label</span><span class="p">,</span> <span class="n">path</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">mask</span><span class="p">,</span> <span class="n">cond</span><span class="p">);</span>
	<span class="n">__end_current_label_crit_section</span><span class="p">(</span><span class="n">label</span><span class="p">);</span>

	<span class="k">return</span> <span class="n">error</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The interesting part is in <code class="language-plaintext highlighter-rouge">common_perm</code>, where we see that if there’s an “unconfined” label, no further validations are done, otherwise the label is validated by <code class="language-plaintext highlighter-rouge">aa_path_perm</code>.</p>

<p>In our case, we can step up to that call:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0xffffffff8141a341 common_perm+180 call   0xffffffff8141d5c0 &lt;aa_path_perm&gt;
</code></pre></div></div>

<p>There are some details about the operation, such as the requested mask and the path to validate:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; p op
$23 = 0xffffffff8212ff06 "mknod"
&gt;&gt;&gt; p *path-&gt;dentry
$24 = {
  d_flags = 524352,
  d_parent = 0xffff8880345a59c0,
  d_name = {
    name = 0xffff8880085270f8 "o.png"
  },
  d_inode = 0x0 &lt;fixed_percpu_data&gt;,
  d_iname = "o.png", '\000' &lt;repeats 26 times&gt;,
&gt;&gt;&gt; p mask
$25 = 16
&gt;&gt;&gt; p *cond
$27 = {
  uid = {
    val = 1000
  },
  mode = 33206
}
</code></pre></div></div>

<p>In particular, the apparmor profile hierarchical name (hname) is specific to evince-thumbnailer:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; p *label
$21 = {
  count = {
    refcount = {
      refs = {
        counter = 114
      }
    }
  },
  node = {
    __rb_parent_color = 18446612682279220920,
    rb_right = 0x0 &lt;fixed_percpu_data&gt;,
    rb_left = 0x0 &lt;fixed_percpu_data&gt;
  },
  rcu = {
    next = 0x0 &lt;fixed_percpu_data&gt;,
    func = 0x0 &lt;fixed_percpu_data&gt;
  },
  proxy = 0xffff88800e47ced0,
  hname = 0xffff888009fe8304 "/usr/bin/evince-thumbnailer",
  flags = 768,
  secid = 15,
  size = 1,
  vec = 0xffff88800c77eb00
}
</code></pre></div></div>

<p>If we compare with the python script, we see it has profile “unconfined”, which means no restrictions are applied, which is why the syscall passed for it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;&gt;&gt; p *label
$32 = {
  count = {
    refcount = {
      refs = {
        counter = 1943
      }
    }
  },
  node = {
    __rb_parent_color = 18446612682126247608,
    rb_right = 0x0 &lt;fixed_percpu_data&gt;,
    rb_left = 0x0 &lt;fixed_percpu_data&gt;
  },
  rcu = {
    next = 0x0 &lt;fixed_percpu_data&gt;,
    func = 0x0 &lt;fixed_percpu_data&gt;
  },
  proxy = 0xffff8880034510b0,
  hname = 0xffff8880034510a4 "unconfined",
  flags = 666,
  secid = 2,
  size = 1,
  vec = 0xffff88800359c700
}
</code></pre></div></div>

<p>Finally, we can see the actual place where the error is set:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">__aa_path_perm</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">op</span><span class="p">,</span> <span class="k">struct</span> <span class="n">aa_profile</span> <span class="o">*</span><span class="n">profile</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">name</span><span class="p">,</span>
		   <span class="n">u32</span> <span class="n">request</span><span class="p">,</span> <span class="k">struct</span> <span class="n">path_cond</span> <span class="o">*</span><span class="n">cond</span><span class="p">,</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span>
		   <span class="k">struct</span> <span class="n">aa_perms</span> <span class="o">*</span><span class="n">perms</span><span class="p">)</span>
<span class="p">{</span>
	<span class="kt">int</span> <span class="n">e</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>

	<span class="k">if</span> <span class="p">(</span><span class="n">profile_unconfined</span><span class="p">(</span><span class="n">profile</span><span class="p">))</span>
		<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
	<span class="n">aa_str_perms</span><span class="p">(</span><span class="n">profile</span><span class="o">-&gt;</span><span class="n">file</span><span class="p">.</span><span class="n">dfa</span><span class="p">,</span> <span class="n">profile</span><span class="o">-&gt;</span><span class="n">file</span><span class="p">.</span><span class="n">start</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="n">cond</span><span class="p">,</span> <span class="n">perms</span><span class="p">);</span>
	<span class="k">if</span> <span class="p">(</span><span class="n">request</span> <span class="o">&amp;</span> <span class="o">~</span><span class="n">perms</span><span class="o">-&gt;</span><span class="n">allow</span><span class="p">)</span>
		<span class="n">e</span> <span class="o">=</span> <span class="o">-</span><span class="n">EACCES</span><span class="p">;</span>
	<span class="k">return</span> <span class="n">aa_audit_file</span><span class="p">(</span><span class="n">profile</span><span class="p">,</span> <span class="n">perms</span><span class="p">,</span> <span class="n">op</span><span class="p">,</span> <span class="n">request</span><span class="p">,</span> <span class="n">name</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span>
			     <span class="n">cond</span><span class="o">-&gt;</span><span class="n">uid</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">e</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Seems like the rule matching is done with some deterministic finite automaton, but we can leave those internals for another time. The relevant part is in <code class="language-plaintext highlighter-rouge">aa_str_perms</code>: callee <code class="language-plaintext highlighter-rouge">aa_dfa_match</code> computes the dfa state for the given filename, then <code class="language-plaintext highlighter-rouge">aa_compute_fperms</code> lookups the given state in the accept table of the dfa, and returns a permission set of <code class="language-plaintext highlighter-rouge">0x244 == AA_MAY_GETATTR | AA_MAY_OPEN | AA_MAY_READ</code>, which lacks a <code class="language-plaintext highlighter-rouge">AA_MAY_CREATE</code> needed for <code class="language-plaintext highlighter-rouge">mknod</code>.</p>

<h2 id="updating-the-profile">Updating the profile</h2>

<p>Our mandatory access control is probably being a bit too restrictive in what the thumbnailer should be allowed to write.</p>

<p>We can find previous attempts at generating thumbnails with <code class="language-plaintext highlighter-rouge">grep audit /var/log/kern.log</code>, which were being done under path “$HOME/.cache/thumbnails/”:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Nov 12 18:45:39 mnutop kernel: [594720.105420] audit: type=1400 audit(1668278739.933:2214):
    apparmor="DENIED"
    operation="mknod"
    profile="/usr/bin/evince-thumbnailer"
    name="/home/fn/.cache/thumbnails/normal/7e1c6e23bd26e7cd94b85849dd6d1b19.png"
    pid=1685096
    comm="evince-thumbnai"
    requested_mask="c"
    denied_mask="c"
    fsuid=1000
    ouid=1000
</code></pre></div></div>

<p>Our profile of interest is stored under “/etc/apparmor.d/usr.bin.evince”. Let’s allow it to write under the user’s thumbnails directory, by adding this rule:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>owner @{HOME}/.cache/thumbnails/** rw,
</code></pre></div></div>

<p>After reloading with <code class="language-plaintext highlighter-rouge">apparmor_parser -r /etc/apparmor.d/usr.bin.evince</code>, we now have thumbnails appearing in our file manager.</p>

<h2 id="ta-da">Ta-da</h2>

<p>It might feel silly to go a long way to identify an issue that was evident in audit logs, but for all the times where logs can’t save you, it’s nice to know where to look under the cover.</p>]]></content><author><name></name></author><category term="bugfix" /><category term="kernel" /><category term="tracing" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">CTF Writeup - 0CTF 2022 - vintage - part1+2</title><link href="https://nevesnunes.github.io/blog/2022/09/19/CTF-Writeup-0CTF-2022-vintage-part1+2.html" rel="alternate" type="text/html" title="CTF Writeup - 0CTF 2022 - vintage - part1+2" /><published>2022-09-19T22:13:37+01:00</published><updated>2022-09-19T22:13:37+01:00</updated><id>https://nevesnunes.github.io/blog/2022/09/19/CTF-Writeup---0CTF-2022---vintage---part1+2</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2022/09/19/CTF-Writeup-0CTF-2022-vintage-part1+2.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<h1 id="introduction">Introduction</h1>

<p>For these 2 tasks, we are given a binary targeting a console system that uses an 8-bit processor, although it also supports 16-bit addressing. Despite this lesser known target, we can still apply general approaches to understand its internals.</p>

<h1 id="part-1">Part 1</h1>

<blockquote>
  <p>Back to 1980s</p>
</blockquote>

<p>Download: <a href="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/game.bin">game.bin</a></p>

<p>Let’s look at the first bytes with <code class="language-plaintext highlighter-rouge">xxd -l $((0x20)) game.bin</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00000000: 6720 4743 4520 3230 3232 80fd 0df8 50f0  g GCE 2022....P.
00000010: b856 4543 544f 525f 4741 4d45 8000 ce00  .VECTOR_GAME....
</code></pre></div></div>

<p>Searching for “g GCE” shows us that this is a Vectrex game.</p>

<h2 id="tooling">Tooling</h2>

<p>After getting an <a href="http://vectrexmuseum.com/share/coder/">overview of this system</a>, we can pick some appropriate tools:</p>

<ul>
  <li><a href="https://github.com/NationalSecurityAgency/ghidra/pull/1201">Ghidra supports the CPU</a>, which is a <a href="https://ia902906.us.archive.org/18/items/bitsavers_motorola68_13419254/M6809PM.rev0_May83_text.pdf">Motorola 6809</a>;</li>
  <li>I didn’t find any Vectrex loader, but we can lookup a <a href="https://wrongbaud.github.io/posts/writing-a-ghidra-loader/">tutorial</a> and <a href="https://github.com/nevesnunes/ghidra-vectrex-loader">write our own loader</a>;</li>
  <li>MAME emulates this console and features <a href="https://docs.mamedev.org/debugger/index.html">extensive debugging functions</a>, along with save states, allowing us to freely edit memory and easily rollback changes;</li>
</ul>

<p>Our loader doesn’t need to do anything fancy, just lay out the <a href="http://vectrexmuseum.com/share/coder/html/appendixa.htm#Reference">memory map</a>
 to allow us to follow cross-references, and also apply labels for <a href="http://vectrexmuseum.com/share/coder/html/appendixb.htm">I/O ports</a>, so that we can easily tell when e.g. the input controller is being read.</p>

<p>The entry point of the game ROM seems to depend on the length of the strings that preceed it. Looking at some <a href="http://vectrexmuseum.com/share/coder/DIS/TAFT/NEW/ART.ASM">example disassembly</a>, these strings are delimited by byte <code class="language-plaintext highlighter-rouge">\x80</code>, and the last string ends with <code class="language-plaintext highlighter-rouge">\x80\x00</code>. This was also the case with our game, so I used that pattern to start disassembling.</p>

<h2 id="finding-flag-checks">Finding flag checks</h2>

<p>My first step was to collect an instruction trace, so that we can <a href="https://github.com/0ffffffffh/dragondance/issues/23#issuecomment-826111520">highlight in Ghidra</a> reachable boilerplate, but more importantly, <strong>not</strong> highlight some conditions that may be related to flag checks.</p>

<p>Let’s load the game ROM in MAME, while also launching the built-in debugger:</p>

<p><code class="language-plaintext highlighter-rouge">mame vectrex -cart ./game.bin -debug</code></p>

<p>Let’s start tracing with <code class="language-plaintext highlighter-rouge">trace mame.tr,,noloop</code>. We get a typical password prompt, 8 characters selected with arrow buttons, then submitted by pressing “stick 1 button 1” (mapped to LeftCtrl):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/pass.png" alt="" />
</div>

<p>I stopped the trace after submitting and getting a “NO” displayed. We can remove duplicate entries with <code class="language-plaintext highlighter-rouge">&lt;mame.tr sed 's/:.*//g' | sort -u &gt; mame.sort.tr</code> then load the <a href="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/mame.sort.tr">resulting trace log</a> in the Ghidra script mentioned before.</p>

<p>The entry point calls <code class="language-plaintext highlighter-rouge">FUN_2f3c()</code>, where we see the joystick being read:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/dis1.png" alt="" />
</div>

<p>We also see this unreached block:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/dis2.png" alt="" />
</div>

<p>If we look inside the called <code class="language-plaintext highlighter-rouge">FUN_1a63()</code>, there are some interesting checks, where the first one was reached, but the other 7 were not:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/dis3.png" alt="" />
</div>

<p>We also see some xor operations being done before those checks, so likely the password has some simple obfuscation.</p>

<p>Let’s place a breakpoint at the start of these checks, to see which address is loaded in <code class="language-plaintext highlighter-rouge">puVar6</code>. The first check’s disassembly shows that we can break at <code class="language-plaintext highlighter-rouge">bpset 1bf7</code> and inspect the stack register <code class="language-plaintext highlighter-rouge">S</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1bef e6 e9 02 15     LDB        0x215,S
1bf3 e1 e9 01 15     CMPB       0x115,S
1bf7 10 26 00 4e     LBNE       LAB_1c49
</code></pre></div></div>

<p>After submitting password “AAAAAAAA”, we hit the breakpoint and get <code class="language-plaintext highlighter-rouge">S = C995</code>, therefore <code class="language-plaintext highlighter-rouge">puVar6 = C995</code>. Let’s check the memory contents being compared under MAME (Debug &gt; New Memory Window):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/mem1.png" alt="" />
</div>

<p>Is the obfuscation done on a character-by-character basis? Let’s confirm with “BAAAAAAA”:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/mem2.png" alt="" />
</div>

<p>Indeed, only the first byte changed, and it matches on both addresses. Recall in the decompilation that 4 shorts are being compared, so 8 bytes at these addresses. Since the xor operation is commutative, associative, and its own inverse, we can directly extract the expected password:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="c1"># c995+215
</span><span class="o">&gt;&gt;&gt;</span> <span class="n">cbaa</span><span class="o">=</span><span class="sa">b</span><span class="s">"</span><span class="se">\x77\xE5\xAF\x8B\xCD\x04\xD0\xA5</span><span class="s">"</span>
<span class="o">&gt;&gt;&gt;</span> <span class="c1"># c995+115
</span><span class="o">&gt;&gt;&gt;</span> <span class="n">caaa</span><span class="o">=</span><span class="sa">b</span><span class="s">"</span><span class="se">\x74\xE8\xAF\x89\xC7\x0E\xD4\xBD</span><span class="s">"</span>
<span class="o">&gt;&gt;&gt;</span> <span class="s">""</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="nb">chr</span><span class="p">(</span><span class="n">cbaa</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">^</span> <span class="n">caaa</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">^</span> <span class="nb">ord</span><span class="p">(</span><span class="s">'A'</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">))</span>
<span class="s">'BLACKKEY'</span>
</code></pre></div></div>

<p>We got the first flag!</p>

<p>Turns out that the decompilation was misleading about those addresses <code class="language-plaintext highlighter-rouge">c995+10b</code> and <code class="language-plaintext highlighter-rouge">c995+8b</code>. If we look at the disassembly, we confirm that only 8 bytes are being compared, one at a time:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1bef e6 e9 02 15     LDB        0x215,S
1bf3 e1 e9 01 15     CMPB       0x115,S
1bf7 10 26 00 4e     LBNE       LAB_1c49
1bfb e6 e9 02 16     LDB        0x216,S
1bff e1 e9 01 16     CMPB       0x116,S
1c03 26 44           BNE        LAB_1c49
1c05 e6 e9 02 17     LDB        0x217,S
1c09 e1 e9 01 17     CMPB       0x117,S
1c0d 26 3a           BNE        LAB_1c49
1c0f e6 e9 02 18     LDB        0x218,S
1c13 e1 e9 01 18     CMPB       0x118,S
1c17 26 30           BNE        LAB_1c49
1c19 e6 e9 02 19     LDB        0x219,S
1c1d e1 e9 01 19     CMPB       0x119,S
1c21 26 26           BNE        LAB_1c49
1c23 e6 e9 02 1a     LDB        0x21a,S
1c27 e1 e9 01 1a     CMPB       0x11a,S
1c2b 26 1c           BNE        LAB_1c49
1c2d e6 e9 02 1b     LDB        0x21b,S
1c31 e1 e9 01 1b     CMPB       0x11b,S
1c35 26 12           BNE        LAB_1c49
1c37 e6 e9 02 1c     LDB        0x21c,S
1c3b e1 e9 01 1c     CMPB       0x11c,S
1c3f 26 08           BNE        LAB_1c49
1c41 c6 01           LDB        #0x1
1c43 32 e9 02 2c     LEAS       0x22c,S
1c47 35 e0           PULS        Y U PC
</code></pre></div></div>

<h1 id="part-2">Part 2</h1>

<blockquote>
  <p>Find all easter eggs and not to get caught cheating! Wrap what you get with “flag{}” as flag. Attachment is the same as part1.</p>
</blockquote>

<p>After submitting the correct password, we are now in the actual game:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/start.png" alt="" />
</div>

<p>It’s a platformer where we move to the right until we reach the flag (shown at the left, since the screen wraps around). We can collect up to 5 balls along the way by jumping around, but the last ball appears to be unreachable, since we can only fall to a bottomless pit:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/4.png" alt="" />
</div>

<p>Let’s start by taking <a href="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/mame2b.sort.tr">another trace log</a>, where we collect one ball, then get a “GAME OVER” after falling off a platform. If we run into the flag without collecting the easter eggs, we get a “TRY HARDER”. If the actual flag also uses the same message display function, we should have some non-highlighted branches in the caller function.</p>

<h2 id="easter-egg-1-collecting-all-5-balls">Easter egg 1: Collecting all 5 balls</h2>

<p>There’s lots of options here, such as finding how the map is represented in data and place a platform under the ball, or even change the ball position. I choose to just <strong>edit the player’s position</strong> to overlap the ball, then edit it again to return to a platform.</p>

<ul>
  <li>We can run a fixed number of instruction steps at a time until the game registers the ball as captured, but before the player falls to the bottomless pit. I choose MAME’s “Run to next VBlank” (F8), which gave a similar result, with the advantage of running up to a rendered screen frame.</li>
  <li>To locate the player’s position in memory, there’s also various options, but with the same underlying concept: we know that a variable increases if e.g. we move to the right, and decreases if we move to the left, so we just need to observe memory addresses that have such changes. We can monitor memory in real-time with some tool like Cheat Engine, or take various memory dumps and write a script to compare values at each address across dumps. I decided to be lazy and just eyeball the memory, since the work RAM address range was fairly small (<code class="language-plaintext highlighter-rouge">0xc800..0xcbff</code>).</li>
</ul>

<p>So if we jump a few times, we see that the acceleration is stored at <code class="language-plaintext highlighter-rouge">c891</code> (positive when going up, negative when going down), and the y-position is 2 bytes stored at <code class="language-plaintext highlighter-rouge">c896</code>. To collect the ball:</p>

<ol>
  <li>Move the character to the platform right above the 5th ball (y-position = <code class="language-plaintext highlighter-rouge">000e</code>);</li>
  <li>Break into the debugger by pressing “backtick”;</li>
  <li>Edit the position in the memory view to <code class="language-plaintext highlighter-rouge">fffe</code>;</li>
  <li>Run a few frames with “F8”;</li>
  <li>Once the ball is collected (the easter egg counter now reads “1/3”), edit the position back to <code class="language-plaintext highlighter-rouge">000e</code>;</li>
  <li>Resume with “F5”;</li>
</ol>

<p>To figure out the other eggs, we can <strong>lookup variables updated after activating the 1st egg</strong>. Let’s go back to take a few memory dumps with <code class="language-plaintext highlighter-rouge">dump mame.dmp,0xc800,0x400</code>: Before collecting the last ball, we take a baseline (./egg0.dmp), then take another one after moving around and jumping (./egg0b.dmp), and another one after collecting the final ball (./egg1.dmp). The differences between ./egg0.dmp and ./egg0b.dmp can be disregarded, since we are only interested in differences exclusive to ./egg1.dmp.</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- egg0.dmp
</span><span class="gi">+++ egg0b.dmp
</span><span class="p">@@ -1,23 +1,23 @@</span>
 C800:  00 00 00 00 00 00 00 3F 00 00 00 00 00 00 00 00  .......?........
 C810:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
<span class="gd">-C820:  03 00 00 00 00 05 31 3F 05 CC F8 50 CA 50 00 00  ......1?...P.P..
</span><span class="gi">+C820:  03 00 00 00 00 08 D6 3F 05 33 F8 50 CA 50 00 00  .......?.3.P.P..
</span> C830:  00 00 00 00 00 00 00 00 0B 00 00 01 00 30 75 00  .............0u.
 C840:  00 00 00 00 00 3F 00 00 00 00 00 00 00 FC 8D FE  .....?..........
 C850:  E8 FE B6 FD 1C 02 00 00 04 00 03 00 06 00 1F 1F  ................
 C860:  00 00 78 00 A0 01 66 00 00 00 00 00 00 00 00 00  ..x...f.........
 C870:  00 00 00 00 00 00 00 00 00 00 00 C8 7D 01 F3 BF  ............}...
<span class="gd">-C880:  00 2E 01 00 00 00 03 03 00 00 07 06 05 04 29 00  ..............).
-C890:  00 00 00 00 00 00 FF EE 00 00 00 EE 01 00 02 00  ................
-C8A0:  00 00 00 00 2E 01 01 01 00 00 00 00 00 00 C2 0A  ................
-C8B0:  04 2D 33 FB 61 47 00 00 2E 02 00 32 02 08 00 02  .-3.aG.....2....
</span><span class="gi">+C880:  00 2E 00 00 40 00 03 03 00 00 07 06 05 04 29 00  ....@.........).
+C890:  0E 00 00 00 00 00 FF EE 00 00 00 EE 01 00 02 03  ................
+C8A0:  00 00 00 00 2E 01 01 01 00 00 00 00 00 00 C5 B5  ................
+C8B0:  8A 5C 33 FB 61 47 00 00 2E 02 00 32 02 08 00 02  .\3.aG.....2....
</span> C8C0:  00 CE 00 00 D2 00 F8 00 01 00 00 00 00 00 00 00  ................
<span class="gd">-C8D0:  00 00 00 00 02 00 2E 00 00 32 00 08 00 00 00 CE  .........2......
-C8E0:  02 00 D2 02 F8 00 01 00 00 00 00 00 00 00 00 00  ................
-C8F0:  00 00 00 00 40 00 00 1E 02 00 02 02 08 00 02 00  ....@...........
-C900:  FE 00 00 E2 00 F8 00 01 00 00 00 00 00 00 00 00  ................
</span><span class="gi">+C8D0:  00 00 00 00 00 00 2E 00 00 12 00 00 20 00 08 00  ............ ...
+C8E0:  00 00 E0 00 F8 00 01 00 00 00 00 00 00 00 00 00  ................
+C8F0:  00 00 00 00 5E 02 00 02 02 08 00 02 00 FE 00 00  ....^...........
+C900:  A2 00 F8 00 01 F8 00 01 00 00 00 00 00 00 00 00  ................
</span> C910:  02 00 5E 00 00 02 00 08 00 00 00 FE 02 00 A2 02  ..^.............
 C920:  F8 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
<span class="gd">-C930:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
</span><span class="gi">+C930:  00 00 00 00 00 00 00 01 01 01 00 00 00 00 00 00  ................
</span> C940:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 C950:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
</code></pre></div></div>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- egg0.dmp
</span><span class="gi">+++ egg1.dmp
</span><span class="p">@@ -1,21 +1,21 @@</span>
 C800:  00 00 00 00 00 00 00 3F 00 00 00 00 00 00 00 00  .......?........
 C810:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01  ................
<span class="gd">-C820:  03 00 00 00 00 59 54 3F 05 CC F8 50 CA 50 00 00  .....YT?...P.P..
</span><span class="gi">+C820:  03 00 00 00 00 5F 7D 3F 05 33 F8 50 CA 50 00 00  ....._}?.3.P.P..
</span> C830:  00 00 00 00 00 00 00 00 0B 00 00 01 00 30 75 00  .............0u.
 C840:  00 00 00 00 00 3B 1E 01 99 00 00 00 00 FC 8D FE  .....;..........
 C850:  E8 FE B6 FD 1C 02 00 00 04 00 03 00 06 00 1F 1F  ................
 C860:  00 00 78 00 A0 01 66 00 00 00 00 00 00 00 00 00  ..x...f.........
 C870:  00 00 00 00 00 00 00 00 00 00 00 C8 7D 01 F3 BF  ............}...
<span class="gd">-C880:  00 1F 01 00 00 00 03 01 00 00 07 06 05 04 29 00  ..............).
-C890:  18 00 00 00 00 04 00 2E 00 00 00 2E 01 00 02 FE  ................
-C8A0:  0C 00 60 00 1F 01 01 01 00 00 00 00 00 00 C0 C4  ..`.............
-C8B0:  BC BB 94 92 23 35 00 00 1F 02 00 41 02 08 00 02  ....#5.....A....
-C8C0:  00 BF 00 00 E1 00 F8 00 01 00 00 00 00 00 00 00  ................
-C8D0:  00 00 00 00 02 00 1F 00 00 41 00 08 00 00 00 BF  .........A......
-C8E0:  02 00 E1 02 F8 00 01 00 00 00 00 00 00 00 00 00  ................
-C8F0:  00 00 00 00 40 00 00 0F 02 00 11 02 08 00 02 00  ....@...........
-C900:  EF 00 00 F1 00 F8 00 01 00 00 00 00 00 00 00 00  ................
-C910:  02 00 4F 00 00 11 00 08 00 00 00 EF 02 00 B1 02  ..O.............
</span><span class="gi">+C880:  00 1D 01 00 00 00 03 02 00 00 07 06 05 04 29 00  ..............).
+C890:  26 00 00 01 01 05 00 0E 00 00 00 0E 01 00 02 FC  &amp;...............
+C8A0:  0E 00 62 00 1D 01 01 01 00 00 00 00 00 00 B9 03  ..b.............
+C8B0:  42 3E 96 C9 1E B5 00 00 1D 02 00 43 02 08 00 02  B&gt;.........C....
+C8C0:  00 BD 00 00 E3 00 F8 00 01 00 00 00 00 00 00 00  ................
+C8D0:  00 00 00 00 02 00 1D 00 00 43 00 08 00 00 00 BD  .........C......
+C8E0:  02 00 E3 02 F8 00 01 00 00 00 00 00 00 00 00 00  ................
+C8F0:  00 00 00 00 4D 02 00 13 02 08 00 02 00 ED 00 00  ....M...........
+C900:  B3 00 F8 00 01 F8 00 01 00 00 00 00 00 00 00 00  ................
+C910:  02 00 4D 00 00 13 00 08 00 00 00 ED 02 00 B3 02  ..M.............
</span> C920:  F8 00 01 F8 00 01 00 00 00 00 00 00 00 00 00 00  ................
 C930:  00 00 00 00 00 00 00 01 01 01 01 01 01 01 01 01  ................
 C940:  01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01  ................
<span class="p">@@ -33,11 +33,11 @@</span>
 CA00:  01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01  ................
 CA10:  01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01  ................
 CA20:  01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01  ................
<span class="gd">-CA30:  01 01 01 01 01 01 01 07 01 07 07 07 07 10 00 00  ................
-CA40:  D0 20 D0 E0 40 28 A2 C2 F0 0A 4D 01 01 01 01 00  . ..@(....M.....
-CA50:  30 2F 33 80 7F 91 FC E4 B0 87 64 64 64 64 20 80  0/3.......dddd .
-CA60:  8C AA 51 49 3F 66 0D F4 DB 42 33 D0 DE 3E E5 00  ..QI?f...B3..&gt;..
-CA70:  01 02 03 04 00 00 00 00 00 00 00 13 62 2D 6F 6B  ............b-ok
</span><span class="gi">+CA30:  01 01 01 01 01 01 07 07 01 07 01 07 07 F0 10 10  ................
+CA40:  00 00 D0 20 40 28 A2 C2 F0 0A 4D 01 01 01 01 01  ... @(....M.....
+CA50:  31 2F 33 80 7F 91 FC E4 B0 87 64 64 64 64 64 80  1/3.......ddddd.
+CA60:  8C AA 51 49 3F 9A 0D F4 DB 42 33 D0 DE 3E E5 00  ..QI?....B3..&gt;..
+CA70:  01 00 03 00 00 00 00 00 00 00 00 13 62 2D 6F 6B  ............b-ok
</span> CA80:  AC 33 F8 DD 61 F2 DC DA 30 BA 3D 2F 7B 9F 77 44  .3..a...0.=/{.wD
 CA90:  67 C3 5E 7D CC 2B 20 5B 5F FA EF C9 5A FE 54 D9  g.^}.+ [_...Z.T.
 CAA0:  9C 0C 39 BF 4B 59 49 21 1A FF 74 E8 AF 89 C7 0E  ..9.KYI!..t.....
<span class="p">@@ -54,11 +54,11 @@</span>
 CB50:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 CB60:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
 CB70:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
<span class="gd">-CB80:  00 00 6C 9D 31 12 20 AB C1 4C BD 6C 73 10 E0 11  ..l.1. ..L.ls...
-CB90:  45 A2 B8 0A 1C 3F A5 BF 03 8D 25 F8 B9 21 C4 10  E....?....%..!..
-CBA0:  BA 00 0F 48 F3 00 BB 79 83 08 CD 00 BC C5 A5 75  ...H...y.......u
-CBB0:  0D D0 BD C7 9E 60 19 00 4B 00 CB D1 2D 4E 00 A0  .....`..K...-N..
-CBC0:  0C 29 00 F2 F0 32 5D 00 00 00 00 00 61 07 00 07  .)...2].....a...
</span><span class="gi">+CB80:  00 00 71 F1 B3 86 1E D7 14 38 B8 71 75 10 20 13  ..q......8.qu. .
+CB90:  46 A3 B6 07 59 33 9F 00 7E 8B 60 00 B7 E4 C8 F0  F...Y3..~.`.....
+CBA0:  B8 A2 45 11 0A 10 B9 A5 3F 1C 03 10 0B 00 00 15  ..E.....?.......
+CBB0:  58 0E 9C 00 00 26 04 00 00 20 00 12 CB DB 10 2D  X....&amp;... .....-
+CBC0:  3D 29 00 F2 F0 32 5D 00 00 00 00 00 61 07 00 07  =)...2].....a...
</span> CBD0:  07 20 20 20 20 20 20 20 63 80 80 42 4C 41 43 4B  .       c..BLACK
 CBE0:  4B 45 59 80 CA 7B 41 5C 00 50 00 20 20 20 20 20  KEY..{A\.P.
 CBF0:  30 80 00 00 00 00 00 00 00 00 00 00 00 00 73 21  0.............s!
</code></pre></div></div>

<p>Something interesting appears at address <code class="language-plaintext highlighter-rouge">ca50</code>: the characters used for the easter egg display. Let’s go through each one of the cross-references for this address:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis1.png" alt="" />
</div>

<ul>
  <li><code class="language-plaintext highlighter-rouge">FUN_2f3c()</code> is the main function we saw before, it’s just initializing the display with “0/3”:
    <div class="c-container-center">
      <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis2.png" alt="" />
  </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">FUN_2851()</code> has 5 references, and guess what, increments <code class="language-plaintext highlighter-rouge">ca50</code> when the balls counter at <code class="language-plaintext highlighter-rouge">c895</code> has reached value 5. Since the trace was taken when only one ball was collected, the highlights in one of the references goes up to the branch instruction:
    <div class="c-container-center">
      <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis3.png" alt="" />
  </div>
  </li>
  <li><code class="language-plaintext highlighter-rouge">FUN_14d7()</code> has 2 references, and each one leads to the remaining eggs.</li>
</ul>

<h2 id="easter-egg-2-input-button-sequence">Easter egg 2: Input button sequence</h2>

<p>One of the references ends up on an unreached block that compares <code class="language-plaintext highlighter-rouge">c894</code> with value <code class="language-plaintext highlighter-rouge">0x8</code> before incrementing the easter egg counter:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis5.png" alt="" />
</div>

<p>By following the branch cross-references, there’s a block that is checking for player input:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis6.png" alt="" />
</div>

<p>With some experimentation, we see that <code class="language-plaintext highlighter-rouge">c894</code> <strong>increases with a certain sequence of button presses</strong>, but goes back to value 0 otherwise. So we just need to figure out the right sequence, which goes up to 8 buttons. We can use save states by pressing “F7”, going back to the last known good state whenever the counter is reset. We get a “2/3” after pressing “Left Right Up Down LCtrl LCtrl LAlt LAlt”!</p>

<h2 id="easter-egg-3-jump-based-prng">Easter egg 3: “Jump-based PRNG”</h2>

<p>The last reference points to a partial reached block, where variables <code class="language-plaintext highlighter-rouge">c8b3, c8b4, c8b5</code> are compared against fixed values. These variables are set in the called <code class="language-plaintext highlighter-rouge">FUN_3577()</code> and appear to be part of some pseudorandom number generator:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/eggdis4.png" alt="" />
</div>

<p>These values are updated whenever we jump. Also, <code class="language-plaintext highlighter-rouge">c8b2</code> appears to be a counter incremented on each jump, but ignored in these checks.</p>

<h3 id="solution-attempt-find-prng-win-state-via-constraint-solving">Solution attempt: Find PRNG win state via constraint solving</h3>

<p>So, we can just <strong>patch the PRNG state that comes before the state resulting in the checked values</strong>. With a <a href="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/prng.py">Z3 script</a> to generate it based on <code class="language-plaintext highlighter-rouge">FUN_3577()</code> we get the solution <code class="language-plaintext highlighter-rouge">ff 8b fb 08</code>. After patching these values and jumping, the state becomes <code class="language-plaintext highlighter-rouge">00 70 78 bf</code> and we pass these checks.</p>

<p>The counter now reads “3/3”, so we are done right? Not quite… If we run to the flag, we get the message “DONT CHEAT”. Most of the time I wasted with these challenges starts right about now!</p>

<h3 id="oh-patching-the-win-state-doesnt-give-a-unique-solution">Oh, patching the win state doesn’t give a unique solution</h3>

<p>At first, it wasn’t clear what was triggering this anti-cheat. I was fairly confident in the 2nd egg, since it was activated with just input keys. But what if the 1st egg required you to jump before patching the position, in case the game checked if you were in the middle of a jump? What if the 3rd egg required you to patch the state that comes before the win state, in case some intermediate variable was being updated and checked? But none of these variants gave a different result, except one.</p>

<p>Eventually, I tried jumping a different number of times before patching the PRNG state, and <strong>one of the variables checked by the anti-cheat was ending up with different values</strong> on each try.</p>

<h3 id="disassembling-the-anti-cheat">Disassembling the anti-cheat</h3>

<p>Rather than guessing, let’s confirm which variables are being checked and if they are being updated as expected.</p>

<p>If we search for the string “DONT CHEAT”, we see it’s stored at <code class="language-plaintext highlighter-rouge">253d</code>. We can search for bytes <code class="language-plaintext highlighter-rouge">25 3d</code> and find a reference in <code class="language-plaintext highlighter-rouge">FUN_2549()</code> (it didn’t have a cross-reference since it wasn’t recognized as a data address):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/cheatdis1.png" alt="" />
</div>

<p><code class="language-plaintext highlighter-rouge">FUN_1470()</code> likely is the message display function. We see that a stack relative address is used instead of “DONT CHEAT” depending on the value of <code class="language-plaintext highlighter-rouge">c936</code>. It’s a pointer to a table that gets filled at the start of the function with values <code class="language-plaintext highlighter-rouge">0x00..0xff</code>:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/cheatdis2.png" alt="" />
</div>

<p>These values are then scrambled using the value of address <code class="language-plaintext highlighter-rouge">ca60</code>:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/cheatdis3.png" alt="" />
</div>

<p>If we go back to the disassembly of the 1st easter egg activation, we see this same address being xor’d with value 3 at <code class="language-plaintext highlighter-rouge">2b37</code>. In fact, all checks manipulate this value, including the PRNG state checks:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/cheatdis4.png" alt="" />
</div>

<p>So what’s going on? Well, there’s this variable that is also set:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1825 f8 c8 a1        EORB       DAT_c8a1
1828 f7 ca 66        STB        DAT_ca66
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">c8a1</code> is a counter also updated by the PRNG. <code class="language-plaintext highlighter-rouge">ca66</code> is read with a relative offset during the anti-cheat scrambling. We can confirm this by setting a memory read breakpoint with <code class="language-plaintext highlighter-rouge">wp ca66,1,r</code>, which is hit inside the anti-cheat function:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>25bd e6 89 ca 60     LDB        DAT_ca60,X
</code></pre></div></div>

<p>This is why our previous patch didn’t work, we weren’t considering the value of <code class="language-plaintext highlighter-rouge">c8a1</code>.</p>

<h3 id="solution-keep-generating-prng-values-via-debugger-script">Solution: Keep generating PRNG values via debugger script</h3>

<p>What if we just need to, you know, <strong>generate values to win</strong>? That’s exactly what we do with the following debugger script. The idea is to break at each of the 3 PRNG state checks, then if the check against register <code class="language-plaintext highlighter-rouge">B</code> isn’t satisfied, we set the program counter back to the PRNG generator call. We also set a breakpoint at <code class="language-plaintext highlighter-rouge">1809</code> after these checks, just to make sure we eventually stop.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt;bpset 17f1, (B != 70) ,{pc=178a ; g}
Breakpoint 37 set
&gt;bpset 17fa, (B != 78) ,{pc=178a ; g}
Breakpoint 38 set
&gt;bpset 1803, (B != b7) ,{pc=178a ; g}
Breakpoint 39 set
&gt;bpset 1809
Breakpoint 3A set
</code></pre></div></div>

<p>Eventually…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Stopped at breakpoint 3A
&gt;bpclear
Cleared all breakpoints
&gt;go
</code></pre></div></div>

<p>And the PRNG state:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>C8A0  1F 05 39 00 07 01 01 01 00 00 00 00 01 00 C1 5A   ..9...........�Z
C8B0  60 53 6D 70 78 B7 00 00 07 00 00 39 02 00 20 02   `Smpx�.....9.. .
</code></pre></div></div>

<p>Look at <code class="language-plaintext highlighter-rouge">c8a1</code>, we had to jump 1337 times… Is this it? Yes, if we run to the flag, we now get the actual second flag!</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/0CTF2022/flag2.png" alt="" />
</div>]]></content><author><name></name></author><category term="ctf" /><category term="emulation" /><category term="reversing" /><category term="tracing" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">CTF Writeup - PlaidCTF 2022 - coregasm</title><link href="https://nevesnunes.github.io/blog/2022/04/13/CTF-Writeup-PlaidCTF-2022-coregasm.html" rel="alternate" type="text/html" title="CTF Writeup - PlaidCTF 2022 - coregasm" /><published>2022-04-13T01:00:00+01:00</published><updated>2022-04-13T01:00:00+01:00</updated><id>https://nevesnunes.github.io/blog/2022/04/13/CTF-Writeup---PlaidCTF-2022---coregasm</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2022/04/13/CTF-Writeup-PlaidCTF-2022-coregasm.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<h1 id="introduction">Introduction</h1>

<p>We are given an executable and a core dump generated near the end of its execution. If we run the executable run multiple times, we see that it prints out 4 different flags, so we need to figure out the random bytes that were used to build the flags of the given core dump.</p>

<h1 id="description">Description</h1>

<blockquote>
  <p>When you get a core file, you’re usually pretty sad. Hopefully this one makes you happy.</p>
</blockquote>

<p>Download: <a href="https://nevesnunes.github.io/blog/assets/writeups/PlaidCTF2022/coregasm/coregasm">bin</a>, <a href="https://nevesnunes.github.io/blog/assets/writeups/PlaidCTF2022/coregasm/core">core</a></p>

<h1 id="analysis">Analysis</h1>

<p>Symbols were not stripped, so we can open Ghidra and jump right into <code class="language-plaintext highlighter-rouge">main()</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">undefined</span><span class="p">[</span><span class="mi">16</span><span class="p">]</span> <span class="n">main</span><span class="p">(</span><span class="n">undefined8</span> <span class="n">param_1</span><span class="p">,</span><span class="kt">char</span> <span class="o">**</span><span class="n">param_2</span><span class="p">,</span><span class="n">undefined8</span> <span class="n">param_3</span><span class="p">,</span><span class="n">ulong</span> <span class="n">param_4</span><span class="p">)</span> <span class="p">{</span>
  <span class="kt">int</span> <span class="n">iVar1</span><span class="p">;</span>
  <span class="kt">ssize_t</span> <span class="n">sVar2</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">__line</span><span class="p">;</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">__assertion</span><span class="p">;</span>

  <span class="n">puts</span><span class="p">(</span><span class="s">"Would you like to see a magic trick?"</span><span class="p">);</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"Printing all the flags..."</span><span class="p">);</span>
  <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
  <span class="n">iVar1</span> <span class="o">=</span> <span class="n">open</span><span class="p">(</span><span class="s">"/dev/urandom"</span><span class="p">,</span><span class="mi">0</span><span class="p">);</span>
  <span class="n">sVar2</span> <span class="o">=</span> <span class="n">read</span><span class="p">(</span><span class="n">iVar1</span><span class="p">,</span><span class="n">globalbuf</span><span class="p">,</span><span class="mh">0x40</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">sVar2</span> <span class="o">==</span> <span class="mh">0x40</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">close</span><span class="p">(</span><span class="n">iVar1</span><span class="p">);</span>
    <span class="n">flag4</span><span class="p">(</span><span class="n">globalbuf</span><span class="p">);</span>
    <span class="n">flag3</span><span class="p">(</span><span class="n">globalbuf</span><span class="p">);</span>
    <span class="n">flag2</span><span class="p">(</span><span class="n">globalbuf</span><span class="p">);</span>
    <span class="n">flag1</span><span class="p">(</span><span class="n">globalbuf</span><span class="p">);</span>
    <span class="n">puts</span><span class="p">(</span><span class="s">"///time for core///"</span><span class="p">);</span>
    <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
    <span class="n">iVar1</span> <span class="o">=</span> <span class="n">strcmp</span><span class="p">(</span><span class="s">"///time for core///"</span><span class="p">,</span><span class="o">*</span><span class="n">param_2</span><span class="p">);</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
      <span class="k">return</span> <span class="n">ZEXT816</span><span class="p">(</span><span class="n">param_4</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x40</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="n">__line</span> <span class="o">=</span> <span class="mh">0xc5</span><span class="p">;</span>
    <span class="n">__assertion</span> <span class="o">=</span> <span class="s">"strcmp(</span><span class="se">\"</span><span class="s">///time for core///</span><span class="se">\"</span><span class="s">, argv[0]) == 0"</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="k">else</span> <span class="p">{</span>
    <span class="n">__line</span> <span class="o">=</span> <span class="mh">0xbb</span><span class="p">;</span>
    <span class="n">__assertion</span> <span class="o">=</span> <span class="s">"x == 64"</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="n">__assert_fail</span><span class="p">(</span><span class="n">__assertion</span><span class="p">,</span><span class="s">"./main.c"</span><span class="p">,</span><span class="n">__line</span><span class="p">,(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">__PRETTY_FUNCTION__</span><span class="p">.</span><span class="mi">3855</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We see that 0x40 random bytes are stored in <code class="language-plaintext highlighter-rouge">globalbuf</code>, which has address <code class="language-plaintext highlighter-rouge">0x001040a0</code> (base address <code class="language-plaintext highlighter-rouge">0x00100000</code> + offset <code class="language-plaintext highlighter-rouge">0x40a0</code>), located in section <code class="language-plaintext highlighter-rouge">.bss</code>, so indeed a static / global variable. It is then passed as argument on each flag function call. Each of these functions can be seen as a self-contained task.</p>

<p>We start with <code class="language-plaintext highlighter-rouge">flag1()</code>, since the global buffer state present in the core dump should reflect the operations done in that last function call.</p>

<h2 id="task-1-co">Task 1: co</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">flag1</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="n">param_1</span><span class="p">)</span> <span class="p">{</span>
  <span class="kt">long</span> <span class="n">i</span><span class="p">;</span>

  <span class="o">*</span><span class="n">param_1</span> <span class="o">=</span> <span class="o">*</span><span class="n">param_1</span> <span class="o">^</span> <span class="mh">0x80083ed7e794313b</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x75136ebbbf60734f</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x6c46a704af4d8380</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xc1991ab8c1674bbf</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xdc0b819132401105</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xaf4464465d7d4dc0</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x9ead54bd51956632</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xc4d2c981312f974</span><span class="p">;</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"Flag 1:"</span><span class="p">);</span>
  <span class="n">puts</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span><span class="p">);</span>
  <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
  <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">^</span> <span class="mh">0xa5</span><span class="p">;</span>
    <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="mh">0x40</span><span class="p">);</span>
  <span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We see that whatever state <code class="language-plaintext highlighter-rouge">flag2()</code> left in the global buffer, it is xor’d with some constants, a flag is printed, then the buffer is again xor’d with a single byte. Doing the reverse operation should be straightforward, but first we need to find the buffer’s bytes in the core dump.</p>

<h3 id="solution-1-finding-bytes-using-surrounding-addresses">Solution 1: Finding bytes using surrounding addresses</h3>

<p>Let’s debug the executable. We place a break at the end of <code class="language-plaintext highlighter-rouge">flag1()</code>, take note of the random bytes we obtain, then take our own core dump. We should be able to find these bytes, then extrapolate the location in the provided core dump:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; b *(0x555555554000 + 0x1342)
pwndbg&gt; r
...
pwndbg&gt; x/20wx 0x5555555580a0
0x5555555580a0: 0xa27718b2      0xa583e99c      0xb3fd791d      0x9db0e59c
0x5555555580b0: 0x53ebe960      0xf1d42dd2      0x335f11f8      0xe1bdc001
0x5555555580c0: 0x47b6b7a7      0x21da16a8      0x40a05792      0x15f31778
0x5555555580d0: 0x6a53548b      0xa196f2c5      0x1d003010      0x4c6df0b1
0x5555555580e0: 0x00000000      0x00000000      0x00000000      0x00000000
pwndbg&gt; generate-core-file
</code></pre></div></div>

<p>To find the offset of one of the 64bit words:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>binwalk <span class="nt">-R</span> <span class="s1">'\xb2\x18\x77\xa2\x9c\xe9\x83\xa5'</span> core.1
<span class="c"># 0x3508</span>
</code></pre></div></div>

<p>Let’s look at a hex dump near this offset:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>000034b0: 5038 e3f7 ff7f 0000 70ef eaf7 ff7f 0000  P8......p.......
000034c0: b650 5555 5555 0000 0000 0000 0000 0000  .PUUUU..........
000034d0: 6880 5555 5555 0000 0000 0000 0000 0000  h.UUUU..........
000034e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000034f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003500: 0000 0000 0000 0000 b218 77a2 9ce9 83a5  ..........w.....
00003510: 1d79 fdb3 9ce5 b09d 60e9 eb53 d22d d4f1  .y......`..S.-..
00003520: f811 5f33 01c0 bde1 a7b7 b647 a816 da21  .._3.......G...!
00003530: 9257 a040 7817 f315 8b54 536a c5f2 96a1  .W.@x....TSj....
00003540: 1030 001d b1f0 6d4c 0000 0000 0000 0000  .0....mL........
</code></pre></div></div>

<p>Before the 0x40 buffer bytes, there seems to be some addresses for our process image map (base address <code class="language-plaintext highlighter-rouge">0x555555554000</code>) and also some libc addresses (base address <code class="language-plaintext highlighter-rouge">0x7ffff7dbe000</code>). The process image base address seems to be present at offset <code class="language-plaintext highlighter-rouge">0x88</code> in our core dump, and we do find a similar base address at the same offset in the provided core dump (<code class="language-plaintext highlighter-rouge">0x55fa6cf06000</code>).</p>

<p>Although the buffer is at offset <code class="language-plaintext highlighter-rouge">0x3508</code> in our core dump, that offset does not match the provided core dump. However, we can expect the buffer to be placed after a similar set of addresses we identify earlier. Let’s take the high bytes of the process image base address and check those offsets:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>binwalk <span class="nt">-R</span> <span class="s1">'\xf0\x6c\xfa\x55'</span> core
</code></pre></div></div>

<p>Eventually we bump into the buffer bytes at <code class="language-plaintext highlighter-rouge">0x30a0</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00003040: 0011 d82e 567f 0000 909c c92e 567f 0000  ....V.......V...
00003050: d03e d12e 567f 0000 e0a1 c92e 567f 0000  .&gt;..V.......V...
00003060: 0000 0000 0000 0000 68a0 f06c fa55 0000  ........h..l.U..
00003070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000030a0: f5e6 f1e3 dec7 c4cb c4cb c4fa c7c4 cbc4  ................
000030b0: cbc4 d8a5 8585 8585 8585 8585 8585 8585  ................
000030c0: 8585 8585 8585 8585 8585 8585 8585 8585  ................
000030d0: 8585 8585 8585 8585 8585 8585 8585 8585  ................
</code></pre></div></div>

<p>We can now extract and apply the last xor operation to these bytes:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="n">v</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">'core'</span><span class="p">,</span><span class="s">'rb'</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
<span class="o">&gt;&gt;&gt;</span> <span class="s">''</span><span class="p">.</span><span class="n">join</span><span class="p">([</span><span class="nb">chr</span><span class="p">(</span><span class="n">x</span> <span class="o">^</span> <span class="mh">0xa5</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">v</span><span class="p">[</span><span class="mh">0x30a0</span> <span class="p">:</span> <span class="mh">0x30a0</span> <span class="o">+</span> <span class="mh">0x40</span><span class="p">]])</span>
<span class="s">'PCTF{banana_banana}</span><span class="se">\x00</span><span class="s">                                            '</span>
</code></pre></div></div>

<p>Before moving on to the next task, let’s calculate the expected buffer bytes when <code class="language-plaintext highlighter-rouge">flag1()</code> gets called:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">v2</span><span class="o">=</span><span class="sa">b</span><span class="s">''</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">0</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">1</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x80083ed7e794313b</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">1</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">2</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x75136ebbbf60734f</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">2</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">3</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x6c46a704af4d8380</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">3</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">4</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xc1991ab8c1674bbf</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">4</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">5</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xdc0b819132401105</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">5</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">6</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xaf4464465d7d4dc0</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">6</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">7</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x9ead54bd51956632</span><span class="p">)</span>
<span class="n">v2</span><span class="o">+=</span><span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span><span class="n">v</span><span class="p">[</span><span class="mh">0x8</span> <span class="o">*</span> <span class="mi">7</span> <span class="p">:</span> <span class="mh">0x8</span> <span class="o">*</span> <span class="mi">8</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xc4d2c981312f974</span><span class="p">)</span>
</code></pre></div></div>

<p>A quick and dirty way to check these are the correct bytes is to rerun the executable, breaking before the call at <code class="language-plaintext highlighter-rouge">b *(0x555555554000 + 0x116e)</code>, then setting the buffer with these bytes, confirming that the flag is printed as expected. To generate gdb commands that set the buffer:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">chunks</span><span class="p">(</span><span class="n">lst</span><span class="p">,</span> <span class="n">n</span><span class="p">):</span>
   <span class="k">return</span> <span class="p">[</span><span class="n">lst</span><span class="p">[</span><span class="n">i</span> <span class="o">-</span> <span class="n">n</span> <span class="p">:</span> <span class="n">i</span><span class="p">]</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">lst</span><span class="p">)</span> <span class="o">+</span> <span class="n">n</span><span class="p">,</span> <span class="n">n</span><span class="p">)]</span>

<span class="k">def</span> <span class="nf">mapx</span><span class="p">(</span><span class="n">x</span><span class="p">):</span>
    <span class="k">global</span> <span class="n">i</span>
    <span class="n">s</span><span class="o">=</span><span class="sa">f</span><span class="s">'set *(char**)($rdi+</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">) = 0x'</span><span class="o">+</span><span class="n">x</span>
    <span class="n">i</span><span class="o">+=</span><span class="mi">8</span>
    <span class="k">return</span> <span class="n">s</span>

<span class="n">i</span><span class="o">=</span><span class="mi">0</span>
<span class="k">print</span><span class="p">(</span><span class="o">*</span><span class="nb">map</span><span class="p">(</span><span class="n">mapx</span><span class="p">,</span> <span class="n">chunks</span><span class="p">(</span><span class="n">v2</span><span class="p">.</span><span class="nb">hex</span><span class="p">(),</span> <span class="mi">16</span><span class="p">)),</span> <span class="n">sep</span><span class="o">=</span><span class="s">'</span><span class="se">\n</span><span class="s">'</span><span class="p">)</span>
</code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set *(char**)($rdi+0) = 0xee695caca1c0726b
set *(char**)($rdi+8) = 0x147d0fd9e0011d2e
set *(char**)($rdi+16) = 0x4c668724af30e2ee
set *(char**)($rdi+24) = 0xe1b93a98e1476b9f
set *(char**)($rdi+32) = 0xfc2ba1b112603125
set *(char**)($rdi+40) = 0x8f6444667d5d6de0
set *(char**)($rdi+48) = 0xbe8d749d71b54612
set *(char**)($rdi+56) = 0x2c6d0cb83332d954
</code></pre></div></div>

<h3 id="solution-2-reading-buffer-from-core-dump-in-debugger">Solution 2: Reading buffer from core dump in debugger</h3>

<p>There’s a more direct way to extract the buffer bytes, without having to locate their offset in the file: after finding the process image base address, we can load the core dump in the debugger and just read the bytes at the image base plus the offset of the global buffer:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gdb coregasm core
...
pwndbg&gt; x/20wx (0x55fa6cf06000 + 0x40a0)
0x55fa6cf0a0a0: 0xe3f1e6f5      0xcbc4c7de      0xfac4cbc4      0xc4cbc4c7
0x55fa6cf0a0b0: 0xa5d8c4cb      0x85858585      0x85858585      0x85858585
0x55fa6cf0a0c0: 0x85858585      0x85858585      0x85858585      0x85858585
0x55fa6cf0a0d0: 0x85858585      0x85858585      0x85858585      0x85858585
0x55fa6cf0a0e0: 0x00000000      0x00000000      0x00000000      0x00000000
</code></pre></div></div>

<h2 id="task-2-re">Task 2: re</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">flag2</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="n">param_1</span><span class="p">)</span> <span class="p">{</span>
  <span class="kt">FILE</span> <span class="o">*</span><span class="n">__stream</span><span class="p">;</span>
  <span class="kt">size_t</span> <span class="n">sVar1</span><span class="p">;</span>
  <span class="kt">long</span> <span class="n">i</span><span class="p">;</span>
  <span class="n">byte</span> <span class="n">otpbuf</span> <span class="p">[</span><span class="mi">64</span><span class="p">];</span>
  <span class="n">byte</span> <span class="n">otpbuf2</span> <span class="p">[</span><span class="mi">72</span><span class="p">];</span>

  <span class="n">__stream</span> <span class="o">=</span> <span class="n">fopen</span><span class="p">(</span><span class="s">"./otp"</span><span class="p">,</span><span class="s">"r"</span><span class="p">);</span>
  <span class="n">sVar1</span> <span class="o">=</span> <span class="n">fread</span><span class="p">(</span><span class="n">otpbuf</span><span class="p">,</span><span class="mh">0x80</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="n">__stream</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">sVar1</span> <span class="o">!=</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">__assert_fail</span><span class="p">(</span><span class="s">"items == 1"</span><span class="p">,</span><span class="s">"./main.c"</span><span class="p">,</span><span class="mh">0x2a</span><span class="p">,</span><span class="s">"flag2"</span><span class="p">);</span>
  <span class="p">}</span>
  <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">^</span> <span class="n">otpbuf</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="mh">0x40</span><span class="p">);</span>
  <span class="o">*</span><span class="n">param_1</span> <span class="o">=</span> <span class="o">*</span><span class="n">param_1</span> <span class="o">^</span> <span class="mh">0x6301641f2866c34b</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x1eb4def5ac740dcf</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x4f490b1c93df4671</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x9f82c6ec691ca0b0</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xc2d142fcaf5dca6b</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xfa68305eb42fcb00</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x62212646a9e04b61</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xbb73ad9a9992c6b</span><span class="p">;</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"Flag 2:"</span><span class="p">);</span>
  <span class="n">puts</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span><span class="p">);</span>
  <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
  <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">byte</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="n">i</span><span class="p">)</span> <span class="o">^</span> <span class="n">otpbuf2</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
    <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="mh">0x40</span><span class="p">);</span>
  <span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This time, xor operations are done with 0x80 bytes read from a file. However, note that the local variables for <code class="language-plaintext highlighter-rouge">./otp</code> aren’t zero’d, so their contents should still be resident in the core dump.</p>

<h3 id="solution-1-bruteforcing-for-candidates">Solution 1: Bruteforcing for candidates</h3>

<p>What do we know about the input? That the first 8 bytes are <code class="language-plaintext highlighter-rouge">0xee695caca1c0726b</code> (calculated in the previous task), that when xor’d with <code class="language-plaintext highlighter-rouge">0x6301641f2866c34b</code> and the first 8 bytes of <code class="language-plaintext highlighter-rouge">./otp</code>, the resulting bytes should contain <code class="language-plaintext highlighter-rouge">PCTF{</code> in little endian (<code class="language-plaintext highlighter-rouge">0x7b46544350</code>). We can simply scan the whole core dump for byte patterns that satisfy these conditions:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">v</span><span class="p">)</span> <span class="o">-</span> <span class="mh">0x8</span><span class="p">):</span>
    <span class="n">chunk1</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;Q"</span><span class="p">,</span> <span class="n">v</span><span class="p">[</span><span class="n">i</span> <span class="p">:</span> <span class="n">i</span> <span class="o">+</span> <span class="mh">0x8</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">candidate</span> <span class="o">=</span> <span class="n">chunk1</span> <span class="o">^</span> <span class="mh">0xEE695CACA1C0726B</span>
    <span class="n">candidate_5c</span> <span class="o">=</span> <span class="n">candidate</span> <span class="o">&amp;</span> <span class="mh">0xFFFFFFFFFF</span>
    <span class="k">if</span> <span class="n">candidate_5c</span> <span class="o">==</span> <span class="mh">0x7B46544350</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="nb">hex</span><span class="p">(</span><span class="n">i</span><span class="p">))</span>
        <span class="c1"># 0xde4
</span>        <span class="c1"># 0xff8
</span>        <span class="c1"># 0x54e0
</span>        <span class="c1"># 0x53d70
</span></code></pre></div></div>

<p>Out of these results, offset <code class="language-plaintext highlighter-rouge">0x54e0</code> seems to be the only one surrounded by 0x80 initialized bytes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>000054a0: 1b80 32da 788c 0df2 65c6 a032 97bf da7f  ..2.x...e..2....
000054b0: 1f27 fbf1 7d65 28de d181 7e08 82a7 ec01  .'..}e(...~.....
000054c0: 0a40 10f5 3817 6367 ea4e ba20 7f10 48da  .@..8.cg.N. ..H.
000054d0: 406b c089 6e86 2472 4b0c b989 814c a339  @k..n.$rK....L.9
000054e0: 3b31 94e7 d73e 0880 4f73 60ca bb6e 1375  ;1...&gt;..Os`..n.u
000054f0: 8083 14cd 45e9 0722 fe4a 2580 f65b d780  ....E..".J%..[..
00005500: 5831 4032 9181 0bdc c04d 7d5d 4664 44af  X1@2.....M}]FdD.
00005510: 3266 9551 bd54 ad9e 74f9 1213 982c 4d0c  2f.Q.T..t....,M.
</code></pre></div></div>

<p>Now we extract these bytes and apply the xor operations:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flag1_input_Qs</span> <span class="o">=</span> <span class="p">[</span>
    <span class="mh">0xEE695CACA1C0726B</span><span class="p">,</span>
    <span class="mh">0x147D0FD9E0011D2E</span><span class="p">,</span>
    <span class="mh">0x4C668724AF30E2EE</span><span class="p">,</span>
    <span class="mh">0xE1B93A98E1476B9F</span><span class="p">,</span>
    <span class="mh">0xFC2BA1B112603125</span><span class="p">,</span>
    <span class="mh">0x8F6444667D5D6DE0</span><span class="p">,</span>
    <span class="mh">0xBE8D749D71B54612</span><span class="p">,</span>
    <span class="mh">0x2C6D0CB83332D954</span><span class="p">,</span>
<span class="p">]</span>
<span class="n">flag2_xor_Qs</span> <span class="o">=</span> <span class="p">[</span>
    <span class="mh">0x6301641F2866C34B</span><span class="p">,</span>
    <span class="mh">0x1EB4DEF5AC740DCF</span><span class="p">,</span>
    <span class="mh">0x4F490B1C93DF4671</span><span class="p">,</span>
    <span class="mh">0x9F82C6EC691CA0B0</span><span class="p">,</span>
    <span class="mh">0xC2D142FCAF5DCA6B</span><span class="p">,</span>
    <span class="mh">0xFA68305EB42FCB00</span><span class="p">,</span>
    <span class="mh">0x62212646A9E04B61</span><span class="p">,</span>
    <span class="mh">0xBB73AD9A9992C6B</span><span class="p">,</span>
<span class="p">]</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">v</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>

<span class="n">pbuf</span> <span class="o">=</span> <span class="mh">0x54E0</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="s">'./otp'</span><span class="p">,</span> <span class="s">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">v</span><span class="p">[</span><span class="n">pbuf</span> <span class="o">-</span> <span class="mh">0x40</span> <span class="p">:</span> <span class="n">pbuf</span> <span class="o">+</span> <span class="mh">0x40</span><span class="p">])</span>

<span class="n">v2</span> <span class="o">=</span> <span class="sa">b</span><span class="s">""</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mh">0x40</span><span class="p">,</span> <span class="mi">8</span><span class="p">):</span>
    <span class="n">otp1</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;Q"</span><span class="p">,</span> <span class="n">v</span><span class="p">[</span><span class="n">pbuf</span> <span class="o">-</span> <span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="p">:</span> <span class="n">pbuf</span> <span class="o">-</span> <span class="mh">0x40</span> <span class="o">+</span> <span class="n">i</span> <span class="o">+</span> <span class="mh">0x8</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">otp2</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">"&lt;Q"</span><span class="p">,</span> <span class="n">v</span><span class="p">[</span><span class="n">pbuf</span> <span class="o">+</span> <span class="n">i</span> <span class="p">:</span> <span class="n">pbuf</span> <span class="o">+</span> <span class="n">i</span> <span class="o">+</span> <span class="mh">0x8</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span>

    <span class="n">flag2_Q</span> <span class="o">=</span> <span class="n">flag1_input_Qs</span><span class="p">[</span><span class="n">i</span> <span class="o">//</span> <span class="mi">8</span><span class="p">]</span>
    <span class="n">candidate</span> <span class="o">=</span> <span class="n">otp2</span> <span class="o">^</span> <span class="n">flag2_Q</span>
    <span class="n">v2</span> <span class="o">+=</span> <span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;Q"</span><span class="p">,</span> <span class="n">candidate</span><span class="p">)</span>
    <span class="k">print</span><span class="p">(</span>
        <span class="sa">f</span><span class="s">'set *(char**)($rbx+</span><span class="si">{</span><span class="n">i</span><span class="si">}</span><span class="s">) = 0x</span><span class="si">{</span> <span class="n">s</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&gt;Q"</span><span class="p">,</span> <span class="n">candidate</span> <span class="o">^</span> <span class="n">flag2_xor_Qs</span><span class="p">[</span><span class="n">i</span> <span class="o">//</span> <span class="mi">8</span><span class="p">]</span> <span class="o">^</span> <span class="n">otp1</span><span class="p">).</span><span class="nb">hex</span><span class="p">()</span> <span class="si">}</span><span class="s">'</span>
    <span class="p">)</span>
<span class="k">print</span><span class="p">(</span><span class="n">v2</span><span class="p">)</span>
</code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set *(char**)($rbx+0) = 0xff6d8a1cb4000000
set *(char**)($rbx+8) = 0x00000000b4b5a5cb
set *(char**)($rbx+16) = 0xff00000000000000
set *(char**)($rbx+24) = 0xff00000000000000
set *(char**)($rbx+32) = 0x859275e47a6d8a1c
set *(char**)($rbx+40) = 0x00000001b4b5a5ca
set *(char**)($rbx+48) = 0x3025800800000001
set *(char**)($rbx+56) = 0x1234567800000000
b'PCTF{banana*banana$banana!banana}\x00                              '
</code></pre></div></div>

<h3 id="solution-2-parsing-the-otp-file-structure">Solution 2: Parsing the <code class="language-plaintext highlighter-rouge">./otp</code> <code class="language-plaintext highlighter-rouge">FILE</code> structure</h3>

<p>Since we are dealing with an open file, the handle should be present in the core dump. It is represented by the <a href="https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/bits/types/struct_FILE.h;h=1eb429888c459fcd443d78fdea4f3c95a026e269;hb=45a8e05785a617683bbaf83f756cada7a4a425b9"><code class="language-plaintext highlighter-rouge">_IO_FILE</code> structure</a>, which contains the following pointers:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">_IO_FILE</span>
<span class="p">{</span>
  <span class="kt">int</span> <span class="n">_flags</span><span class="p">;</span>           <span class="cm">/* High-order word is _IO_MAGIC; rest is flags. */</span>

  <span class="cm">/* The following pointers correspond to the C++ streambuf protocol. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_read_ptr</span><span class="p">;</span>   <span class="cm">/* Current read pointer */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_read_end</span><span class="p">;</span>   <span class="cm">/* End of get area. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_read_base</span><span class="p">;</span>  <span class="cm">/* Start of putback+get area. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_write_base</span><span class="p">;</span> <span class="cm">/* Start of put area. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_write_ptr</span><span class="p">;</span>  <span class="cm">/* Current put pointer. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_write_end</span><span class="p">;</span>  <span class="cm">/* End of put area. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_buf_base</span><span class="p">;</span>   <span class="cm">/* Start of reserve area. */</span>
  <span class="kt">char</span> <span class="o">*</span><span class="n">_IO_buf_end</span><span class="p">;</span>    <span class="cm">/* End of reserve area. */</span>
  <span class="cm">/* ... */</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We can use as signature the <a href="https://sourceware.org/git/?p=glibc.git;a=blob;f=libio/libio.h;h=d0184df878422d7495367007dcbce85d309e2a81;hb=HEAD">magic bytes of field <code class="language-plaintext highlighter-rouge">_flags</code></a>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Magic number and bits for the _flags field.  The magic number is
   mostly vestigial, but preserved for compatibility.  It occupies the
   high 16 bits of _flags; the low 16 bits are actual flag bits.  */</span>

<span class="cp">#define _IO_MAGIC         0xFBAD0000 </span><span class="cm">/* Magic number */</span><span class="cp">
</span></code></pre></div></div>

<p>Let’s find these handles and print out the corresponding pointers:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span>
<span class="kn">import</span> <span class="nn">re</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>

<span class="n">matches</span> <span class="o">=</span> <span class="p">[</span><span class="n">x</span><span class="p">.</span><span class="n">start</span><span class="p">()</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">re</span><span class="p">.</span><span class="n">finditer</span><span class="p">(</span><span class="sa">b</span><span class="s">'</span><span class="se">\xad\xfb</span><span class="s">'</span><span class="p">,</span> <span class="n">data</span><span class="p">)]</span>
<span class="k">for</span> <span class="n">match</span> <span class="ow">in</span> <span class="n">matches</span><span class="p">:</span>
    <span class="n">offset</span> <span class="o">=</span> <span class="n">match</span> <span class="o">-</span> <span class="mi">2</span> <span class="c1"># include 2 bytes for low-bits of actual flag bits.
</span>    <span class="k">print</span><span class="p">(</span><span class="nb">hex</span><span class="p">(</span><span class="n">offset</span><span class="p">),</span> <span class="p">[</span><span class="nb">hex</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;QQQQQQQQQQQQ'</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">offset</span> <span class="p">:</span> <span class="n">offset</span> <span class="o">+</span> <span class="mi">8</span> <span class="o">*</span> <span class="mi">12</span><span class="p">])])</span>
</code></pre></div></div>

<p>The first result ends up being the one we are interested in:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x5270 ['0xfbad2488', '0x55fa6d305520', '0x55fa6d305520', '0x55fa6d3054a0', '0x55fa6d3054a0', '0x55fa6d3054a0', '0x55fa6d3054a0', '0x55fa6d3054a0', '0x55fa6d3064a0', '0x0', '0x0', '0x0']
</code></pre></div></div>

<p>Just for fun, we see that <code class="language-plaintext highlighter-rouge">_IO_read_ptr</code> is at <code class="language-plaintext highlighter-rouge">_IO_read_end</code>, since we read the full contents, starting at <code class="language-plaintext highlighter-rouge">_IO_read_base</code> (<code class="language-plaintext highlighter-rouge">0x55fa6d305520 = 0x55fa6d3054a0 + 0x80</code>). In gdb, we can read the bytes at <code class="language-plaintext highlighter-rouge">_IO_read_base</code> and confirm they match the ones we saw earlier in the hex dump:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; x/20gx 0x55fa6d3054a0
0x55fa6d3054a0: 0xf20d8c78da32801b      0x7fdabf9732a0c665
0x55fa6d3054b0: 0xde28657df1fb271f      0x01eca782087e81d1
0x55fa6d3054c0: 0x67631738f510400a      0xda48107f20ba4eea
0x55fa6d3054d0: 0x7224866e89c06b40      0x39a34c8189b90c4b
0x55fa6d3054e0: 0x80083ed7e794313b      0x75136ebbca60734f
0x55fa6d3054f0: 0x2207e945cd148380      0x80d75bf680254afe
0x55fa6d305500: 0xdc0b819132403158      0xaf4464465d7d4dc0
0x55fa6d305510: 0x9ead54bd51956632      0x0c4d2c981312f974
0x55fa6d305520: 0x0000000000000000      0x0000000000000000
0x55fa6d305530: 0x0000000000000000      0x0000000000000000
</code></pre></div></div>

<h2 id="task-3-ga">Task 3: ga</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ulong</span> <span class="nf">flag3</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="n">param_1</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">uint</span> <span class="n">uVar0</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">uVar1</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">uVar2</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">uVar3</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">uVar4</span><span class="p">;</span>
  <span class="n">ulong</span> <span class="n">ulVar1</span><span class="p">;</span>

  <span class="o">*</span><span class="n">param_1</span> <span class="o">=</span> <span class="o">*</span><span class="n">param_1</span> <span class="o">^</span> <span class="mh">0x2f01d6f7c8701da9</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x230ed5e2ec453098</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x2f01dae2ef4a3f97</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x2301dae2ec45309b</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x230ed5e2ec4a3f97</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x2002d5e2ec4a3f97</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x200ed5e2ef4a3f97</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x6140948cf3453c97</span><span class="p">;</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"Flag 3:"</span><span class="p">);</span>
  <span class="n">puts</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span><span class="p">);</span>
  <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
  <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x3c</span><span class="p">)</span> <span class="o">=</span> <span class="mh">0x12345678</span><span class="p">;</span>
  <span class="n">uVar1</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">4</span><span class="p">);</span>
  <span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0xc</span><span class="p">);</span>
  <span class="n">ulVar1</span> <span class="o">=</span> <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">5</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x24</span><span class="p">);</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar1</span><span class="p">;</span>
  <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1c</span><span class="p">)</span> <span class="o">*</span> <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">3</span><span class="p">);</span>
  <span class="n">uVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x34</span><span class="p">)</span> <span class="o">^</span> <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">6</span><span class="p">);</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x14</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar4</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar3</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">7</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar2</span><span class="p">;</span>
  <span class="n">uVar0</span> <span class="o">=</span> <span class="p">(</span><span class="n">uint</span><span class="p">)</span><span class="n">ulVar1</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x2c</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar0</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span> <span class="o">=</span> <span class="n">uVar1</span> <span class="o">&amp;</span> <span class="n">uVar4</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar4</span> <span class="o">|</span> <span class="n">uVar3</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar0</span> <span class="o">*</span> <span class="n">uVar2</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0xc</span><span class="p">)</span> <span class="o">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="n">uVar3</span> <span class="o">%</span> <span class="n">ulVar1</span><span class="p">);</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">3</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar2</span> <span class="o">/</span> <span class="n">uVar1</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1c</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar4</span> <span class="o">+</span> <span class="n">uVar2</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x24</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar2</span> <span class="o">-</span> <span class="n">uVar3</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">5</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar1</span> <span class="o">^</span> <span class="n">uVar0</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x34</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar1</span> <span class="o">&amp;</span> <span class="n">uVar3</span><span class="p">;</span>
  <span class="o">*</span><span class="p">(</span><span class="kt">int</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">6</span><span class="p">)</span> <span class="o">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)(</span><span class="n">ulVar1</span> <span class="o">%</span> <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="n">uVar4</span><span class="p">);</span>
  <span class="k">return</span> <span class="n">ulVar1</span> <span class="o">/</span> <span class="n">uVar4</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If these operations make you think about symbolic execution… well, let’s see how that goes.</p>

<h3 id="solution-attempt-solving-with-symbolic-execution">Solution attempt: Solving with symbolic execution?</h3>

<p>On a <a href="http://nevesnunes.github.io/blog/2021/10/03/CTF-Writeup-TSG-CTF-2021-2-Reversing-Tasks.html">previous writeup</a>, I described the steps in setting up an angr script to run over a set of instructions, without having to start at the binary’s entrypoint. The approach here is the same, with the main difference being the constraints: the printed flag should start with <code class="language-plaintext highlighter-rouge">PCTF{</code>, and we want the <code class="language-plaintext highlighter-rouge">flag2()</code> input bytes in the global buffer when we arrive at the end of this function.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span>
<span class="kn">from</span> <span class="nn">pwn</span> <span class="kn">import</span> <span class="o">*</span>
<span class="kn">import</span> <span class="nn">angr</span>
<span class="kn">import</span> <span class="nn">claripy</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="n">BE</span> <span class="o">=</span> <span class="n">angr</span><span class="p">.</span><span class="n">archinfo</span><span class="p">.</span><span class="n">Endness</span><span class="p">.</span><span class="n">BE</span>
<span class="n">LE</span> <span class="o">=</span> <span class="n">angr</span><span class="p">.</span><span class="n">archinfo</span><span class="p">.</span><span class="n">Endness</span><span class="p">.</span><span class="n">LE</span>

<span class="n">start</span> <span class="o">=</span> <span class="mh">0x16CC</span>
<span class="n">end</span> <span class="o">=</span> <span class="mh">0x1757</span>
<span class="n">base</span> <span class="o">=</span> <span class="mh">0x555555554000</span>
<span class="n">addr_flag</span> <span class="o">=</span> <span class="mh">0x5555555580A0</span>


<span class="k">def</span> <span class="nf">char</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">c</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">Or</span><span class="p">(</span><span class="n">c</span> <span class="o">==</span> <span class="mi">0</span><span class="p">,</span> <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">And</span><span class="p">(</span><span class="n">c</span> <span class="o">&lt;=</span> <span class="s">"~"</span><span class="p">,</span> <span class="n">c</span> <span class="o">&gt;=</span> <span class="s">" "</span><span class="p">))</span>

<span class="k">def</span> <span class="nf">apply_constraints</span><span class="p">(</span><span class="n">state</span><span class="p">):</span>
    <span class="k">if</span> <span class="n">state</span><span class="p">.</span><span class="n">addr</span> <span class="o">&lt;</span> <span class="n">base</span> <span class="o">+</span> <span class="n">end</span> <span class="o">-</span> <span class="mi">1</span><span class="p">:</span>
        <span class="k">return</span> <span class="bp">False</span>

    <span class="n">flag2_input_Qs</span> <span class="o">=</span> <span class="p">[</span>
        <span class="mh">0xFF6D8A1CB4000000</span><span class="p">,</span>
        <span class="mh">0x00000000B4B5A5CB</span><span class="p">,</span>
        <span class="mh">0xFF00000000000000</span><span class="p">,</span>
        <span class="mh">0xFF00000000000000</span><span class="p">,</span>
        <span class="mh">0x859275E47A6D8A1C</span><span class="p">,</span>
        <span class="mh">0x00000001B4B5A5CA</span><span class="p">,</span>
        <span class="mh">0x3025800800000001</span><span class="p">,</span>
        <span class="mh">0x1234567800000000</span><span class="p">,</span>
    <span class="p">]</span>
    <span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">q</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">flag2_input_Qs</span><span class="p">):</span>
        <span class="n">expr</span> <span class="o">=</span> <span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">load</span><span class="p">(</span><span class="n">addr_flag</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="n">endness</span><span class="o">=</span><span class="n">BE</span><span class="p">)</span>
        <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">expr</span> <span class="o">==</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="n">q</span><span class="p">))</span>

    <span class="k">return</span> <span class="n">state</span><span class="p">.</span><span class="n">satisfiable</span><span class="p">()</span>


<span class="k">def</span> <span class="nf">main</span><span class="p">():</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="n">asm</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()[</span><span class="n">start</span><span class="p">:</span><span class="n">end</span><span class="o">+</span><span class="mi">1</span><span class="p">]</span>

    <span class="n">project</span> <span class="o">=</span> <span class="n">angr</span><span class="p">.</span><span class="n">load_shellcode</span><span class="p">(</span>
        <span class="n">asm</span><span class="p">,</span>
        <span class="s">"x86_64"</span><span class="p">,</span>
        <span class="n">start_offset</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
        <span class="n">load_address</span><span class="o">=</span><span class="n">base</span> <span class="o">+</span> <span class="n">start</span><span class="p">,</span>
        <span class="n">support_selfmodifying_code</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span>
    <span class="p">)</span>
    <span class="n">state</span> <span class="o">=</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">entry_state</span><span class="p">()</span>

    <span class="c1"># Taken at `b *(0x555555554000 + 0x16cc)`
</span>    <span class="n">mems</span> <span class="o">=</span> <span class="p">[</span>
        <span class="p">[</span><span class="mh">0x555555554000</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span>
        <span class="p">[</span><span class="mh">0x555555555000</span><span class="p">,</span> <span class="mi">5</span><span class="p">],</span>
        <span class="p">[</span><span class="mh">0x555555556000</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span>
        <span class="p">[</span><span class="mh">0x555555557000</span><span class="p">,</span> <span class="mi">4</span><span class="p">],</span>
        <span class="p">[</span><span class="mh">0x555555558000</span><span class="p">,</span> <span class="mi">6</span><span class="p">],</span>
        <span class="p">[</span><span class="mh">0x7FFFFFFDC000</span><span class="p">,</span> <span class="mi">6</span><span class="p">],</span>
    <span class="p">]</span>
    <span class="k">for</span> <span class="n">mem_pair</span> <span class="ow">in</span> <span class="n">mems</span><span class="p">:</span>
        <span class="n">addr</span> <span class="o">=</span> <span class="n">mem_pair</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">perm</span> <span class="o">=</span> <span class="n">mem_pair</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
        <span class="n">memory</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="sa">f</span><span class="s">"out.</span><span class="si">{</span><span class="nb">hex</span><span class="p">(</span><span class="n">addr</span><span class="p">)</span><span class="si">}</span><span class="s">.mem"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
        <span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
        <span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">permissions</span><span class="p">(</span><span class="n">addr</span><span class="p">,</span> <span class="n">perm</span><span class="p">)</span>

    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rax</span> <span class="o">=</span> <span class="mh">0x0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rbx</span> <span class="o">=</span> <span class="n">addr_flag</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rcx</span> <span class="o">=</span> <span class="mh">0x7FFFF7F8E580</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rdx</span> <span class="o">=</span> <span class="mh">0x0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rdi</span> <span class="o">=</span> <span class="mh">0x7FFFF7F844E0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rsi</span> <span class="o">=</span> <span class="mh">0x1</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r8</span> <span class="o">=</span> <span class="mh">0x41</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r9</span> <span class="o">=</span> <span class="mh">0x7FFFF7F81A60</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r10</span> <span class="o">=</span> <span class="mh">0x7FFFF7DD0178</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r11</span> <span class="o">=</span> <span class="mh">0x246</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r12</span> <span class="o">=</span> <span class="mh">0x5555555551C0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r13</span> <span class="o">=</span> <span class="mh">0x0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r14</span> <span class="o">=</span> <span class="mh">0x0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r15</span> <span class="o">=</span> <span class="mh">0x0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rbp</span> <span class="o">=</span> <span class="mh">0x7FFFFFFFC9B8</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rsp</span> <span class="o">=</span> <span class="mh">0x7FFFFFFFC8A0</span>
    <span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rip</span> <span class="o">=</span> <span class="mh">0x5555555556CC</span>

    <span class="n">sym_data</span> <span class="o">=</span> <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">BVS</span><span class="p">(</span><span class="s">"v1"</span><span class="p">,</span> <span class="mh">0x40</span><span class="o">*</span><span class="mi">8</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">):</span>
        <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">char</span><span class="p">(</span><span class="n">state</span><span class="p">,</span> <span class="n">c</span><span class="p">))</span>
    <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">"P"</span><span class="p">))</span>
    <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">"C"</span><span class="p">))</span>
    <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">"T"</span><span class="p">))</span>
    <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="mi">3</span><span class="p">]</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">"F"</span><span class="p">))</span>
    <span class="n">state</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="mi">4</span><span class="p">]</span> <span class="o">==</span> <span class="nb">ord</span><span class="p">(</span><span class="s">"{"</span><span class="p">))</span>

    <span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="n">addr_flag</span><span class="p">,</span> <span class="n">sym_data</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>

    <span class="c1"># Sanity checking the start address instruction
</span>    <span class="k">assert</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">block</span><span class="p">(</span><span class="n">base</span> <span class="o">+</span> <span class="n">start</span><span class="p">).</span><span class="nb">bytes</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="mh">0x31</span>
    <span class="k">assert</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">block</span><span class="p">(</span><span class="n">base</span> <span class="o">+</span> <span class="n">start</span><span class="p">).</span><span class="nb">bytes</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">==</span> <span class="mh">0xd2</span>

    <span class="n">queue</span> <span class="o">=</span> <span class="p">[</span><span class="n">state</span><span class="p">,</span> <span class="p">]</span>
    <span class="k">while</span> <span class="nb">len</span><span class="p">(</span><span class="n">queue</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
        <span class="n">state</span> <span class="o">=</span> <span class="n">queue</span><span class="p">.</span><span class="n">pop</span><span class="p">()</span>
        <span class="n">state2</span> <span class="o">=</span> <span class="n">state</span><span class="p">.</span><span class="n">copy</span><span class="p">()</span>

        <span class="n">sm</span> <span class="o">=</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">simgr</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
        <span class="n">sm</span><span class="p">.</span><span class="n">explore</span><span class="p">(</span><span class="n">find</span><span class="o">=</span><span class="n">apply_constraints</span><span class="p">,</span> <span class="n">avoid</span><span class="o">=</span><span class="k">lambda</span> <span class="n">s</span><span class="p">:</span> <span class="n">s</span><span class="p">.</span><span class="n">addr</span> <span class="o">&gt;</span> <span class="n">base</span><span class="o">+</span><span class="n">end</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span>

        <span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="n">sm</span><span class="p">.</span><span class="n">active</span><span class="p">:</span>
            <span class="n">queue</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>

        <span class="k">if</span> <span class="n">sm</span><span class="p">.</span><span class="n">found</span><span class="p">:</span>
            <span class="k">for</span> <span class="n">found</span> <span class="ow">in</span> <span class="n">sm</span><span class="p">.</span><span class="n">found</span><span class="p">:</span>
                <span class="n">found_flag</span> <span class="o">=</span> <span class="n">found</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="nb">eval</span><span class="p">(</span><span class="n">sym_data</span><span class="p">,</span> <span class="n">cast_to</span><span class="o">=</span><span class="nb">bytes</span><span class="p">)</span>
                <span class="k">print</span><span class="p">(</span><span class="n">found_flag</span><span class="p">)</span>
                <span class="k">print</span><span class="p">(</span><span class="nb">hex</span><span class="p">(</span><span class="n">found</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="nb">eval</span><span class="p">(</span><span class="n">sym_data</span><span class="p">)))</span>

                <span class="c1"># Add found solutions as constraints to the solver, so that
</span>                <span class="c1"># we start exploring again but arrive at different solutions.
</span>                <span class="n">more_constraints</span> <span class="o">=</span> <span class="p">[]</span>
                <span class="k">for</span> <span class="n">ic</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">64</span><span class="p">):</span>
                    <span class="n">more_constraints</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">sym_data</span><span class="p">.</span><span class="n">chop</span><span class="p">(</span><span class="mi">8</span><span class="p">)[</span><span class="n">ic</span><span class="p">]</span> <span class="o">==</span> <span class="n">found_flag</span><span class="p">[</span><span class="n">ic</span><span class="p">])</span>
                <span class="n">state2</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">add</span><span class="p">(</span>
                    <span class="n">state2</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">Not</span><span class="p">(</span>
                        <span class="n">state2</span><span class="p">.</span><span class="n">solver</span><span class="p">.</span><span class="n">And</span><span class="p">(</span>
                            <span class="o">*</span><span class="n">more_constraints</span>
                        <span class="p">)</span>
                    <span class="p">)</span>
                <span class="p">)</span>
            <span class="n">queue</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">state2</span><span class="p">)</span>

    <span class="k">print</span><span class="p">(</span><span class="n">sm</span><span class="p">)</span>


<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">main</span><span class="p">()</span>
</code></pre></div></div>

<p>While we get the expected <code class="language-plaintext highlighter-rouge">PCTF{ban</code>, we also end up with multiple junk solutions, since our conditions are too unconstrained. At this point, it’s better to step back and look for other clues…</p>

<h3 id="solution-1-checking-input-bytes">Solution 1: Checking input bytes</h3>

<p>If we break right before calling <code class="language-plaintext highlighter-rouge">flag3()</code>, we notice something odd in the global buffer: the previous function call filled it with the same repeated 64bit pattern:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>► 0x55555555514f &lt;main+127&gt;    lea    rdi, [rip + 0x2f4a]           &lt;0x5555555580a0&gt;
  0x555555555156 &lt;main+134&gt;    call   flag3                &lt;flag3&gt;

pwndbg&gt; x/10gx 0x5555555580a0
0x5555555580a0 &lt;globalbuf&gt;:     0x41619aa7a689d4f3      0x41619aa7a689d4f3
0x5555555580b0 &lt;globalbuf+16&gt;:  0x41619aa7a689d4f3      0x41619aa7a689d4f3
0x5555555580c0 &lt;globalbuf+32&gt;:  0x41619aa7a689d4f3      0x41619aa7a689d4f3
0x5555555580d0 &lt;globalbuf+48&gt;:  0x41619aa7a689d4f3      0x41619aa7a689d4f3
0x5555555580e0 &lt;flag1ptr&gt;:      0x0000000000000000      0x0000000000000000
</code></pre></div></div>

<p>Since we already know that the flag starts with <code class="language-plaintext highlighter-rouge">PCTF{ban</code>, we can just xor it with the first constant (<code class="language-plaintext highlighter-rouge">0x2f01d6f7c8701da9</code>) to get the expected repeated byte pattern. Then we can take that pattern and apply it for the other constants:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="nb">hex</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="sa">b</span><span class="s">'PCTF{ban'</span><span class="p">)[</span><span class="mi">0</span><span class="p">])</span>
<span class="s">'0x6e61627b46544350'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&gt;Q'</span><span class="p">,</span> <span class="mh">0x6e61627b46544350</span> <span class="o">^</span> <span class="mh">0x2f01d6f7c8701da9</span><span class="p">).</span><span class="nb">hex</span><span class="p">()</span>
<span class="s">'4160b48c8e245ef9'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x2f01d6f7c8701da9</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'PCTF{ban'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x230ed5e2ec453098</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'anabnanb'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x2f01dae2ef4a3f97</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'nanannan'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x2301dae2ec45309b</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'bnabnnab'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x230ed5e2ec4a3f97</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'nanbnanb'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x2002d5e2ec4a3f97</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'nanbnaba'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x200ed5e2ef4a3f97</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'nananana'</span>
<span class="o">&gt;&gt;&gt;</span> <span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="mh">0x4160b48c8e245ef9</span> <span class="o">^</span> <span class="mh">0x6140948cf3453c97</span><span class="p">)</span>
<span class="sa">b</span><span class="s">'nba}</span><span class="se">\x00</span><span class="s">   '</span>
</code></pre></div></div>

<h3 id="solution-2-checking-leftover-strings">Solution 2: Checking leftover strings</h3>

<p>When calling these flag functions, allocated strings also aren’t zero’d, so we can actually find the last part of these flags in the core dump. Running <code class="language-plaintext highlighter-rouge">strings</code> will bring up this pile of bananas:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00004260: 2f2f 2f74 696d 6520 666f 7220 636f 7265  ///time for core
00004270: 2f2f 2f0a 6261 6e61 6e61 7d0a 616e 616e  ///.banana}.anan
00004280: 6121 6261 6e61 6e61 7d0a 6e62 6e61 6e62  a!banana}.nbnanb
00004290: 6e61 6e62 6e61 6261 6e61 6e61 6e61 6e61  nanbnabanananana
000042a0: 6e62 617d 0a00 0000 0000 0000 0000 0000  nba}............
</code></pre></div></div>

<p>If we break <code class="language-plaintext highlighter-rouge">nbnanbnanbnabanananana}</code> into 64bit chunks, at least two of them could be xor’d with the previously seen constants, resulting in the same input byte sequence, thus hinting at the rest of the flag.</p>

<h2 id="task-4-sm">Task 4: sm</h2>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">flag4</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="n">param_1</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">ushort</span> <span class="n">uVar1</span><span class="p">;</span>
  <span class="n">uint</span> <span class="n">uVar2</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar3</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar4</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar5</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar6</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar7</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar8</span><span class="p">;</span>
  <span class="kt">double</span> <span class="n">dVar9</span><span class="p">;</span>
  <span class="kt">long</span> <span class="n">i</span><span class="p">;</span>

  <span class="o">*</span><span class="n">param_1</span> <span class="o">=</span> <span class="o">*</span><span class="n">param_1</span> <span class="o">^</span> <span class="mh">0xbc019ee23a6bf6bf</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xe9483020414b589c</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x217b7d11e6c9a8a3</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x3b3924ce775a8541</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">4</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x6bbdb2171bad0ec8</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">5</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0xb0b0429f1f0242e9</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x5de514ab5abe8132</span><span class="p">;</span>
  <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">=</span> <span class="n">param_1</span><span class="p">[</span><span class="mi">7</span><span class="p">]</span> <span class="o">^</span> <span class="mh">0x50789e90a63c152e</span><span class="p">;</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"Flag 4:"</span><span class="p">);</span>
  <span class="n">puts</span><span class="p">((</span><span class="kt">char</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span><span class="p">);</span>
  <span class="n">fflush</span><span class="p">((</span><span class="kt">FILE</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">);</span>
  <span class="n">dVar9</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)</span><span class="n">param_1</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">4</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">);</span>
  <span class="n">dVar3</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">6</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">10</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar9</span><span class="p">;</span>
  <span class="n">dVar4</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0xc</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">2</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar3</span><span class="p">;</span>
  <span class="n">dVar5</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x12</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x16</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar4</span><span class="p">;</span>
  <span class="n">dVar6</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">3</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1c</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar5</span><span class="p">;</span>
  <span class="n">dVar7</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1e</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x22</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar6</span><span class="p">;</span>
  <span class="n">uVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x2a</span><span class="p">);</span>
  <span class="n">dVar8</span> <span class="o">=</span> <span class="p">(</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">uint</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x24</span><span class="p">)</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span>
                  <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">5</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar7</span><span class="p">;</span>
  <span class="n">uVar1</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ushort</span> <span class="o">*</span><span class="p">)((</span><span class="kt">long</span><span class="p">)</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x2e</span><span class="p">);</span>
  <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="n">param_1</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="p">(</span><span class="n">ulong</span><span class="p">)(((</span><span class="kt">double</span><span class="p">)((</span><span class="n">ulong</span><span class="p">)</span><span class="n">uVar2</span> <span class="o">|</span> <span class="mh">0x3fff000000000000</span> <span class="o">|</span> <span class="p">(</span><span class="n">ulong</span><span class="p">)</span><span class="n">uVar1</span> <span class="o">&lt;&lt;</span> <span class="mh">0x20</span><span class="p">)</span> <span class="o">+</span> <span class="n">dVar8</span>
                         <span class="p">)</span> <span class="o">*</span> <span class="n">dVar8</span> <span class="o">*</span> <span class="n">dVar7</span> <span class="o">*</span> <span class="n">dVar6</span> <span class="o">*</span> <span class="n">dVar5</span> <span class="o">*</span> <span class="n">dVar4</span> <span class="o">*</span> <span class="n">dVar3</span> <span class="o">*</span> <span class="n">dVar9</span><span class="p">);</span>
    <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="mi">8</span><span class="p">);</span>
  <span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>The decompilation only tells part of the story. There are some floating-point variables being juggled until a final result is stored 8 times in the global buffer. But what’s special about these variables?</p>

<p>Let’s look in the disassembly for instructions specific to floating-point calculation:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00101544 dd 44 24 08     FLD        qword ptr [RSP + local_10]
...
00101566 dd 44 24 08     FLD        qword ptr [RSP + local_10]
0010156a d8 c1           FADD       ST0,ST1
</code></pre></div></div>

<p>Wait, <code class="language-plaintext highlighter-rouge">ST0</code>? Aren’t the used registers usually like <code class="language-plaintext highlighter-rouge">XMM0</code>?</p>

<p>Usually yes, but there is more than one instruction subset dedicated to floating-point calculations. XMM registers are used by Streaming SIMD Extensions (SSE) instructions. But before those came along, calculations were done with the floating-point unit (FPU) instructions, which uses ST registers.</p>

<p>Is there any reason why these FPU instructions are being used here instead?</p>

<p>Turns out that these registers map to a <a href="https://www.csee.umbc.edu/courses/undergraduate/313/fall04/burt_katz/lectures/Lect12/stack.html">stack of 8 extended floating point numbers</a>. See where this is going? Most likely the stack is also present in the core dump.</p>

<p>Let’s see how we can recover the individual values of each float variable:</p>

<ul>
  <li>We start with a push of the first value (<code class="language-plaintext highlighter-rouge">FLD</code>);</li>
  <li>Then we push a second value, <a href="https://c9x.me/x86/html/file_module_x86_id_81.html">sum them and store in ST0</a> (<code class="language-plaintext highlighter-rouge">FLD</code> + <code class="language-plaintext highlighter-rouge">FADD ST0,ST1</code>)
    <ul>
      <li>Repeated 7 times;</li>
    </ul>
  </li>
  <li>Then we multiply ST1 by ST0, store result in ST1, and pop the register stack (<code class="language-plaintext highlighter-rouge">FMULP</code>);
    <ul>
      <li>Repeated 7 times;</li>
    </ul>
  </li>
  <li>Finally, this result gets copied to all 8 64bit offsets in the global buffer (<code class="language-plaintext highlighter-rouge">FST qword ptr [RBX + RAX*0x8]</code>);</li>
</ul>

<p>Intermediate results are persisted in the stack, so we should be to apply these operations in reverse.</p>

<p>Now, regarding how these floats are packed. Since we are dealing with extended precision floats, recall the <a href="https://home.deec.uc.pt/~jlobo/tc/artofasm/ch14/ch141.htm">difference in precision formats</a>:</p>

<ul>
  <li><strong>single precision</strong>: 32-bits = one’s complement 24-bit mantissa, 8-bit excess-128 exponent</li>
  <li><strong>double precision</strong>: 64-bits = 53-bit mantissa (with an implied H.O. bit of one), 11-bit excess-1023 exponent, 1-bit sign</li>
  <li><strong>extended precision</strong>: 80-bits = 64-bit mantissa, 15 bit excess-16383 exponent, 1-bit sign</li>
</ul>

<p>Therefore, we should expect 10 bytes for each packed float, possibly aligned to 16 bytes with nulls.</p>

<p>Again, we can take our own core dumps before and after some of these operations, take note of the values, and find out how the stack gets changed.</p>

<p>We start with a <code class="language-plaintext highlighter-rouge">diff -u &lt;(xxd core.4a) &lt;(xxd core.4b)</code> before and after the first store, and we see a value pop up containing the or’d <code class="language-plaintext highlighter-rouge">0x3fff000000000000</code>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 000ae630: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
<span class="gd">-000ae640: 0000 0000 7f03 0000 0000 0000 0000 0000  ................
-000ae650: 0000 0000 0000 0000 0000 0000 801f 0000  ................
-000ae660: 0000 0000 0000 0000 0000 0000 0000 0000  ................
</span><span class="gi">+000ae640: 0000 0000 7f03 0038 8000 0000 c251 5555  .......8.....QUU
+000ae650: 5555 0000 a8c8 ffff ff7f 0000 801f 0000  UU..............
+000ae660: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
</span></code></pre></div></div>

<p>The second value is also stored:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 000ae630: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
<span class="gd">-000ae640: 0000 0000 7f03 0038 8000 0000 c251 5555  .......8.....QUU
</span><span class="gi">+000ae640: 0000 0000 7f03 0030 c000 0000 c251 5555  .......0.....QUU
</span> 000ae650: 5555 0000 a8c8 ffff ff7f 0000 801f 0000  UU..............
<span class="gd">-000ae660: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
-000ae670: 0000 0000 0000 0000 0000 0000 0000 0000  ................
</span><span class="gi">+000ae660: 0000 0000 0028 55c8 a1ce 8ffd ff3f 0000  .....(U......?..
+000ae670: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
</span></code></pre></div></div>

<p>And the sum is computed:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 000ae630: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
<span class="gd">-000ae640: 0000 0000 7f03 0030 c000 0000 c251 5555  .......0.....QUU
</span><span class="gi">+000ae640: 0000 0000 7f03 0030 c000 0000 4855 5555  .......0....HUUU
</span> 000ae650: 5555 0000 a8c8 ffff ff7f 0000 801f 0000  UU..............
<span class="gd">-000ae660: 0000 0000 0028 55c8 a1ce 8ffd ff3f 0000  .....(U......?..
</span><span class="gi">+000ae660: 0000 0000 006c 37a3 2931 93fd 0040 0000  .....l7.)1...@..
</span> 000ae670: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
</code></pre></div></div>

<p>After all the sums:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 000ae630: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
<span class="gd">-000ae640: 0000 0000 7f03 0030 c000 0000 4855 5555  .......0....HUUU
</span><span class="gi">+000ae640: 0000 0000 7f03 0000 ff00 0000 c251 5555  .............QUU
</span> 000ae650: 5555 0000 a8c8 ffff ff7f 0000 801f 0000  UU..............
<span class="gd">-000ae660: 0000 0000 006c 37a3 2931 93fd 0040 0000  .....l7.)1...@..
-000ae670: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
-000ae680: 0000 0000 0000 0000 0000 0000 0000 0000  ................
-000ae690: 0000 0000 0000 0000 0000 0000 0000 0000  ................
-000ae6a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
-000ae6b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
-000ae6c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
-000ae6d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
</span><span class="gi">+000ae660: 0000 0000 004d 6ea9 9e7a 0afc 0240 0000  .....Mn..z...@..
+000ae670: 0000 0000 001e 69b6 7bb9 3adc 0240 0000  ......i.{.:..@..
+000ae680: 0000 0000 0044 d1be 2061 d3bc 0240 0000  .....D.. a...@..
+000ae690: 0000 0000 0091 4e42 d8ca 819d 0240 0000  ......NB.....@..
+000ae6a0: 0000 0000 0012 4ee0 c281 4cfc 0140 0000  ......N...L..@..
+000ae6b0: 0000 0000 0044 f05d a3f0 dfbd 0140 0000  .....D.].....@..
+000ae6c0: 0000 0000 006c 37a3 2931 93fd 0040 0000  .....l7.)1...@..
+000ae6d0: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
</span></code></pre></div></div>

<p>After all the multiplications:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code> 000ae630: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
<span class="gd">-000ae640: 0000 0000 7f03 0000 ff00 0000 c251 5555  .............QUU
</span><span class="gi">+000ae640: 0000 0000 7f03 203a 8000 0000 2856 5555  ...... :....(VUU
</span> 000ae650: 5555 0000 a8c8 ffff ff7f 0000 801f 0000  UU..............
<span class="gd">-000ae660: 0000 0000 004d 6ea9 9e7a 0afc 0240 0000  .....Mn..z...@..
-000ae670: 0000 0000 001e 69b6 7bb9 3adc 0240 0000  ......i.{.:..@..
-000ae680: 0000 0000 0044 d1be 2061 d3bc 0240 0000  .....D.. a...@..
-000ae690: 0000 0000 0091 4e42 d8ca 819d 0240 0000  ......NB.....@..
-000ae6a0: 0000 0000 0012 4ee0 c281 4cfc 0140 0000  ......N...L..@..
-000ae6b0: 0000 0000 0044 f05d a3f0 dfbd 0140 0000  .....D.].....@..
-000ae6c0: 0000 0000 006c 37a3 2931 93fd 0040 0000  .....l7.)1...@..
-000ae6d0: 0000 0000 00b0 197e b193 96fd ff3f 0000  .......~.....?..
</span><span class="gi">+000ae660: 0000 0000 6985 0216 ddf3 258d 1640 0000  ....i.....%..@..
+000ae670: 0000 0000 004d 6ea9 9e7a 0afc 0240 0000  .....Mn..z...@..
+000ae680: 0000 0000 d8b0 c980 5dd2 d2d8 0640 0000  ........]....@..
+000ae690: 0000 0000 6fb3 52ab 83da ed9f 0a40 0000  ....o.R......@..
+000ae6a0: 0000 0000 c770 e749 2de9 cbc4 0d40 0000  .....p.I-....@..
+000ae6b0: 0000 0000 48ba a65d d289 f3c1 1040 0000  ....H..].....@..
+000ae6c0: 0000 0000 6925 d573 3576 da8f 1340 0000  ....i%.s5v...@..
+000ae6d0: 0000 0000 fc90 7fea e49c 7d8e 1540 0000  ..........}..@..
</span></code></pre></div></div>

<p>Ok, now we have an idea of how this structure is represented. To find it in the provided core dump, we can use the same trick of searching for high-valued image base bytes (<code class="language-plaintext highlighter-rouge">\xf0\x6c\xfa\x55</code>), or even that <code class="language-plaintext highlighter-rouge">CORE</code> string we see before the stack. Eventually, we arrive at offset <code class="language-plaintext highlighter-rouge">0xd14</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00000ce0: 0500 0000 0002 0000 0200 0000 434f 5245  ............CORE
00000cf0: 0000 0000 7f03 2000 0000 0000 3676 f06c  ...... .....6v.l
00000d00: fa55 0000 0000 0000 0000 0000 801f 0000  .U..............
00000d10: ffff 0000 004d 011c 58e7 86fa 0240 0000  .....M..X....@..
00000d20: 0000 0000 f5bf 8d80 2bf2 d4d6 0640 0000  ........+....@..
00000d30: 0000 0000 ce05 b0b4 efa3 e39d 0a40 0000  .............@..
00000d40: 0000 0000 639c 8340 af09 5fc1 0d40 0000  ....c..@.._..@..
00000d50: 0000 0000 be52 3872 6f03 70bd 1040 0000  .....R8ro.p..@..
00000d60: 0000 0000 78b5 2c15 b427 208b 1340 0000  ....x.,..' ..@..
00000d70: 0000 0000 8f84 f589 3cc9 0a88 1540 0000  ........&lt;....@..
00000d80: 0000 0000 b3c5 f722 7164 a485 1640 0000  ......."qd...@..
00000d90: 0000 0000 0000 0000 0000 0000 0000 0000  ................
</code></pre></div></div>

<h3 id="solution">Solution</h3>

<p>You can find alternative solutions written in C, and for a good reason: unpacking floats is as easy as making some casts out of an unsigned char buffer, so to unpack a float80 (padded to 16 bytes) at index i you just do <code class="language-plaintext highlighter-rouge">*(long double *)&amp;buffer[i * 16]</code>.</p>

<p>In python, the struct module only supports unpacking floats up to 8 bytes. Then there’s the Decimal module, which could be suitable, but I ended up using numpy since it had more direct functions:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">struct</span>
<span class="kn">import</span> <span class="nn">sys</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">v</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>

<span class="n">stack_begin</span> <span class="o">=</span> <span class="mh">0xD14</span>
<span class="n">stack</span> <span class="o">=</span> <span class="n">v</span><span class="p">[</span><span class="n">stack_begin</span> <span class="p">:</span> <span class="n">stack_begin</span> <span class="o">+</span> <span class="mi">16</span> <span class="o">*</span> <span class="mi">8</span><span class="p">]</span>

<span class="c1"># Read float128 entries a.k.a. float80 padded to 16 bytes
</span><span class="n">floats_raw</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">8</span><span class="p">):</span>
    <span class="n">floats_raw</span> <span class="o">+=</span> <span class="p">[</span><span class="n">np</span><span class="p">.</span><span class="n">frombuffer</span><span class="p">(</span><span class="n">stack</span><span class="p">[</span><span class="n">i</span> <span class="o">*</span> <span class="mi">16</span> <span class="p">:</span> <span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">16</span><span class="p">],</span> <span class="n">np</span><span class="p">.</span><span class="n">float128</span><span class="p">)[</span><span class="mi">0</span><span class="p">]]</span>

<span class="c1"># Reverse multiplications
</span><span class="n">floats1</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">):</span>
    <span class="n">floats1</span> <span class="o">+=</span> <span class="p">[</span><span class="n">floats_raw</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">/</span> <span class="n">floats_raw</span><span class="p">[</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]]</span>
<span class="n">floats1</span> <span class="o">+=</span> <span class="p">[</span><span class="n">floats_raw</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span>

<span class="c1"># Reverse sums
</span><span class="n">floats2</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">):</span>
    <span class="n">floats2</span> <span class="o">+=</span> <span class="p">[</span><span class="n">floats1</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">-</span> <span class="n">floats1</span><span class="p">[</span><span class="n">i</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]]</span>
<span class="n">floats2</span> <span class="o">+=</span> <span class="p">[</span><span class="n">floats1</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span>

<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">):</span>
    <span class="n">sys</span><span class="p">.</span><span class="n">stdout</span><span class="p">.</span><span class="nb">buffer</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">struct</span><span class="p">.</span><span class="n">pack</span><span class="p">(</span><span class="s">"&lt;d"</span><span class="p">,</span> <span class="n">floats2</span><span class="p">[</span><span class="n">i</span><span class="p">])[:</span><span class="o">-</span><span class="mi">2</span><span class="p">])</span>  <span class="c1"># remove 2 bytes of exponent added in OR operations (0x3fff)
</span></code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>PCTF{orange%you&amp;glad$i!didn't^say#banana*again}
</code></pre></div></div>]]></content><author><name></name></author><category term="ctf" /><category term="reversing" /><category term="file formats" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Empty Handshakes</title><link href="https://nevesnunes.github.io/blog/2021/11/13/Empty-Handshakes.html" rel="alternate" type="text/html" title="Empty Handshakes" /><published>2021-11-13T09:07:23+00:00</published><updated>2021-11-13T09:07:23+00:00</updated><id>https://nevesnunes.github.io/blog/2021/11/13/Empty-Handshakes</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/11/13/Empty-Handshakes.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<p>When attempting to make a https request from a Qt app, a terse error was returned:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/zeal-error.png" alt="" />
</div>

<p>Which seemed odd, given that curl had no issue doing the same request, without the user specifying any additional certificates. So, what was different?</p>

<h1 id="analysis">Analysis</h1>

<p>With <code class="language-plaintext highlighter-rouge">strace -f -k</code>, we don’t find the message text verbatim, but we can search for the last instance of “handshake”, then look up for application specific functions:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1984033 write(5, "\1\0\0\0\0\0\0\0", 8) = 8
[...]
 &gt; /usr/lib64/libQt5Widgets.so.5.15.2(QPushButton::QPushButton(QString const&amp;, QWidget*)+0x18) [0x34c448]
[...]
 &gt; /usr/lib64/libQt5Widgets.so.5.15.2(QMessageBox::warning(QWidget*, QString const&amp;, QString const&amp;, int, int, int)+0x5f) [0x3f933f]
 &gt; /home/foo/opt/zeal-dev/build/bin/zeal(Zeal::WidgetUi::DocsetsDialog::downloadCompleted()+0x273) [0x4c7e81]
[...]
1984262 write(163, "\1\0\0\0\0\0\0\0", 8) = 8
[...]
 &gt; /usr/lib64/libQt5Network.so.5.15.2(QAbstractSocket::disconnectFromHost()+0xc8) [0xfe128]
 &gt; /usr/lib64/libQt5Network.so.5.15.2(QSslSocketBackendPrivate::checkSslErrors()+0x11b) [0x13f99b]
 &gt; /usr/lib64/libQt5Network.so.5.15.2(QSslSocketBackendPrivate::startHandshake()+0x3ea) [0x143eaa]
</code></pre></div></div>

<p>On an earlier call stack with the “handshake” function, we see OpenSSL specific functions, from libssl.so:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> &gt; /usr/lib64/libssl.so.1.1.1l(state_machine.part.0+0x43a) [0x53f3a]
 &gt; /usr/lib64/libQt5Network.so.5.15.2(QSslSocketBackendPrivate::startHandshake()+0x4e4) [0x143fa4]
</code></pre></div></div>

<p>This shows us what Qt ends up using for the SSL connection. We can use the <code class="language-plaintext highlighter-rouge">openssl s_client</code> tool to compare validation results, since it will use the same library (checked with ldd).</p>

<div class="c-indirectly-related">
  <p>We want to know the full URL to also test with curl. While it’s given by another dialog message, we could also figure it out with gdb.</p>

  <p>Let’s look at the actual download function, which appears earlier in the trace:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> &gt; /home/foo/opt/zeal-dev/build/bin/zeal(Zeal::WidgetUi::DocsetsDialog::download(QUrl const&amp;)+0x10d) [0x4cb409]
</code></pre></div>  </div>

  <p>This function is also called as part of the retry logic inside <code class="language-plaintext highlighter-rouge">downloadCompleted()</code>:</p>

  <div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">ret</span> <span class="o">==</span> <span class="n">QMessageBox</span><span class="o">::</span><span class="n">Retry</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">QNetworkReply</span> <span class="o">*</span><span class="n">newReply</span> <span class="o">=</span> <span class="n">download</span><span class="p">(</span><span class="n">reply</span><span class="o">-&gt;</span><span class="n">request</span><span class="p">().</span><span class="n">url</span><span class="p">());</span>
    <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div>  </div>

  <p>Debugger setup:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>set follow-fork-mode parent
b Zeal::WidgetUi::DocsetsDialog::download
r
</code></pre></div>  </div>

  <p>Let’s inspect <code class="language-plaintext highlighter-rouge">url</code>:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; p url
$8 = (const QUrl &amp;) @0x7fffffffb070: {
  d = 0xd569b0
}
pwndbg&gt; telescope url.d 0x20
00:0000│  0xd569b0 ◂— 0xffffffff00000001
01:0008│  0xd569b8 —▸ 0x7fffdc00d110 ◂— 0x500000001
02:0010│  0xd569c0 —▸ 0x7ffff766f260 (QArrayData::shared_null) ◂— 0xffffffff
03:0018│  0xd569c8 —▸ 0x7ffff766f260 (QArrayData::shared_null) ◂— 0xffffffff
04:0020│  0xd569d0 —▸ 0x10c1b70 ◂— 0x1000000001
05:0028│  0xd569d8 —▸ 0xd79740 ◂— 0xb00000001
06:0030│  0xd569e0 —▸ 0x7ffff766f260 (QArrayData::shared_null) ◂— 0xffffffff
07:0038│  0xd569e8 —▸ 0x7ffff766f260 (QArrayData::shared_null) ◂— 0xffffffff
08:0040│  0xd569f0 ◂— 0x0
09:0048│  0xd569f8 ◂— 0x7400720009 /* '\t' */
0a:0050│  0xd56a00 ◂— 0x2400000065 /* 'e' */
[...]
</code></pre></div>  </div>

  <p>A fine structure… Given that we compiled with debug symbols (<code class="language-plaintext highlighter-rouge">cmake -DCMAKE_BUILD_TYPE=Debug</code>), we can check these fields:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; p url.d-&gt;host.d
$16 = (QString::Data *) 0x10c1b70
pwndbg&gt; p &amp;url.d-&gt;host.d-&gt;offset
$17 = (qptrdiff *) 0x10c1b80
pwndbg&gt; p url.d-&gt;host.d-&gt;offset
$18 = 24
pwndbg&gt; x/20wx 0x10c1b70+24
0x10c1b88:      0x00700061      0x002e0069      0x0065007a      0x006c0061
0x10c1b98:      0x006f0064      0x00730063      0x006f002e      0x00670072
0x10c1ba8:      0x00000000      0x40000000      0x00000050      0x00000000
0x10c1bb8:      0x00000061      0x00000000      0xc426e000      0x00007fff
0x10c1bc8:      0x00065848      0x00000000      0x000070f0      0x00000000
</code></pre></div>  </div>

  <p>Seems like our characters are encoded in UTF-16 little endian. These same values can be iterated with <code class="language-plaintext highlighter-rouge">data()</code>:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; p url.d-&gt;host.d-&gt;data()
$39 = (unsigned short *) 0x10c1b88
pwndbg&gt; p &amp;url.d-&gt;host.d-&gt;data()[1]
$41 = (unsigned short *) 0x10c1b8a
</code></pre></div>  </div>

  <p>After joining them:</p>

  <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">&gt;&gt;&gt;</span> <span class="s">''</span><span class="p">.</span><span class="n">join</span><span class="p">([</span><span class="nb">chr</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="p">[</span><span class="mh">0x61</span><span class="p">,</span><span class="mh">0x70</span><span class="p">,</span><span class="mh">0x69</span><span class="p">,</span><span class="mh">0x2e</span><span class="p">,</span><span class="mh">0x7a</span><span class="p">,</span><span class="mh">0x65</span><span class="p">,</span><span class="mh">0x61</span><span class="p">,</span><span class="mh">0x6c</span><span class="p">,</span><span class="mh">0x64</span><span class="p">,</span><span class="mh">0x6f</span><span class="p">,</span><span class="mh">0x63</span><span class="p">,</span><span class="mh">0x73</span><span class="p">,</span><span class="mh">0x2e</span><span class="p">,</span><span class="mh">0x6f</span><span class="p">,</span><span class="mh">0x72</span><span class="p">,</span><span class="mh">0x67</span><span class="p">]])</span>
<span class="s">'api.zealdocs.org'</span>
</code></pre></div>  </div>

  <p>This only gave us the host, so we would need to repeat this process for subpaths, but keywords <code class="language-plaintext highlighter-rouge">QString::Data gdb</code> point us to some <a href="https://invent.kde.org/sdk/kde-dev-scripts">helper scripts</a> that pretty print these Qt types.</p>

  <p>Funnily enough, casually invoking telescope on the registers gives us the full URL:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; telescope
10:0080│         0x7fffffffb060 —▸ 0x7fffffffb110 —▸ 0x7fffffffb140 —▸ 0x7fffffffb190 —▸ 0x7fffffffb270 ◂— ...
11:0088│         0x7fffffffb068 —▸ 0x7fffdc009690 —▸ 0x5fccb8 —▸ 0x4c022e ◂— push   rbp
12:0090│ rdx rsi 0x7fffffffb070 —▸ 0xd569b0 ◂— 0xffffffff00000001
13:0098│         0x7fffffffb078 —▸ 0xd77af0 ◂— 0x2300000001
14:00a0│         0x7fffffffb080 —▸ 0x580090 ◂— 'https://api.zealdocs.org/v1'
15:00a8│         0x7fffffffb088 —▸ 0x7fffffffb090 ◂— 0x7fff00000008
16:00b0│         0x7fffffffb090 ◂— 0x7fff00000008
17:00b8│         0x7fffffffb098 —▸ 0x58042c ◂— '/docsets'
</code></pre></div>  </div>
</div>

<p>Before testing with <code class="language-plaintext highlighter-rouge">openssl s_client</code>, we should clarify who is actually reporting a handshake error: the client (our app) or the server (the remote host)?</p>

<p>Let’s sniff the traffic from curl, filtering by the ip given by <code class="language-plaintext highlighter-rouge">nslookup api.zealdocs.org</code>:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/zeal-curl.png" alt="" />
</div>

<p>Compare it with the traffic from our client app:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/zeal-before-log.png" alt="" />
</div>

<p>We get to see that it’s the client app that decides to terminate the connection with a TCP FIN packet.</p>

<p>There’s some “Encrypted Handshake Message” packets, which can be decrypted by <a href="https://github.com/saleemrashid/frida-sslkeylog">instrumenting OpenSSL functions</a><sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, so that the private key is logged in this format:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RSA Session-ID:f310e2aefbb422b9e7ab02afa2fbc3bdfd00107d6d5dd8653340c98b3ee9db36 Master-Key:1f52347840128456110317cb312a38985b2fd212afa6da2793caaa30b374790eb8043ec665cce0599159b4575a6b0415
</code></pre></div></div>

<p>Which is then loaded in wireshark: <code class="language-plaintext highlighter-rouge">Right click on a TLS packet &gt; Protocol Preferences &gt; Transport Layer Security &gt; Pre-Master-Secret Log</code></p>

<p>Output:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/zeal-after-log.png" alt="" />
</div>

<p>Oh, it was just a “Finished” message…</p>

<h2 id="certificate-verification">Certificate Verification</h2>

<p>Now that we suspect that our client is the one originating the error, let’s check if we can actually verify the certificate chain successfully with other tools.</p>

<p>Let’s start with a minimal Qt app that downloads a file from a user provided URL. Keywords <code class="language-plaintext highlighter-rouge">qt example http</code> direct us to such an <a href="https://doc.qt.io/qt-5/qtnetwork-http-example.html">example</a>, built with:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd</span> <span class="nv">$path</span>/qtbase
cmake <span class="nb">.</span>
cmake <span class="nt">--build</span> <span class="nb">.</span> <span class="nt">--parallel</span>
<span class="nb">cd</span> <span class="nv">$path</span>/qtbase/examples/network/http
<span class="nv">LD_LIBRARY_PATH</span><span class="o">=</span><span class="nv">$path</span>/qtbase/lib <span class="nv">$path</span>/qtbase/bin/qmake <span class="nt">-o</span> Makefile <span class="k">*</span>.pro
</code></pre></div></div>

<div class="c-indirectly-related">
  <p>Apparently the <code class="language-plaintext highlighter-rouge">cmake</code> commands do not compile any examples. No README in the repository had any further instructions. Keywords <code class="language-plaintext highlighter-rouge">gist qt makefile</code> lead me to <a href="https://gist.github.com/mishurov/8134532">how to compile without something called qmake</a>. Well, that’s one of the binaries the initial cmake commands built, so now we knew the magic word to search for and land on the intended <a href="https://doc.qt.io/qt-5/qmake-tutorial.html">tutorial</a>…</p>
</div>

<p>Running the example app with our URL gives us the same error<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup>, but with a clearer description:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>One or more SSL errors has occurred:
The issuer certificate of a locally looked up certificate could not be found
</code></pre></div></div>

<hr />

<p>Moving on to curl, we trace our request:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">--head</span> https://api.zealdocs.org/v1/docsets <span class="nt">--trace</span> /dev/stderr <span class="o">&gt;</span>/dev/null
</code></pre></div></div>

<p>Which shows us the system certificates that are loaded:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>* successfully set certificate verify locations:
*  CAfile: /etc/pki/tls/certs/ca-bundle.crt
</code></pre></div></div>

<p>curl does not send any (client) certificate:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>== Info: TLSv1.3 (OUT), TLS handshake, Client hello (1):
=&gt; Send SSL data, 512 bytes (0x200)
[...]
&lt;= Recv SSL data, 5 bytes (0x5)
0000: 16 03 03 00 6c                                  ....l
== Info: TLSv1.3 (IN), TLS handshake, Server hello (2):
&lt;= Recv SSL data, 108 bytes (0x6c)
[...]
&lt;= Recv SSL data, 5 bytes (0x5)
0000: 16 03 03 10 da                                  .....
== Info: TLSv1.2 (IN), TLS handshake, Certificate (11):
&lt;= Recv SSL data, 4314 bytes (0x10da)
[...]
&lt;= Recv SSL data, 5 bytes (0x5)
0000: 16 03 03 00 04                                  .....
== Info: TLSv1.2 (IN), TLS handshake, Server finished (14):
&lt;= Recv SSL data, 4 bytes (0x4)
0000: 0e 00 00 00                                     ....
=&gt; Send SSL data, 5 bytes (0x5)
0000: 16 03 03 00 46                                  ....F
== Info: TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
=&gt; Send SSL data, 70 bytes (0x46)
</code></pre></div></div>

<p>The request then proceeds without issues.</p>

<hr />

<p>Now, let’s check the server’s certificate chain:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>openssl s_client <span class="nt">-showcerts</span> <span class="nt">-connect</span> api.zealdocs.org:443 &lt;/dev/null
</code></pre></div></div>

<p>The root CA certificate should be the last issuer:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Certificate chain
 0 s:CN = api.ams2-01.zealdocs.org
   i:C = US, O = Let's Encrypt, CN = R3
[...]
 1 s:C = US, O = Let's Encrypt, CN = R3
   i:C = US, O = Internet Security Research Group, CN = ISRG Root X1
[...]
 2 s:C = US, O = Internet Security Research Group, CN = ISRG Root X1
   i:O = Digital Signature Trust Co., CN = DST Root CA X3
</code></pre></div></div>

<p>And we know they are valid:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SSL handshake has read 5071 bytes and written 436 bytes
Verification: OK
</code></pre></div></div>

<p>Let’s check if our system certificate bundle contains these CA certificates:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>openssl crl2pkcs7 <span class="nt">-nocrl</span> <span class="nt">-certfile</span> /etc/pki/tls/certs/ca-bundle.crt <span class="se">\</span>
    | openssl pkcs7 <span class="nt">-print_certs</span> <span class="nt">-text</span> <span class="nt">-noout</span>
</code></pre></div></div>

<p>Seems like they are present:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Subject: O=Digital Signature Trust Co., CN=DST Root CA X3
[...]
Subject: C=US, O=Internet Security Research Group, CN=ISRG Root X1
</code></pre></div></div>

<p>Therefore, loading this bundle should be enough to verify these CA certificates.</p>

<p>Is our Qt client doing it?</p>

<p>Filtering with <code class="language-plaintext highlighter-rouge">strace -e file -f -k</code>, we don’t find any read operations of that certificate bundle! Instead, it tries to load some other files that don’t exist:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>1984262 newfstatat(AT_FDCWD, "/etc/openssl/certs//8d33f237.0",  &lt;unfinished ...&gt;
[...]
1984262 &lt;... newfstatat resumed&gt;0x7f0d9bffd080, 0) = -1 ENOENT (No such file or directory)
[...]
1984262 newfstatat(AT_FDCWD, "/etc/ssl/certs//4042bcee.0", 0x7f0d9bffd080, 0) = -1 ENOENT (No such file or directory)
</code></pre></div></div>

<p>Turns out that it’s a <a href="https://github.com/owncloud/client/issues/1540">known issue</a>. To summarize, some environments generate <code class="language-plaintext highlighter-rouge">c_rehash</code> symlinks (which should reference certificate files). When these are present, Qt will parse them, skipping any existing bundles, even if the symlinks don’t reference any valid files.</p>

<h1 id="solution">Solution</h1>

<p>Keywords <code class="language-plaintext highlighter-rouge">qnetworkaccessmanager add ssl ca cert</code> eventually lead to a <a href="https://qgis.org/api/2.18/qgsnetworkaccessmanager_8cpp_source.html">snippet</a> that hinted at how to set the certificates for requests:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">QSslConfiguration</span> <span class="nf">sslconfig</span><span class="p">(</span> <span class="n">pReq</span><span class="o">-&gt;</span><span class="n">sslConfiguration</span><span class="p">()</span> <span class="p">);</span>
<span class="n">sslconfig</span><span class="p">.</span><span class="n">setCaCertificates</span><span class="p">(</span> <span class="n">QgsAuthManager</span><span class="o">::</span><span class="n">instance</span><span class="p">()</span><span class="o">-&gt;</span><span class="n">getTrustedCaCertsCache</span><span class="p">()</span> <span class="p">);</span>
<span class="c1">// [...]</span>
<span class="n">pReq</span><span class="o">-&gt;</span><span class="n">setSslConfiguration</span><span class="p">(</span> <span class="n">sslconfig</span> <span class="p">);</span>
</code></pre></div></div>

<p>Now, how to get the system certificates? One way to find out is to download the qtbase sources, which contain the QSslConfiguration class (adapt to your favourite distro):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">sudo </span>dnf debuginfo-install qt5-qtbase
</code></pre></div></div>

<p>Then, <code class="language-plaintext highlighter-rouge">grep -rin cacertificate</code> matches this function definition:</p>

<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*!
    \since 5.5

    This function provides the CA certificate database
    provided by the operating system. The CA certificate database
    returned by this function is used to initialize the database
    returned by caCertificates() on the default QSslConfiguration.

    \sa caCertificates(), setCaCertificates(), defaultConfiguration(),
    addCaCertificate(), addCaCertificates()
*/</span>
<span class="n">QList</span><span class="o">&lt;</span><span class="n">QSslCertificate</span><span class="o">&gt;</span> <span class="n">QSslConfiguration</span><span class="o">::</span><span class="n">systemCaCertificates</span><span class="p">()</span>
<span class="p">{</span>
    <span class="c1">// we are calling ensureInitialized() in the method below</span>
    <span class="k">return</span> <span class="n">QSslSocketPrivate</span><span class="o">::</span><span class="n">systemCaCertificates</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Finally, we have all the parts to go to our client app’s request building function and add these system certificates:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gh">diff --git a/src/libs/core/networkaccessmanager.cpp b/src/libs/core/networkaccessmanager.cpp
index 95200f9..26b88ed 100644
</span><span class="gd">--- a/src/libs/core/networkaccessmanager.cpp
</span><span class="gi">+++ b/src/libs/core/networkaccessmanager.cpp
</span><span class="p">@@ -71,5 +71,9 @@</span> QNetworkReply *NetworkAccessManager::createRequest(QNetworkAccessManager::Operat
         op = QNetworkAccessManager::GetOperation;
     }
<span class="gi">+
+    QSslConfiguration sslConfig = overrideRequest.sslConfiguration();
+    sslConfig.setCaCertificates(QSslConfiguration::systemCaCertificates());
+    overrideRequest.setSslConfiguration(sslConfig);
</span>
     return QNetworkAccessManager::createRequest(op, overrideRequest, outgoingData);
 }
</code></pre></div></div>

<p>With these changes, requests are now successful.</p>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>Alternatively, we could compile Qt sources with some <a href="https://gist.github.com/jeckhart/b2d4af50e371ed4c20c13cabb11e0f15">debug macro definitions</a>, maybe <a href="https://www.qt.io/blog/2014/12/04/how-to-help-qt-support-help-you-more-efficiently">a lot of them</a> to understand where exactly the validation logic fails. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">[return]</a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>SSL errors seem to be frequent enough in Qt apps that someone bothered to write <a href="https://www.kdab.com/wp-content/uploads/stories/slides/DD12/Using-SSL-the-right-way-with-qt.odp">slides on this topic</a>. More <a href="https://maulwuff.de/research/ssl-debugging.html">generic guides</a> can also be found. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">[return]</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="bugfix" /><category term="networking" /><category term="protocols" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">TCP By Disk</title><link href="https://nevesnunes.github.io/blog/2021/10/12/TCP-By-Disk.html" rel="alternate" type="text/html" title="TCP By Disk" /><published>2021-10-12T18:10:46+01:00</published><updated>2021-10-12T18:10:46+01:00</updated><id>https://nevesnunes.github.io/blog/2021/10/12/TCP-By-Disk</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/10/12/TCP-By-Disk.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<p>Ever wanted TCP, but instead of directly connecting a client to a server with sockets, you process requests and responses by writing and reading files? What do you mean “no”? Let me give you one contrived use case.</p>

<p>Suppose you are connecting to a Windows host via RDP. Only the RDP port is open to the internet, and the host can only reach other hosts on a private network. Apparently, there’s no way to tunnel connections like you would with e.g. SSH port forwarding.</p>

<p>However, it’s possible to have a filesystem share under <code class="language-plaintext highlighter-rouge">\\tsclient</code>, which is reachable from the guest host. A cursed hypothesis comes to mind: <strong>can we reliably redirect TCP connections through files?</strong></p>

<p>On the following sections, we will put together some implementations of this filesystem based relay.</p>

<h1 id="setup">Setup</h1>

<p>We will start with a simple localhost scenario. Then, we will bring up remote hosts on virtual machines, configured with a bridged network, and a mounted share that both local and remote hosts can write to.</p>

<p>It’s possible to relay TCP connections to files using socat. What is missing is some coordination in how to parse requests and responses, which will vary across implementations.</p>

<h1 id="scenarios">Scenarios</h1>

<h2 id="one-shot-get-localhost-client-and-server">One-shot GET, localhost client and server</h2>

<p><strong>Server script</strong> (echo.py):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">flask</span> <span class="kn">import</span> <span class="n">Flask</span>

<span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="n">__name__</span><span class="p">)</span>

<span class="o">@</span><span class="n">app</span><span class="p">.</span><span class="n">route</span><span class="p">(</span><span class="s">"/&lt;text&gt;"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s">"GET"</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">echo</span><span class="p">(</span><span class="n">text</span><span class="p">):</span>
    <span class="k">return</span> <span class="sa">f</span><span class="s">"You said (len = </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span><span class="si">}</span><span class="s">): </span><span class="si">{</span><span class="n">text</span><span class="si">}</span><span class="s">"</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">app</span><span class="p">.</span><span class="n">run</span><span class="p">()</span>
</code></pre></div></div>

<p><strong>Server session</strong>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Terminal 1</span>
./echo.py
<span class="c"># Terminal 2</span>
<span class="nb">rm</span> <span class="nt">-f</span> request response hello bye
<span class="k">while </span><span class="nb">true</span><span class="p">;</span> <span class="k">do
    </span>socat <span class="nt">-v</span> <span class="nt">-d</span> <span class="nt">-d</span> <span class="se">\</span>
        FILE:request,creat,ignoreeof,trunc <span class="se">\</span>
        TCP:localhost:5000,retry<span class="o">=</span>10,reuseaddr
    : <span class="o">&gt;</span> hello
    <span class="nb">echo</span> <span class="s2">"bye"</span> <span class="o">&gt;</span> bye
    <span class="nb">tail</span> <span class="nt">-F</span> hello | <span class="nb">grep</span> <span class="nt">-qm1</span> <span class="nb">.</span>
<span class="k">done</span>
</code></pre></div></div>

<p><strong>Client session</strong>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Terminal 1</span>
<span class="k">while </span><span class="nb">true</span><span class="p">;</span> <span class="k">do 
    </span><span class="nb">tail</span> <span class="nt">-F</span> bye | <span class="nb">grep</span> <span class="nt">-qm1</span> <span class="nb">.</span>
    <span class="nb">dd </span><span class="k">if</span><span class="o">=</span>request <span class="nv">of</span><span class="o">=</span>response <span class="nv">bs</span><span class="o">=</span>1 <span class="nv">skip</span><span class="o">=</span><span class="s2">"</span><span class="si">$(</span><span class="nb">cat</span> ./request_len<span class="si">)</span><span class="s2">"</span> <span class="o">&gt;</span>/dev/null 2&gt;&amp;1
    <span class="nb">cat </span>response
    : <span class="o">&gt;</span> bye
<span class="k">done</span>
<span class="c"># Terminal 2</span>
<span class="nb">echo</span> <span class="s2">"hello"</span> <span class="o">&gt;</span> hello
<span class="nb">sleep </span>1  <span class="c"># wait for socat to open and truncate ./request</span>
<span class="nv">data</span><span class="o">=</span><span class="s1">'GET / HTTP/1.1\r\n\r\n'</span>
<span class="nb">wc</span> <span class="nt">--bytes</span> &lt;<span class="o">(</span><span class="nb">echo</span> <span class="nt">-n</span> <span class="s2">"</span><span class="nv">$data</span><span class="s2">"</span><span class="o">)</span> | <span class="nb">cut</span> <span class="nt">-d</span><span class="s1">' '</span> <span class="nt">-f1</span> <span class="o">&gt;</span> request_len
<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$data</span><span class="s2">"</span> <span class="o">&gt;</span> request
</code></pre></div></div>

<p>Files used for coordination:</p>

<ul>
  <li>./request: payload of client HTTP request;</li>
  <li>./request_len: payload length of client HTTP request;</li>
  <li>./response: payload of server HTTP response;</li>
  <li>./hello: written when a new request has been written, signaling the server to read the request file;</li>
  <li>./bye: written when a new response has been written, signaling the client to read the response file.</li>
</ul>

<p>Conversation flow:</p>

<ol>
  <li>Client (Terminal 2) writes to ./hello, stores payload length in ./request_len, writes payload in ./request;</li>
  <li>Server reads request from ./request, writes response to ./request, writes to ./bye, waits for next write to ./hello;</li>
  <li>Client (Terminal 1) reads response from ./request, starting at ./request_len bytes offset.</li>
</ol>

<p>In this scenario, we avoid depending on socat for the client, but due to the server reading and writing to the same request file, we have to identify the response bytes offset in that file to extract just the response. We will see in the next scenarios that also using socat in the client avoids this manual processing.</p>

<h2 id="many-posts-localhost-client-and-server">Many POSTs, localhost client and server</h2>

<p><strong>Server script</strong> (echo.py):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">flask</span> <span class="kn">import</span> <span class="n">Flask</span><span class="p">,</span> <span class="n">request</span>

<span class="n">app</span> <span class="o">=</span> <span class="n">Flask</span><span class="p">(</span><span class="n">__name__</span><span class="p">)</span>

<span class="o">@</span><span class="n">app</span><span class="p">.</span><span class="n">route</span><span class="p">(</span><span class="s">"/raw"</span><span class="p">,</span> <span class="n">methods</span><span class="o">=</span><span class="p">[</span><span class="s">"POST"</span><span class="p">])</span>
<span class="k">def</span> <span class="nf">echo_raw</span><span class="p">():</span>
    <span class="n">text</span> <span class="o">=</span> <span class="n">request</span><span class="p">.</span><span class="n">get_data</span><span class="p">()</span>
    <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"You said (len = </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span><span class="si">}</span><span class="s">): </span><span class="si">{</span><span class="n">text</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">text</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">app</span><span class="p">.</span><span class="n">run</span><span class="p">()</span>
</code></pre></div></div>

<p><strong>Client script</strong> (echo_client.py):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">time</span>
<span class="kn">import</span> <span class="nn">urllib</span>
<span class="kn">from</span> <span class="nn">requests.adapters</span> <span class="kn">import</span> <span class="n">HTTPAdapter</span>
<span class="kn">from</span> <span class="nn">urllib3.util.retry</span> <span class="kn">import</span> <span class="n">Retry</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">"__main__"</span><span class="p">:</span>
    <span class="n">host</span> <span class="o">=</span> <span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
    <span class="n">port</span> <span class="o">=</span> <span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>

    <span class="n">retries</span> <span class="o">=</span> <span class="n">Retry</span><span class="p">(</span><span class="n">total</span><span class="o">=</span><span class="mi">20</span><span class="p">,</span> <span class="n">backoff_factor</span><span class="o">=</span><span class="mf">0.5</span><span class="p">)</span>
    <span class="n">s</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="n">Session</span><span class="p">()</span>
    <span class="n">s</span><span class="p">.</span><span class="n">mount</span><span class="p">(</span><span class="s">"http://"</span><span class="p">,</span> <span class="n">HTTPAdapter</span><span class="p">(</span><span class="n">max_retries</span><span class="o">=</span><span class="n">retries</span><span class="p">))</span>
    <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
        <span class="n">text</span> <span class="o">=</span> <span class="n">os</span><span class="p">.</span><span class="n">urandom</span><span class="p">(</span><span class="mi">32</span><span class="p">)</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Gonna say (len = </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span><span class="si">}</span><span class="s">): </span><span class="si">{</span><span class="n">text</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
        <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
            <span class="k">try</span><span class="p">:</span>
                <span class="n">r</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">post</span><span class="p">(</span><span class="sa">f</span><span class="s">"http://</span><span class="si">{</span><span class="n">host</span><span class="si">}</span><span class="s">:</span><span class="si">{</span><span class="n">port</span><span class="si">}</span><span class="s">/raw"</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="n">text</span><span class="p">)</span>
                <span class="k">break</span>
            <span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
                <span class="k">pass</span>
        <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">"Got (len = </span><span class="si">{</span><span class="nb">len</span><span class="p">(</span><span class="n">r</span><span class="p">.</span><span class="n">content</span><span class="p">)</span><span class="si">}</span><span class="s">): </span><span class="si">{</span><span class="n">r</span><span class="p">.</span><span class="n">content</span><span class="si">}</span><span class="s">"</span><span class="p">)</span>
        <span class="k">assert</span> <span class="nb">len</span><span class="p">(</span><span class="n">r</span><span class="p">.</span><span class="n">content</span><span class="p">)</span> <span class="o">==</span> <span class="nb">len</span><span class="p">(</span><span class="n">text</span><span class="p">)</span>
</code></pre></div></div>

<p><strong>Client session</strong>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Terminal 1</span>
./echo_client.py
<span class="c"># Terminal 2</span>
<span class="k">while </span><span class="nb">true</span><span class="p">;</span> <span class="k">do
    </span>socat <span class="nt">-v</span> <span class="nt">-d</span> <span class="nt">-d</span> <span class="se">\</span>
        TCP-LISTEN:5001,retry<span class="o">=</span>10,reuseaddr <span class="se">\</span>
        FILE:request,creat,ignoreeof,trunc
    <span class="nb">echo</span> <span class="s2">"hello"</span> <span class="o">&gt;</span> hello
    <span class="nb">tail</span> <span class="nt">-F</span> bye | <span class="nb">grep</span> <span class="nt">-qm1</span> <span class="nb">.</span>
    : <span class="o">&gt;</span> bye
<span class="k">done</span>
</code></pre></div></div>

<p>To ensure we can send and receive arbitrary bytes, we switch to continuously sent POST requests. Doing it in GET requests would imply URL encoding the payload, which <a href="https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers">isn’t reliable</a>. By having socat truncating the file both on the server and the client, request and response payloads won’t be present at the same time in the file, so we no longer have to manually extract responses by offset, simplifying the client loop.</p>

<h2 id="many-posts-remote-linux-server">Many POSTs, remote Linux server</h2>

<p><strong>Client session</strong>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Terminal 1</span>
./echo_client.py
<span class="c"># Terminal 2</span>
<span class="nv">share</span><span class="o">=</span><span class="nv">$HOME</span>/share
<span class="k">while </span><span class="nb">true</span><span class="p">;</span> <span class="k">do
    </span>socat <span class="nt">-v</span> <span class="nt">-d</span> <span class="nt">-d</span> <span class="se">\</span>
        TCP-LISTEN:5001,retry<span class="o">=</span>10,reuseaddr <span class="se">\</span>
        FILE:<span class="s2">"</span><span class="nv">$share</span><span class="s2">"</span>/request,creat,ignoreeof,trunc
    <span class="nb">echo</span> <span class="s2">"hello"</span> <span class="o">&gt;</span> <span class="s2">"</span><span class="nv">$share</span><span class="s2">"</span>/hello
    <span class="nb">tail</span> <span class="nt">-F</span> <span class="s2">"</span><span class="nv">$share</span><span class="s2">"</span>/bye | <span class="nb">grep</span> <span class="nt">-qm1</span> <span class="nb">.</span>
    : <span class="o">&gt;</span> <span class="s2">"</span><span class="nv">$share</span><span class="s2">"</span>/bye
<span class="k">done</span>
</code></pre></div></div>

<p>Instead of writing to files in a localhost directory, we write to files in a shared directory, which the remote server reads from.</p>

<h2 id="many-posts-remote-windows-server">Many POSTs, remote Windows server</h2>

<p><strong>Server session</strong>:</p>

<div class="language-powershell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">New-Item</span><span class="w"> </span><span class="nt">-ItemType</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nt">-ErrorAction</span><span class="w"> </span><span class="nx">SilentlyContinue</span><span class="w"> </span><span class="nx">hello</span><span class="p">,</span><span class="nx">bye</span><span class="p">,</span><span class="nx">request</span><span class="p">,</span><span class="nx">response</span><span class="w">
</span><span class="kr">do</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="c"># Workaround for "Invalid function" thrown by: `get-content hello -totalcount 1 -wait`</span><span class="w">
    </span><span class="kr">while</span><span class="w"> </span><span class="p">((</span><span class="n">gci</span><span class="w"> </span><span class="nx">hello</span><span class="p">)</span><span class="o">.</span><span class="nf">length</span><span class="w"> </span><span class="o">-eq</span><span class="w"> </span><span class="mi">0</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="n">Start-Sleep</span><span class="w"> </span><span class="nt">-Milliseconds</span><span class="w"> </span><span class="nx">100</span><span class="w">
    </span><span class="p">}</span><span class="w">
    </span><span class="o">&amp;</span><span class="w"> </span><span class="nv">$</span><span class="nn">env</span><span class="p">:</span><span class="nv">USERPROFILE</span><span class="n">\Downloads\socat\socat.exe</span><span class="w"> </span><span class="nt">-v</span><span class="w"> </span><span class="nt">-d</span><span class="w"> </span><span class="nt">-d</span><span class="w"> </span><span class="nx">FILE:request</span><span class="p">,</span><span class="nx">creat</span><span class="p">,</span><span class="nx">ignoreeof</span><span class="p">,</span><span class="nx">trunc</span><span class="w"> </span><span class="nx">TCP:localhost:5000</span><span class="p">,</span><span class="nx">retry</span><span class="o">=</span><span class="mi">10</span><span class="w">
    </span><span class="c"># Workaround for UTF BOM added by: `echo $null &gt; hello`</span><span class="w">
    </span><span class="n">New-Item</span><span class="w"> </span><span class="nt">-ItemType</span><span class="w"> </span><span class="nx">file</span><span class="w"> </span><span class="nt">-Force</span><span class="w"> </span><span class="nx">hello</span><span class="w">
    </span><span class="n">echo</span><span class="w"> </span><span class="s2">"bye"</span><span class="w"> </span><span class="err">&gt;</span><span class="w"> </span><span class="nx">bye</span><span class="w">
</span><span class="p">}</span><span class="w"> </span><span class="kr">while</span><span class="w"> </span><span class="p">(</span><span class="bp">$true</span><span class="p">)</span><span class="w">
</span></code></pre></div></div>

<p>Since the remote host is Windows, our shell commands are now written in powershell, but the coordination logic is equivalent.</p>

<h1 id="further-work">Further Work</h1>

<ul>
  <li>Spinning up socat instances for every request is slow, but we can’t just have a socat listening with option <code class="language-plaintext highlighter-rouge">fork</code>, since we require consistent truncation of the same request file. An alternative would probably involve replacing socat with some script that does this relay, but manages files in a more flexible way;</li>
  <li>Add <a href="https://stackoverflow.com/questions/17480967/using-socat-to-multiplex-incoming-tcp-connection">request multiplexing</a>. Currently, only one client at a time is supported, otherwise the request file would end up mixing payloads from distinct clients.</li>
</ul>]]></content><author><name></name></author><category term="networking" /><category term="relays" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">CTF Writeup - TSG CTF 2021 - 2 Reversing Tasks</title><link href="https://nevesnunes.github.io/blog/2021/10/03/CTF-Writeup-TSG-CTF-2021-2-Reversing-Tasks.html" rel="alternate" type="text/html" title="CTF Writeup - TSG CTF 2021 - 2 Reversing Tasks" /><published>2021-10-03T11:26:35+01:00</published><updated>2021-10-03T11:26:35+01:00</updated><id>https://nevesnunes.github.io/blog/2021/10/03/CTF-Writeup---TSG-CTF-2021---2-Reversing-Tasks</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/10/03/CTF-Writeup-TSG-CTF-2021-2-Reversing-Tasks.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<h1 id="beginners-rev-2021">Beginner’s Rev 2021</h1>

<blockquote>
  <p>Don’t spend too much on reading the code. Once you get an idea of the behavior, I recommend you to try some dynamic analysis with various tools.</p>
</blockquote>

<p><a href="https://hackmd.io/@mikit/rkej4TLVK">Author’s Writeup</a>, <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/beginners_rev">Download</a></p>

<h2 id="analysis">Analysis</h2>

<p>With <code class="language-plaintext highlighter-rouge">strace</code> we spot several calls to <code class="language-plaintext highlighter-rouge">fork()</code> and <code class="language-plaintext highlighter-rouge">wait()</code>, suggesting some computation being processed from child processes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3109bfa850) = 1481328
 &gt; /usr/lib64/libc-2.33.so(__libc_fork+0x69) [0xccde9]
 &gt; beginners_rev(check+0x34) [0x31a4]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3109bfa850) = 1481343
 &gt; /usr/lib64/libc-2.33.so(__libc_fork+0x69) [0xccde9]
 &gt; beginners_rev(check+0x34) [0x31a4]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f3109bfa850) = 1481344
 &gt; /usr/lib64/libc-2.33.so(__libc_fork+0x69) [0xccde9]
 &gt; beginners_rev(check+0x34) [0x31a4]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
[...]
wait4(-1, [{WIFEXITED(s) &amp;&amp; WEXITSTATUS(s) == 1}], 0, NULL) = 1481328
 &gt; /usr/lib64/libc-2.33.so(wait4+0x1a) [0xccaca]
 &gt; beginners_rev(check+0x7a) [0x31ea]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1481328, si_uid=1000, si_status=1, si_utime=0, si_stime=0} ---
 &gt; /usr/lib64/libc-2.33.so(wait4+0x1a) [0xccaca]
 &gt; beginners_rev(check+0x7a) [0x31ea]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
wait4(-1, [{WIFEXITED(s) &amp;&amp; WEXITSTATUS(s) == 1}], 0, NULL) = 1481343
 &gt; /usr/lib64/libc-2.33.so(wait4+0x1a) [0xccaca]
 &gt; beginners_rev(check+0x7a) [0x31ea]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1481343, si_uid=1000, si_status=1, si_utime=0, si_stime=0} ---
 &gt; /usr/lib64/libc-2.33.so(wait4+0x1a) [0xccaca]
 &gt; beginners_rev(check+0x7a) [0x31ea]
 &gt; /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
 &gt; beginners_rev(_start+0x2d) [0x11bd]
</code></pre></div></div>

<p>After providing some input via stdin, we verify from the output messages that it expects an input length of 32 characters.</p>

<p>From the decompilation, we see that <code class="language-plaintext highlighter-rouge">main()</code> calls <code class="language-plaintext highlighter-rouge">check()</code>, and each child process runs <code class="language-plaintext highlighter-rouge">is_correct()</code> over a given character of the passed input (at index <code class="language-plaintext highlighter-rouge">ki_pi</code>):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">do</span> <span class="p">{</span>
  <span class="n">_Var1</span> <span class="o">=</span> <span class="n">fork</span><span class="p">();</span>
  <span class="n">iVar2</span> <span class="o">=</span> <span class="n">iVar2</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">_Var1</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">iVar2</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">ki_pi</span> <span class="o">=</span> <span class="n">ki_pi</span> <span class="o">|</span> <span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="p">((</span><span class="n">byte</span><span class="p">)</span><span class="n">i</span> <span class="o">&amp;</span> <span class="mh">0x1f</span><span class="p">);</span>
    <span class="n">fd</span> <span class="o">=</span> <span class="n">open</span><span class="p">(</span><span class="s">"/dev/null"</span><span class="p">,</span><span class="mi">1</span><span class="p">);</span>
    <span class="n">dup2</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span><span class="mi">1</span><span class="p">);</span>
  <span class="p">}</span>
  <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="mi">5</span><span class="p">);</span>
<span class="n">i</span> <span class="o">=</span> <span class="n">iVar2</span> <span class="o">+</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="n">fd</span> <span class="o">=</span> <span class="n">is_correct</span><span class="p">((</span><span class="kt">int</span><span class="p">)</span><span class="o">*</span><span class="p">(</span><span class="kt">char</span> <span class="o">*</span><span class="p">)(</span><span class="n">input</span> <span class="o">+</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">ki_pi</span><span class="p">),</span><span class="n">ki_pi</span><span class="p">);</span>
<span class="n">flag</span> <span class="o">=</span> <span class="n">fd</span> <span class="o">==</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">iVar2</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
    <span class="n">wait</span><span class="p">(</span><span class="o">&amp;</span><span class="n">wstatus</span><span class="p">);</span>
    <span class="n">flag</span> <span class="o">=</span> <span class="n">flag</span> <span class="o">|</span> <span class="n">local_33</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">flag</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"correct"</span><span class="p">);</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
  <span class="n">puts</span><span class="p">(</span><span class="s">"wrong"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<h2 id="solution">Solution</h2>

<p>Since each character is being independently processed, it becomes feasible to just bruteforce the expected characters one-by-one. To verify if we got the right character, we need to trace the result of <code class="language-plaintext highlighter-rouge">is_correct()</code> for all processes. It’s possible to follow child processes in gdb. However, there’s an anti-debugging check in <code class="language-plaintext highlighter-rouge">is_correct()</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">in_stack_00000000</span> <span class="o">!=</span> <span class="mh">0x1031cf</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">fwrite</span><span class="p">(</span><span class="s">"This function may not work properly with a debugger."</span><span class="p">,</span><span class="mi">1</span><span class="p">,</span><span class="mh">0x34</span><span class="p">,</span><span class="n">stderr</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In assembly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0010128d 48 8b 74        MOV        RSI,qword ptr [RSP + 0x18] ; load return address
         24 18
00101292 48 8d 1d        LEA        RBX,[check] ; load start address of check()
         d7 1e 00 00
00101299 48 89 f0        MOV        RAX,RSI
0010129c 48 29 d8        SUB        RAX,RBX
0010129f 48 83 f8 5f     CMP        RAX,0x5f
001012a3 74 22           JZ         char_check
001012a5 48 8b 0d        MOV        RCX,qword ptr [stderr]
         74 4d 00 00
001012ac be 01 00        MOV        RSI,0x1
         00 00
001012b1 ba 34 00        MOV        EDX,0x34
         00 00
001012b6 48 8d 3d        LEA        input,[s_This_function_may_not_work_prope_0010   = "This function may not work pr
         4b 2d 00 00
001012bd e8 4e fe        CALL       &lt;EXTERNAL&gt;::fwrite                               size_t fwrite(void * __ptr, size
         ff ff
</code></pre></div></div>

<p>This checks that the difference between <code class="language-plaintext highlighter-rouge">0x1031cf</code> (the return address pushed into the stack, which is the first instruction after the call instruction at <code class="language-plaintext highlighter-rouge">0x1031ca</code>) and the start of <code class="language-plaintext highlighter-rouge">check()</code> is <code class="language-plaintext highlighter-rouge">0x5f</code> (<code class="language-plaintext highlighter-rouge">0x1031cf - 0x103170 = 0x5f</code>). However, if either the stack or the surrounding instructions are changed, this difference might also change.</p>

<p>The expected return address is also used multiple times during character validation:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0010147f 49 01 f0        ADD        R8,RSI
[...]
0010196e 49 01 f0        ADD        R8,RSI
[...]
00102faa 48 89 f0        MOV        RAX,RSI
</code></pre></div></div>

<p>Therefore, running in a debugger could cause the expected characters to not pass validation.</p>

<p>As an alternative to running in a debugger, we can dynamically instrument the process using frida. We’ll adapt an existing <a href="https://github.com/frida/frida-python/blob/master/examples/child_gating.py">full example on instrumenting child processes</a>, which is accompanied by an <a href="https://frida.re/news/2018/04/28/frida-10-8-released/">high-level description</a>. Let’s focus on the key parts of both our <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/frida_session.py">client script</a> and <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/frida_trace.js">instrumentation script</a>.</p>

<p>A first attempt was to trace at the input check function <code class="language-plaintext highlighter-rouge">is_correct()</code>:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">m</span> <span class="o">=</span> <span class="nx">Process</span><span class="p">.</span><span class="nx">enumerateModules</span><span class="p">()[</span><span class="mi">0</span><span class="p">];</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Base address: </span><span class="dl">'</span> <span class="o">+</span> <span class="nx">m</span><span class="p">.</span><span class="nx">base</span><span class="p">);</span>

<span class="kd">var</span> <span class="nx">char</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="kd">var</span> <span class="nx">char_i</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="kd">var</span> <span class="nx">is_correct</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span>
<span class="nx">Interceptor</span><span class="p">.</span><span class="nx">attach</span><span class="p">(</span><span class="nx">ptr</span><span class="p">(</span><span class="nx">m</span><span class="p">.</span><span class="nx">base</span><span class="p">.</span><span class="nx">add</span><span class="p">(</span><span class="mh">0x1280</span><span class="p">)),</span> <span class="p">{</span>
    <span class="na">onEnter</span><span class="p">:</span> <span class="kd">function</span><span class="p">(</span><span class="nx">args</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">char</span> <span class="o">=</span> <span class="nx">args</span><span class="p">[</span><span class="mi">0</span><span class="p">].</span><span class="nx">toInt32</span><span class="p">()</span>
        <span class="nx">char_i</span> <span class="o">=</span> <span class="nx">args</span><span class="p">[</span><span class="mi">1</span><span class="p">].</span><span class="nx">toInt32</span><span class="p">()</span>
        <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="s2">`is_correct(</span><span class="p">${</span><span class="nx">char</span><span class="p">}</span><span class="s2">, </span><span class="p">${</span><span class="nx">char_i</span><span class="p">}</span><span class="s2">)`</span><span class="p">);</span>
    <span class="p">},</span>
    <span class="na">onLeave</span><span class="p">:</span> <span class="kd">function</span><span class="p">(</span><span class="nx">retval</span><span class="p">)</span> <span class="p">{</span>
        <span class="kd">const</span> <span class="nx">v</span> <span class="o">=</span> <span class="nx">retval</span><span class="p">.</span><span class="nx">toInt32</span><span class="p">()</span>
        <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">-&gt; </span><span class="dl">"</span> <span class="o">+</span> <span class="nx">v</span><span class="p">);</span>
        <span class="nx">send</span><span class="p">([</span><span class="nx">char</span><span class="p">,</span> <span class="nx">char_i</span><span class="p">,</span> <span class="nx">v</span><span class="p">]);</span>
    <span class="p">}</span>
<span class="p">});</span>
</code></pre></div></div>

<p>But this fails the anti-debug. Instead, we have to instrument before and after the function call, parsing the corresponding input and output registers:</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">Interceptor</span><span class="p">.</span><span class="nx">attach</span><span class="p">(</span><span class="nx">ptr</span><span class="p">(</span><span class="nx">m</span><span class="p">.</span><span class="nx">base</span><span class="p">.</span><span class="nx">add</span><span class="p">(</span><span class="mh">0x31c4</span><span class="p">)),</span>
    <span class="kd">function</span><span class="p">(</span><span class="nx">args</span><span class="p">)</span> <span class="p">{</span>
        <span class="nx">char</span> <span class="o">=</span> <span class="nx">Memory</span><span class="p">.</span><span class="nx">readU8</span><span class="p">(</span><span class="k">this</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="nx">r13</span><span class="p">.</span><span class="nx">add</span><span class="p">(</span><span class="k">this</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="nx">rax</span><span class="p">))</span>
        <span class="kd">var</span> <span class="nx">rsi</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="nx">rsi</span>
        <span class="nx">char_i</span> <span class="o">=</span> <span class="nb">parseInt</span><span class="p">(</span><span class="nx">rsi</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">);</span>
<span class="nx">Interceptor</span><span class="p">.</span><span class="nx">attach</span><span class="p">(</span><span class="nx">ptr</span><span class="p">(</span><span class="nx">m</span><span class="p">.</span><span class="nx">base</span><span class="p">.</span><span class="nx">add</span><span class="p">(</span><span class="mh">0x31cf</span><span class="p">)),</span>
    <span class="kd">function</span><span class="p">(</span><span class="nx">args</span><span class="p">)</span> <span class="p">{</span>
        <span class="kd">var</span> <span class="nx">rax</span> <span class="o">=</span> <span class="k">this</span><span class="p">.</span><span class="nx">context</span><span class="p">.</span><span class="nx">rax</span>
        <span class="nx">is_correct</span> <span class="o">=</span> <span class="nb">parseInt</span><span class="p">(</span><span class="nx">rax</span><span class="p">)</span>
        <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">([</span><span class="nx">char</span><span class="p">,</span> <span class="nx">char_i</span><span class="p">,</span> <span class="nx">is_correct</span><span class="p">])</span>
        <span class="nx">send</span><span class="p">([</span><span class="nx">char</span><span class="p">,</span> <span class="nx">char_i</span><span class="p">,</span> <span class="nx">is_correct</span><span class="p">]);</span>
    <span class="p">}</span>
<span class="p">);</span>
</code></pre></div></div>

<div class="c-indirectly-related">
  <p>This alternative happens to work because the instrumentation patches don’t introduce instruction misalignments. To verify this:</p>

  <ol>
    <li>Run without ASLR (e.g. under a shell started with <code class="language-plaintext highlighter-rouge">setarch "$(uname -m)" -R /bin/bash</code>);</li>
    <li>Wait for the debugger at the end of our instrumentation script:
      <div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">while</span> <span class="p">(</span><span class="o">!</span><span class="nx">Process</span><span class="p">.</span><span class="nx">isDebuggerAttached</span><span class="p">())</span> <span class="p">{</span>
  <span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="dl">'</span><span class="s1">Waiting for debugger in PID:</span><span class="dl">'</span><span class="p">,</span> <span class="nx">Process</span><span class="p">.</span><span class="nx">id</span><span class="p">);</span>
  <span class="nx">Thread</span><span class="p">.</span><span class="nx">sleep</span><span class="p">(</span><span class="mi">5</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div>      </div>
    </li>
    <li>Attach gdb and dump around the call with <code class="language-plaintext highlighter-rouge">disas (0x555555554000+0x31c1),+30</code>;</li>
    <li>Attach another gdb to the executable without instrumentation and dump around the call;</li>
    <li>Compare these two listings, noticing that the call address and one of the following instruction’s address (e.g. <code class="language-plaintext highlighter-rouge">lea  rbp,[rsp+0x4]</code>) happen to have the same offset:
      <div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">--- gdb beginners_rev
</span><span class="gi">+++ gdb -p 1689554  # PID taken from script log
</span><span class="p">@@ -1,8 +1,9 @@</span>
 0x00005555555571c1 &lt;check+81&gt;:       xor    r12d,r12d
<span class="gd">-0x00005555555571c4 &lt;check+84&gt;:       movsx  edi,BYTE PTR [r13+rax*1+0x0]
</span><span class="gi">+0x00005555555571c4 &lt;check+84&gt;:       jmp    0x55555557d008
+0x00005555555571c9 &lt;check+89&gt;:       nop
</span> 0x00005555555571ca &lt;check+90&gt;:       call   0x555555555280 &lt;is_correct&gt;
<span class="gd">-0x00005555555571cf &lt;check+95&gt;:       test   eax,eax
-0x00005555555571d1 &lt;check+97&gt;:       sete   r12b
</span><span class="gi">+0x00005555555571cf &lt;check+95&gt;:       jmp    0x55555557d108
+0x00005555555571d4 &lt;check+100&gt;:      nop
</span> 0x00005555555571d5 &lt;check+101&gt;:      test   ebp,ebp
 0x00005555555571d7 &lt;check+103&gt;:      je     0x5555555571f8 &lt;check+136&gt;
 0x00005555555571d9 &lt;check+105&gt;:      lea    rbp,[rsp+0x4]
</code></pre></div>      </div>
    </li>
  </ol>
</div>

<p>The sent message from our instrumentation script is then parsed:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">_on_message</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">pid</span><span class="p">,</span> <span class="n">message</span><span class="p">):</span>
    <span class="n">char_i</span> <span class="o">=</span> <span class="n">message</span><span class="p">[</span><span class="s">"payload"</span><span class="p">][</span><span class="mi">1</span><span class="p">]</span>
    <span class="n">results</span><span class="p">[</span><span class="n">char_i</span><span class="p">]</span> <span class="o">=</span> <span class="n">message</span><span class="p">[</span><span class="s">"payload"</span><span class="p">]</span>
    <span class="k">print</span><span class="p">(</span><span class="s">"* message: pid={}, payload={}"</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">pid</span><span class="p">,</span> <span class="n">message</span><span class="p">[</span><span class="s">"payload"</span><span class="p">]))</span>
</code></pre></div></div>

<p>And we track each character that was valid (i.e. <code class="language-plaintext highlighter-rouge">rax = 1</code>):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flag</span> <span class="o">=</span> <span class="p">[</span><span class="s">"?"</span><span class="p">]</span> <span class="o">*</span> <span class="mi">32</span>
<span class="k">for</span> <span class="n">c</span> <span class="ow">in</span> <span class="n">string</span><span class="p">.</span><span class="n">printable</span><span class="p">:</span>
    <span class="n">input_data</span> <span class="o">=</span> <span class="s">""</span><span class="p">.</span><span class="n">join</span><span class="p">([</span><span class="n">c</span><span class="p">]</span> <span class="o">*</span> <span class="mi">32</span><span class="p">)</span>
    <span class="n">results</span> <span class="o">=</span> <span class="p">[</span><span class="bp">None</span><span class="p">]</span> <span class="o">*</span> <span class="mi">32</span>

    <span class="n">app</span> <span class="o">=</span> <span class="n">Application</span><span class="p">()</span>
    <span class="n">app</span><span class="p">.</span><span class="n">run</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">result</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span>
        <span class="k">if</span> <span class="n">result</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
            <span class="n">flag</span><span class="p">[</span><span class="n">result</span><span class="p">[</span><span class="mi">1</span><span class="p">]]</span> <span class="o">=</span> <span class="nb">chr</span><span class="p">(</span><span class="n">result</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
</code></pre></div></div>

<p>Joining all the found characters gives us the flag:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>TSGCTF{y0u_kN0w_m@ny_g0od_t0015}
</code></pre></div></div>

<hr />

<h1 id="optimized">optimized</h1>

<blockquote>
  <p>Decompilers hate this state-of-the-art math trick…</p>
</blockquote>

<p><a href="https://hackmd.io/@ishitatsuyuki/B1MDOgw4Y">Author’s Writeup</a>, <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/optimized">Download</a></p>

<h2 id="analysis-1">Analysis</h2>

<p>Starting with <code class="language-plaintext highlighter-rouge">strace -k</code>, several operations modifying module maps can be spotted, and later on there’s the prompt for input:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mmap(0x800000, 2177040, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, 0, 0) = 0x800000
 &gt; optimized() [0x400b3f]
readlink("/proc/self/exe", "opti"..., 4096) = 37
mmap(0x400000, 2117632, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x400000
mmap(0x400000, 13704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x400000
mprotect(0x400000, 13704, PROT_READ|PROT_EXEC) = 0
mmap(0x603000, 4232, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0x3000) = 0x603000
mprotect(0x603000, 4232, PROT_READ|PROT_WRITE) = 0
open("/lib64/ld-linux-x86-64.so.2", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0&gt;\0\1\0\0\0\220\20\0\0\0\0\0\0"..., 1024) = 1024
mmap(NULL, 212992, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe434b50000
mmap(0x7fe434b50000, 3112, PROT_READ, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x7fe434b50000
mmap(0x7fe434b51000, 149910, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0x1000) = 0x7fe434b51000
mmap(0x7fe434b76000, 37436, PROT_READ, MAP_PRIVATE|MAP_FIXED, 3, 0x26000) = 0x7fe434b76000
mmap(0x7fe434b80000, 12344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x2f000) = 0x7fe434b80000
close(3)                                = 0
munmap(0x800000, 2177040)               = 0
 &gt; optimized() [0x40000e]
[...]
write(1, "Enter password: ", 16)        = 16
 &gt; /usr/lib64/libc-2.33.so(write+0x17) [0xf1387]
 &gt; /usr/lib64/libc-2.33.so(_IO_file_write@@GLIBC_2.2.5+0x2c) [0x8178c]
 &gt; /usr/lib64/libc-2.33.so(new_do_write+0x65) [0x80b05]
 &gt; /usr/lib64/libc-2.33.so(_IO_do_write@@GLIBC_2.2.5+0x18) [0x82828]
 &gt; /usr/lib64/libc-2.33.so(_IO_file_sync@@GLIBC_2.2.5+0xa7) [0x80927]
 &gt; /usr/lib64/libc-2.33.so(_IO_fflush+0x85) [0x758a5]
 &gt; optimized() [0x40093f]
</code></pre></div></div>

<p>However, if we check the executable’s disassembly, we see that there’s no <code class="language-plaintext highlighter-rouge">write()</code> call at <code class="language-plaintext highlighter-rouge">0x40093f</code>, much less any input parsing logic:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                     entry
00400928 e8 53 02        CALL       FUN_00400b80
         00 00
0040092d 55              PUSH       RBP
0040092e 53              PUSH       RBX
0040092f 51              PUSH       RCX
00400930 52              PUSH       RDX
00400931 48 01 fe        ADD        RSI,RDI
00400934 56              PUSH       RSI
00400935 48 89 fe        MOV        RSI,RDI
00400938 48 89 d7        MOV        RDI,RDX
0040093b 31 db           XOR        EBX,EBX
0040093d 31 c9           XOR        ECX,ECX
0040093f 48 83 cd ff     OR         RBP,-0x1
00400943 e8 50 00        CALL       FUN_00400998
         00 00
00400948 01 db           ADD        EBX,EBX
0040094a 74 02           JZ         LAB_0040094e
0040094c f3 c3           RET
</code></pre></div></div>

<p>Seems like we have self-modifying code: a new executable map at <code class="language-plaintext highlighter-rouge">0x800000</code> is created (<code class="language-plaintext highlighter-rouge">mmap(0x800000, 2177040, PROT_READ|PROT_WRITE|PROT_EXEC, ...)</code>), the original executable map at <code class="language-plaintext highlighter-rouge">0x400000</code> becomes writable (<code class="language-plaintext highlighter-rouge">mprotect(0x603000, 4232, PROT_READ|PROT_WRITE)</code>), and we can assume that new code will be written there. This unpacker also cleans up after itself, removing from memory the map that contains its code (<code class="language-plaintext highlighter-rouge">munmap(0x800000, 2177040)</code>)</p>

<p>To locate the original entry point (i.e. the entry address of the original packed executable), we could <code class="language-plaintext highlighter-rouge">catch syscall munmap</code>, and follow manually from there. In an attempt to get closer than that, I ran the following gdb script, so that it would stop at the <code class="language-plaintext highlighter-rouge">write()</code> instruction after unpacking:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">gdb</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="c1"># Before jumping to unpacker
</span><span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x400b7c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"r"</span><span class="p">)</span>

<span class="c1"># Unpacker has been written at this point, now we can break on it
</span><span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x800a3b"</span><span class="p">)</span>

<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
    <span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"si"</span><span class="p">)</span>
    <span class="n">rip</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">gdb</span><span class="p">.</span><span class="n">parse_and_eval</span><span class="p">(</span><span class="s">"$rip"</span><span class="p">)).</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">16</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">rip</span> <span class="o">==</span> <span class="mh">0x40093f</span><span class="p">:</span>
        <span class="c1"># Stepped up to write() call
</span>        <span class="k">break</span>
</code></pre></div></div>

<p>Besides taking several minutes, it seems that gdb just ends up going from map <code class="language-plaintext highlighter-rouge">0x800000</code> right into libc addresses, without following addresses in map <code class="language-plaintext highlighter-rouge">0x400000</code>. Furthermore, the changes in map protections also caused issues with placed breakpoints, which needed to be deleted after being hit. After these fixes, and finding address <code class="language-plaintext highlighter-rouge">0x800c8b</code> which is closer to the unpacking end, we could reliably stop inside map <code class="language-plaintext highlighter-rouge">0x400000</code>:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">gdb</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x400b7c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"r"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x800a3b"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"del 1"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"del 2"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x800c8b"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"del 3"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *0x40093f"</span><span class="p">)</span>
<span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
</code></pre></div></div>

<p>Alternatively, since we know that the process reads from input, we can just let it run and only attach afterwards to it, which would also work to dump the unpacked executable.</p>

<p>Let’s turn the <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/optimized.dump">process memory into an ELF executable</a> using <a href="https://github.com/whatsbcn/skpd">skpd</a>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./skpd -p $(pgrep optimized) -o optimized.dump
</code></pre></div></div>

<p>And dissassemble that:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Invalid file offset 15888 while reading optimized.dump
java.io.EOFException: Invalid file offset 15888 while reading optimized.dump
	at ghidra.app.util.bin.RandomAccessByteProvider.readBytes(RandomAccessByteProvider.java:140)
	at ghidra.app.util.bin.BinaryReader.readLong(BinaryReader.java:703)
	at ghidra.app.util.bin.BinaryReader.readNextLong(BinaryReader.java:338)
	at ghidra.app.util.bin.format.elf.ElfDynamic.initElfDynamic(ElfDynamic.java:83)
	at ghidra.app.util.bin.format.elf.ElfDynamic.createElfDynamic(ElfDynamic.java:66)
	at ghidra.app.util.bin.format.elf.ElfDynamicTable.initDynamicTable(ElfDynamicTable.java:71)
	at ghidra.app.util.bin.format.elf.ElfDynamicTable.createDynamicTable(ElfDynamicTable.java:48)
	at ghidra.app.util.bin.format.elf.ElfHeader.parseDynamicTable(ElfHeader.java:626)
	at ghidra.app.util.bin.format.elf.ElfHeader.parse(ElfHeader.java:221)
	at ghidra.app.util.opinion.ElfProgramBuilder.load(ElfProgramBuilder.java:110)
	at ghidra.app.util.opinion.ElfProgramBuilder.loadElf(ElfProgramBuilder.java:103)
	at ghidra.app.util.opinion.ElfLoader.load(ElfLoader.java:153)
</code></pre></div></div>

<p>Ok, some headers probably need fixing… Alternatively, we can just dump the executable map:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; dump memory out.0x400000.mem 0x400000 0x404000
</code></pre></div></div>

<p>And load it as an overlay of the original disassembly (on ghidra: <code class="language-plaintext highlighter-rouge">File &gt; Add To Program...</code> + <code class="language-plaintext highlighter-rouge">Options... &gt; Check: Overlay, Base Address = 0x400000</code>), resulting in a new map:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/map.png" alt="" />
</div>

<p>Now, we manually decompile around address <code class="language-plaintext highlighter-rouge">0x40093f</code>, revealing the flag checks:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">printf</span><span class="p">(</span><span class="s">"Enter password: "</span><span class="p">);</span>
<span class="n">FUN_segment_0b__00400820</span><span class="p">(</span><span class="n">_DAT_006040a8</span><span class="p">);</span>
<span class="n">iVar1</span> <span class="o">=</span> <span class="n">scanf</span><span class="p">(</span><span class="s">"%u %u %u %u"</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v1</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v2</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v3</span><span class="p">,</span><span class="o">&amp;</span><span class="n">v4</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">iVar1</span> <span class="o">==</span> <span class="mi">4</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">uVar4</span> <span class="o">=</span> <span class="n">SUB164</span><span class="p">(</span><span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)</span><span class="n">v1</span> <span class="o">*</span> <span class="mh">0x5f50ddca7b17</span><span class="p">)</span> <span class="o">*</span> <span class="n">ZEXT816</span><span class="p">(</span><span class="mh">0x2af91</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mh">0x40</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0x3ffff</span><span class="p">;</span>
  <span class="k">if</span> <span class="p">(</span><span class="nb">false</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">uVar4</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="k">if</span> <span class="p">(</span><span class="nb">false</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">uVar7</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
  <span class="p">}</span>
  <span class="k">else</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="nb">false</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">uVar7</span> <span class="o">=</span> <span class="mh">0x9569</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">else</span> <span class="p">{</span>
      <span class="n">uVar7</span> <span class="o">=</span> <span class="mh">0x9569</span><span class="p">;</span>
    <span class="p">}</span>
  <span class="p">}</span>
  <span class="n">auVar6</span> <span class="o">=</span> <span class="n">CONCAT115</span><span class="p">(</span><span class="mh">0xff</span><span class="p">,</span><span class="n">CONCAT114</span><span class="p">(</span><span class="mh">0xff</span><span class="p">,</span><span class="n">CONCAT113</span><span class="p">(</span><span class="mh">0xff</span><span class="p">,</span><span class="n">CONCAT112</span><span class="p">(</span><span class="mh">0xff</span><span class="p">,</span><span class="n">CONCAT111</span><span class="p">(</span><span class="mh">0xff</span><span class="p">,</span><span class="n">CONCAT110</span><span class="p">(</span><span class="o">-</span><span class="p">(</span>
                                                <span class="p">(</span><span class="kt">char</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">&gt;&gt;</span> <span class="mh">0x10</span><span class="p">)</span> <span class="o">==</span> <span class="sc">'\0'</span><span class="p">),</span>
                                                <span class="n">CONCAT19</span><span class="p">(</span><span class="o">-</span><span class="p">((</span><span class="kt">char</span><span class="p">)((</span><span class="n">ushort</span><span class="p">)</span><span class="n">uVar7</span> <span class="o">&gt;&gt;</span> <span class="mi">8</span><span class="p">)</span> <span class="o">==</span>
                                                          <span class="p">(</span><span class="kt">char</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">&gt;&gt;</span> <span class="mi">8</span><span class="p">)),</span>
                                                         <span class="n">CONCAT18</span><span class="p">(</span><span class="o">-</span><span class="p">((</span><span class="kt">char</span><span class="p">)</span><span class="n">uVar7</span> <span class="o">==</span> <span class="p">(</span><span class="kt">char</span><span class="p">)</span><span class="n">uVar4</span><span class="p">),</span>
                                                                  <span class="mh">0xffffffffffffffff</span><span class="p">))))))));</span>
  <span class="k">if</span> <span class="p">((</span><span class="n">ushort</span><span class="p">)((</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mi">7</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">|</span> <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0xf</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">1</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x17</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">2</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x1f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">3</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x27</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">4</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x2f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">5</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x37</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">6</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x3f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">7</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x47</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">8</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x4f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">9</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x57</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mi">10</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x5f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0xb</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x67</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0xc</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x6f</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0xd</span> <span class="o">|</span>
               <span class="p">(</span><span class="n">ushort</span><span class="p">)(</span><span class="n">SUB161</span><span class="p">(</span><span class="n">auVar6</span> <span class="o">&gt;&gt;</span> <span class="mh">0x77</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mi">1</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="mh">0xe</span> <span class="o">|</span> <span class="mh">0x8000</span><span class="p">)</span> <span class="o">==</span> <span class="mh">0xffff</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">uVar4</span> <span class="o">=</span> <span class="n">SUB164</span><span class="p">(</span><span class="n">ZEXT816</span><span class="p">((</span><span class="n">ulong</span><span class="p">)</span><span class="n">v1</span> <span class="o">*</span> <span class="mh">0x4dc4591dac8f</span><span class="p">)</span> <span class="o">*</span> <span class="n">ZEXT816</span><span class="p">(</span><span class="mh">0x34ab9</span><span class="p">)</span> <span class="o">&gt;&gt;</span> <span class="mh">0x40</span><span class="p">,</span><span class="mi">0</span><span class="p">)</span> <span class="o">&amp;</span> <span class="mh">0x3ffff</span><span class="p">;</span>

    <span class="c1">// [...]</span>
  <span class="p">}</span>
<span class="c1">// [...]</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There we go, there’s the input parsing of 4 integers, followed by a lot of poorly decompiled checks. Guess we arrive at the “math trick”…</p>

<p>Now, before trying to <a href="https://lemire.me/blog/2019/02/08/faster-remainders-when-the-divisor-is-a-constant-beating-compilers-and-libdivide/">understand these checks</a>, let’s go through one of them in assembly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00400969 8b 4c 24 10     MOV        ECX,dword ptr [RSP + v1] ; v1 = one of the parsed integers
0040096d 48 b8 17        MOV        RAX,0x5f50ddca7b17
         7b ca dd
         50 5f 00 00
00400977 48 0f af c1     IMUL       RAX,RCX
0040097b ba 91 af        MOV        EDX,0x2af91
         02 00
00400980 48 f7 e2        MUL        RDX
00400983 81 e2 ff        AND        EDX,0x3ffff
         ff 03 00
00400989 66 48 0f        MOVQ       XMM0,RDX
         6e c2
0040098e 66 0f 73        PSLLDQ     XMM0,0x8
         f8 08
00400993 b8 69 95        MOV        EAX,0x9569
         00 00
00400998 66 48 0f        MOVQ       XMM1,RAX
         6e c8
0040099d 66 0f 73        PSLLDQ     XMM1,0x8
         f9 08
004009a2 66 0f 74 c8     PCMPEQB    XMM1,XMM0
004009a6 66 0f d7 c1     PMOVMSKB   EAX,XMM1
004009aa 3d ff ff        CMP        EAX,0xffff
         00 00
</code></pre></div></div>

<p>Consider what they are accessing:</p>

<ul>
  <li>For each integer, several SIMD instructions are applied, and the result is compared with 0xffff;</li>
  <li>There are no calls to other functions during these checks;</li>
  <li>After all checks, the input is passed to a function, then to some libc functions.</li>
</ul>

<p>Assuming no other processing happens in the last function call, these seem simple enough to solve with symbolic execution… Except we don’t have a valid executable (running causes it to segfault). Turns out that this isn’t a blocker.</p>

<h2 id="solution-1">Solution</h2>

<p>Conveniently, angr supports executing from straight assembly, so we can skip all the executable setup:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">"rb"</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="c1"># Skip ELF headers and code up to flag check start address
</span>    <span class="n">asm</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()[</span><span class="mh">0x960</span><span class="p">:]</span>

<span class="n">project</span> <span class="o">=</span> <span class="n">angr</span><span class="p">.</span><span class="n">load_shellcode</span><span class="p">(</span>
    <span class="n">asm</span><span class="p">,</span>
    <span class="s">"x86_64"</span><span class="p">,</span>
    <span class="n">start_offset</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
    <span class="n">load_address</span><span class="o">=</span><span class="mh">0x400960</span><span class="p">,</span>
    <span class="n">support_selfmodifying_code</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span>
<span class="p">)</span>
<span class="n">state</span> <span class="o">=</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">entry_state</span><span class="p">()</span>
</code></pre></div></div>

<p>Before executing these instructions, we need to have the actual <strong>program state at this point in execution</strong>, since we are e.g. reading values from registers and stack. Similar to how in software development we make a minimal working test case when we want to isolate logic that causes some bug, here we want to prepare a minimal state so that we can execute the instructions of the flag check like the executable normally would<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>.</p>

<p>Also, do we have any state resulting from side-effects (e.g. certain bytes read/written from files)? These wouldn’t be captured from a debugger. In this case, we don’t depend on such side-effects.</p>

<p>With gdb, it’s possible to dump all the state we need. We could adapt an existing <a href="https://github.com/Battelle/afl-unicorn/blob/master/unicorn_mode/helper_scripts/unicorn_dumper_gdb.py">dump script</a>, but let’s just go through it manually, starting by accessed module maps:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; vmmap
0x400000           0x404000 r-xp     4000 0      anon_00400
0x404000           0x603000 ---p   1ff000 0      anon_00404
0x603000           0x604000 r--p     1000 0      anon_00603
0x604000           0x626000 rw-p    22000 0      [heap]
[...]
0x7ffffffdd000     0x7ffffffff000 rw-p    22000 0      [stack]

pwndbg&gt; dump memory out.0x400000.mem 0x400000 0x404000
pwndbg&gt; dump memory out.0x603000.mem 0x603000 0x604000
pwndbg&gt; dump memory out.0x604000.mem 0x604000 0x626000
pwndbg&gt; dump memory out.0x7ffffffdd000.mem 0x7ffffffdd000 0x7ffffffff000
</code></pre></div></div>

<p>Importing them in our script:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">memory</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">"out.0x400000.mem"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mh">0x400000</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">permissions</span><span class="p">(</span><span class="mh">0x400000</span><span class="p">,</span> <span class="mi">5</span><span class="p">)</span>  <span class="c1"># 0b101 = r-x
</span>
<span class="n">memory</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">"out.0x603000.mem"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mh">0x603000</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">permissions</span><span class="p">(</span><span class="mh">0x603000</span><span class="p">,</span> <span class="mi">4</span><span class="p">)</span>  <span class="c1"># 0b100 = r--
</span>
<span class="n">memory</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">"out.0x604000.mem"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mh">0x604000</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">permissions</span><span class="p">(</span><span class="mh">0x604000</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span>  <span class="c1"># 0b110 = rw-
</span>
<span class="n">memory</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="s">"out.0x7ffffffdd000.mem"</span><span class="p">,</span> <span class="s">"rb"</span><span class="p">).</span><span class="n">read</span><span class="p">()</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">store</span><span class="p">(</span><span class="mh">0x7FFFFFFDD000</span><span class="p">,</span> <span class="n">memory</span><span class="p">,</span> <span class="n">disable_actions</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span> <span class="n">inspect</span><span class="o">=</span><span class="bp">False</span><span class="p">)</span>
<span class="n">state</span><span class="p">.</span><span class="n">memory</span><span class="p">.</span><span class="n">permissions</span><span class="p">(</span><span class="mh">0x7FFFFFFDD000</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span>  <span class="c1"># 0b110 = rw-
</span></code></pre></div></div>

<p>Followed by registers (taken from gdb via <code class="language-plaintext highlighter-rouge">context</code>; we don’t need state from flags or SIMD registers):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rax</span> <span class="o">=</span> <span class="mh">0x4</span>  <span class="c1"># used for scanf parsed count check
</span><span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rbx</span> <span class="o">=</span> <span class="mh">0x403350</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rcx</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rdx</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rdi</span> <span class="o">=</span> <span class="mh">0x7FFFFFFFB930</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rsi</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r8</span> <span class="o">=</span> <span class="mh">0x4</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r9</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r10</span> <span class="o">=</span> <span class="mh">0x7FFFF7C48AC0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r11</span> <span class="o">=</span> <span class="mh">0x7FFFF7C493C0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r12</span> <span class="o">=</span> <span class="mh">0x400830</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r13</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r14</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">r15</span> <span class="o">=</span> <span class="mh">0x7FFFFFFFC5D8</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rbp</span> <span class="o">=</span> <span class="mh">0x0</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rsp</span> <span class="o">=</span> <span class="mh">0x7FFFFFFFBE70</span>
<span class="n">state</span><span class="p">.</span><span class="n">regs</span><span class="p">.</span><span class="n">rip</span> <span class="o">=</span> <span class="mh">0x400960</span>
</code></pre></div></div>

<p>Since we will be running without loading libc, we need to explicitly skip any calls to stubs present in the procedure linkage table (a.k.a. <code class="language-plaintext highlighter-rouge">.plt</code>):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">pass_hook</span><span class="p">(</span><span class="n">angr</span><span class="p">.</span><span class="n">SimProcedure</span><span class="p">):</span>
    <span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"! pass_hook"</span><span class="p">)</span>
        <span class="k">return</span>

<span class="c1"># [...]
</span>
<span class="c1"># Skip libc handlers
</span><span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x400790</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007A0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007B0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007C0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007D0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007E0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x4007F0</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x400800</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x400810</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
<span class="n">project</span><span class="p">.</span><span class="n">hook</span><span class="p">(</span><span class="mh">0x400C20</span><span class="p">,</span> <span class="n">pass_hook</span><span class="p">())</span>
</code></pre></div></div>

<p>While angr has the “explorer” technique, where it tries to reach target addresses while avoiding others, we also want to stop execution at addresses that can’t be handled in our setup state, since angr would end up accessing unmapped memory or executing bad instructions. We explicitly mark such addresses as <code class="language-plaintext highlighter-rouge">deadend</code>. If needed, we could later on manually inspect these states:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># After "CALL scanf"
</span><span class="n">START</span> <span class="o">=</span> <span class="mh">0x400960</span>
<span class="c1"># Flag
</span><span class="n">FIND</span> <span class="o">=</span> <span class="mh">0x400B5D</span>
<span class="c1"># "Bad format!" and "Wrong!"
</span><span class="n">AVOID</span> <span class="o">=</span> <span class="p">[</span><span class="mh">0x400BF7</span><span class="p">,</span> <span class="mh">0x400BF0</span><span class="p">]</span>
<span class="c1"># Not fail cases, but don't continue execution
</span><span class="n">DEADEND</span> <span class="o">=</span> <span class="p">[</span><span class="mh">0x400BFC</span><span class="p">,</span> <span class="mh">0x400C01</span><span class="p">,</span> <span class="mh">0x400C03</span><span class="p">,</span> <span class="mh">0x400C0A</span><span class="p">,</span> <span class="mh">0x400C0B</span><span class="p">,</span> <span class="mh">0x400C20</span><span class="p">]</span>

<span class="c1"># [...]
</span>
<span class="n">sm</span> <span class="o">=</span> <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">simgr</span><span class="p">(</span><span class="n">state</span><span class="p">)</span>
<span class="k">while</span> <span class="n">sm</span><span class="p">.</span><span class="n">active</span><span class="p">:</span>
    <span class="k">print</span><span class="p">(</span><span class="n">sm</span><span class="p">,</span> <span class="n">sm</span><span class="p">.</span><span class="n">active</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">active</span> <span class="ow">in</span> <span class="n">sm</span><span class="p">.</span><span class="n">active</span><span class="p">:</span>
        <span class="n">project</span><span class="p">.</span><span class="n">factory</span><span class="p">.</span><span class="n">block</span><span class="p">(</span><span class="n">active</span><span class="p">.</span><span class="n">addr</span><span class="p">,</span> <span class="n">backup_state</span><span class="o">=</span><span class="n">active</span><span class="p">).</span><span class="n">pp</span><span class="p">()</span>
        <span class="k">if</span> <span class="n">active</span><span class="p">.</span><span class="n">addr</span> <span class="ow">in</span> <span class="p">[</span><span class="n">FIND</span><span class="p">]:</span>
            <span class="n">ipdb</span><span class="p">.</span><span class="n">set_trace</span><span class="p">()</span>
    <span class="n">sm</span><span class="p">.</span><span class="n">step</span><span class="p">()</span>

    <span class="c1"># Don't run fail cases, libc, stack, etc...
</span>    <span class="n">sm</span><span class="p">.</span><span class="n">stash</span><span class="p">(</span>
        <span class="n">from_stash</span><span class="o">=</span><span class="s">"active"</span><span class="p">,</span>
        <span class="n">to_stash</span><span class="o">=</span><span class="s">"avoid"</span><span class="p">,</span>
        <span class="n">filter_func</span><span class="o">=</span><span class="k">lambda</span> <span class="n">s</span><span class="p">:</span> <span class="n">s</span><span class="p">.</span><span class="n">addr</span> <span class="ow">in</span> <span class="n">AVOID</span> <span class="ow">or</span> <span class="n">s</span><span class="p">.</span><span class="n">addr</span> <span class="o">&gt;</span> <span class="mh">0x7FFFF7AF0000</span><span class="p">,</span>
    <span class="p">)</span>
    <span class="c1"># Don't run code after the flag check end
</span>    <span class="n">sm</span><span class="p">.</span><span class="n">stash</span><span class="p">(</span>
        <span class="n">from_stash</span><span class="o">=</span><span class="s">"active"</span><span class="p">,</span>
        <span class="n">to_stash</span><span class="o">=</span><span class="s">"deadend"</span><span class="p">,</span>
        <span class="n">filter_func</span><span class="o">=</span><span class="k">lambda</span> <span class="n">s</span><span class="p">:</span> <span class="n">s</span><span class="p">.</span><span class="n">addr</span> <span class="ow">in</span> <span class="n">DEADEND</span> <span class="ow">or</span> <span class="n">s</span><span class="p">.</span><span class="n">addr</span> <span class="o">&gt;</span> <span class="mh">0x400C28</span><span class="p">,</span>
    <span class="p">)</span>
</code></pre></div></div>

<p>After running this <a href="https://nevesnunes.github.io/blog/assets/writeups/TSGCTF2021/optimized.solver.py">script</a> (around 5 minutes), we get a nice trace of angr progressively passing each check:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;SimulationManager with 1 active&gt; [&lt;SimState @ 0x400960&gt;]
0x400960:    cmp    eax, 4
0x400963:    jne    0x400bf0
&lt;SimulationManager with 1 active&gt; [&lt;SimState @ 0x400969&gt;]
0x400969:    mov    ecx, dword ptr [rsp + 0x10]
0x40096d:    movabs    rax, 0x5f50ddca7b17
0x400977:    imul    rax, rcx
0x40097b:    mov    edx, 0x2af91
0x400980:    mul    rdx
0x400983:    and    edx, 0x3ffff
0x400989:    movq    xmm0, rdx
0x40098e:    pslldq    xmm0, 8
0x400993:    mov    eax, 0x9569
0x400998:    movq    xmm1, rax
0x40099d:    pslldq    xmm1, 8
0x4009a2:    pcmpeqb    xmm1, xmm0
0x4009a6:    pmovmskb    eax, xmm1
0x4009aa:    cmp    eax, 0xffff
0x4009af:    jne    0x400bf7
&lt;SimulationManager with 1 active, 1 avoid&gt; [&lt;SimState @ 0x4009b5&gt;]
0x4009b5:    movabs    rax, 0x4dc4591dac8f
0x4009bf:    imul    rax, rcx
0x4009c3:    mov    edx, 0x34ab9
0x4009c8:    mul    rdx
0x4009cb:    and    edx, 0x3ffff
0x4009d1:    movq    xmm0, rdx
0x4009d6:    pslldq    xmm0, 8
0x4009db:    mov    eax, 0x26cf2
0x4009e0:    movq    xmm1, rax
0x4009e5:    pslldq    xmm1, 8
0x4009ea:    pcmpeqb    xmm1, xmm0
&lt;SimulationManager with 1 active, 2 avoid&gt; [&lt;SimState @ 0x4009fd&gt;]
0x4009fd:    mov    esi, dword ptr [rsp + 0x14]
0x400a01:    movabs    rax, 0x4ae11552df1a
0x400a0b:    imul    rax, rsi
0x400a0f:    mov    edx, 0x36b39
0x400a14:    mul    rdx
0x400a17:    and    edx, 0x3ffff
0x400a1d:    movq    xmm0, rdx
0x400a22:    pslldq    xmm0, 8
0x400a27:    mov    eax, 0x20468
0x400a2c:    movq    xmm1, rax
0x400a31:    pslldq    xmm1, 8
0x400a36:    pcmpeqb    xmm1, xmm0
0x400a3a:    pmovmskb    eax, xmm1
0x400a3e:    cmp    eax, 0xffff
0x400a43:    jne    0x400bf7
&lt;SimulationManager with 1 active, 3 avoid&gt; [&lt;SimState @ 0x400a49&gt;]
0x400a49:    movabs    rax, 0x46680b140eff
0x400a53:    imul    rax, rsi
0x400a57:    mov    edx, 0x3a2d3
0x400a5c:    mul    rdx
0x400a5f:    and    edx, 0x3ffff
0x400a65:    movq    xmm0, rdx
0x400a6a:    pslldq    xmm0, 8
0x400a6f:    mov    eax, 0x3787a
0x400a74:    movq    xmm1, rax
0x400a79:    pslldq    xmm1, 8
0x400a7e:    pcmpeqb    xmm1, xmm0
0x400a82:    pmovmskb    eax, xmm1
0x400a86:    cmp    eax, 0xffff
0x400a8b:    jne    0x400bf7
&lt;SimulationManager with 1 active, 4 avoid&gt; [&lt;SimState @ 0x400a91&gt;]
0x400a91:    mov    edi, dword ptr [rsp + 0x18]
0x400a95:    movabs    rax, 0x4d935bbd3e0
0x400a9f:    mov    rdx, rdi
0x400aa2:    imul    rdx, rax
0x400aa6:    cmp    rdx, rax
0x400aa9:    jae    0x400bf7
&lt;SimulationManager with 1 active, 5 avoid&gt; [&lt;SimState @ 0x400aaf&gt;]
0x400aaf:    movabs    rax, 0x66b9b431b9ed
0x400ab9:    imul    rax, rdi
0x400abd:    mov    edx, 0x27df9
0x400ac2:    mul    rdx
0x400ac5:    and    edx, 0x3ffff
0x400acb:    movq    xmm0, rdx
0x400ad0:    pslldq    xmm0, 8
0x400ad5:    mov    eax, 0x5563
0x400ada:    movq    xmm1, rax
0x400adf:    pslldq    xmm1, 8
0x400ae4:    pcmpeqb    xmm1, xmm0
0x400ae8:    pmovmskb    eax, xmm1
0x400aec:    cmp    eax, 0xffff
0x400af1:    jne    0x400bf7
&lt;SimulationManager with 1 active, 6 avoid&gt; [&lt;SimState @ 0x400af7&gt;]
0x400af7:    mov    ebx, dword ptr [rsp + 0x1c]
0x400afb:    movabs    rax, 0x1e5d2be81c5
0x400b05:    mov    rdx, rbx
0x400b08:    imul    rdx, rax
0x400b0c:    cmp    rdx, rax
0x400b0f:    jae    0x400bf7
&lt;SimulationManager with 1 active, 7 avoid&gt; [&lt;SimState @ 0x400b15&gt;]
0x400b15:    movabs    rax, 0x448626500938
0x400b1f:    imul    rax, rbx
0x400b23:    mov    edx, 0x3bc65
0x400b28:    mul    rdx
0x400b2b:    and    edx, 0x3ffff
0x400b31:    movq    xmm0, rdx
0x400b36:    pslldq    xmm0, 8
0x400b3b:    mov    eax, 0x133e7
0x400b40:    movq    xmm1, rax
0x400b45:    pslldq    xmm1, 8
0x400b4a:    pcmpeqb    xmm1, xmm0
0x400b4e:    pmovmskb    eax, xmm1
0x400b52:    cmp    eax, 0xffff
0x400b57:    jne    0x400bf7
&lt;SimulationManager with 1 active, 8 avoid&gt; [&lt;SimState @ 0x400b5d&gt;]
0x400b5d:    mov    dword ptr [rsp + 0x20], ecx
0x400b61:    mov    dword ptr [rsp + 0x24], esi
0x400b65:    mov    dword ptr [rsp + 0x28], edi
0x400b69:    mov    dword ptr [rsp + 0x2c], ebx
0x400b6d:    call    0x4007d0
</code></pre></div></div>

<p>Taking the concrete input register values:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ipdb&gt; active.solver.eval(active.regs.rbx)
1334930147
ipdb&gt; active.solver.eval(active.regs.rdi)
4273479145
ipdb&gt; active.solver.eval(active.regs.rsi)
2204180909
ipdb&gt; active.solver.eval(active.regs.rcx)
772928896
</code></pre></div></div>

<p>We now get the flag:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Enter password: 772928896 2204180909 4273479145 1334930147
TSGCTF{F457_m0dul0!}
</code></pre></div></div>
<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>This would be the same case if we instead wanted to do emulation (e.g. with unicorn). Alternatively, a more interactive approach should be possible with <a href="https://github.com/andreafioraldi/angrgdb">angrgdb</a>. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">[return]</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><category term="ctf" /><category term="reversing" /><category term="bruteforce" /><category term="dynamic instrumentation" /><category term="symbolic execution" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Decompression Meddlings</title><link href="https://nevesnunes.github.io/blog/2021/09/29/Decompression-Meddlings.html" rel="alternate" type="text/html" title="Decompression Meddlings" /><published>2021-09-29T00:00:00+01:00</published><updated>2021-09-29T00:00:00+01:00</updated><id>https://nevesnunes.github.io/blog/2021/09/29/Decompression-Meddlings</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/09/29/Decompression-Meddlings.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<p>Lack of familiarity with a binary format leads us to handle them with conservative expectations. Today, let’s subvert two of these expectations with varying degrees of usefulness, each explored in a dedicated part.</p>

<p>Previously, I dissected a <a href="https://nevesnunes.github.io/blog/2019/09/29/Deceitful-Zip.html">zip file that had its body AES encrypted</a>, despite the compression method being set to the typical DEFLATE. This time, we will dig into the DEFLATE format’s implementation details.</p>

<div class="c-aside">
  <h1 id="toc">TOC</h1>

  <!-- toc -->

  <ul>
    <li><a href="#part-i-alleviating-streams">Part I: Alleviating streams</a>
      <ul>
        <li><a href="#preparing-a-stream">Preparing a stream</a></li>
        <li><a href="#peeking-at-the-naughty-bits">Peeking at the naughty bits</a></li>
        <li><a href="#meet-the-symbols">Meet the symbols</a></li>
        <li><a href="#how-bad-can-a-bit-flip-be">How bad can a bit flip be?</a>
          <ul>
            <li><a href="#bfinal0-but-no-more-blocks-in-stream">BFINAL=0 but no more blocks in stream</a></li>
            <li><a href="#btype3-which-is-reserved">BTYPE=3 which is reserved</a></li>
            <li><a href="#hdist31-which-is-more-than-allowed">HDIST=31 which is more than allowed</a></li>
            <li><a href="#lengths-repeated-are-greater-than-hdist-count">Lengths repeated are greater than HDIST count</a></li>
            <li><a href="#trailing-bytes-that-should-be-ignored">Trailing bytes that should be ignored</a></li>
            <li><a href="#distance-too-far-back">Distance too far back</a></li>
            <li><a href="#distance-valid-but-bad">Distance valid but bad</a></li>
          </ul>
        </li>
        <li><a href="#guidance-from-structured-data">Guidance from structured data</a></li>
        <li><a href="#isnt-this-solution-just-glorified-bruteforcing">Isn’t this solution just glorified bruteforcing?</a></li>
      </ul>
    </li>
    <li><a href="#part-ii-embellishing-streams">Part II: Embellishing streams</a>
      <ul>
        <li><a href="#playing-well-with-parsers">Playing well with parsers</a></li>
        <li><a href="#applying-the-message">Applying the message</a></li>
      </ul>
    </li>
    <li><a href="#further-work">Further work</a></li>
    <li><a href="#references">References</a></li>
  </ul>

  <!-- tocstop -->

</div>

<h1 id="part-i-alleviating-streams">Part I: Alleviating streams</h1>

<p><strong>Suppose a zip file gets some bytes corrupted. Can we decompress it?</strong></p>

<p>Some metadata cases would be simple to deal with:</p>

<ul>
  <li>Filenames with unexpected characters: if they are reserved characters in the target filesystem, just patch in some valid characters;</li>
  <li>Compressed size larger than file size: can be ignored by decompressors.</li>
</ul>

<p>Most likely, the corruption falls not in the metadata header, but in the compressed stream: consider that a zip file containing a single compressed file would have header and footer sizes both less than 100 bytes, as the rest of the file size would be taken by the stream:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zip - <span class="nt">-X</span> <span class="nt">-FI</span> <span class="nt">-r</span> &lt;<span class="o">(</span><span class="nb">printf</span> <span class="s1">'%s'</span> <span class="s1">''</span><span class="o">)</span> 2&gt;/dev/null | <span class="nb">wc</span> <span class="nt">-c</span>
<span class="c"># 146 total, 0 from input</span>

zip - <span class="nt">-X</span> <span class="nt">-FI</span> <span class="nt">-r</span> &lt;<span class="o">(</span><span class="nb">dd </span><span class="k">if</span><span class="o">=</span>/dev/urandom <span class="nv">of</span><span class="o">=</span>/dev/stdout <span class="nv">bs</span><span class="o">=</span>1M <span class="nv">count</span><span class="o">=</span>1<span class="o">)</span> 2&gt;/dev/null | <span class="nb">wc</span> <span class="nt">-c</span>
<span class="c"># 1048880 total, 1024*1024=1048576 from input</span>
</code></pre></div></div>

<p>But could we still recover the original bytes from a corrupted stream?</p>

<h2 id="preparing-a-stream">Preparing a stream</h2>

<p>On the following sections, we’ll use as an example a <a href="https://raw.githubusercontent.com/aquasecurity/vuln-list/0756b586549026400f91221eb748a0df4251a17b/nvd/2011/CVE-2011-4925.json">JSON file</a> that was zipped (<a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.zip">output file</a>):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zip <span class="nt">-X</span> CVE-2011-4925.zip CVE-2011-4925.json
</code></pre></div></div>

<p>Let’s inspect the data structure of this zip file:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ksv CVE-2011-4925.zip ~/opt/kaitai_struct_formats_HEAD/archive/zip.ksy
</code></pre></div></div>

<p>We can see that the stream starts where the filename entry ends (offset 0x30):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[-] [root]                                               00000000: 50 4b 03 04 14 00 00 00 08 00 9b a9 34 53 bb cc | PK..........4S..
  [-] sections (3 = 0x3 entries)                         00000010: 75 39 44 05 00 00 fd 33 00 00 12 00 00 00 43 56 | u9D....3......CV
    [-] 0                                                00000020: 45 2d 32 30 31 31 2d 34 39 32 35 2e 6a 73 6f 6e | E-2011-4925.json
      [.] magic = 50 4b                                  00000030: cd 5b 5d 6f da 48 14 7d cf af 18 79 5f 76 ab 80 | .[]o.H.}...y_v..
      [.] section_type = 1027                            00000040: b1 b1 d3 c4 4f 4b 09 5b 59 1b 20 05 92 4a ad 50 | ....OK.[Y. ..J.P
      [-] body                                           00000050: 34 d8 13 32 5a e3 71 67 c6 a4 51 95 ff be 77 6c | 4..2Z.qg..Q...wl
        [-] header                                       00000060: 70 ec 28 0d b4 ea 2e 57 21 12 9e 7b 7d e7 dc 0f | p.(....W!..{}...
          [.] version = 20                               00000070: 9f 33 91 9c 6f 47 84 58 91 48 6f f9 32 97 54 73 | .3..oG.X.Ho.2.Ts
          [.] flags = 0                                  00000080: 91 2a 2b 20 df 60 15 d6 fb d7 83 9b 98 6a 7a b3 | .*+ .`.......jz.
          [.] compression_method = compression_deflated  00000090: 66 52 81 09 2c 96 d7 ee 58 c7 a5 39 15 31 33 de | fR..,...X..9.13.
          [.] file_mod_time = 43419                      000000a0: 9f 8b 4b b2 b9 ab 30 45 19 bb 59 51 1d dd d5 cc | ..K...0E..YQ....
          [.] file_mod_date = 21300                      000000b0: 4d 97 ad 9b db bd 92 dc 04 86 ef 81 db ee 06 34 | M..............4
          [.] crc32 = 964021435                          000000c0: 88 92 5c 69 26 6f 24 53 22 97 11 53 81 16 f2 4b | ..\i&amp;o$S"..S...K
          [.] compressed_size = 1348                     000000d0: ce aa 05 08 9d d2 25 93 81 d3 ee b4 9d ac 13 bc | ......%.........
          [.] uncompressed_size = 13309                  000000e0: a9 ff 6c e0 55 9b ac f3 24 65 92 2e 12 06 db 68 | ..l.U...$e.....h
          [.] file_name_len = 18                         000000f0: 99 b3 9a f9 f1 f8 bf 06 e7 60 06 e7 62 06 d7 c5 | .........`..b...
          [.] extra_len = 0                              00000100: 0c ce c3 0c ce c7 0c ee 04 2b 38 80 87 97 4a 0c | .........+8...J.
          [.] file_name = "CVE-2011-4925.json"           00000110: 38 b4 54 62 c0 a1 a5 12 03 0e 2d 95 18 70 68 a9 | 8.Tb......-..ph.
          [-] extra                                      00000120: c4 80 43 4b 25 06 1c 5a 2a 71 31 53 89 8b 99 4a | ..CK%..Z*q1S...J
            [-] entries (0 = 0x0 entries)                00000130: 5c cc 54 e2 62 a6 12 17 33 95 b8 98 a9 c4 c5 4b | \.T.b...3......K
        [.] body = cd 5b 5d 6f da 48 14 7d cf af 18 79 5…00000140: 25 00 0d 2f 95 94 e0 90 52 49 09 0e 29 95 94 e0 | %../....RI..)...
    [-] 1                                                00000150: 90 52 49 09 0e 29 95 94 e0 90 52 49 09 ee 10 54 | .RI..)....RI...T
      [.] magic = 50 4b                                  00000160: b2 0f b6 43 fd 7d b3 1f b6 43 d0 c8 7e c8 0e c1 | ...C.}...C..~...
      [.] section_type = 513                             00000170: 21 fb 20 83 69 43 8a ac 8b 18 19 d6 39 eb a2 9d | !. .iC......9...
      [+] body                                           00000180: b3 6e fb 10 42 b5 1f b2 43 a8 d4 7e c8 0e 21 51 | .n..B...C..~..!Q
    [+] 2                                                00000190: fb 21 c3 aa 4f dd f6 5b b4 c8 4e d1 22 3b 43 8b | .!..O..[..N.";C.
</code></pre></div></div>

<p>And ends before the next zip file section, which starts with bytes <code class="language-plaintext highlighter-rouge">PK</code> (offset 0x574):</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[-] [root]                        00000360: e9 f8 04 46 17 fa 05 5c a0 74 b0 9d 84 61 d1 e4 | ...F...\.t...a..
  [-] sections (3 = 0x3 entries)  00000370: 5e b3 c9 d3 cd bd e4 dd 43 46 95 6a 4e 20 a0 29 | ^.......CF.jN .)
    [+] 0                         00000380: 87 a7 78 4e 2e c2 e9 ac 69 d7 74 59 bc e6 3c 6f | ..xN....i.tY..&lt;o
    [-] 1                         00000390: ac e6 32 31 fe 77 5a 67 81 6d c3 d3 9f de c3 04 | ..21.wZg.m......
      [.] magic = 50 4b           000003a0: b5 23 b1 b2 13 ae b4 b2 eb 98 6d 03 d9 ee 38 76 | .#........m...8v
      [.] section_type = 513      000003b0: c7 b7 9d da 1c 1c ff 5c de 13 16 fc cf b9 37 d9 | .......\......7.
      [+] body                    000003c0: f1 b2 78 b5 bb 3e 06 bf a6 34 67 fb 97 c6 7b db | ..x..&gt;...4g...{.
    [+] 2                         000003d0: 3d 75 5e c9 65 3a e8 5f 8d c2 de 3e d9 5c b3 34 | =u^.e:._...&gt;.\.4
                                  [...]
                                  00000560: b2 7c 01 54 7b f7 dc cf 69 39 5d e3 e7 78 9f ac | .|.T{...i9]..x..
                                  00000570: a3 c7 7f 01 50 4b 01 02 1e 03 14 00 00 00 08 00 | ....PK..........
</code></pre></div></div>

<p>If you don’t wan’t to get your hands dirty, the stream can also be <a href="https://github.com/nevesnunes/zip-frolicking/blob/master/kaitai_struct/dump_first_stream.py">extracted programmatically using kaitai_struct</a>.</p>

<p>Comparing the DEFLATE stream with the original zip file, with <a href="https://github.com/nevesnunes/aggregables/blob/master/aggregables/differences/hexdiff.py">summarized distinct bytes</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 CVE-2011-4925.zip CVE-2011-4925.deflate.out
</span><span class="gd">--- CVE-2011-4925.zip
</span><span class="gi">+++ CVE-2011-4925.deflate
</span><span class="gd">-        0x0: 504b0304140000000800 [...] 2d343932352e6a736f6e -&gt; b'PK\x03\x04\x14\x00\x00\x00\x08\x00' [...] b'-4925.json' [+ 28 byte(s)]
</span>  0x30,  0x0: cd5b5d6fda48147dcfaf [...] 5de3e7789faca3c77f01 -&gt; b'\xcd[]o\xdaH\x14}\xcf\xaf' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1328 byte(s)]
<span class="gd">-0x574,0x544: 504b01021e0314000000 [...] 40000000740500000000 -&gt; b'PK\x01\x02\x1e\x03\x14\x00\x00\x00' [...] b'@\x00\x00\x00t\x05\x00\x00\x00\x00' [+ 66 byte(s)]
</span></code></pre></div></div>

<h2 id="peeking-at-the-naughty-bits">Peeking at the naughty bits</h2>

<p>DEFLATE streams are encoded in <strong>bit-aligned values</strong>. Given that a lot of common editing tools only go down to the granularity of bytes (e.g. hex editors can list bits, but you can’t directly patch or truncate bits), we will need something more suited for our experiments.</p>

<p><a href="https://github.com/madler/infgen">infgen</a> is a DEFLATE disassembler, apparently based on the <a href="https://github.com/madler/zlib/blob/master/contrib/puff/puff.c">puff.c inflater</a>, which applies the same validations as <a href="https://github.com/madler/zlib/blob/master/inftrees.c">zlib’s inftrees.c</a>.</p>

<p>Still, it lacked a lot of needed verbosity, in particular which bits parsed at which offset matched a given token, along with traces of dynamic huffman table construction. Therefore, I extended it in my <a href="https://github.com/nevesnunes/infgen">fork</a> to include such details.</p>

<p>Below is an illustrated breakdown of what is represented in one of the fork’s log entries:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                               .------&gt; 1st..2nd bits parsed
                               |
                               |       ,--&gt; 6 bits skipped (already parsed)
                               +      +
       DEBUG 00000087 6: 0x2e  01......
                    + +     +  ____0111 (need 6, decode bitbuf (RTL))
                    | |     |  +      +
next byte index &lt;---' |     |  |       `----&gt; 3rd..6th bits parsed
(used in next parsing)|     |  |
                      |     |  '------&gt; 4 bits remaining from byte 0x86
next bit index &lt;------'     |
(used in current parsing)   |
                            |
hex value parsed &lt;----------'
(= 0b101110 from bits read in right-to-left order)
</code></pre></div></div>

<p>Now we should be able to cover many possible errors, since we can lookup which exact bits to replace.</p>

<h2 id="meet-the-symbols">Meet the symbols</h2>

<p>With DEFLATE, we don’t need the full payload to start decompression output, since compressed bytes are read as a stream, and can be decompressed on the fly, one byte at a time. Bits are parsed from one or more bytes until a symbol is decoded.</p>

<p>To understand how the wrong symbol can affect output, we can check in the specification the possible values:</p>

<ul>
  <li>0..255: <strong>literal bytes</strong>, from the alphabet of byte values (e.g. <code class="language-plaintext highlighter-rouge">symbol 65</code> = byte 0x41 = “A”);</li>
  <li>256: <strong>end-of-block</strong>;</li>
  <li>257..285: <strong>lengths for &lt;length, backward distance&gt; pairs</strong> (e.g. <code class="language-plaintext highlighter-rouge">&lt;2, 4&gt;</code> = copy 2 bytes starting at 4 bytes ago in the output, so if output is “12345678”, we would get “56”, and any other subsequent distance would be relative to the new output “1234567856”).
    <ul>
      <li>Always followed by: 0..29: <strong>distances for &lt;length, backward distance&gt; pairs</strong></li>
    </ul>
  </li>
</ul>

<p><a href="https://github.com/XlogicX/YouFLATE">YouFLATE</a> allows us to interactively craft streams. Combined with infgen, the relations between these symbols become more evident:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ./youflate.pl

Current Tokens: A4,1B
ASCIIHex Data: 7304012700
Uncompressed data: AAAAAB

# echo "7304012700" | xxd -r -p | infgen -dd

                    +---&gt; bits parsed for each token
              .-----'-----.
last        ! 1              +---&gt; BFINAL=1
fixed       ! 01             +---&gt; BTYPE=1: static huffman tables, so no
literal 'A  ! 01110001      -.              table entries included in stream
match 4 1   ! 00000 0000010  +---&gt; symbols
literal 'B  ! 01110010       |
end         ! 0000000       -'
            ! 00 +---&gt; unused bits (byte padding)
</code></pre></div></div>

<h2 id="how-bad-can-a-bit-flip-be">How bad can a bit flip be?</h2>

<p>Before coming up with a solution, let’s investigate how decompressors deal with corrupted streams.</p>

<p>After an attempt at decompression, we can have two types of end result:</p>

<ul>
  <li>None / partial output, due to <strong>metadata errors</strong> which stop further parsing;</li>
  <li>Full but bad output, due to <strong>length/distance errors</strong> that result in decoding the wrong symbols.</li>
</ul>

<p>If the compressed stream was parsed until the end, then we got the “full” output. For the parser, bad output is still valid output. The offset that has corrupted bytes could go undetected until a metadata error or the end of the stream is reached. When does that error happen? Maybe close to the corrupted byte, maybe many bytes later. It depends on how the next symbols will be decoded.</p>

<p>How about some examples? As usual when parsing file formats, different tools have different behaviours…</p>

<h3 id="bfinal0-but-no-more-blocks-in-stream">BFINAL=0 but no more blocks in stream</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/100_BFINAL0/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 100_BFINAL0/CVE-2011-4925.deflate
</span><span class="gd">-        0x0: cd -&gt; b'\xcd'
</span><span class="gi">+        0x0: cc -&gt; b'\xcc'
</span>         0x1: 5b5d6fda48147dcfaf18 [...] 5de3e7789faca3c77f01 -&gt; b'[]o\xdaH\x14}\xcf\xaf\x18' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1327 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 100_BFINAL0/CVE-2011-4925.deflate
</span><span class="p">@@ -1,12 +1,12 @@</span>
<span class="gd">-DEBUG 00000001 0: 0x1   _______1 (need 1, BFINAL)
-INFO  00000001 1: BFINAL 1 (last block)
</span><span class="gi">+DEBUG 00000001 0: 0x0   _______0 (need 1, BFINAL)
+INFO  00000001 1: BFINAL 0 (not last block)
</span></code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
zlib.error: Error -5 while decompressing data: incomplete or truncated stream

# zlib.decompressobj(-15).decompress(...) | tail
      "obtainAllPrivilege": false,
      "obtainOtherPrivilege": false,
      "obtainUserPrivilege": false,
      "severity": "MEDIUM",
      "userInteractionRequired": false
    }
  },
  "lastModifiedDate": "2012-02-02T04:09Z",
  "publishedDate": "2012-01-13T04:14Z"
}

# infgen -d ... | grep WARN
WARN  00000548 4: incomplete deflate data

# unzip -p -- ...
  error:  invalid compressed data to inflate CVE-2011-4925.json

# jar xf ...
java.util.zip.ZipException: invalid stored block lengths
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:165)
</code></pre></div></div>

<p>Observations:</p>

<ul>
  <li>unzip just gives a generic error message about the data stream (it always gives this message, so it will be omitted from the other examples);</li>
  <li>zlib.decompressobj() did not report any error, since additional input could still be expected (it will be omitted when it reports the same error as zlib.decompress() and has no output);</li>
  <li>java’s decompressor references block lengths, which is misleading: those come after BFINAL and BTYPE, but such fields weren’t present in the stream.</li>
</ul>

<h3 id="btype3-which-is-reserved">BTYPE=3 which is reserved</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/101_BTYPE3/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 101_BTYPE3/CVE-2011-4925.deflate
</span><span class="gd">-        0x0: cd -&gt; b'\xcd'
</span><span class="gi">+        0x0: cf -&gt; b'\xcf'
</span>         0x1: 5b5d6fda48147dcfaf18 [...] 5de3e7789faca3c77f01 -&gt; b'[]o\xdaH\x14}\xcf\xaf\x18' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1327 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 101_BTYPE3/CVE-2011-4925.deflate
</span><span class="p">@@ -1,8704 +1,14 @@</span>
<span class="gd">-DEBUG 00000001 1: 0x2   _____10. (need 2, BTYPE)
-INFO  00000001 3: BTYPE 10 (compressed, dynamic)
</span><span class="gi">+DEBUG 00000001 1: 0x3   _____11. (need 2, BTYPE)
+INFO  00000001 3: BTYPE 11 (reserved)
</span></code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
zlib.error: Error -3 while decompressing data: invalid block type

# zlib.decompressobj(-15).decompress(...) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
  File "&lt;string&gt;", line 1, in &lt;listcomp&gt;
zlib.error: Error -3 while decompressing data: invalid block type

# infgen -d ... | grep WARN
WARN  00000001 3: invalid deflate data -- invalid block type (3)

# jar xf ...
java.util.zip.ZipException: invalid block type
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:165)
</code></pre></div></div>

<h3 id="hdist31-which-is-more-than-allowed">HDIST=31 which is more than allowed</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/102_HDIST31/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 102_HDIST31/CVE-2011-4925.deflate
</span>         0x0: cd -&gt; b'\xcd'
<span class="gd">-        0x1: 5b -&gt; b'['
</span><span class="gi">+        0x1: 5f -&gt; b'_'
</span>         0x2: 5d6fda48147dcfaf1879 [...] 5de3e7789faca3c77f01 -&gt; b']o\xdaH\x14}\xcf\xaf\x18y' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1326 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 102_HDIST31/CVE-2011-4925.deflate
</span><span class="p">@@ -1,6 +1,6 @@</span>
<span class="gd">-DEBUG 00000001 0: 0x5b  01011011 (parse)
-DEBUG 00000002 0: 0x1b  ___11011 (need 5, dyn HDIST)
</span><span class="gi">+DEBUG 00000001 0: 0x5f  01011111 (parse)
+DEBUG 00000002 0: 0x1f  ___11111 (need 5, dyn HDIST)
</span> DEBUG 00000002 5: 0x5d  01011101 (parse)
 DEBUG 00000003 5: 0xa   010.....
                         _______1 (need 4, dyn HCLEN)
<span class="gd">-INFO  00000003 1: ! dyn count (HLIT 282, HDIST 28, HCLEN 14)
</span></code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
zlib.error: Error -3 while decompressing data: too many length or distance symbols

# zlib.decompressobj(-15).decompress(...) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
  File "&lt;string&gt;", line 1, in &lt;listcomp&gt;
zlib.error: Error -3 while decompressing data: too many length or distance symbols

# infgen -d ... | grep WARN
WARN  00000003 1: invalid deflate data -- too many length or distance codes

# jar xf ...
java.util.zip.ZipException: too many length or distance symbols
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:165)
</code></pre></div></div>

<h3 id="lengths-repeated-are-greater-than-hdist-count">Lengths repeated are greater than HDIST count</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/103_HDIST_repeat_more/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 103_HDIST_repeat_more/CVE-2011-4925.deflate
</span>         0x0: cd5b5d6fda48147dcfaf -&gt; b'\xcd[]o\xdaH\x14}\xcf\xaf'
<span class="gd">-        0xa: 18 -&gt; b'\x18'
</span><span class="gi">+        0xa: 1f -&gt; b'\x1f'
</span>         0xb: 795f76ab80b1b1d3c44f [...] 5de3e7789faca3c77f01 -&gt; b'y_v\xab\x80\xb1\xb1\xd3\xc4O' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1317 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 103_HDIST_repeat_more/CVE-2011-4925.deflate
</span><span class="p">@@ -95,8610 +95,334 @@</span>
<span class="gd">-DEBUG 0000000a 4: 0x18  00011000 (parse)
-DEBUG 0000000b 4: 0xa   1010....
-                        _____000 (need 7, repeat=18 (0 x 11..138))
-INFO  0000000b 3: zeros 21
</span><span class="gi">+DEBUG 0000000a 4: 0x1f  00011111 (parse)
+DEBUG 0000000b 4: 0x7a  1010....
+                        _____111 (need 7, repeat=18 (0 x 11..138))
+INFO  0000000b 3: zeros 133
</span> DEBUG 0000000b 3: 0xc   _0011... (need 4, decode bitbuf (RTL))
<span class="gd">-INFO  0000000b 7: ! decoded len 4 bits 0011 sym_i 32 [index + (code - first)] = [5 + (12 - 12)] = [5] = 5
</span><span class="gi">+INFO  0000000b 7: ! decoded len 4 bits 0011 sym_i 144 [index + (code - first)] = [5 + (12 - 12)] = [5] = 5
</span> INFO  0000000b 7: lens 5 (0x5)
 [...]
 DEBUG 0000002d 5: 0xbe  10111110 (parse)
 DEBUG 0000002e 5: 0x77  111.....
                         ____1110 (need 7, repeat=18 (0 x 11..138))
<span class="gd">-INFO  0000002e 4: zeros 130
-DEBUG 0000002e 4: 0xd   1011.... (need 4, decode bitbuf (RTL))
-INFO  0000002e 0: ! decoded len 4 bits 1011 sym_i 256 [index + (code - first)] = [5 + (13 - 12)] = [6] = 10
</span> [...]
<span class="gd">-INFO  00000544 1: ! decoded len 10 bits 1011111111 sym_i 1087 [index + (code - first)] = [88 + (1021 - 1014)] = [95] = 256
-INFO  00000544 1: decode, symbol=256
-INFO  00000544 1: end
</span></code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
zlib.error: Error -3 while decompressing data: invalid bit length repeat

# infgen -d ... | grep WARN
WARN  0000002e 4: invalid deflate data -- repeat more lengths than available

# jar xf ...
java.util.zip.ZipException: invalid bit length repeat
	at java.base/java.util.zip.InflaterInputStream.read(InflaterInputStream.java:165)
</code></pre></div></div>

<h3 id="trailing-bytes-that-should-be-ignored">Trailing bytes that should be ignored</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/104_trail/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 104_trail/CVE-2011-4925.deflate
</span>         0x0: cd5b5d6fda48147dcfaf [...] 5de3e7789faca3c77f01 -&gt; b'\xcd[]o\xdaH\x14}\xcf\xaf' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1328 byte(s)]
<span class="gi">+      0x544: 41424344 -&gt; b'ABCD'
</span>
infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 104_trail/CVE-2011-4925.deflate
</span><span class="p">@@ -8700,5 +8700,18 @@</span>
 INFO  00000544 1: ! decoded len 10 bits 1011111111 sym_i 1087 [index + (code - first)] = [88 + (1021 - 1014)] = [95] = 256
 INFO  00000544 1: decode, symbol=256
 INFO  00000544 1: end
<span class="gd">-DEBUG 00000544 1: 0xffffffff 11111111 (parse)
-DEBUG 00000545 1: 0xffffffff 11111111 (parse)
</span><span class="gi">+DEBUG 00000544 1: 0x41  01000001 (parse)
+DEBUG 00000545 1: 0x42  01000010 (parse)
+DEBUG 00000546 1: &lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt; (unparse)
+DEBUG 00000545 1: &lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt;&lt; (unparse)
+INFO  00000544 1: ! raw
+DEBUG 00000544 1: 0x41  01000001 (parse)
+DEBUG 00000545 1: 0x1   ______1. (need 1, BFINAL)
+INFO  00000545 2: BFINAL 1 (last block)
+DEBUG 00000545 2: 0x0   ____00.. (need 2, BTYPE)
+INFO  00000545 4: BTYPE 00 (no compression) 8
+DEBUG 00000545 4: 0x42  01000010 (parse)
+DEBUG 00000546 4: 0x43  01000011 (parse)
+DEBUG 00000547 4: 0x44  01000100 (parse)
+DEBUG 00000548 4: 0xffffffff 11111111 (parse)
</span></code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
      "obtainAllPrivilege": false,
      "obtainOtherPrivilege": false,
      "obtainUserPrivilege": false,
      "severity": "MEDIUM",
      "userInteractionRequired": false
    }
  },
  "lastModifiedDate": "2012-02-02T04:09Z",
  "publishedDate": "2012-01-13T04:14Z"
}

# zlib.decompressobj(-15).decompress(...) | tail
      "obtainAllPrivilege": false,
      "obtainOtherPrivilege": false,
      "obtainUserPrivilege": false,
      "severity": "MEDIUM",
      "userInteractionRequired": false
    }
  },
  "lastModifiedDate": "2012-02-02T04:09Z",
  "publishedDate": "2012-01-13T04:14Z"
}

# infgen -d ... | grep WARN
WARN  00000549 4: incomplete deflate data
</code></pre></div></div>

<p>Observations:</p>

<ul>
  <li>infgen tries to read beyond a block with BFINAL=1, perhaps unintentional.</li>
</ul>

<h3 id="distance-too-far-back">Distance too far back</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/110_dist_too_far_back/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 110_dist_too_far_back/CVE-2011-4925.deflate
</span>         0x0: cd5b5d6fda48147dcfaf [...] f932975473912a2b20df -&gt; b'\xcd[]o\xdaH\x14}\xcf\xaf' [...] b'\xf92\x97Ts\x91*+ \xdf' [+ 65 byte(s)]
<span class="gd">-       0x55: 60 -&gt; b'`'
</span><span class="gi">+       0x55: 40 -&gt; b'@'
</span>        0x56: 15d6fbd7839b986a7ab3 [...] 5de3e7789faca3c77f01 -&gt; b'\x15\xd6\xfb\xd7\x83\x9b\x98jz\xb3' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1242 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 110_dist_too_far_back/CVE-2011-4925.deflate
</span><span class="p">@@ -919,7 +919,7 @@</span>
 DEBUG 00000054 0: 0xdf  11011111 (parse)
<span class="gd">-DEBUG 00000055 0: 0x60  01100000 (parse)
</span><span class="gi">+DEBUG 00000055 0: 0x40  01000000 (parse)
</span> DEBUG 00000056 0: 0x1f6 11011111
                         _______0 (need 9, decode bitbuf (RTL))
 INFO  00000056 1: ! decoded len 9 bits 011011111 sym_i 23 [index + (code - first)] = [73 + (502 - 492)] = [83] = 123
<span class="p">@@ -929,545 +929,537 @@</span>
 INFO  00000056 5: ! decoded len 4 bits 0000 sym_i 24 [index + (code - first)] = [0 + (0 - 0)] = [0] = 257
 INFO  00000056 5: decode, symbol=257
 DEBUG 00000056 5: 0x0   ___..... (need 0, length code (257..285))
<span class="gd">-DEBUG 00000056 5: 0x15  00010101 (parse)
-DEBUG 00000057 5: 0x1a  011.....
-                        ______01 (need 5, decode bitbuf (RTL))
-INFO  00000057 2: ! decoded len 5 bits 01011 sym_i 24 [index + (code - first)] = [10 + (26 - 26)] = [10] = 8
-DEBUG 00000057 2: 0x5   ___101.. (need 3, distance code)
-INFO  00000057 5: match (len 3, dist 22)
-DEBUG 00000057 5: 0xd6  11010110 (parse)
-DEBUG 00000058 5: 0x0   000.....
-                        _______0 (need 4, decode bitbuf (RTL))
-INFO  00000058 1: ! decoded len 4 bits 0000 sym_i 25 [index + (code - first)] = [0 + (0 - 0)] = [0] = 257
-INFO  00000058 1: decode, symbol=257
-DEBUG 00000058 1: 0x0   _______. (need 0, length code (257..285))
-DEBUG 00000058 1: 0x1a  __01011. (need 5, decode bitbuf (RTL))
-INFO  00000058 6: ! decoded len 5 bits 01011 sym_i 25 [index + (code - first)] = [10 + (26 - 26)] = [10] = 8
-DEBUG 00000058 6: 0xfb  11111011 (parse)
-DEBUG 00000059 6: 0x7   11......
-                        _______1 (need 3, distance code)
-INFO  00000059 1: match (len 3, dist 24)
-DEBUG 00000059 1: 0x5f  1111101. (need 7, decode bitbuf (RTL))
-INFO  00000059 0: ! decoded len 7 bits 1111101 sym_i 26 [index + (code - first)] = [30 + (95 - 90)] = [35] = 67
-INFO  00000059 0: decode, symbol=67
-INFO  00000059 0: literal C (0x43)
-DEBUG 00000059 0: 0xd7  11010111 (parse)
</span> [...]
<span class="gi">+DEBUG 00000056 5: 0x2   010..... (need 3, decode bitbuf (RTL))
+INFO  00000056 0: ! decoded len 3 bits 010 sym_i 24 [index + (code - first)] = [0 + (2 - 0)] = [2] = 21
+DEBUG 00000056 0: 0x15  00010101 (parse)
+DEBUG 00000057 0: 0xd6  11010110 (parse)
+DEBUG 00000058 0: 0x15  00010101
+                        _______0 (need 9, distance code)
+INFO  00000058 1: match (len 3, dist 1558)
</span> [...]
<span class="gd">-INFO  00000544 1: ! decoded len 10 bits 1011111111 sym_i 1087 [index + (code - first)] = [88 + (1021 - 1014)] = [95] = 256
</span><span class="gi">+INFO  00000544 1: ! decoded len 10 bits 1011111111 sym_i 1084 [index + (code - first)] = [88 + (1021 - 1014)] = [95] = 256
</span> INFO  00000544 1: decode, symbol=256
 INFO  00000544 1: end
 DEBUG 00000544 1: 0xffffffff 11111111 (parse)
</code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
zlib.error: Error -3 while decompressing data: invalid distance too far back

# zlib.decompressobj(-15).decompress(...) | tail
Traceback (most recent call last):
  File "&lt;string&gt;", line 1, in &lt;module&gt;
  File "&lt;string&gt;", line 1, in &lt;listcomp&gt;
zlib.error: Error -3 while decompressing data: invalid distance too far back
{
  "configurations": {

# infgen -d ... | grep WARN
WARN  00000058 1: distance too far back (dist:1558, max:23)

# unzip -p -- ...
{
  "configurations": {
</code></pre></div></div>

<p>Observations:</p>

<ul>
  <li>Both zlib.decompressobj() and unzip manage to decompress symbols up to the invalid distance, but stop decompression at that point;</li>
  <li>In contrast, infgen continues parsing beyond the invalid distance until the end of the stream, detecting several unexpected symbols.</li>
</ul>

<h3 id="distance-valid-but-bad">Distance valid but bad</h3>

<p>Comparing <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">original</a> with <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/111_full_but_bad/CVE-2011-4925.deflate">modified</a>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">hexdiff.py -c -l 40 ...
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 111_full_but_bad/CVE-2011-4925.deflate
</span>         0x0: cd5b5d6fda48147dcfaf [...] d5cc4d97ad9bdbbd92dc -&gt; b'\xcd[]o\xdaH\x14}\xcf\xaf' [...] b'\xd5\xccM\x97\xad\x9b\xdb\xbd\x92\xdc' [+ 116 byte(s)]
<span class="gd">-       0x88: 04 -&gt; b'\x04'
</span><span class="gi">+       0x88: c4 -&gt; b'\xc4'
</span>        0x89: 86ef81dbee063488925c [...] 5de3e7789faca3c77f01 -&gt; b'\x86\xef\x81\xdb\xee\x064\x88\x92\\' [...] b']\xe3\xe7x\x9f\xac\xa3\xc7\x7f\x01' [+ 1191 byte(s)]

infgen -d ...
<span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 111_full_but_bad/CVE-2011-4925.deflate
</span><span class="p">@@ -1264,7 +1264,7 @@</span>
 INFO  00000088 5: ! decoded len 5 bits 11100 sym_i 74 [index + (code - first)] = [2 + (7 - 4)] = [5] = 105
 INFO  00000088 5: decode, symbol=105
 INFO  00000088 5: literal i (0x69)
<span class="gd">-DEBUG 00000088 5: 0x4   00000100 (parse)
</span><span class="gi">+DEBUG 00000088 5: 0xc4  11000100 (parse)
</span> DEBUG 00000089 5: 0xc   110.....
                         ______00 (need 5, decode bitbuf (RTL))
 INFO  00000089 2: ! decoded len 5 bits 00110 sym_i 75 [index + (code - first)] = [2 + (12 - 4)] = [10] = 258
<span class="p">@@ -1273,9 +1273,9 @@</span>
 DEBUG 00000089 2: 0x8   __0001.. (need 4, decode bitbuf (RTL))
 INFO  00000089 6: ! decoded len 4 bits 0001 sym_i 75 [index + (code - first)] = [3 + (8 - 6)] = [5] = 12
 DEBUG 00000089 6: 0x86  10000110 (parse)
<span class="gd">-DEBUG 0000008a 6: 0x18  00......
</span><span class="gi">+DEBUG 0000008a 6: 0x1b  11......
</span>                         _____110 (need 5, distance code)
<span class="gd">-INFO  0000008a 3: match (len 4, dist 89)
</span><span class="gi">+INFO  0000008a 3: match (len 4, dist 92)
</span> DEBUG 0000008a 3: 0x0   _0000... (need 4, decode bitbuf (RTL))
 INFO  0000008a 7: ! decoded len 4 bits 0000 sym_i 76 [index + (code - first)] = [0 + (0 - 0)] = [0] = 257
 INFO  0000008a 7: decode, symbol=257
</code></pre></div></div>

<p>Errors:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># zlib.decompress(..., -15) | tail
      "obtainAllPrivilegeionfalse,
      "obtainOtherPrivilegeionfalse,
      "obtainUserPrivilegeionfalse,
      "severityion"MEDIUM",
      "userInteractionRequiredionfalse
    }
  },
  "lastModifiedDate": "2012-02-02T04:09Z",
  "publishedDate": "2012-01-13T04:14Z"
}

# unzip -p -- ... | tail
      "obtainAllPrivilegeionfalse,
      "obtainOtherPrivilegeionfalse,
      "obtainUserPrivilegeionfalse,
      "severityion"MEDIUM",
      "userInteractionRequiredionfalse
    }
  },
  "lastModifiedDate": "2012-02-02T04:09Z",
  "publishedDate": "2012-01-13T04:14Z"
}CVE-2011-4925.json      bad CRC 674112a4  (should be 3975ccbb)

# jar xf ...
java.util.zip.ZipException: invalid entry CRC (expected 0x3975ccbb but got 0x674112a4)
	at java.base/java.util.zip.ZipInputStream.readEnd(ZipInputStream.java:410)
</code></pre></div></div>

<p>Observations:</p>

<ul>
  <li>The output is unexpected (e.g. <code class="language-plaintext highlighter-rouge">"obtainAllPrivilegeionfalse,</code> instead of a key value pair), but the stream itself is valid, so infgen doesn’t report any warning;</li>
  <li>Corruption is only detected when parsed from a zip, since a checksum is also included (assuming the checksum itself wasn’t corrupted as well).</li>
</ul>

<h2 id="guidance-from-structured-data">Guidance from structured data</h2>

<p>How would we deal with the previous examples?</p>

<ul>
  <li>With some context, we could correct simple ones manually (e.g. BTYPE or HDIST values);</li>
  <li>Corrupted dynamic huffman tables are the hardest to handle, since the corruption can happen at a certain offset, but only after parsing the whole table will an error be reported. Our solution will not handle these cases, but some hints on how to manipulate these tables are given in the <a href="#part-ii-embellishing-streams">next part</a>;</li>
  <li>This leaves us with errors in literals and distance codes, which seem to be reported close to the offset where the corruption resides. Our solution will cover these cases.</li>
</ul>

<p>Let’s go back to our zip file. The metadata contains a file entry where we can see the extension of the compressed file (i.e. json). We know that json files are structured data. Therefore, if there are decompression errors that lead to bad output, a json parser would pick up some syntax errors. So, if we manage to <strong>parse a given part of the json without errors</strong>, we can assume <strong>it wasn’t hit by corruption</strong>. Sure, we can have e.g. <code class="language-plaintext highlighter-rouge">"a":1</code> instead of <code class="language-plaintext highlighter-rouge">"a":0</code>, but those would be very specific edge cases.</p>

<p>What can we use as a parser? <a href="https://tree-sitter.github.io/tree-sitter/">Tree-sitter</a>!</p>

<blockquote>
  <p>[…] a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited.</p>
</blockquote>

<p>We can make use of this syntax tree, which includes useful information, such as error nodes for tokens that contain syntax errors. By taking the corresponding text content of each node, we can reconstruct the json up to the first error node, and measure its length.</p>

<p>The idea is to <strong>generate candidate bytes to replace at the error offset</strong> reported by infgen, take the json output, pass it to tree-sitter, then <strong>check if we got a larger valid syntax tree</strong>: if we did, then probably we were able to fix the corruption!</p>

<p>The following <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/fix_deflate_w_sitter.py">script</a> automates this process:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./fix_deflate_w_sitter.py 110_dist_too_far_back/CVE-2011-4925.deflate
</code></pre></div></div>

<p>For each candidate modification, we keep track of the (partially) successfully parsed output:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mh">0xFF</span><span class="p">):</span>  <span class="c1"># Try candidate byte at error index
</span>    <span class="n">data</span><span class="p">[</span><span class="n">i</span> <span class="o">+</span> <span class="n">wi</span><span class="p">]</span> <span class="o">=</span> <span class="n">k</span>
    <span class="k">for</span> <span class="n">k2</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mh">0xFF</span><span class="p">):</span>  <span class="c1"># Try candidate byte at error index + 1
</span>        <span class="n">data</span><span class="p">[</span><span class="n">i</span> <span class="o">+</span> <span class="n">wi</span> <span class="o">+</span> <span class="mi">1</span><span class="p">]</span> <span class="o">=</span> <span class="n">k2</span>

        <span class="c1"># Buffered decompression using zlib.decompressobj(),
</span>        <span class="c1"># up to an error or end of stream
</span>        <span class="n">o</span><span class="p">,</span> <span class="n">i_</span><span class="p">,</span> <span class="n">has_errors_</span> <span class="o">=</span> <span class="n">decompress</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>

        <span class="c1"># Extract AST using tree-sitter
</span>        <span class="n">tree</span> <span class="o">=</span> <span class="n">PARSER</span><span class="p">.</span><span class="n">parse</span><span class="p">(</span><span class="n">o</span><span class="p">)</span>
        <span class="n">count_valid_tokens</span><span class="p">,</span> <span class="n">error_node</span> <span class="o">=</span> <span class="n">json_sitter</span><span class="p">.</span><span class="n">bfs</span><span class="p">(</span><span class="n">tree</span><span class="p">,</span> <span class="n">o</span><span class="p">)</span>
        <span class="n">error_byte_i</span> <span class="o">=</span> <span class="n">error_node</span><span class="p">.</span><span class="n">start_byte</span> <span class="k">if</span> <span class="n">error_node</span> <span class="k">else</span> <span class="nb">float</span><span class="p">(</span><span class="s">"inf"</span><span class="p">)</span>
</code></pre></div></div>

<p>After aggregating these outputs, the user is presented with a writable file, containing entries with candidate byte modifications. This file works similar to that of “git rebase -i”, where we can provide commands on further processing. For our example deflate stream, an error was detected by zlib.decompressobj(), and these candidates were generated:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Commands:
# p, pick &lt;line&gt; = use line for patch
# d, drop &lt;line&gt; = ignore line
# w, write &lt;line&gt; = write state for debugging
# x, expand &lt;line&gt; = show full line contents
#
# Lines starting with '#' will be ignored.
# This file will be restored if no line is picked.

d 0 (@0x54_0xfe_0x63, cvt=11, len=inf) b'{\n  "configurations": 5}'
[...]
d 6 (@0x54_0x27_0x8f, cvt=11, len=inf) b'{\n  "configurations": 6}'
d 7 (@0x54_0xbf_0x66, cvt=35, len=988) b'{\n  "configurations": {\n  "configurations": {\n  "configurations": {\n  "configurations": {\n    "CVE_data_version": "4.0",\n    "nodes": [\n      {\n        "cpe_match": [\n          {\n            "cpe23Uri": "cpe:2.3:a:cluster_resources:torque_resource_manager:1.0.1p0:*:*:*:*:*:*:*",\n [...]
d 8 (@0x54_...._0x60, cvt=26, len=922) b'{\n  "configurations": {\n    "CVE_data_version": "4.0",\n    "nodes": [\n      {\n        "cpe_match": [\n          {\n            "cpe23Uri": "cpe:2.3:a:cluster_resources:torque_resource_manager:1.0.1p0:*:*:*:*:*:*:*",\n            "vulnerable": true\n          },\n          {\n            "cpe23Uri": "cpe:2.3:a:cluster_resources:torque_resource_manager:1.0.1p1:*:*:*:*:*:*:*",\n [...]
[...]
</code></pre></div></div>

<p>Entries are sorted by descending length of tree-sitter’s successfully parsed output. Index 0 to 6 are edge cases we can ignore. Index 7 has the largest valid length, but we see some fields being repeated… It’s possible that this is an artifact of incorrect length/distance codes, so let’s move on. Index 8 only has one byte change (<code class="language-plaintext highlighter-rouge">@0x54_...._0x60</code> = changed byte at offset 0x55 to 0x60), and it seems closer to what we expect. In we pick this entry (replace command <code class="language-plaintext highlighter-rouge">d</code> with <code class="language-plaintext highlighter-rouge">p</code>, save and quit), we end up having the full decompressed output, since there were no more errors detected. Of course, if there were more errors in the stream, a new file would be open, and this process would be repeated for the next offset.</p>

<p>Some approaches that I found reduces time spent evaluating entries:</p>

<ol>
  <li>If you know a pattern that should be present at a given point in the output, just delete entries that don’t contain it (in vim: <code class="language-plaintext highlighter-rouge">:g!/good_pattern/d</code>);</li>
  <li>It’s preferable to first go through smaller modifications, which are entries that change a single byte (in vim: <code class="language-plaintext highlighter-rouge">/_\.\.\.</code>), then go through those that change two bytes.</li>
</ol>

<p>Turns out that we fixed the exact byte that was needed! To ensure the output we got (stored in a <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/110_dist_too_far_back/%400x54_0xdf_0x60.fix">file</a> with name pattern “@offset_0x??_0x??.fix”) matches the original bytes:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>diff <span class="nt">-au</span> <span class="se">\</span>
    000_samples/CVE-2011-4925.deflate <span class="se">\</span>
    @0x54_0xdf_0x60.fix <span class="se">\</span>
    | <span class="nb">wc</span> <span class="nt">-c</span>
<span class="c"># 0 (no bytes are different)</span>
</code></pre></div></div>

<h2 id="isnt-this-solution-just-glorified-bruteforcing">Isn’t this solution just glorified bruteforcing?</h2>

<p>True! It’s still worth challenging the assumption that this sort of recovery is simply “<a href="https://stackoverflow.com/questions/15694270/how-to-force-zlib-to-decompress-more-than-x-bytes">not</a> <a href="https://stackoverflow.com/questions/26794514/how-to-extract-data-from-corrupted-gzip-files-in-python">possible</a>”. With some context on the underlying file formats, we can make educated guesses and be closer to the original data.</p>

<h1 id="part-ii-embellishing-streams">Part II: Embellishing streams</h1>

<p><strong>Can we take a compressed stream, modify it, and still get the same decompressed output?</strong></p>

<p>Let’s make the following observations:</p>

<ul>
  <li>It’s possible to guess if the payload is plaintext or not without decompressing: for each dynamic huffman table, check the literal code lengths: if literals in the ascii range have smaller lengths then other literals, then it’s likely to be plaintext, since a compressor assigns smaller lengths to literals that appear more frequently.
    <ul>
      <li>This implies a relationship between the dynamic huffman table entries and the symbols that appear later on in the block.</li>
    </ul>
  </li>
  <li><strong>What if we don’t have any literals to compress?</strong> What would be the minimal fields that such a block would need to have to be succesfully parsed?
    <ul>
      <li><strong>At least symbol 256 is needed</strong> to encode the end of block, so we also need table entries for it.</li>
    </ul>
  </li>
</ul>

<p>Great, so unused table entries (i.e. for other symbols besides symbol 256) are still parsed as lengths, but since the corresponding symbols can just not be included in the block, it doesn’t matter which lengths end up being defined: those <strong>unused entries can be overwritten by arbitrary bytes</strong>.</p>

<p>However, huffman codes used in DEFLATE are prefix-free (e.g. since <code class="language-plaintext highlighter-rouge">00</code> is a prefix of <code class="language-plaintext highlighter-rouge">001</code>, those two bit sequences cannot correspond to two distinct codes), so there must be some validation being applied to these entries, which we still need to somehow pass…</p>

<h2 id="playing-well-with-parsers">Playing well with parsers</h2>

<p>In theory, a minimal block would contain:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>- BFINAL = 1 bit
- BTYPE = 2 bits (dynamic huffman tables = 0b10)
- HLIT count = 5 bits (need 1)
- HDIST count = 5 bits (don't need any)
- HCLEN count = 4 bits (need 1)
- HCLEN table (need 1 length for 1 HLIT entry)
- HLIT table (need 1 length for 1 symbol)
- HDIST table (empty)
- symbol 256
</code></pre></div></div>

<div class="c-indirectly-related">
  <p>We need to know how those tables are written before we can fix them.</p>

  <p>Let’s just focus on the part that matters to our solution, which is how a HCLEN code matches a HLIT code, which in turn matches a symbol to decode. Do check the <a href="#references">references</a> for the full context.</p>

  <p>To decode symbol 256 (a literal):</p>

  <ul>
    <li>We need the HLIT code for a certain length. Symbols can match HLIT codes of the same length, as long as the code values themselves are different. Which n-th code is matched depends on the order they were read in the block (defined by the encoder). But which length exactly?
      <ul>
        <li>We need the HCLEN code for a certain HLIT code. Which n-th HCLEN code depends on the order they were read in the block (pre-defined).</li>
      </ul>
    </li>
  </ul>

  <p>To be clear, we have a <strong>double huffman decoding</strong> going on here. In the below illustration, you can see how all the decodings map to our infgen fork log output, in this case to decode symbol 10:</p>

  <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[begin parsing HCLEN codes]
[...]
    static const short order[19] = /* permutation of code length codes */
        {16, 17, 18, 0, 8, 7, 9, 6, 10, 5, 11, 4, 12, 3, 13, 2, 14, 1, 15};
                     ^
                     +`--------------------------------------------------.
                      `-----------------------------------------.        |
                                                                V        +
DEBUG 00000004 2: 0x3   ___011.. (need 3, HCLEN code len, order 0, index 3)
                    +                          ,----------------+
                    |                         V
                     `---&gt; symbol with length 0 has 1st HLIT code with length 3
DEBUG 00000004 5: 0x3   011..... (need 3, HCLEN code len, order 8, index 4)   +
                    +                          ,----------------+             |
                    |                         V                               '-.
                     `---&gt; symbol with length 8 has 2nd HLIT code with length 3 |
[...]                                                                         + |
[end parsing HCLEN codes]                                                     | |
[begin parsing HLIT+HDIST codes]                                              | |
[...]                                                                         | |
[on 10th HLIT bit sequence to decode]                                         | |
                     ,--------------------------------------------------------' |
                    V                                                           |
DEBUG 00000009 3: 0x4   __001... (need 3, decode bitbuf (RTL))                  |
                    +---------------------------------------------------------. |
INFO  00000009 6: ! decoded len 3 bits 001 sym_i 10                           | |
                                                  +---&gt; parsing for symbol 10 | |
                                                    ,-------------------------' |
                                                   |    ,-----------------------'
                             (parsed as 0b100 = 4) |   | (parsed as 0b010 = 2)
                                                   V   V
                  [index + (code - first)] = [1 + (4 - 2)] = [3] = 8
                  +                                                +--------.
                   `---&gt; infgen's decode() ordered symbol table lookup      |
                         - int index; /* index of first code of length len  |
                                         in symbol table */                 |
                         - int code;  /* len bits being decoded */          |
                         - int first; /* first code of length len */        |
                        ,---------------------------------------------------'
                       V
INFO  00000009 6: lens 8 (0x8) +---&gt; symbol 10 has 1st code with length 8 +----.
                       +---------------.                                       |
[...]                                  |                                       |
[end parsing HLIT+HDIST codes]         |                                       |
[begin parsing symbols]                |                                       |
[...]                                  |                                       |
[on symbol 10 parsing]                 |                                       |
                                       |                                       |
DEBUG 0000007e 3: 0xe2  00111...       V                                       |
                        _____010 (need 8, decode bitbuf (RTL))                 |
                                 ,-----+                 ,--&gt; 2nd literal that |
                                V                       |     is decoded and   |
INFO  0000007e 3: ! decoded len 8 bits 01000111 sym_i 2 +     sent to output   |
                                              +                                |
                 (parsed as 0b11100010 = 226) |                                |
                                              '------.       ,-----------------'
                                                      V     V
                  [index + (code - first)] = [53 + (226 - 226)] = [53] = 10
                  +                                                       +
                   `---&gt; infgen's decode() ordered symbol table lookup    |
                            ,---------------------------------------------'
                           V
INFO  0000007e 3: literal 10 (0xa)

[...]
[on symbol 256 parsing]
[end parsing symbols]
</code></pre></div>  </div>
</div>

<p>Back to how these tables are validated. Unfortunately, decompressors (e.g. zlib) may error on incomplete huffman tables. They expect these tables to contain <strong>enough entries to decode any possible bit sequence</strong> used for symbols, even if such symbols don’t end up being present in the block!</p>

<p>Let’s look closely at the validation done by infgen when building Huffman tables:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="n">symbol</span><span class="p">;</span>         <span class="cm">/* current symbol when stepping through length[] */</span>
<span class="kt">int</span> <span class="n">len</span><span class="p">;</span>            <span class="cm">/* current length when stepping through h-&gt;count[] */</span>
<span class="kt">int</span> <span class="n">left</span><span class="p">;</span>           <span class="cm">/* number of possible codes left of current length */</span>
<span class="kt">short</span> <span class="n">offs</span><span class="p">[</span><span class="n">MAXBITS</span><span class="o">+</span><span class="mi">1</span><span class="p">];</span>      <span class="cm">/* offsets in symbol table for each length */</span>

<span class="cm">/* count number of codes of each length */</span>
<span class="k">for</span> <span class="p">(</span><span class="n">len</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">len</span> <span class="o">&lt;=</span> <span class="n">MAXBITS</span><span class="p">;</span> <span class="n">len</span><span class="o">++</span><span class="p">)</span>
    <span class="n">h</span><span class="o">-&gt;</span><span class="n">count</span><span class="p">[</span><span class="n">len</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="n">symbol</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">symbol</span> <span class="o">&lt;</span> <span class="n">n</span><span class="p">;</span> <span class="n">symbol</span><span class="o">++</span><span class="p">)</span>
    <span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">count</span><span class="p">[</span><span class="n">length</span><span class="p">[</span><span class="n">symbol</span><span class="p">]])</span><span class="o">++</span><span class="p">;</span>   <span class="cm">/* assumes lengths are within bounds */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">h</span><span class="o">-&gt;</span><span class="n">count</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="n">n</span><span class="p">)</span>               <span class="cm">/* no codes! */</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>                       <span class="cm">/* complete, but decode() will fail */</span>

<span class="cm">/* check for an over-subscribed or incomplete set of lengths */</span>
<span class="n">left</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>                           <span class="cm">/* one possible code of zero length */</span>
<span class="k">for</span> <span class="p">(</span><span class="n">len</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span> <span class="n">len</span> <span class="o">&lt;=</span> <span class="n">MAXBITS</span><span class="p">;</span> <span class="n">len</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">left</span> <span class="o">&lt;&lt;=</span> <span class="mi">1</span><span class="p">;</span>                     <span class="cm">/* one more bit, double codes left */</span>
    <span class="n">left</span> <span class="o">-=</span> <span class="n">h</span><span class="o">-&gt;</span><span class="n">count</span><span class="p">[</span><span class="n">len</span><span class="p">];</span>          <span class="cm">/* deduct count from possible codes */</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">left</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">left</span><span class="p">;</span>                <span class="cm">/* over-subscribed--return negative */</span>
<span class="p">}</span>                                   <span class="cm">/* left &gt; 0 means incomplete */</span>

<span class="cm">/* [...] */</span>

<span class="cm">/* return zero for complete set, positive for incomplete set */</span>
<span class="k">return</span> <span class="n">left</span><span class="p">;</span>
</code></pre></div></div>

<p>Basically, for each code length up to MAXBITS, variable <code class="language-plaintext highlighter-rouge">left</code> is doubled and then subtracted by the number of occurrences of the corresponding length being iterated. Sure, we can have different counts across lengths, but in the end, <code class="language-plaintext highlighter-rouge">left</code> needs to be zero, not more, not less.</p>

<p>Luckily, all these constraints can be expressed as linear functions, which can be fed to our favourite model checker (i.e. z3).</p>

<h2 id="applying-the-message">Applying the message</h2>

<p>With this knowledge on how to make a “dummy” block, the plan is to take some existing stream, use it to produce a block that contains a human readable message (read with e.g. strings), overriding some code length table entries. More entries will be added to fix the total. We can then concatenate it with a copy of the original stream.</p>

<p>Let’s use the <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/000_samples/CVE-2011-4925.deflate">same example file from the last part</a>.</p>

<p>The following steps can be reproduced with a <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/embellish.py">script</a> (some limitations will be <span class="c-badge c-badge-nok">highlighted</span>):</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./embellish.py ~/CVE-2011-4925.deflate <span class="s1">'hello world!'</span>
</code></pre></div></div>

<p>For starters:</p>

<ul>
  <li>Apply our arbitrary bytes to the DEFLATE stream (<a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/200_hello/CVE-2011-4925.deflate.add_message.out">output file</a>);
    <ul>
      <li>Check with infgen that the maximum count for a given length wasn’t exceeded (a.k.a. over-subscribed), otherwise we need a smaller / different message;</li>
    </ul>
  </li>
</ul>

<p>To retrieve the minimum code lengths needed:</p>

<ul>
  <li>Reduce the HCLIT and HDIST counts, so that parsing of these codes stops near the end of the last injected byte (at least up to the length for symbol 256, everything else that follows will be replaced);
    <ul>
      <li>Since we don’t need distances, we can set their count to zero (will be parsed as <code class="language-plaintext highlighter-rouge">HDIST=1</code>, the specification’s minimal number of distance codes), then adjust later on after we have the final literal/length counts;</li>
    </ul>
  </li>
  <li>At this point, infgen is able to construct huffman tables, and our expectation is for them to be under-subscribed:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  DEBUG 00000031 3: ! iterating litlen len  1 (left  1, left&lt;&lt;1  2, count[len]  0, left-count  2)
  DEBUG 00000031 3: ! iterating litlen len  2 (left  2, left&lt;&lt;1  4, count[len]  0, left-count  4)
  DEBUG 00000031 3: ! iterating litlen len  3 (left  4, left&lt;&lt;1  8, count[len]  1, left-count  7)
  [...]
  DEBUG 00000031 3: ! iterating litlen len 13 (left  8, left&lt;&lt;1 16, count[len]  0, left-count 16)
  DEBUG 00000031 3: ! iterating litlen len 14 (left 16, left&lt;&lt;1 32, count[len]  0, left-count 32)
  DEBUG 00000031 3: ! iterating litlen len 15 (left 32, left&lt;&lt;1 64, count[len]  0, left-count 64)
  WARN  00000031 3: ! under-subscribed litlen (left 64)
  INFO  00000031 3: ! construct litlen: err 64, nlen 261, code.count[0] 181
</code></pre></div>    </div>
  </li>
  <li>Retrieve the reported code lengths that were used so far:
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  INFO  00000031 3: ! construct litlen len 0 count 181 (n 261)
  INFO  00000031 3: ! construct litlen len 1 count 0 (n 261)
  INFO  00000031 3: ! construct litlen len 2 count 0 (n 261)
  INFO  00000031 3: ! construct litlen len 3 count 1 (n 261)
  [...]
  INFO  00000031 3: ! construct litlen len 15 count 0 (n 261)
</code></pre></div>    </div>
  </li>
  <li>Compute the additional code lengths constrained to the retrieved lengths.</li>
</ul>

<p>The following <a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/huffman_solver.py">z3 script</a> computes solutions for additional code counts:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">z3</span> <span class="kn">import</span> <span class="o">*</span>

<span class="n">MAXBITS</span> <span class="o">=</span> <span class="mi">16</span>
<span class="n">MAXHDIST</span> <span class="o">=</span> <span class="mi">29</span>
<span class="n">MAXHLIT</span> <span class="o">=</span> <span class="mi">285</span>


<span class="k">def</span> <span class="nf">solve</span><span class="p">(</span><span class="n">code_counts</span><span class="p">,</span> <span class="n">exclusions</span><span class="p">,</span> <span class="n">max_codes</span><span class="p">):</span>
    <span class="n">z3</span><span class="p">.</span><span class="n">set_param</span><span class="p">(</span><span class="n">proof</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">s</span> <span class="o">=</span> <span class="n">Optimize</span><span class="p">()</span>

    <span class="c1"># Known input
</span>    <span class="n">f_len</span> <span class="o">=</span> <span class="n">MAXBITS</span>
    <span class="n">f</span> <span class="o">=</span> <span class="p">[</span><span class="n">Int</span><span class="p">(</span><span class="s">"{:04d}"</span><span class="p">.</span><span class="nb">format</span><span class="p">(</span><span class="n">i</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">f_len</span><span class="p">)]</span>
    <span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">exclusions</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
        <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">==</span> <span class="n">v</span><span class="p">)</span>
    <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">)</span>
    <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">)</span>

    <span class="c1"># Huffman table validation
</span>    <span class="n">left</span> <span class="o">=</span> <span class="mi">2</span> <span class="o">-</span> <span class="n">f</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="n">f_len</span><span class="p">,</span> <span class="mi">1</span><span class="p">):</span>
        <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">And</span><span class="p">(</span><span class="n">f</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">&gt;=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">f</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">&lt;</span> <span class="p">(</span><span class="mi">1</span> <span class="o">&lt;&lt;</span> <span class="n">i</span><span class="p">)))</span>
        <span class="n">left</span> <span class="o">=</span> <span class="p">(</span><span class="n">left</span> <span class="o">*</span> <span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">f</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
    <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">left</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>

    <span class="c1"># Avoid solutions requiring more codes than the maximum allowed
</span>    <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">Sum</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> <span class="o">&lt;=</span> <span class="n">max_codes</span><span class="p">)</span>

    <span class="c1"># We prefer solutions with the minimum number of additional lengths necessary,
</span>    <span class="c1"># so that we can use larger payloads
</span>    <span class="n">s</span><span class="p">.</span><span class="n">minimize</span><span class="p">(</span><span class="n">Sum</span><span class="p">(</span><span class="n">f</span><span class="p">))</span>

    <span class="c1"># Used code lengths so far
</span>    <span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">code_counts</span><span class="p">.</span><span class="n">items</span><span class="p">():</span>
        <span class="n">s</span><span class="p">.</span><span class="n">add</span><span class="p">(</span><span class="n">f</span><span class="p">[</span><span class="n">k</span><span class="p">]</span> <span class="o">&gt;=</span> <span class="n">v</span><span class="p">)</span>

    <span class="k">if</span> <span class="n">s</span><span class="p">.</span><span class="n">check</span><span class="p">()</span> <span class="o">==</span> <span class="n">sat</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="s">"Found solution:"</span><span class="p">)</span>
        <span class="n">model</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="n">model</span><span class="p">()</span>
        <span class="n">vs</span> <span class="o">=</span> <span class="p">[(</span><span class="n">v</span><span class="p">,</span> <span class="n">model</span><span class="p">[</span><span class="n">v</span><span class="p">])</span> <span class="k">for</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">model</span><span class="p">]</span>
        <span class="n">vs</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="n">vs</span><span class="p">,</span> <span class="n">key</span><span class="o">=</span><span class="k">lambda</span> <span class="n">a</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">a</span><span class="p">))</span>
        <span class="n">new_code_counts</span> <span class="o">=</span> <span class="p">{}</span>
        <span class="k">for</span> <span class="n">k</span><span class="p">,</span> <span class="n">v</span> <span class="ow">in</span> <span class="n">vs</span><span class="p">:</span>
            <span class="k">print</span><span class="p">(</span><span class="n">k</span><span class="p">,</span> <span class="n">v</span><span class="p">)</span>
            <span class="n">ik</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">k</span><span class="p">),</span> <span class="mi">10</span><span class="p">)</span>
            <span class="n">new_code_counts</span><span class="p">[</span><span class="n">ik</span><span class="p">]</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">v</span><span class="p">),</span> <span class="mi">10</span><span class="p">)</span>
        <span class="k">return</span> <span class="n">new_code_counts</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">print</span><span class="p">(</span><span class="n">s</span><span class="p">.</span><span class="n">unsat_core</span><span class="p">())</span>
        <span class="k">print</span><span class="p">(</span><span class="n">s</span><span class="p">.</span><span class="n">__repr__</span><span class="p">())</span>
        <span class="k">raise</span> <span class="nb">RuntimeError</span><span class="p">(</span><span class="s">"No solution."</span><span class="p">)</span>
</code></pre></div></div>

<p>As an example, with these literal/length codes and distance codes:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lit_counts</span> <span class="o">=</span> <span class="p">{</span><span class="mi">3</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">4</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">:</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">7</span><span class="p">:</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">8</span><span class="p">:</span> <span class="mi">33</span><span class="p">,</span> <span class="mi">9</span><span class="p">:</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">10</span><span class="p">:</span> <span class="mi">13</span><span class="p">}</span>
<span class="n">solve</span><span class="p">(</span><span class="n">lit_counts</span><span class="p">,</span> <span class="p">{},</span> <span class="n">MAXHLIT</span><span class="p">)</span>

<span class="n">dist_counts</span> <span class="o">=</span> <span class="p">{</span><span class="mi">10</span><span class="p">:</span> <span class="mi">1</span><span class="p">}</span>
<span class="n">solve</span><span class="p">(</span><span class="n">dist_counts</span><span class="p">,</span> <span class="p">{},</span> <span class="n">MAXHDIST</span><span class="p">)</span>
</code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Found solution:
0000 0
0001 0
0002 0
0003 2
0004 3
0005 9
0006 0
0007 13
0008 33
0009 19
0010 14
0011 0
0012 0
0013 0
0014 0
0015 0
Found solution:
0000 0
0001 1
0002 1
0003 1
0004 1
0005 1
0006 1
0007 1
0008 1
0009 1
0010 2
0011 0
0012 0
0013 0
0014 0
0015 0
</code></pre></div></div>

<p>Afterwards, fix the stream:</p>

<ul>
  <li>Subtract the solution’s counts from the existing counts, producing the additional code lengths;</li>
  <li>Add the additional lengths;
    <ul>
      <li><span class="c-badge c-badge-nok">No computed huffman tables</span>: can only add known code lengths (e.g. if we never decoded code length 4, our solution cannot contain it, since we don’t know which bits to add to the stream);</li>
    </ul>
  </li>
  <li>Increase the HCLIT and HDIST counts, to cover the previously added lengths (<a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/200_hello/CVE-2011-4925.deflate.add_all_codes.out">output file</a>);</li>
  <li>Add symbol 256 (<a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/200_hello/CVE-2011-4925.deflate.add_sym_256.out">output file</a>);
    <ul>
      <li><span class="c-badge c-badge-nok">No computed huffman tables</span>: have to bruteforce the bits corresponding to this symbol’s code length.</li>
    </ul>
  </li>
</ul>

<p>Finally, concatenate this new block with a copy of the original stream (<a href="https://github.com/nevesnunes/deflate-frolicking/blob/master/200_hello/CVE-2011-4925.deflate.embellished">output file</a>):</p>

<ul>
  <li>Set <code class="language-plaintext highlighter-rouge">BFINAL=0</code> in the new block, since it’s followed by one or more blocks.</li>
</ul>

<p>To ensure the added block doesn’t affect decompression output:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>diff <span class="nt">-au</span> <span class="se">\</span>
    &lt;<span class="o">(</span>python3 <span class="nt">-c</span> <span class="s1">'import sys,zlib;print(zlib.decompress(open(sys.argv[1], "rb").read(), -15))'</span> CVE-2011-4925.deflate<span class="o">)</span> <span class="se">\</span>
    &lt;<span class="o">(</span>python3 <span class="nt">-c</span> <span class="s1">'import sys,zlib;print(zlib.decompress(open(sys.argv[1], "rb").read(), -15))'</span> CVE-2011-4925.deflate.embellished<span class="o">)</span> <span class="se">\</span>
    | <span class="nb">wc</span> <span class="nt">-c</span>
<span class="c"># 0 (no bytes are different)</span>
</code></pre></div></div>

<p>And just to double check that our message is indeed present in the new stream:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gh">diff -au &lt;(xxd 000_samples/CVE-2011-4925.deflate) &lt;(xxd 200_hello/CVE-2011-4925.deflate.embellished)
</span><span class="gd">--- 000_samples/CVE-2011-4925.deflate
</span><span class="gi">+++ 200_hello/CVE-2011-4925.deflate.embellished
</span><span class="p">@@ -1,85 +1,88 @@</span>
<span class="gd">-00000000: cd5b 5d6f da48 147d cfaf 1879 5f76 ab80  .[]o.H.}...y_v..
-00000010: b1b1 d3c4 4f4b 095b 591b 2005 924a ad50  ....OK.[Y. ..J.P
-00000020: 34d8 1332 5ae3 7167 c6a4 5195 ffbe 776c  4..2Z.qg..Q...wl
-00000030: 70ec 280d b4ea 2e57 2112 9e7b 7de7 dc0f  p.(....W!..{}...
-00000040: 9f33 919c 6f47 8458 9148 6ff9 3297 5473  .3..oG.X.Ho.2.Ts
</span> [...]
<span class="gi">+00000000: 2c4b 5d6f da48 147d cfaf 1879 5f76 ab80  ,K]o.H.}...y_v..
+00000010: b1b1 d3c4 4f4b 095b 591b 2068 656c 6c6f  ....OK.[Y. hello
+00000020: 2077 6f72 6c64 2167 c6a4 5195 ffbe 776c   world!g..Q...wl
+00000030: 7ace 39e7 9c3b 0669 deea 7ad3 46a2 e87b  z.9..;.i..z.F..{
+00000040: 7ec5 c8fb b25b 058c 8d9d 267e 5a4a d8ca  ~....[....&amp;~ZJ..
</span> [...]
</code></pre></div></div>

<h1 id="source-code">Source code</h1>

<p>Available in a <a href="https://github.com/nevesnunes/deflate-frolicking">git repository</a>.</p>

<h1 id="further-work">Further work</h1>

<ul>
  <li>Improving the accuracy of reported syntax errors in tree-sitter grammars would lead to better sorting of candidates in bruteforced error repairing. In some cases, the error node can be marked too early in the syntax tree, causing the calculated valid output length to be smaller than expected;</li>
  <li>Including huffman tables generation when preparing arbitrary payloads for a DEFLATE stream, as the presented proof-of-concept relies on parsing the infgen output, and can miss some cases where it would be possible to fit the message in some offset or fix the dynamic huffman table entries with less constrained solutions.</li>
</ul>

<p>How about other possibilities?</p>

<ul>
  <li>Ever wanted to have a zip file that is ridiculously larger than the compressed payload? Just concatenate a series of blocks that only contain symbol 256, and replace the zip’s stream with that one concatenated to the original stream, but you also need to adjust metadata offsets and sizes;</li>
  <li>Maybe steganography with unused huffman table entries? Well, be aware that even if there isn’t a huge size difference between a zip with hidden messages and the corresponding recompressed zip, it’s still suspicious to disassemble blocks with just symbol 256…</li>
</ul>

<h1 id="references">References</h1>

<ul>
  <li>Specification and documentation are a must, bonus points if they compare implementations:
    <ul>
      <li><a href="https://datatracker.ietf.org/doc/html/rfc1951">rfc1951 - DEFLATE Compressed Data Format Specification version 1.3</a></li>
      <li><a href="http://zlib.net/feldspar.html">An Explanation of the `Deflate’ Algorithm</a></li>
      <li><a href="https://www.euccas.me/zlib/">Understanding zlib</a></li>
    </ul>
  </li>
  <li>Theory needs to be put into practice, which can be more digestible with smaller scoped tools:
    <ul>
      <li><a href="https://github.com/madler/infgen">GitHub - madler/infgen: Deflate disassember to convert a deflate, zlib, or gzip stream into a readable form.</a></li>
      <li><a href="https://github.com/XlogicX/YouFLATE">GitHub - XlogicX/YouFLATE: An interactive tool that allows you to DEFLATE (compress) data using your own length-distance pairs, not merely the most efficient ones as is default with DEFLATE.</a></li>
    </ul>
  </li>
  <li>Those who rolled their own implementations can have unique insights on design decisions:
    <ul>
      <li><a href="https://www.nayuki.io/page/unspecified-edge-cases-in-the-deflate-standard">Unspecified edge cases in the DEFLATE standard</a></li>
      <li><a href="https://jnior.com/deflate-compression-algorithm/">DEFLATE Compression Algorithm | INTEG Process Group</a></li>
    </ul>
  </li>
</ul>]]></content><author><name></name></author><category term="compression" /><category term="file formats" /><category term="bruteforce" /><category term="constraint solving" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">CTF Writeup - InCTF 2021 - Miz</title><link href="https://nevesnunes.github.io/blog/2021/08/15/CTF-Writeup-InCTF2021-Miz.html" rel="alternate" type="text/html" title="CTF Writeup - InCTF 2021 - Miz" /><published>2021-08-15T16:51:02+01:00</published><updated>2021-08-15T16:51:02+01:00</updated><id>https://nevesnunes.github.io/blog/2021/08/15/CTF-Writeup---InCTF2021---Miz</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/08/15/CTF-Writeup-InCTF2021-Miz.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<h1 id="introduction">Introduction</h1>

<p>We are given a stripped rust binary. Functions in rust seem to feature convoluted stack setups that don’t play well with Ghidra’s decompiler. However, we can mostly avoid them in this binary, since the relevant logic is contained in a single function manipulating few data structures.</p>

<h1 id="description">Description</h1>

<blockquote>
  <p>Senpai plis find me a way.</p>

  <p>Author: Freakston, silverf3lix</p>

  <p>nc 34.94.181.140 4200</p>
</blockquote>

<p><a href="https://nevesnunes.github.io/blog/assets/writeups/InCTF2021/miz">Download</a></p>

<h1 id="analysis">Analysis</h1>

<p><code class="language-plaintext highlighter-rouge">strace</code> doesn’t report much: input is parsed with a <code class="language-plaintext highlighter-rouge">read()</code>, and the process exits with <code class="language-plaintext highlighter-rouge">exit_group(256)</code>.</p>

<p>Let’s start with low-hanging fruit: can we get any insights from instruction counting?</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="s1">'AAAAAA'</span>| qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span>
<span class="c"># 26110</span>
<span class="nb">echo</span> <span class="s1">'iAAAAA'</span>| qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span>
<span class="c"># 26117</span>
<span class="nb">echo</span> <span class="s1">'inctf{'</span>| qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span>
<span class="c"># 26117</span>
</code></pre></div></div>

<p>Seems that the flag prefix doesn’t go through different code branches. Let’s backtrack from the end, address <code class="language-plaintext highlighter-rouge">0x1d386</code>, reported by <code class="language-plaintext highlighter-rouge">strace -k</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&gt; /usr/lib64/libc-2.33.so(_exit+0x31) [0xcd021]
&gt; /usr/lib64/libc-2.33.so(__run_exit_handlers+0x201) [0x3fc01]
&gt; /usr/lib64/libc-2.33.so(exit+0x1f) [0x3fc9f]
&gt; miz() [0x1d386]
&gt; miz() [0x18bce]
</code></pre></div></div>

<p>Disassembly:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>                    FUN_0011d380                               XREF[3]: exit256:00118bca(c), 0013c9a0,
                                                                         0013ef30(*)
0011d380  50        PUSH       RAX
0011d381  ff 15 19  CALL       qword ptr [-&gt;&lt;EXTERNAL&gt;::exit]  void exit(int __status)
          97 02 00
</code></pre></div></div>

<p>I’ve named the function that calls it <code class="language-plaintext highlighter-rouge">exit256()</code>, which in turn has two callers. One was named <code class="language-plaintext highlighter-rouge">flag()</code>, since it has the only reference to a string that contains the word “flag”:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">FUN_001154f0</span><span class="p">(</span><span class="o">&amp;</span><span class="n">local_48</span><span class="p">,</span><span class="s">"flagYametesrc/bacharu.rshehe </span><span class="se">\n</span><span class="s">"</span><span class="p">,</span><span class="mi">4</span><span class="p">);</span>
</code></pre></div></div>

<p>There’s a call to it that we can force in gdb, by breaking at <code class="language-plaintext highlighter-rouge">CMP RAX,0x2</code> and skipping <code class="language-plaintext highlighter-rouge">JNZ exit</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>b *(0x555555554000 + 0x97f5)
r &lt;&lt;&lt; $(printf '%s' llllllllllllllll)
set $rip = (0x555555554000 + 0x97ff)
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">flag()</code> will try to open a non-existing file. So once we know the correct input, we need to supply it to the host in the task description, where that file is present.</p>

<p>The other caller of <code class="language-plaintext highlighter-rouge">exit256()</code> has more logic. There’s a switch case for 5 values, all in the ascii range, so I commented them with the corresponding char:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">FUN_00109590</span><span class="p">(</span><span class="kt">long</span> <span class="n">param_1</span><span class="p">,</span><span class="kt">long</span> <span class="o">*</span><span class="n">param_2</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// ...</span>

  <span class="n">i</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">);</span>
  <span class="n">len_in</span> <span class="o">=</span> <span class="n">param_2</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="n">len_in</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">lVar1</span> <span class="o">=</span> <span class="o">*</span><span class="n">param_2</span><span class="p">;</span>
    <span class="k">do</span> <span class="p">{</span>
      <span class="k">if</span> <span class="p">(</span><span class="n">len_in</span> <span class="o">&lt;=</span> <span class="n">i</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">panic</span><span class="p">(</span><span class="n">i</span><span class="p">,</span><span class="n">len_in</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144320</span><span class="p">);</span>
        <span class="k">do</span> <span class="p">{</span>
          <span class="n">invalidInstructionException</span><span class="p">();</span>
        <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
      <span class="p">}</span>
      <span class="k">if</span> <span class="p">(</span><span class="nb">false</span><span class="p">)</span> <span class="p">{</span>
                <span class="cm">/* i */</span>
<span class="nl">switchD_001095e8_caseD_69:</span>
        <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
      <span class="p">}</span>
      <span class="k">else</span> <span class="p">{</span>
        <span class="k">switch</span><span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="n">undefined</span> <span class="o">*</span><span class="p">)(</span><span class="n">lVar1</span> <span class="o">+</span> <span class="n">i</span><span class="p">))</span> <span class="p">{</span>
        <span class="k">case</span> <span class="mh">0x68</span><span class="p">:</span>
                    <span class="cm">/* h */</span>
          <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
          <span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">);</span>
            <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar4</span><span class="p">)</span> <span class="p">{</span>
              <span class="n">panic</span><span class="p">(</span><span class="n">uVar4</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144350</span><span class="p">);</span>
              <span class="k">do</span> <span class="p">{</span>
                <span class="n">invalidInstructionException</span><span class="p">();</span>
              <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
            <span class="p">}</span>
            <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
            <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar3</span><span class="p">)</span> <span class="p">{</span>
              <span class="n">panic</span><span class="p">(</span><span class="n">uVar3</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144350</span><span class="p">);</span>
              <span class="k">do</span> <span class="p">{</span>
                <span class="n">invalidInstructionException</span><span class="p">();</span>
              <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
            <span class="p">}</span>
            <span class="n">lVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">*</span> <span class="mi">200</span> <span class="o">+</span> <span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x10</span> <span class="o">+</span> <span class="n">uVar3</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
            <span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
              <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar3</span><span class="p">;</span>
              <span class="k">goto</span> <span class="n">i</span><span class="o">++</span><span class="p">;</span>
            <span class="p">}</span>
            <span class="k">goto</span> <span class="n">if2_flag</span><span class="p">;</span>
          <span class="p">}</span>
          <span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
        <span class="nl">default:</span>
          <span class="k">goto</span> <span class="n">switchD_001095e8_caseD_69</span><span class="p">;</span>
        <span class="k">case</span> <span class="mh">0x6a</span><span class="p">:</span>
                    <span class="cm">/* j */</span>
          <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
          <span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
          <span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span><span class="p">;</span>
          <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar4</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">panic</span><span class="p">(</span><span class="n">uVar4</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144380</span><span class="p">);</span>
            <span class="k">do</span> <span class="p">{</span>
              <span class="n">invalidInstructionException</span><span class="p">();</span>
            <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
          <span class="p">}</span>
          <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">);</span>
          <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar3</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">panic</span><span class="p">(</span><span class="n">uVar3</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144380</span><span class="p">);</span>
            <span class="k">do</span> <span class="p">{</span>
              <span class="n">invalidInstructionException</span><span class="p">();</span>
            <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
          <span class="p">}</span>
          <span class="k">break</span><span class="p">;</span>
        <span class="k">case</span> <span class="mh">0x6b</span><span class="p">:</span>
                    <span class="cm">/* k */</span>
          <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
          <span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">)</span> <span class="o">==</span> <span class="mh">0x18</span><span class="p">)</span> <span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
          <span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">)</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
          <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar4</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">panic</span><span class="p">(</span><span class="n">uVar4</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144368</span><span class="p">);</span>
            <span class="k">do</span> <span class="p">{</span>
              <span class="n">invalidInstructionException</span><span class="p">();</span>
            <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
          <span class="p">}</span>
          <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">);</span>
          <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar3</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">panic</span><span class="p">(</span><span class="n">uVar3</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144368</span><span class="p">);</span>
            <span class="k">do</span> <span class="p">{</span>
              <span class="n">invalidInstructionException</span><span class="p">();</span>
            <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
          <span class="p">}</span>
          <span class="k">break</span><span class="p">;</span>
        <span class="k">case</span> <span class="mh">0x6c</span><span class="p">:</span>
                    <span class="cm">/* l */</span>
          <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mi">8</span><span class="p">)</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
          <span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">);</span>
          <span class="k">if</span> <span class="p">(</span><span class="n">uVar4</span> <span class="o">!=</span> <span class="mh">0x18</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">);</span>
            <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar3</span><span class="p">)</span> <span class="p">{</span>
              <span class="n">panic</span><span class="p">(</span><span class="n">uVar3</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144338</span><span class="p">);</span>
              <span class="k">do</span> <span class="p">{</span>
                <span class="n">invalidInstructionException</span><span class="p">();</span>
              <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
            <span class="p">}</span>
            <span class="k">if</span> <span class="p">(</span><span class="mh">0x18</span> <span class="o">&lt;</span> <span class="n">uVar4</span><span class="p">)</span> <span class="p">{</span>
              <span class="n">panic</span><span class="p">(</span><span class="n">uVar4</span><span class="p">,</span><span class="mh">0x19</span><span class="p">,</span><span class="o">&amp;</span><span class="n">PTR_s_src</span><span class="o">/</span><span class="n">bacharu</span><span class="p">.</span><span class="n">rshehe_00144338</span><span class="p">);</span>
              <span class="k">do</span> <span class="p">{</span>
                <span class="n">invalidInstructionException</span><span class="p">();</span>
              <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
            <span class="p">}</span>
            <span class="n">lVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">uVar3</span> <span class="o">*</span> <span class="mi">200</span> <span class="o">+</span> <span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x10</span> <span class="o">+</span> <span class="n">uVar4</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
            <span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="k">goto</span> <span class="n">if2_flag</span><span class="p">;</span>
            <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar4</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
            <span class="k">goto</span> <span class="n">i</span><span class="o">++</span><span class="p">;</span>
          <span class="p">}</span>
          <span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">lVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">*</span> <span class="mi">200</span> <span class="o">+</span> <span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x10</span> <span class="o">+</span> <span class="n">uVar3</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
<span class="nl">if2_flag:</span>
          <span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">==</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">flag</span><span class="p">();</span>
            <span class="k">do</span> <span class="p">{</span>
              <span class="n">invalidInstructionException</span><span class="p">();</span>
            <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
          <span class="p">}</span>
          <span class="k">break</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">)</span> <span class="o">=</span> <span class="n">uVar4</span><span class="p">;</span>
      <span class="p">}</span>
<span class="n">i</span><span class="o">++:</span>
      <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">i</span> <span class="o">!=</span> <span class="n">len_in</span><span class="p">);</span>
  <span class="p">}</span>
<span class="nl">exit:</span>
  <span class="n">exit256</span><span class="p">(</span><span class="mh">0x100</span><span class="p">);</span>
  <span class="k">do</span> <span class="p">{</span>
    <span class="n">invalidInstructionException</span><span class="p">();</span>
  <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Some points of interest:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">panic()</code> is a rust runtime function that terminates the process with an error message. There’s 2 checks for <code class="language-plaintext highlighter-rouge">var &gt; 0x18</code> in each case of the switch. If we <code class="language-plaintext highlighter-rouge">set $rip = (0x555555554000 + 0x973d)</code> to force one of the calls to <code class="language-plaintext highlighter-rouge">panic()</code>, we see that those are bound checks:
    <blockquote>
      <p>thread ‘main’ panicked at ‘index out of bounds: the len is 25 but the index is 1’, src/bacharu.rs:123:20
note: run with <code class="language-plaintext highlighter-rouge">RUST_BACKTRACE=1</code> environment variable to display a backtrace
[Inferior 1 (process 3673543) exited with code 0145]</p>
    </blockquote>
  </li>
  <li><code class="language-plaintext highlighter-rouge">vi</code> users will recognize <code class="language-plaintext highlighter-rouge">h j k l</code> as movement keys. There’s also an <code class="language-plaintext highlighter-rouge">i</code> here, but it seems to only increment a counter stored at <code class="language-plaintext highlighter-rouge">param_1 + 8</code>, so probably not relevant to the remaining logic.</li>
</ul>

<p>Ok, so the task’s theme is a maze, probably the input we have to supply are the steps to traverse this maze. Let’s revisit instruction counting again, this time trying some valid inputs:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">a</span><span class="o">=(</span>h j k l<span class="o">)</span>
<span class="k">for </span>i <span class="k">in</span> <span class="s2">"</span><span class="k">${</span><span class="nv">a</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span><span class="p">;</span> <span class="k">do
  </span><span class="nv">walk</span><span class="o">=</span><span class="s2">"</span><span class="nv">$i</span><span class="s2">"</span>
  <span class="nv">inscount</span><span class="o">=</span><span class="si">$(</span><span class="nb">echo</span> <span class="s2">"</span><span class="nv">$walk</span><span class="s2">"</span> | qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span><span class="si">)</span>
  <span class="nb">echo</span> <span class="s2">"</span><span class="nv">$inscount</span><span class="s2"> </span><span class="nv">$walk</span><span class="s2">"</span>
  <span class="k">for </span>j <span class="k">in</span> <span class="s2">"</span><span class="k">${</span><span class="nv">a</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span><span class="p">;</span> <span class="k">do
    </span><span class="nv">walk</span><span class="o">=</span><span class="s2">"</span><span class="nv">$i$j</span><span class="s2">"</span>
    <span class="nv">inscount</span><span class="o">=</span><span class="si">$(</span><span class="nb">echo</span> <span class="s2">"</span><span class="nv">$walk</span><span class="s2">"</span> | qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span><span class="si">)</span>
    <span class="nb">echo</span> <span class="s2">"</span><span class="nv">$inscount</span><span class="s2"> </span><span class="nv">$walk</span><span class="s2">"</span>
    <span class="k">for </span>k <span class="k">in</span> <span class="s2">"</span><span class="k">${</span><span class="nv">a</span><span class="p">[@]</span><span class="k">}</span><span class="s2">"</span><span class="p">;</span> <span class="k">do
      </span><span class="nv">walk</span><span class="o">=</span><span class="s2">"</span><span class="nv">$i$j$k</span><span class="s2">"</span>
      <span class="nv">inscount</span><span class="o">=</span><span class="si">$(</span><span class="nb">echo</span> <span class="s2">"</span><span class="nv">$walk</span><span class="s2">"</span> | qemu-x86_64 <span class="nt">-d</span> in_asm ./miz 2&gt;&amp;1 | <span class="nb">wc</span> <span class="nt">-l</span><span class="si">)</span>
      <span class="nb">echo</span> <span class="s2">"</span><span class="nv">$inscount</span><span class="s2"> </span><span class="nv">$walk</span><span class="s2">"</span>
    <span class="k">done
  done
done</span> | <span class="nb">sort</span>
</code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>26143 jhh
26143 jhj
26143 jhk
26143 jhl
26143 jjh
...
26186 hj
26186 hk
26186 lj
26186 lk
26188 hhl
26188 hlh
26188 hll
26188 lhh
26188 lhl
26188 llh
26188 llk
26193 hl
26193 lh
26214 hlj
26214 hlk
26214 lhj
26214 lhk
</code></pre></div></div>

<p>Again, it doesn’t tell much. Some inputs exit early (e.g. <code class="language-plaintext highlighter-rouge">jhh</code>), could be because we bumped into a wall after the first <code class="language-plaintext highlighter-rouge">j</code>. But we see that moving back and forth (e.g. <code class="language-plaintext highlighter-rouge">hl</code>) naturally runs more instructions, since we never bump into walls. So, looking blindly at these counts won’t guide us to the solution.</p>

<p>These walls have a distinct representation in memory, which must be compared against our position. If we look at this conditional:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">)</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">//...</span>
  <span class="k">goto</span> <span class="n">if2_flag</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">goto</span> <span class="n">exit</span><span class="p">;</span>
</code></pre></div></div>

<p>We see that we exit when that variable is zero. Either this variable or <code class="language-plaintext highlighter-rouge">param_1 + 0x1398</code> are updated on each case, so could they be our position? Would zero be out-of-bounds?</p>

<p>There’s also some addressing that falls in range <code class="language-plaintext highlighter-rouge">[0x0..0x13a0]</code>:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">uVar4</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x13a0</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">uVar4</span> <span class="o">!=</span> <span class="mh">0x18</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">uVar3</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="n">ulong</span> <span class="o">*</span><span class="p">)(</span><span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x1398</span><span class="p">);</span>
  <span class="c1">// ...</span>
  <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">*</span> <span class="mi">200</span> <span class="o">+</span> <span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x10</span> <span class="o">+</span> <span class="n">uVar3</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
  <span class="c1">// ...</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Probably the maze is stored in this data structure. Let’s dump it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dump binary memory /tmp/1 $r8 $r8+0x13a0
</code></pre></div></div>

<p>Output:</p>

<pre><code>
00000000: 0a00 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000010: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000020: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000030: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000040: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000050: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000060: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000070: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000080: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00000090: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000a0: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000b0: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000c0: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000d0: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000e0: 0000 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
000000f0: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000100: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000110: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000120: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000130: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000140: 0000 <mark>0000 0000 0000</mark> 0000 <mark>0000 0000 0000</mark>
00000150: 0000 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
...
00001340: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00001350: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00001360: 0100 <mark>0000 0000 0000</mark> 0200 <mark>0000 0000 0000</mark>
00001370: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00001380: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
00001390: 0100 <mark>0000 0000 0000</mark> 0100 <mark>0000 0000 0000</mark>
</code></pre>

<p>These are little-endian uint64_t sized values. I’ve highlighted the part of the uint64_t which is always constant. So, the values are mostly <code class="language-plaintext highlighter-rouge">0</code> or <code class="language-plaintext highlighter-rouge">1</code>. One of them is <code class="language-plaintext highlighter-rouge">2</code>, which seems to be the destination we are trying to reach:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">lVar2</span> <span class="o">=</span> <span class="o">*</span><span class="p">(</span><span class="kt">long</span> <span class="o">*</span><span class="p">)(</span><span class="n">uVar4</span> <span class="o">*</span> <span class="mi">200</span> <span class="o">+</span> <span class="n">param_1</span> <span class="o">+</span> <span class="mh">0x10</span> <span class="o">+</span> <span class="n">uVar3</span> <span class="o">*</span> <span class="mi">8</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="nl">if2_flag:</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">lVar2</span> <span class="o">==</span> <span class="mi">2</span><span class="p">)</span> <span class="p">{</span>
      <span class="n">flag</span><span class="p">();</span>
      <span class="k">do</span> <span class="p">{</span>
        <span class="n">invalidInstructionException</span><span class="p">();</span>
      <span class="p">}</span> <span class="k">while</span><span class="p">(</span> <span class="nb">true</span> <span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Recall in the panic message “len is 25”… This must be a 25x25 maze, so a 2D visualization would help. Gimp allows us to import a file as “Raw image data” and then play around with the file offset for alignment:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/InCTF2021/import1.png" alt="" />
</div>

<p>Each value is stored in 8 * 8-bits, so <code class="language-plaintext highlighter-rouge">25 * 8 = 200</code> for a 1-bit representation. Sure, if we squint there’s a pattern in the preview, but we can avoid the gaps between dots by converting our values from uint64_t to a single byte:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="s">'rb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">data</span> <span class="o">=</span> <span class="n">f</span><span class="p">.</span><span class="n">read</span><span class="p">()</span>
<span class="n">data2</span> <span class="o">=</span> <span class="sa">b</span><span class="s">""</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">),</span> <span class="mi">8</span><span class="p">):</span>
    <span class="n">v</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="n">data</span><span class="p">[</span><span class="n">i</span><span class="p">:</span><span class="n">i</span><span class="o">+</span><span class="mi">8</span><span class="p">])[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">if</span> <span class="n">v</span> <span class="o">==</span> <span class="mi">1</span><span class="p">:</span>
        <span class="n">v</span> <span class="o">=</span> <span class="sa">b</span><span class="s">"</span><span class="se">\xff</span><span class="s">"</span> <span class="c1"># white
</span>    <span class="k">elif</span> <span class="n">v</span> <span class="o">==</span> <span class="mi">2</span><span class="p">:</span>
        <span class="n">v</span> <span class="o">=</span> <span class="sa">b</span><span class="s">"</span><span class="se">\x7f</span><span class="s">"</span> <span class="c1"># grey
</span>    <span class="k">else</span><span class="p">:</span>
        <span class="n">v</span> <span class="o">=</span> <span class="sa">b</span><span class="s">"</span><span class="se">\x00</span><span class="s">"</span> <span class="c1"># black
</span>    <span class="n">data2</span> <span class="o">+=</span> <span class="n">v</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">sys</span><span class="p">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">2</span><span class="p">],</span> <span class="s">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
    <span class="n">f</span><span class="p">.</span><span class="n">write</span><span class="p">(</span><span class="n">data2</span><span class="p">)</span>
</code></pre></div></div>

<p>Now we can use a 8-bit representation, which is clearer:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/InCTF2021/import2.png" alt="" />
</div>

<p>Zoomed-in:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/InCTF2021/zoom1.png" alt="" />
</div>

<p>Ok, but what’s the player position? We can confirm in the debugger the initial values stored in <code class="language-plaintext highlighter-rouge">r8+0x1398</code> and <code class="language-plaintext highlighter-rouge">r8+0x13a0</code>, and check how they are updated when moving around. The following gdb script allows us to supply inputs in a loop, and check if we bumped into a wall at a certain point and stopped moving. By checking the counter that is incremented on each switch case, we know how many steps we took of the original input we supplied (if we bumped into a wall, then <code class="language-plaintext highlighter-rouge">counter &lt; len(input)</code>):</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">gdb</span>
<span class="kn">import</span> <span class="nn">struct</span>

<span class="c1"># start of FUN_00109590
</span><span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *(0x555555554000 + 0x9597)"</span><span class="p">)</span>
<span class="c1"># start of switch case (jmp rax)
</span><span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *(0x555555554000 + 0x95e8)"</span><span class="p">)</span>
<span class="c1"># exit
</span><span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"b *(0x555555554000 + 0x95a4)"</span><span class="p">)</span>

<span class="n">walk</span> <span class="o">=</span> <span class="s">''</span>
<span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
    <span class="c1"># Expecting sequence of [hjkl]*
</span>    <span class="n">walk</span> <span class="o">=</span> <span class="nb">input</span><span class="p">(</span><span class="s">"&gt; "</span><span class="p">)</span>
    <span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="sa">f</span><span class="s">"r &lt;&lt;&lt; $(printf '%s' </span><span class="si">{</span><span class="n">walk</span><span class="si">}</span><span class="s">)"</span><span class="p">)</span>

    <span class="c1"># Break at start of FUN_00109590, $r8 has the pointer to the maze structure
</span>    <span class="n">inferior</span> <span class="o">=</span> <span class="n">gdb</span><span class="p">.</span><span class="n">selected_inferior</span><span class="p">()</span>
    <span class="n">r8</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">gdb</span><span class="p">.</span><span class="n">parse_and_eval</span><span class="p">(</span><span class="s">"$r8"</span><span class="p">)).</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">10</span><span class="p">)</span>
    <span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>

    <span class="k">while</span> <span class="bp">True</span><span class="p">:</span>
        <span class="n">rip</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">gdb</span><span class="p">.</span><span class="n">parse_and_eval</span><span class="p">(</span><span class="s">"$rip"</span><span class="p">)).</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">16</span><span class="p">)</span>
        <span class="k">if</span> <span class="n">rip</span> <span class="o">==</span> <span class="mh">0x555555554000</span> <span class="o">+</span> <span class="mh">0x95e8</span><span class="p">:</span>
            <span class="c1"># Break at start of switch case
</span>            <span class="n">pos_base</span> <span class="o">=</span> <span class="nb">int</span><span class="p">(</span><span class="nb">str</span><span class="p">(</span><span class="n">gdb</span><span class="p">.</span><span class="n">parse_and_eval</span><span class="p">(</span><span class="s">"$r8"</span><span class="p">)).</span><span class="n">split</span><span class="p">()[</span><span class="mi">0</span><span class="p">],</span> <span class="mi">10</span><span class="p">)</span>
            <span class="n">x</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">inferior</span><span class="p">.</span><span class="n">read_memory</span><span class="p">(</span><span class="n">r8</span><span class="o">+</span><span class="mh">0x13a0</span><span class="p">,</span> <span class="mi">8</span><span class="p">)))[</span><span class="mi">0</span><span class="p">]</span>
            <span class="n">y</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">inferior</span><span class="p">.</span><span class="n">read_memory</span><span class="p">(</span><span class="n">r8</span><span class="o">+</span><span class="mh">0x1398</span><span class="p">,</span> <span class="mi">8</span><span class="p">)))[</span><span class="mi">0</span><span class="p">]</span>
            <span class="k">print</span><span class="p">(</span><span class="sa">f</span><span class="s">'</span><span class="si">{</span><span class="nb">hex</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="si">}</span><span class="s"> </span><span class="si">{</span><span class="nb">hex</span><span class="p">(</span><span class="n">y</span><span class="p">)</span><span class="si">}</span><span class="s">'</span><span class="p">)</span>
            <span class="n">gdb</span><span class="p">.</span><span class="n">execute</span><span class="p">(</span><span class="s">"c"</span><span class="p">)</span>
        <span class="k">else</span><span class="p">:</span>
            <span class="k">break</span>

    <span class="c1"># Break at exit
</span>    <span class="n">walk_counter</span> <span class="o">=</span> <span class="n">struct</span><span class="p">.</span><span class="n">unpack</span><span class="p">(</span><span class="s">'&lt;Q'</span><span class="p">,</span> <span class="nb">bytearray</span><span class="p">(</span><span class="n">inferior</span><span class="p">.</span><span class="n">read_memory</span><span class="p">(</span><span class="n">r8</span><span class="o">+</span><span class="mi">8</span><span class="p">,</span> <span class="mi">8</span><span class="p">)))[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">print</span><span class="p">(</span><span class="n">walk_counter</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span>
</code></pre></div></div>

<p>Output:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># no movement, but advances counter
&gt; iii
0xd 0x1
0xd 0x1
0xd 0x1
3

# no movement, bumping into wall after first step
&gt; j 
0xd 0x1
0
&gt; jj
0xd 0x1
0

# moving left
&gt; h
0xd 0x1
1
&gt; hh
0xd 0x1
0xc 0x1
2

# moving right
&gt; l
0xd 0x1
1
&gt; ll
0xd 0x1
0xe 0x1
2
</code></pre></div></div>

<p>There’s an off-by-one when reporting the position, but all the information is there to traverse the maze. First, let’s mark the player position on our visualization, by computing the offset in the data structure corresponding to that maze cell, using the addressing expression found earlier:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>0x1 * 200 + 0x10 + 0xd * 8 = 0x140
</code></pre></div></div>

<p>Patched <code class="language-plaintext highlighter-rouge">0x2</code> in our dump:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>00000140: 0200 0000 0000 0000 0000 0000 0000 0000
</code></pre></div></div>

<p>Updated visualization:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/writeups/InCTF2021/zoom2.png" alt="" />
</div>

<p>Now we can just traverse manually. Turns out that <code class="language-plaintext highlighter-rouge">j</code> and <code class="language-plaintext highlighter-rouge">k</code> have switched directions, so going down uses <code class="language-plaintext highlighter-rouge">k</code>…</p>

<p>Here’s the complete input:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>llkkhhhhkkkkhhhhjjhhhhhhkkllkkkkkkhhkkllkklljjlllllljjhhjjllllllkklljjllkklljjllkkkkhhhhkkkkllkkkkhhk
</code></pre></div></div>

<p>When submitted to the remote host, we get the flag:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>hehe inctf{mizes_are_fun_or_get}
</code></pre></div></div>]]></content><author><name></name></author><category term="ctf" /><category term="reversing" /><category term="tracing" /><category term="visualization" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Filename Hook</title><link href="https://nevesnunes.github.io/blog/2021/07/22/Filename-Hook.html" rel="alternate" type="text/html" title="Filename Hook" /><published>2021-07-22T18:46:03+01:00</published><updated>2021-07-22T18:46:03+01:00</updated><id>https://nevesnunes.github.io/blog/2021/07/22/Filename-Hook</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/07/22/Filename-Hook.html"><![CDATA[<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />

<p>To workaround a filesystem feature, I decided to try dynamic preloading, bumping into a bunch of libc corners…</p>

<h2 id="analysis">Analysis</h2>

<p>In this case, a git repository was failing to checkout:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
You can inspect what was checked out with 'git status'
and retry with 'git restore --source=HEAD :/'
</code></pre></div></div>

<p>If we run with <code class="language-plaintext highlighter-rouge">strace -e file</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir("foo.", 0777) = -1 EINVAL (Invalid argument)
</code></pre></div></div>

<p>Same error could be reproduced with a simple <code class="language-plaintext highlighter-rouge">mkdir -p foo.</code>.</p>

<p>The filesystem was NTFS, where implementations may <a href="https://superuser.com/questions/585097/why-does-ntfs-disallow-the-use-of-trailing-periods-in-directory-names">disallow creating files with a dot at the end of the filename</a>. While Windows APIs support prefixing a path with <code class="language-plaintext highlighter-rouge">\\?\</code> to <a href="https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#win32-file-namespaces">disable all string parsing and passthrough to the filesystem</a>, Linux has NTFS-3G, which <a href="https://sourceforge.net/p/ntfs-3g/ntfs-3g/ci/17b56ccfa2334ec905b80b81b151c54a263a6d61/">honors the disallow behaviour</a> when mount option <code class="language-plaintext highlighter-rouge">windows_names</code> is set, so the boring solution is to mount without it.</p>

<p>Is there another way around this? Well, NTFS-3G uses the libc to interface with the filesystem, so we should be able to hook the relevant functions using LD_PRELOAD. The idea is to clean the filename, so that it no longer ends with a dot. I chose suffixing an underscore to it, since it’s relatively uncommon for names to end with <code class="language-plaintext highlighter-rouge">._</code>.</p>

<h2 id="covering-relevant-functions">Covering relevant functions</h2>

<p>We want all functions that expect a filename as argument. In particular, the signature is needed to know which arguments to pass when calling the original function with the cleaned filename, via <code class="language-plaintext highlighter-rouge">dlsym(RTLD_NEXT, "foo")</code>. The laziest approach I could think of was to grab the <a href="https://www.gnu.org/software/libc/manual/html_mono/libc.html">single page glibc documentation</a>, which conveniently describes functions in a greppable manner, which we filter by parameter names:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">grep</span> <span class="s1">'Function:'</span> libc.html | <span class="nb">grep</span> <span class="nt">-i</span> <span class="s1">'(.*\(.*filename\|path\).*)'</span>
</code></pre></div></div>

<p>Then we massage these signatures into hook functions (an example for <code class="language-plaintext highlighter-rouge">mkdir()</code>):</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">mkdir</span><span class="p">(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span> <span class="n">mode_t</span> <span class="n">mode</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">filename</span> <span class="o">=</span> <span class="n">clean</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s">"mkdir"</span><span class="p">);</span>

    <span class="kt">int</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">filename</span><span class="p">,</span> <span class="n">mode_t</span> <span class="n">mode</span><span class="p">);</span>
    <span class="n">original</span> <span class="o">=</span> <span class="n">dlsym</span><span class="p">(</span><span class="n">RTLD_NEXT</span><span class="p">,</span> <span class="s">"mkdir"</span><span class="p">);</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Except when it’s not that direct.</p>

<h3 id="variadic-arguments">Variadic arguments</h3>

<p>While C allows defining varargs, there’s no way to delegate them to another call without explicitly passing the arguments. Ok, so we parse them. But how many? It’s implementation specific… it can end with a null byte, or with any other arbitrary criteria.</p>

<p>One case is <code class="language-plaintext highlighter-rouge">open()</code>, which can have an optional argument:</p>

<blockquote>
  <p>The argument mode is used only when a file is created.</p>
  <ul>
    <li>https://www.gnu.org/software/libc/manual/html_node/Opening-and-Closing-Files.html</li>
  </ul>
</blockquote>

<p>A better clarification of how that file creation check is done:</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">mode</code> specifies the permissions to use in case a new file is created. This argument must be supplied when O_CREAT is specified in flags; if O_CREAT is not specified, then mode is ignored.</p>
  <ul>
    <li>https://linux.die.net/man/2/open</li>
  </ul>
</blockquote>

<p>Alternatively, with a simple stat check, that one ends up as:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">stat</span> <span class="n">stat_buf</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">stat</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">stat_buf</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// File exists, ignore mode.</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">flags</span><span class="p">);</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="kt">va_list</span> <span class="n">argp</span><span class="p">;</span>
    <span class="n">va_start</span><span class="p">(</span><span class="n">argp</span><span class="p">,</span> <span class="n">flags</span><span class="p">);</span>
    <span class="n">mode_t</span> <span class="n">mode</span> <span class="o">=</span> <span class="n">va_arg</span><span class="p">(</span><span class="n">argp</span><span class="p">,</span> <span class="n">mode_t</span><span class="p">);</span>
    <span class="n">va_end</span><span class="p">(</span><span class="n">argp</span><span class="p">);</span>

    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">flags</span><span class="p">,</span> <span class="n">mode</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>But there are trickier cases, such as <code class="language-plaintext highlighter-rouge">execl()</code>, where we have to deal with zero or more arguments:</p>

<blockquote>
  <p>This is similar to execv, but the argv strings are specified
individually instead of as an array. A null pointer must be passed
as the last such argument.</p>
  <ul>
    <li>https://www.gnu.org/software/libc/manual/html_node/Executing-a-File.html</li>
  </ul>
</blockquote>

<p>In order to pass them explicitly, we have to compromise with a fixed number of handled cases:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">va_list</span> <span class="n">argp</span><span class="p">;</span>
<span class="n">va_start</span><span class="p">(</span><span class="n">argp</span><span class="p">,</span> <span class="n">arg0</span><span class="p">);</span>
<span class="kt">char</span> <span class="o">*</span><span class="n">argX</span> <span class="o">=</span> <span class="n">va_arg</span><span class="p">(</span><span class="n">argp</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="p">);</span>
<span class="kt">char</span> <span class="o">**</span><span class="n">args</span><span class="p">[</span><span class="mi">20</span><span class="p">];</span>
<span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">while</span> <span class="p">(</span><span class="o">*</span><span class="n">argX</span> <span class="o">!=</span> <span class="sc">'\0'</span> <span class="o">&amp;&amp;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="mi">20</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">args</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">argX</span><span class="p">;</span>
    <span class="n">argX</span> <span class="o">=</span> <span class="n">va_arg</span><span class="p">(</span><span class="n">argp</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="p">);</span>
    <span class="n">i</span><span class="o">++</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">va_end</span><span class="p">(</span><span class="n">argp</span><span class="p">);</span>

<span class="k">switch</span> <span class="p">(</span><span class="n">i</span><span class="p">)</span> <span class="p">{</span>
<span class="k">case</span> <span class="mi">1</span><span class="p">:</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">arg0</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">0</span><span class="p">]);</span>
<span class="k">case</span> <span class="mi">2</span><span class="p">:</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">arg0</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">1</span><span class="p">]);</span>
<span class="k">case</span> <span class="mi">3</span><span class="p">:</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">arg0</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="o">*</span><span class="n">args</span><span class="p">[</span><span class="mi">2</span><span class="p">]);</span>
<span class="c1">// [...]</span>
<span class="nl">default:</span>
    <span class="k">return</span> <span class="p">(</span><span class="o">*</span><span class="n">original</span><span class="p">)(</span><span class="n">filename</span><span class="p">,</span> <span class="n">arg0</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Unfortunately the <a href="https://gcc.gnu.org/onlinedocs/cpp/Variadic-Macros.html">__VA_ARGS__ variadic macro</a> is of no use here, since we would still need to explicitly pass the arguments to it.</p>

<p>An alternative would be to <a href="https://stackoverflow.com/a/61474680/8020917">setup the call in assembly</a>, with all its portability caveats.</p>

<h3 id="wrappers-for-wrappers">Wrappers for wrappers</h3>

<p>Until now, we were assuming that the syscall names match the function symbols exposed by glibc, but we can find many exceptions.</p>

<p>As an example, compare these syscalls:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir("foo._", 0777) = 0
openat(AT_FDCWD, "foo.", O_RDONLY|O_NOCTTY|O_NONBLOCK|O_NOFOLLOW|O_DIRECTORY) = -1 ENOENT (No such file or directory)
</code></pre></div></div>

<p>Against the library calls output by <code class="language-plaintext highlighter-rouge">ltrace mkdir -p foo/bar</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir("foo", 0777) = 0
__open_2(0x7fffded3e73e, 0x30900, 1, 0) = 3
</code></pre></div></div>

<p>Why does ltrace report such a specific symbol for opening a file? Is it directly called by rm? Let’s follow in the debugger. For convenience, I’ve installed the glibc debuginfo for my Linux distro.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; catch syscall openat
Catchpoint 1 (syscall 'openat' [257])
pwndbg&gt; run -p adsf/asdf
...
 ► f 0     7ffff7fec278 __open_nocancel+56
   f 1     7ffff7fdc3da _dl_sysdep_read_whole_file+42
   f 2     7ffff7fe39a4 _dl_load_cache_lookup+372
   f 3     7ffff7fd5338 _dl_map_object+1656
   f 4     7ffff7fd9a05 openaux+53
   f 5     7ffff7fe903e _dl_catch_exception+110
   f 6     7ffff7fd9e2e _dl_map_object_deps+1054
   f 7     7ffff7fcf1f3 dl_main+7283
   f 8     7ffff7fe7fe7 _dl_sysdep_start+935
   f 9     7ffff7fcd0ef _dl_start+655
   f 10     7ffff7fcd0ef _dl_start+655
</code></pre></div></div>

<p>We’re still in libc startup, let’s move forward:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   In file: /usr/include/bits/fcntl2.h
   52   }
   53       return __open_alias (__path, __oflag, __va_arg_pack ());
   54     }
   55
   56   if (__va_arg_pack_len () &lt; 1)
 ► 57     return __open_2 (__path, __oflag);
...
 ► f 0     7ffff7e8703b open64+91
   f 1     555555559e77 savewd_chdir+503
   f 2     55555555ffe9 make_dir_parents.constprop+745
   f 3     5555555602dd process_dir+77
   f 4     55555555711e main+1294
   f 5     7ffff7dbdb75 __libc_start_main+213
</code></pre></div></div>

<p>If we disassemble <code class="language-plaintext highlighter-rouge">savewd_chdir()</code> and check the instruction before the address in frame 1:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; disass savewd_chdir
...
0x0000555555559e72 &lt;+498&gt;:   call   0x555555556710 &lt;__open_2@plt&gt;
</code></pre></div></div>

<p>The corresponding symbol table contains the source filename:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; python print(gdb.lookup_symbol("__open_2")[0].symtab.fullname())
/usr/src/debug/glibc-2.33-18.fc34.x86_64/io/open_2.c
</code></pre></div></div>

<p>Where we can find our signature:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int __open_2 (const char *file, int oflag)
</code></pre></div></div>

<p>And a brief comment with its purpose:</p>

<blockquote>
  <p>_FORTIFY_SOURCE wrapper for open.</p>
</blockquote>

<p>There’s plenty of other hardening and compatibility wrappers to be found, as we can glance from a <code class="language-plaintext highlighter-rouge">objdump -T /lib/libc.so.6</code> and cross-validate against <code class="language-plaintext highlighter-rouge">extern</code> signatures or <code class="language-plaintext highlighter-rouge">strong_alias</code>/<code class="language-plaintext highlighter-rouge">weak_alias</code> macro expansions.</p>

<h2 id="cleanup">Cleanup</h2>

<p>Appending an underscore should be pretty simple…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mkdir("foo./bar", 0777) = -1 ENOENT (No such file or directory)
</code></pre></div></div>

<p>Of course, we need to handle each subpath, so let’s use <code class="language-plaintext highlighter-rouge">strtok()</code> to split by <code class="language-plaintext highlighter-rouge">/</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>segmentation fault (core dumped)
</code></pre></div></div>

<p>Hmm, let’s check the core dump with <code class="language-plaintext highlighter-rouge">coredumpctl gdb "$(coredumpctl list | tail -n1 | awk '{print $5}')"</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>RBX  0x7ff6314002e3 ◂— '/selinux/config'
RDI  0x7ff6314002e0 ◂— 'etc/selinux/config'
...
 ► 0x7ff6312a1ddb &lt;strtok_r+75&gt;    mov    byte ptr [rbx], 0
</code></pre></div></div>

<p>We see an attempt at writing to the filename passed as the first argument via RDI to <code class="language-plaintext highlighter-rouge">strtok()</code>. If we lookup the section containing <code class="language-plaintext highlighter-rouge">0x7ff6314002e0</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg&gt; maintenance info sections
...
[44]     0x7ff631400000-&gt;0x7ff631407000 at 0x0003f000: load28 ALLOC READONLY
</code></pre></div></div>

<p>Oh right, <code class="language-plaintext highlighter-rouge">strtok()</code> mutates the string passed to it, so we need a mutable copy.</p>

<p>What else? Let’s try <code class="language-plaintext highlighter-rouge">rm -f foo</code>:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>rm: failed to get attributes of '/': No such file or directory
</code></pre></div></div>

<p>Here’s ltrace with vs. without our hooks:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">-lstat(0x5641e64a824b, 0x7fff9f9f4150, 0x7fca8ee6d380, 1) = 0xffffffff
</span><span class="gi">+lstat(0x556c3bee324b, 0x7fffd7b839c0, 0x7f7fcdd41380, 1) = 0
</span></code></pre></div></div>

<p>Turns out that sometimes we want to fallback to a more informative <code class="language-plaintext highlighter-rouge">strace -k</code>:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">-lstat("", 0x7fffd8961c10)               = -1 ENOENT (No such file or directory)
-  &gt; /usr/lib64/libc-2.33.so() [0x100dba]
-  &gt; /home/fn/code/snippets/preload/ntfs_clean_name.so(lstat+0x59) [0x2e20]
-  &gt; /usr/bin/rm(main+0x88c) [0x319c]
-  &gt; /usr/lib64/libc-2.33.so() [0x27b74]
-  &gt; /usr/bin/rm(_start+0x2d) [0x420d]
</span><span class="gi">+newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
+  &gt; /usr/lib64/libc-2.33.so() [0xf080e]
+  &gt; /usr/bin/rm(main+0x88c) [0x319c]
+  &gt; /usr/lib64/libc-2.33.so() [0x27b74]
+  &gt; /usr/bin/rm(_start+0x2d) [0x420d]
</span></code></pre></div></div>

<p>We are calling another <code class="language-plaintext highlighter-rouge">lstat()</code> wrapper in libc, as seen in the different reported addresses (<code class="language-plaintext highlighter-rouge">0xf080e vs. 0x100dba</code>). Let’s inspect <code class="language-plaintext highlighter-rouge">main+0x88c</code>, but to break in the debugger, we want to adjust to the address at the beginning of the call instruction bytes:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Although 0x88c = 2188, the call is at 2184
0x0000555555557198 &lt;+2184&gt;:  call   0x555555556650 &lt;lstat@plt&gt;

# Let's break on that address then
gdb -ex 'b main' -ex 'run -r foo' -ex 'b *(0x555555554000 + 0x319c - 4)' -ex 'c' rm
...
   0x555555556650 &lt;lstat@plt&gt;       endbr64
   0x555555556654 &lt;lstat@plt+4&gt;     bnd jmp qword ptr [rip + 0xc7ed] &lt;lstat64&gt;
    ↓
 ► 0x7ffff7eb27e0 &lt;lstat64&gt;         endbr64
    ↓
   0x7ffff7eb27e7 &lt;lstat64+7&gt;       mov    ecx, 0x100
   0x7ffff7eb27ec &lt;lstat64+12&gt;      mov    rsi, rdi
   0x7ffff7eb27ef &lt;lstat64+15&gt;      mov    edi, 0xffffff9c
   0x7ffff7eb27f4 &lt;lstat64+20&gt;      jmp    fstatat64 &lt;fstatat64&gt;
    ↓
   0x7ffff7eb2800 &lt;fstatat64&gt;       endbr64
</code></pre></div></div>

<p>Turns out I was delegating to <code class="language-plaintext highlighter-rouge">__lxstat()</code>, as I misinterpreted this comment:</p>

<blockquote>
  <p>The ‘stat’, ‘fstat’, ‘lstat’ functions have to be handled special since
even while not compiling the library with optimization calls to these
functions in the shared library must reference the ‘xstat’ etc
functions. We have to use macros but we cannot define them in the
normal headers since on user level we must use real functions.</p>
  <ul>
    <li>https://code.woboq.org/userspace/glibc/include/sys/stat.h.html</li>
  </ul>
</blockquote>

<p>The correct behaviour is to just delegate to <code class="language-plaintext highlighter-rouge">lstat()</code>.</p>

<p>Almost there…</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>readlink("/usr/bin/rm", 0x7ffdf3a96500, 1023) = -1 EINVAL (Invalid argument)
</code></pre></div></div>

<p>We just need to compare traces with vs. without our hooks:</p>

<div class="language-diff highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="gd">-newfstatat(AT_FDCWD, "", 0x7ffe91509720, AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
</span><span class="gi">+newfstatat(AT_FDCWD, "/", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_SYMLINK_NOFOLLOW) = 0
</span></code></pre></div></div>

<p>Since <code class="language-plaintext highlighter-rouge">strtok()</code> consumes delimiters, a path consisting only of <code class="language-plaintext highlighter-rouge">/</code> would resolve to an empty string, so we need to handle that case separately.</p>

<hr />

<p>After these fixes, we arrive at the overall logic to implement:</p>

<ul>
  <li>Allocate memory for containing the characters of the original filename plus an extra character for each subpath name;</li>
  <li>Make a mutable copy for <code class="language-plaintext highlighter-rouge">strtok()</code>;</li>
  <li>Don’t clean special name-inode maps (i.e. <code class="language-plaintext highlighter-rouge">.</code> and <code class="language-plaintext highlighter-rouge">..</code>);</li>
  <li>If the pathname only contains delimiters, then return <code class="language-plaintext highlighter-rouge">/</code>;</li>
  <li>We don’t care about additional trailing delimiters, so we can just add a single <code class="language-plaintext highlighter-rouge">/</code> between subpaths;</li>
</ul>

<h2 id="source-code">Source code</h2>

<p>Available in a <a href="https://github.com/nevesnunes/env/blob/master/common/code/snippets/preload/ntfs_clean_name.c">git repository</a>.</p>

<p>Try it out:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>gcc ntfs_clean_name.c <span class="nt">-fPIC</span> <span class="nt">-shared</span> <span class="nt">-ldl</span> <span class="nt">-o</span> ntfs_clean_name.so
<span class="nv">LD_PRELOAD</span><span class="o">=</span>./ntfs_clean_name.so <span class="nb">touch </span>foo.
</code></pre></div></div>

<h2 id="further-work">Further work</h2>

<ul>
  <li>Some hooks may be missing, but simple use cases should be already covered: creating, accessing, and removing files. I’ve considered parsing the remaining signatures with tree-sitter, but it turns out that some tokens aren’t recognized by its C grammar, such as variadic arguments;</li>
  <li>For statically built apps, instead of LD_PRELOAD, we could use <a href="https://blog.nelhage.com/2010/08/write-yourself-an-strace-in-70-lines-of-code/">ptrace</a>;</li>
</ul>]]></content><author><name></name></author><category term="filesystems" /><category term="linkers" /><category term="dynamic instrumentation" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Side-Channel Statistical Analysis</title><link href="https://nevesnunes.github.io/blog/2021/01/31/Side-Channel-Statistical-Analysis.html" rel="alternate" type="text/html" title="Side-Channel Statistical Analysis" /><published>2021-01-31T21:00:00+00:00</published><updated>2021-01-31T21:00:00+00:00</updated><id>https://nevesnunes.github.io/blog/2021/01/31/Side-Channel-Statistical-Analysis</id><content type="html" xml:base="https://nevesnunes.github.io/blog/2021/01/31/Side-Channel-Statistical-Analysis.html"><![CDATA[<p>Without a good intuition of what packet fields to consider, finding side-channel data in packet captures becomes a bit harder. While <code class="language-plaintext highlighter-rouge">wireshark</code> provides some statistics views to summarize conversations, we may desire to look into other packet details as well.</p>

<p>Our main focus is to find data points that differ significantly from others, i.e. <strong>outliers</strong>.</p>

<p>I’ll describe some approaches for these types of datasets, considering the trade-offs of each approach.</p>

<h1 id="eyeballing">Eyeballing</h1>

<p>Let’s start with an example, the CTF task <a href="https://ctftime.org/writeup/24019">Patience</a> from BalCCon2k20. <a href="https://ajdin.io/posts/ctf-balccon-2020/#forensicspatience">One of its writeups</a> alludes to eyeballing tcp duration differences (on <code class="language-plaintext highlighter-rouge">wireshark</code>, under <code class="language-plaintext highlighter-rouge">Statistics &gt; Conversations</code>):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/conversations.png" alt="" />
</div>

<p>Which are more explicit in an <code class="language-plaintext highlighter-rouge">I/O Graph</code> for the field <code class="language-plaintext highlighter-rouge">tcp.time_delta</code>:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/io_graph.png" alt="" />
</div>

<p>From the observed 3 ranges of values (around 0, 0.5, and 1), we can map those to morse code characters (where one of them is used as delimiter).</p>

<p>These views are ideal if eyeballing does lead you to them. Previously, I’ve written about the CTF task <a href="https://nevesnunes.github.io/blog/2020/07/20/CTF-Writeup-UIUCTF-2020-RFCland.html">RFCLand</a>, where the needle was in field <code class="language-plaintext highlighter-rouge">ip.flags.rb</code>, an uncommon field which easily goes unnoticed in this approach.</p>

<h1 id="visualizing-packet-fields">Visualizing packet fields</h1>

<p>Some <strong>data cleaning</strong> is required before attempting to visualize these values.</p>

<p>Given that some of them are numerical, others categorical or string data, we can map all non-numerical variables to numerical, by assigning each distinct value its own number. Missing values are assigned with an infinite value. Finally, <a href="https://github.com/nevesnunes/env/blob/master/common/code/snippets/pcap/packets_to_csv.py">output as csv</a>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tshark <span class="nt">-r</span> patience.pcap <span class="nt">-T</span> json <span class="o">&gt;</span> patience.json
./packets_to_csv.py <span class="nt">--type</span> float patience.json
</code></pre></div></div>

<p>Given multiple variables, all countable and ordered, we can plot them with <a href="https://github.com/nevesnunes/aggregables/blob/master/aggregables/captures/matplotlib/multiple_bar.py">small multiple bar charts</a>, sorting figures based on the approaches described below.</p>

<h2 id="tukeys-fences">Tukey’s Fences</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./multiple_bar.py <span class="nt">-f</span> patience.json.float.csv <span class="nt">--strategy</span> tukey_fences
</code></pre></div></div>

<p>By calculating <strong>quartiles</strong> from <strong>standard deviation</strong>, a range can be defined to identify outliers: points which aren’t contained in that range.</p>

<p>However, this doesn’t give a good idea of other groups of values in our dataset. Depending on how points are distributed, most of them could end up being classified as outliers.</p>

<h2 id="clustering">Clustering</h2>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./multiple_bar.py <span class="nt">-f</span> patience.json.float.csv <span class="nt">--strategy</span> clustering
</code></pre></div></div>

<p>To get a better quantification of “density” between points, we can use <strong>clustering algorithms</strong>, such as DBSCAN.</p>

<p>Based on its <a href="http://www2.cs.uh.edu/~ceick/7363/Papers/dbscan.pdf">paper</a>, we define the following parameters:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">eps</code>: threshold for distance between 2 neighbor points;</li>
  <li><code class="language-plaintext highlighter-rouge">min_samples</code>: minimum number of neighbors to consider a point to be inside the cluster (in our case 4, since we will apply it per variable, therefore &lt;= 2 dimensions);</li>
  <li><code class="language-plaintext highlighter-rouge">metric</code>: distance function, in our case euclidean distance.</li>
</ul>

<p>Epsilon (<code class="language-plaintext highlighter-rouge">eps</code>) can be estimated by taking the derivate closest to 1 at the “elbow” of a curve where sorted distances between data points are plotted. In some cases the derivate isn’t valid, so we fallback to using half of the smallest distance between the closest 2 points.</p>

<h2 id="comparisons">Comparisons</h2>

<p>Ideally, variables containing side-channels should appear before other variables, so that the reader has to go through less irrelevant figures.</p>

<h3 id="task-patience">Task: Patience</h3>

<p><a href="https://nevesnunes.github.io/blog/assets/writeups/BalCCon2k20/patience.pcap">Download pcap</a></p>

<p>Here’s how Tukey’s Fence handles these packet fields (<code class="language-plaintext highlighter-rouge">#</code> denotes number of clusters, outliers are encoded in grey, non-deviating values are encoded in blue):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/patience-tukey.png" alt="" />
</div>

<blockquote>
  <p>Position: 15th out of 193 figures</p>
</blockquote>

<p>Most points were misclassified as outliers.</p>

<p>To understand why, let’s calculate expected <a href="https://github.com/nevesnunes/env/blob/master/common/code/snippets/encodings/morse_letter_frequency.py">morse character frequencies</a>, based on English letter frequencies:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{".": 59.30666666666667, "-": 40.67333333333334}
</code></pre></div></div>

<p>On our payload, we have the following frequencies:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>{".": 47, "-": 40, " ": 42}
</code></pre></div></div>

<p>They approximately match the expected frequencies, so the <code class="language-plaintext highlighter-rouge">.</code> characters skews the standard deviation towards it. There are 93 outliers, which is close to the sum of the other 2 characters (40 + 42).</p>

<p>However, with DBSCAN (outliers are encoded in grey, clusters up to a defined limit are encoded with distinct colors):</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/patience-dbscan.png" alt="" />
</div>

<blockquote>
  <p>Position: 11th out of 193 figures</p>
</blockquote>

<p>Note how we get 3 clusters, and all points fall into one of them, so we don’t get outliers. A much better match with the actual side-channel values.</p>

<h3 id="task-rfcland">Task: RFCLand</h3>

<p><a href="https://nevesnunes.github.io/blog/assets/writeups/UIUCTF2020/challenge.pcap">Download pcap</a></p>

<p>Some cases don’t work so well with DBSCAN:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/rfcland-dbscan.png" alt="" />
</div>

<blockquote>
  <p>Position: 45th out of 112 figures</p>
</blockquote>

<p>The actual side-channel values are identified as 1 of 2 clusters, but ideally, we should have one single cluster, with all side-channel values represented as outliers. Furthermore, since we are sorting by decreasing number of clusters and variance (<code class="language-plaintext highlighter-rouge">stdev</code> and <code class="language-plaintext highlighter-rouge">eps</code>), the figure ends up ranking in a lower than desirable position.</p>

<p>Compare with Tukey’s Fences:</p>

<div class="c-container-center">
    <img src="https://nevesnunes.github.io/blog/assets/img/side-channels/rfcland-tukey.png" alt="" />
</div>

<blockquote>
  <p>Position: 19th out of 112 figures</p>
</blockquote>

<p>Outliers are correctly identified, since the standard deviation is skewed towards non-side-channel values in our side-channel variable. Since our sorting criteria considers outliers first, it gives a better position when compared to the previous approach.</p>

<hr />

<p>While we don’t have any winner approach here, it is nice to go through both of them, as they evidence distinct features about these datasets. Given this conclusion, it’s hard to say how the sorting could be tweaked to further improve positions.</p>

<h1 id="further-work">Further work</h1>

<ul>
  <li>DBSCAN could be applied across the whole dataset instead of for each packet field. Given the high-dimensionality (there would be as many dimensions as packet fields), some feature selection would be needed;</li>
  <li>Matplotlib figure drawing is slow (System: CPU: Intel i5-4200U, RAM: 12GiB DDR3 1600 MT/s), as it takes around 8 minutes to render a pdf with 193 figures, each containing 1287 data points. I’m considering moving the rendering to D3.js.</li>
</ul>

<link rel="stylesheet" href="https://nevesnunes.github.io/blog/assets/css/custom.css" />]]></content><author><name></name></author><category term="ctf" /><category term="protocol analysis" /><category term="visualization" /><summary type="html"><![CDATA[Without a good intuition of what packet fields to consider, finding side-channel data in packet captures becomes a bit harder. While wireshark provides some statistics views to summarize conversations, we may desire to look into other packet details as well.]]></summary></entry></feed>