-
Replace
vimrcwith your own. -
Copy your
~/.vim(extensions, etc.) insidevimfiles. -
Run
make. -
vim.comis created.
Actually Portable Executable is a binary format by Justine Tunney that makes it possible to run a single executable on a variety of operating systems and architectures, including Windows, macOS and Linux on both arm64 and x86-64.
The APE format is used by
Cosmopolitan, an implementation
of the C standard library that reconfigures gcc and clang to output
APE binaries and packs in an astonishing bag of tricks.
I am often connected to remote machines with different operating systems, and as an avid Vim user I'd like to bring my configuration with me. This project is part personal curiosity and part scratching my own itch.
The purpose of this README is to serve as a "post-hoc lab notebook" of sorts. I have no previous experience with the Cosmopolitan ecosystem, so maybe most of this story, dear reader, might seem rather basic to you. I am writing this in the hope that it can be somehow useful, perhaps to one of my future selfs. At the very least I'm a noob you can laugh at, that should have some entertainment value.
I'd like to acknowledge the work done by the developers at superconfigure, a repository of software built with Cosmopolitan. This work was developed independently, but I did occasionally "compare notes" with their work.
As suggested by the Cosmopolitan documentation, the easiest way
to start is to download a binary build of the toolchain, available
here. The compiler and associated
tools are ridiculously easy to use, especially considering the black
magic we're about to summon. If you have ever used gcc or clang
you'll be right at home.
wget https://cosmo.zip/pub/cosmocc/cosmocc.zip
unzip cosmocc.zip -d cosmoccAt the Savior's command and formed by divine teaching, we dare write inside hello.c:
#include <stdio.h>
int main(void)
{
printf("hello, world\n");
}Issuing the obvious command just works:
/cosmocc/bin/cosmocc hello.c -o helloUnlike a normal compiler, cosmocc produced three output executables:
$ ls -l hello*
-rwxr-xr-x. 1 edoardo edoardo 409294 Apr 4 15:30 hello
-rwx--x--x. 1 edoardo edoardo 689878 Apr 4 15:30 hello.aarch64.elf
-rw-r--r--. 1 edoardo edoardo 69 Apr 4 15:03 hello.c
-rwx--x--x. 1 edoardo edoardo 1056430 Apr 4 15:30 hello.com.dbgThis is because cosmocc is actually running a bunch of stuff under
the hood. We can get more information about that setting the BUILDLOG
environment variable:
BUILDLOG=log; /cosmocc/bin/cosmocc hello.c -o hello
cat log
(cd /home/edoardo/tutorial; cosmocc/bin/x86_64-linux-cosmo-gcc -o/tmp/fatcosmocc.ekwu71tq5t2z2.o -D__COSMOPOLITAN__ -D__COSMOCC__ -D__FATCOSMOCC__ -include libc/integral/normalize.inc -fportcosmo -fno-semantic-interposition -fno-optimize-sibling-calls -mno-omit-leaf-frame-pointer -fno-schedule-insns2 -Wno-implicit-int -mno-tls-direct-seg-refs -fpatchable-function-entry=18,16 -fno-inline-functions-called-once -DFTRACE -DSYSDEBUG -fno-pie -nostdinc -isystem cosmocc/bin/../include -c hello.c -fno-omit-frame-pointer)
(cd /home/edoardo/tutorial; cosmocc/bin/aarch64-linux-cosmo-gcc -o/tmp/fatcosmocc.5zju32xc4e3l1.o -D__COSMOPOLITAN__ -D__COSMOCC__ -D__FATCOSMOCC__ -include libc/integral/normalize.inc -fportcosmo -fno-semantic-interposition -fno-optimize-sibling-calls -mno-omit-leaf-frame-pointer -fno-schedule-insns2 -Wno-implicit-int -ffixed-x18 -ffixed-x28 -fpatchable-function-entry=7,6 -fno-inline-functions-called-once -DFTRACE -DSYSDEBUG -fno-pie -nostdinc -isystem cosmocc/bin/../include -fsigned-char -c hello.c -fno-omit-frame-pointer)
(cd /home/edoardo/tutorial; cosmocc/bin/fixupobj /tmp/fatcosmocc.ekwu71tq5t2z2.o)
(cd /home/edoardo/tutorial; cosmocc/bin/fixupobj /tmp/fatcosmocc.5zju32xc4e3l1.o)
(cd /home/edoardo/tutorial; cosmocc/bin/x86_64-linux-cosmo-gcc -o/tmp/fatcosmocc.vokhorjqgmm52.com.dbg cosmocc/bin/../x86_64-linux-cosmo/lib/ape.o cosmocc/bin/../x86_64-linux-cosmo/lib/crt.o -static -nostdlib -no-pie -fuse-ld=bfd -Wl,-z,noexecstack -Wl,-z,norelro -Wl,--gc-sections -Lcosmocc/bin/../x86_64-linux-cosmo/lib -Lcosmocc/bin/../x86_64-linux-cosmo/lib -Wl,-T,cosmocc/bin/../x86_64-linux-cosmo/lib/ape.lds -Wl,-z,common-page-size=4096 -Wl,-z,max-page-size=16384 /tmp/fatcosmocc.ekwu71tq5t2z2.o -lcosmo)
(cd /home/edoardo/tutorial; cosmocc/bin/aarch64-linux-cosmo-gcc -o/tmp/fatcosmocc.7mbrfufn709n3.aarch64.elf cosmocc/bin/../aarch64-linux-cosmo/lib/crt.o -static -nostdlib -no-pie -fuse-ld=bfd -Wl,-z,noexecstack -Wl,-z,norelro -Wl,--gc-sections -Lcosmocc/bin/../aarch64-linux-cosmo/lib -Lcosmocc/bin/../aarch64-linux-cosmo/lib -Wl,-T,cosmocc/bin/../aarch64-linux-cosmo/lib/aarch64.lds -Wl,-z,common-page-size=16384 -Wl,-z,max-page-size=16384 /tmp/fatcosmocc.5zju32xc4e3l1.o -lcosmo)
(cd /home/edoardo/tutorial; cosmocc/bin/fixupobj /tmp/fatcosmocc.7mbrfufn709n3.aarch64.elf)
(cd /home/edoardo/tutorial; cosmocc/bin/fixupobj /tmp/fatcosmocc.vokhorjqgmm52.com.dbg)
(cd /home/edoardo/tutorial; cosmocc/bin/apelink -V -1 -l cosmocc/bin/ape-x86_64.elf -l cosmocc/bin/ape-aarch64.elf -M cosmocc/bin/ape-m1.c -o hello /tmp/fatcosmocc.vokhorjqgmm52.com.dbg /tmp/fatcosmocc.7mbrfufn709n3.aarch64.elf)
(cd /home/edoardo/tutorial; cosmocc/bin/pecheck hello)Well, that's a mouthful. But reading it carefully illustrates what's happening. The first two lines are the compilation of the translation unit. The compiler is invoked twice, the first time to generate x86 object code, the second time to generate ARM object code. The two object files are given random names and put into a temporary directory.
With each of those files as input, fixupobj is invoked. It optimizes
the code and modifies the layout so that the final compiled object can
double as a zip archive to put assets inside. This is yet another trick up
Cosmopolitan's sleeve, as we'll see later.
Then, the object files are linked with the libc and the C runtime. Nothing
unusual, but Cosmopolitan has to tell the underlying compiler to use
static linking and not to add its own builtin implementations. This
results in ELF binaries. The two binaries are finally merged into their
final APE form with apelink, and finally a PE signature is generated.
Black magic indeed.
Vi is a venerable text editor. It has been implemented many times in many
different variants, and ported to a staggering amount of operating systems
and architectures. The most common variation you'll find in the wild is
vim, which is the Vi clone of choice in most Linux distributions and
macOS. Other popular clones are neovim, a new project that bundles
the Lua interpreter, a new terminal emulator and a LSP client among
other things, and the traditional vi clone that comes with OpenBSD.
There are many more: openvi, elvis, vile, nvi, and even Busybox
embeds a tiny vi. The last one holds a special place in my heart:
it was the first vi I ever used, and the editor I wrote my first
"hello world" on. The focus of this project is Vim, not any of the
other clones.
For a program I have used so much, I am ashamed to admit I had never
built it myself, let alone took a look at the source code. Getting the
vim source code is easier than you might expect: it's on github,
a mere git clone away. The core project is written in C, and the
code is rather readable and not too hard to hack on (foreshadowing is
a powerful narrative technique). The only downside is that the years
and the many platforms supported by the project have taken their toll,
and as a result some sections have become a jungle of #ifdefs.
Vim is a GNU Autotools project, which should be easy to build with the
Cosmopolitan machinery. The official documentation suggests to set the
CC environment variable and a few related ones (CXX for the C++
compiler if needed, which Cosmopolitan also provides; INSTALL and
AR), and then run the configure script with --prefix.
The last part is critical. Cosmopolitan uses an ABI that is not compatible with the rest of the libraries on your operating system (or indeed any operating system!) Looking back at the log output of our smoke test, we already see a clear indication of this in the name of the compiler itself:
x86_64-linux-cosmo
For all intents and purposes, we should treat that as a target triplet
different from whatever our machine is using. An immediate consequence is
that we can't rely on our OS package manager (dnf, apt, pacman,
etc.) for libraries. If we wanted to write a program against, say,
the Postgres library, we would ordinarily install it with a command like:
sudo dnf install libpq-devel After our package manager is done, the net result will be that libpq.so
is put in a designated library directory, the required headers in the
include directory, and we can go our merry way linking our program with
the shared object.
But in our case this will never work. Our package manager will give us correctly compiled object files for our machine's target triplet, which doesn't mix and match with Cosmopolitan. Whatever we need, we will have to build from scratch.
So, after cloning the source, we try to use autotools:
mkdir -p opt/include
mkdir -p opt/lib
export CC="$HOME/tutorial/cosmocc/bin/cosmocc -I$HOME/tutorial/opt/include -L$HOME/tutorial/opt/lib"
export CXX="$HOME/tutorial/cosmocc/bin/cosmoc++ -I$HOME/tutorial/opt/include -L$HOME/tutorial/opt/lib"
export INSTALL="$HOME/tutorial/cosmocc/bin/cosmoinstall"
export AR="$HOME/tutorial/cosmocc/bin/cosmoar"
cd vim/src
./configure --prefix=$HOME/tutorial/optA flurry of activity fills the terminal, as the configure script makes its sanity checks. A familiar sight appears: an error message.
...
...
...
no terminal library found
checking for tgetent()... configure: error: NOT FOUND!
You need to install a terminal library; for example ncurses.
On Linux that would be the libncurses-dev package.
Or specify the name of the library with --with-tlib.
Vim depends on a "terminal library", a piece of software that knows
the capabilities of the terminal it's running on (colors, ANSI support,
etc.) and can correctly render the screen. The configure script suggests
that ncurses is one such library, and it makes the normally sensible
suggestion of installing it from the package manager. As we discussed
before, this simple solution is unfortunately unavailable to us.
Ncurses, unlike Vim itself, doesn't seem to be developed on a "forge" platform like Github or Gitlab, but being a very popular piece of software its source code is mirrored everywhere, and it's easy enough to obtain a copy of the most recent stable release, for example from the GNU FTP server.
It is an Autotools project as well, which means we can recycle the
same strategy as above to try and configure it. The only caveat is
that we need to specify --disable-lib-suffixes as an option to
the configure script, or otherwise the library will be compiled as
ncursesw. Other than that, there are no problems, and the configure
script terminates successfully:
...
...
...
** Configuration summary for NCURSES 6.5 20240427:
extended funcs: yes
xterm terminfo: xterm-new
bin directory: /home/edoardo/tutorial/opt/bin
lib directory: /home/edoardo/tutorial/opt/lib
include directory: /home/edoardo/tutorial/opt/include/ncursesw
man directory: /home/edoardo/tutorial/opt/share/man
terminfo directory: /home/edoardo/tutorial/opt/share/terminfo
** Include-directory is not in a standard location
Great! We can go on to compile and install the library with the usual
make -j && make install, there won't be any problems whatsoev...
../ncurses/curses.priv.h:1941:28: error: implicit declaration of function 'wcwidth' [-Wimplicit-function-declaration]
1941 | #define _nc_wacs_width(ch) wcwidth(ch)
|
For all those not conversant in ancient bullshit, that error message is a
fascinating piece of history. Before ISO C99, undeclared functions could
be assumed by the compiler to have return type int and unspecified
parameters. We did not specify a standard in our compilation options,
therefore that program could technically compile. Fortunately, the
compiler declines to maliciously comply and gracefully informs us of
the problem: wcwidth is undeclared.
A quick glimpse at the manual reveals the problem: wcwdith is POSIX but
not ISO C. We have two options: compile without wide character support,
or provide an implementation of the missing function. Since Cosmopolitan
does include wcwidth, it's easy enough to patch the code to make it
compile, we just add the appropriate header at the top of curses.priv.h:
#ifdef __COSMOPOLITAN__
#include "libc/str/unicode.h"
#endifWe run it back, and apart from a few reminders that const in C is
actually a lie:
../ncurses/./tinfo/lib_baudrate.c:105:27: warning: not sure if modding const structs is good
105 | static struct speed const speeds[] =
|
the process completes without too many issues. The library is now compiled and installed, and we can have another go at trying to build Vim.
Running back the configure script now works, and we can finally attempt to build our program again. The fan spins up as electricity fills the room. Eventually we are returned to the prompt. It's moving! It's alive! Now I know what it feels like to be God!
Inside opt/bin the lifeless mass of code has been imbued with the
spark of life and transmuted into a vim executable, the +x bit
already set. But trying to run it has a quite disappointing effect:
bash: ./vim: cannot execute binary file: Exec format error
Looking inside the source directory we see the raw ELF files which
seem to run just fine. What happened? The mystery lies in the
install target of the generated Makefile:
installvimbin: $(VIMTARGET) $(DESTDIR)$(exec_prefix) $(DEST_BIN)
-if test -f $(DEST_BIN)/$(VIMTARGET); then \
mv -f $(DEST_BIN)/$(VIMTARGET) $(DEST_BIN)/$(VIMNAME).rm; \
rm -f $(DEST_BIN)/$(VIMNAME).rm; \
fi
$(INSTALL_PROG) $(VIMTARGET) $(DEST_BIN)
$(STRIP) $(DEST_BIN)/$(VIMTARGET)
chmod $(BINMOD) $(DEST_BIN)/$(VIMTARGET)
The penultimate step of the build process, just before making the binary
executable, invokes strip. Cosmopolitan builds an Actually Portable
Executable, a format that strip doesn't know; running it regardless
will break the executable. We can disable this by replacing strip
with "do nothing":
export STRIP=/bin/trueIt works! We finally have our very own Vim!
The next step is what we're all here for: we move the executable to other machines, running different operating systems, and see what happens. It runs. The very same executable, compiled on an x86 Linux machine, can successfully run on x86 Windows, aarch64 Linux, even on an ARM Mac under macOS!
As technically impressive as this is, the end product still seems to
have a few problems. For one, colors are missing, and attempting to set
them with :color default results in an ominous error message. Many
features are broken, for example, trying to open a directory with :e .
simply doesn't work, while normally a nice TUI menu shows up.
This suggests there's a problem with the Vim runtime. Vim isn't
just the executable, but also a set of runtime files that normally
live under /usr/share/vim in a Unix system. When you install vim
through a package manager, those files are magically dropped in place
and seamlessly referenced by the main program. But on a different machine
they might not exist! Let's investigate by typing :version:
...
fall-back for $VIM: "/home/edoardo/tutorial/opt/share/vim"
...
Indeed, a hardcoded string. The problem here is twofold: we need to
find a way to ship the runtime with the executable, and have the $VIM
variable point to the right place.
Cosmopolitan gives us a massive hand. Believe it or not, APE binaries are also zip files, their contents accessible under the /zip path from the executable. This means that simply "updating" the executable as though it were a regular zip file:
cd opt
cp bin/vim vim.com
zip -qur vim.com share/vim share/terminfohas the effect of bundling those directories with the executable! Running the program again and trying to open some of those files from within Vim confirms that they are indeed accessible.
We still need to find a way to replace the hardcoded string. The solution is a bit janky but hey, it works. Back when I was a kid, in a misguided attempt to "understand programs", I frequently tried to open binary executables with a plain text editor. Obviously most of the file was line noise, but I noticed some parts were plain text strings that I could modify. A bit of trial and error revealed that this worked as long as the replacement string did not exceed the previous one in length, violating this "rule" would cause the program to act weirdly and crash at random. At the time I had no way of knowing, but there is a pretty straightforward reason for this, which involves even more ancient bullshit.
Once upon a time, there was a computer called PDP-11. It was originally released in 1970, 55 years ago. The way this computer represented strings was as a series of ASCII-encoded characters terminated by a byte set to zero. This is a dreadful way to represent a string, it's inefficient and ripe with security problems. So, why do we care about factoids on old pieces of junk? Because the PDP line is where C was born. So obviously, to this day, that's how strings are represented inside and ELF file.
When you define a string like this:
#include <stdio.h>
int main(void)
{
const char* s = "hello, world";
printf("%s\n", s);
}The string is put into a section of the ELF file called .rodata.
Compiling the above program and extracting the segment with
objdump -s -j .rodata a.out shows the bare bytes:
a.out: file format elf64-x86-64
Contents of section .rodata:
402000 01000200 00000000 00000000 00000000 ................
402010 68656c6c 6f2c2077 6f726c64 00 hello, world.
If someone were to replace those bytes at that exact location with something else, as long as the content is still a null-terminated string, the program wouldn't even notice.
So, let's do just that. It's easy enough to hack this together in
bash with regular expressions (see replace.sh for the actual script).
We end up with a patched executable where the $VIM variable finally points to the right path:
fall-back for $VIM: "/zip/share/vim"
A reason why Vim has such a large following is its ability to be easily
customized via plugins. Normally, you would put your plugins inside
~/.vim and your configuration in a file named ~/.vimrc.
Vim will also automatically load anything under $VIM/vimfiles as well,
and will use $VIM/vimrc (note the missing dot) as initial configuration.
You normally don't want to do this, using your own home directory is safer and easier, but for the sake of practicality the zip trick we saw above can be used to bundle entire plugins!
It's sufficient to dump your ~/.vim folder under opt/vim/vimfiles,
copy your ~/.vimrc as opt/vim/vimrc and update the zip again with
the same command as above:
cd opt
cp bin/vim vim.com
zip -qur vim.com share/vim share/terminfoThis repository has a fresh copy of the popular ALE plugin as a git submodule, the color scheme and the vimrc I normally use. That's obviously just an example of what can be done. Replace as desired.
On most operating systems, the above procedure results in a perfectly working and perfectly portable Vim distribution, contained in a single file. On Windows, there are a few annoyances that unfortunately require manually patching the code. They are mostly related to the way Vim handles the external shell.
A handy feature of modern versions of Vim is quickly spawning a terminal:
it can be easily done with :term, :vert term or tab term. This
results in a new buffer that contains the shell and can be interacted
with by means of the usual ctrl-W commands.
By default, on Windows this is broken. If Vim is launched as a console
application from the Windows GUI, :term fails outright, offering no
explanation other than error code 0xc0000142. Both DuckDuckGo and
Google print pages of smoking crap if we try to search it, but Claude
Sonnet suggests it refers to STATUS_DLL_INIT_FAILED, which could
be caused by a variety of conditions, including "corrupted DLL files,
incompatible applications, insufficient permissions, antivirus software
interference, outdated or incompatible drivers".
If on the other hand Vim is launched from within a cmd.exe or Powershell
session, the terminal buffer is created correctly, but won't echo typed
characters and will ignore pressing return, only registering it when
emulated with ctrl-J.
Reading the code, it turns out that the terminal handling code for
Unix-like systems and Windows is effectively entirely separate, with the
Windows code further split between the older "winpty" terminal interface
and the newer "conpty". Cosmopolitan compiles the code as though the
platform were a normal Unix, the Windows codepath would require compiling
against the Win32 API, which Cosmopolitan does not implement. As a result
there is no easy fix for this problem. With a bit of trial and error it
turns out that running the command shell (cmd.exe, powershell.exe,
etc.) inside conhost.exe solves the problem. The downside of this
hack is that this increases the likelihood that our executable is flagged
as malware. Obviously, conhost.exe is a core component of Windows and
not, contrary to the folk legend, in and of itself malware, but liberally
spawning processes under conhost.exe is exactly what a virus would do!
The code that sets the default shell is inside option.c, under the
function set_init_default_shell(). The code already has a compile-time
ifdef depending on the platform. A simple pattern to introduce our fix
is this:
-
Compiling with Cosmopolitan sets the
__COSMOPOLITAN__preprocessor variable. We introduce a new compile-time choice using#ifdef __COSMOPOLITAN__. -
We then detect at runtime whether we are running under Windows. Cosmopolitan provides a very handy
IsWindows()macro to figure that out.
An additional problem emerges with filters. An important feature of vi
clones is the ability to "filter" text through an arbitrary shell command,
which is done by applying :!command to the current selection. A popular
application of this feature is having Vim double as a hex editor with
:!xxd.
This is broken under Windows, with Vim complaining about a "file not
found" type error. Investigating this problem is a good way to showcase
another incredible feature of Cosmopolitan: every compiled program
includes strace! Simply running the executable with --strace causes
the program to print out every system call it performs via kprintf.
set KPRINTF_LOG=log
C:\vim.com --strace
There is, predictably, a lot of output. Searching for an execve should
help us figure out something:
...
"/c/windows/system32/cmd.exe", {"/c/windows/system32/cmd.exe", "-c", "(xxd) < /tmp/vnvcw59/0 > /tmp/vnvcw59/1", NULL}, {"=C:=C:\\", "ALLUSERSPROFILE=/C/ProgramData", "APPDATA=/C/Users/edo/AppData/Roaming", "COMMONPROGRAMFILES=/C/Program Files/Common Files", "COMMONPROGRAMFILES(X86)=/C/Program Files (x86)/Common Files", "COMMONPROGRAMW=/C/Program Files/Common Files", "COMPUTERNAME=DESKTOP-1A9HCN4", "COMSPEC=/...)
...
There's the problem. Vim implements its filter feature with shell redirecting, in this way:
-
First, Vim determines the temporary directory, and opens a file with a random name inside it.
-
Second, the input for the filter command is written inside that file.
-
Third, Vim constructs a shell command that runs the program the user specified and redirects the output to another temporary file.
-
Finally, Vim reads from the output file.
This won't work for two reasons. First of all, because the command uses
Unix-like arguments, cmd.exe should be invoked with /c rather than
-c. Additionally, /tmp doesn't exist under Windows; we need a way
to generate a temporary path that works portably.
Indeed, looking around a bit we see the failing system calls that cause our observed error:
fstatat(AT_FDCWD, "/tmp/vnvcw59/0", [n/a], 0) → -1 ENOENT
...
fstatat(AT_FDCWD, "/tmp/vnvcw59/1", [n/a], 0) → -1 ENOENT
The first issue (improper flag) is very easily fixed. Grep reveals that
the flag to add is set in optiondefs.h. We can recycle the "ifdef"
pattern we used before to decide at runtime whether to add -c or /c.
Finding a temporary path that works portably is a bit more difficult.
Luckily Cosmopolitan comes to our help once again: it bundles a handy
__get_tmpdir() that does just that. There are still a few details that
need to be fixed, mainly the fact that Cosmopolitan internally represents
Windows paths with a Unix-like notation:
/c/windows/system32
But Windows requires paths to start with the drive letter like this:
C:/windows/system32
so we'll have to convert between the two. Still, this is resolved with a fairly compact set of code changes, impacting very few files.
Since there are changes to the upstream code, the easiest way to keep
track of them and future-proof the project is to fork the official Vim
repository. As mentioned before, the changes are relatively localized,
always guarded by #ifdef __COSMOPOLITAN__ and not very obtrusive.
They live on the vimape_master branch.
If for whatever reason you'd rather use the original unmodified source,
that's an option too: simply use the master branch instead.


