This post is part of a series. For an introduction to the file command, and the other posts in the series, see the first post. When writing a "magic pattern" file for the file command, you can include some special lines that start with "!:". The documentation doesn't name this feature, but I'll call it … Continue reading The “file” command: annotations
Category: Programming
The “file” command: binary and text files
This post attempts to explain a few things about how the file command deals with binary vs. text files, primarily from the standpoint of someone writing "magic patterns" for it. For an introduction to the file command, and my other posts on the topic, see the first post. Note that by "ruleset", I mean a … Continue reading The “file” command: binary and text files
MHD: A multi-file hex dump utility
MHD is a little Python script I wrote to do one-line partial hex dumps of multiple files. I made it public in the middle of last year, but didn't really announce or introduce it, so I guess this will be that introduction. It's at https://github.com/jsummers/miscjs, "mhd" directory. It's not much, and I'm sure its audience … Continue reading MHD: A multi-file hex dump utility
The file command and black magic (Part 2)
This is a continuation of Part 1. I'm discussing random trivia about the file command. For a list of other posts about the file command, see this post. Format specifiers Consider the format specifiers that can appear in the "message" field: %s, %d, %u, etc. In a previous post, I indicated that I didn't know … Continue reading The file command and black magic (Part 2)
The file command and black magic (Part 1)
For an introduction to the file command, and my other posts on the topic, see the first post. I've come across some undocumented things, and various weirdness, in the behavior of the file command. In this post, I'll go over a few of them. The intended audience is someone writing "magic pattern" rules for file. … Continue reading The file command and black magic (Part 1)
A script to analyze magic patterns for the “file” command
See my previous post for an introduction to the file command, and the "magic patterns" used to configure it. My previous post also introduces some of the terminology used in this post. I've been working on a little Python script named Mgchkj (all the good names were taken) that helps to identify patterns that might … Continue reading A script to analyze magic patterns for the “file” command
Compiling lzhuf.c on a modern computer
There's an old data compression computer program named lzhuf.c ("Lzhuf"). It was written in the late 1980s by Haruyasu Yoshizaki. Even today, it is potential useful, if you want to support LHarc format, or certain other compression formats. But it doesn't work correctly when compiled by a modern C compiler. In this post, I'll investigate … Continue reading Compiling lzhuf.c on a modern computer
Win32 I/O character encoding supplement 3: UTF-8 manifest
For a list of other posts in this series, refer to the first post. A relatively recent Windows software development feature, affecting character encoding, is the ability to request a specific "ANSI" character encoding (or "code page"), presumably UTF-8, using a manifest. I decided to investigate what this really does. This "manifest method" is independent … Continue reading Win32 I/O character encoding supplement 3: UTF-8 manifest
Testing some LZ77 compression limits
This post is about data compression algorithms that involve LZ77, or a similar kind of compression. It's mainly about old-school compression algorithms and software. There is some information about LZ77 in my post about LZ77 prehistory. I won't explain it in detail here, but here are some things to know about it. Both the compressor … Continue reading Testing some LZ77 compression limits
Win32 I/O character encoding supplement 2 – setlocale enhancement
This is part of a series of post on using Unicode in Windows command-line applications. Here's the first post. Sometime in 2018, some functions in the Windows 10 C runtime system, and related development SDKs, were enhanced to support UTF-8. This feature is enabled by calling the setlocale function. For reference, Microsoft's current documentation of … Continue reading Win32 I/O character encoding supplement 2 – setlocale enhancement