AWK interpreter written in Go with coregex regex engine.
- POSIX AWK compliant with GNU AWK extensions
- Parallel file processing (
-j N) - Embeddable Go API
- Zero CGO dependencies
go install github.com/kolkov/uawk/cmd/uawk@latest# Basic usage
uawk '{ print $1 }' file.txt
# Field separator
uawk -F: '{ print $1 }' /etc/passwd
# Variables
uawk -v name=World 'BEGIN { print "Hello, " name }'
# Program from file
uawk -f script.awk input.txt
# Parallel processing
uawk -j 4 '{ sum += $1 } END { print sum }' *.log
# Non-POSIX regex mode
uawk --no-posix '/pattern/ { print }' file.txtpackage main
import (
"fmt"
"strings"
"github.com/kolkov/uawk"
)
func main() {
output, err := uawk.Run(`{ print $1 }`, strings.NewReader("hello world"), nil)
if err != nil {
panic(err)
}
fmt.Print(output)
// With configuration
config := &uawk.Config{
FS: ":",
Variables: map[string]string{"threshold": "100"},
}
output, err = uawk.Run(`$2 > threshold { print $1 }`, input, config)
// Compile once, run multiple times
prog, err := uawk.Compile(`{ sum += $1 } END { print sum }`)
for _, file := range files {
result, _ := prog.Run(file, nil)
fmt.Println(result)
}
}See uawk-bench for benchmark suite and methodology.
Results vary by workload. Regex-heavy patterns benefit from coregex optimizations. I/O-bound workloads show smaller differences between implementations.
git clone https://github.com/kolkov/uawk
cd uawk
go build -o uawk ./cmd/uawkRequires Go 1.25+.
Source → Lexer → Parser → AST → Semantic Analysis → Compiler → Optimizer → VM
| Component | Description |
|---|---|
| Lexer | Context-sensitive tokenizer, UTF-8 |
| Parser | Recursive descent |
| Compiler | Bytecode generation (~110 opcodes) |
| Optimizer | Peephole optimization |
| VM | Stack-based execution |
- Pattern-action rules, BEGIN/END blocks
- Field splitting and assignment
- Built-in variables (NR, NF, FS, RS, OFS, ORS, FILENAME, etc.)
- Arithmetic, string, and regex operators
- Control flow (if/else, while, for, do-while)
- Associative arrays
- Built-in functions (print, printf, sprintf, length, substr, split, sub, gsub, match, tolower, toupper, sin, cos, exp, log, sqrt, int, rand, srand, system, etc.)
- User-defined functions
- I/O redirection (>, >>, |, getline)
-j Nparallel execution-cUnicode character operations--posix/--no-posixregex mode- Debug flags (-d, -da, -dt)
MIT