Documentation
¶
Overview ¶
Package uawk provides a high-performance AWK interpreter.
uawk is a modern AWK implementation written in Go, featuring:
- Full POSIX AWK compatibility
- High-performance regex engine (coregex)
- Zero external dependencies for core functionality
- Embeddable library for Go applications
Quick Start ¶
For simple one-off execution:
output, err := uawk.Run(`{ print $1 }`, strings.NewReader("hello world"), nil)
With configuration:
output, err := uawk.Run(program, input, &uawk.Config{
FS: ":",
Variables: map[string]string{"threshold": "100"},
})
Compiled Programs ¶
For repeated execution of the same program:
prog, err := uawk.Compile(`$1 > threshold { print $2 }`)
if err != nil {
log.Fatal(err)
}
for _, file := range files {
output, err := prog.Run(file, &uawk.Config{
Variables: map[string]string{"threshold": "100"},
})
// ...
}
Configuration ¶
The Config type allows customization of AWK execution:
- Field and record separators (FS, RS, OFS, ORS)
- Pre-defined variables
- Custom I/O writers
Error Handling ¶
Errors are returned as specific types for detailed handling:
- ParseError: syntax errors in AWK source
- CompileError: semantic errors during compilation
- RuntimeError: errors during execution
Thread Safety ¶
Compiled Program objects are safe for concurrent use. Each call to Program.Run creates an independent execution context.
Index ¶
- Constants
- func Exec(program string, input io.Reader, output io.Writer, config *Config) error
- func IsExitError(err error) (int, bool)
- func Run(program string, input io.Reader, config *Config) (string, error)
- type CompileError
- type Config
- type ExitError
- type ParallelAnalysis
- type ParallelSafety
- type ParseError
- type Program
- type RuntimeError
Examples ¶
Constants ¶
const Version = "0.1.0"
Version is the uawk version string.
Variables ¶
This section is empty.
Functions ¶
func Exec ¶
Exec is a simplified interface for running an AWK program. It reads from input, writes to output, and returns any error.
This function is useful for integration with I/O pipelines where you need control over the output writer.
Example:
err := uawk.Exec(`{ print toupper($0) }`, os.Stdin, os.Stdout, nil)
func IsExitError ¶
IsExitError reports whether err is an ExitError and returns the exit code. Returns (code, true) if err is an ExitError, or (0, false) otherwise.
func Run ¶
Run executes an AWK program with the given input. This is a convenience function for one-off execution. For repeated execution of the same program, use Compile followed by Program.Run.
Parameters:
- program: AWK source code
- input: input data reader (can be nil for programs without input)
- config: execution configuration (can be nil for defaults)
Returns the program output as a string, or an error if parsing, compilation, or execution fails.
Example:
output, err := uawk.Run(`{ print $1 }`, strings.NewReader("hello world"), nil)
// output: "hello\n"
Example ¶
Example functions for documentation
package main
import (
"fmt"
"strings"
"github.com/kolkov/uawk"
)
func main() {
output, _ := uawk.Run(`{ print $1 }`, strings.NewReader("hello world\n"), nil)
fmt.Print(output)
}
Output: hello
Types ¶
type CompileError ¶
type CompileError struct {
Message string // Error description
}
CompileError represents a semantic error during compilation.
func (*CompileError) Error ¶
func (e *CompileError) Error() string
type Config ¶
type Config struct {
// FS is the input field separator (default: " ").
// When set to a single space, runs of whitespace are treated as separators.
// Otherwise, each occurrence of the string is a separator.
// Can also be a regular expression pattern.
FS string
// RS is the input record separator (default: "\n").
// When set to empty string, records are separated by blank lines.
RS string
// OFS is the output field separator (default: " ").
// Used when printing multiple values with print statement.
OFS string
// ORS is the output record separator (default: "\n").
// Appended after each print statement.
ORS string
// Variables contains pre-defined variables.
// These are set before BEGIN block execution.
// Example: map[string]string{"threshold": "100", "prefix": "LOG:"}
Variables map[string]string
// Output is the writer for print/printf statements.
// If nil, output is captured and returned from Run.
Output io.Writer
// Stderr is the writer for error output.
// If nil, errors are discarded.
Stderr io.Writer
// Args contains command-line arguments (ARGV).
// Args[0] is typically the program name.
Args []string
// POSIXRegex enables POSIX leftmost-longest regex matching.
// When true (default), uses AWK/POSIX ERE semantics (slower but compliant).
// When false, uses leftmost-first matching (faster, Perl-like).
// Set to false for better performance when POSIX compliance is not required.
POSIXRegex *bool
// Parallel enables parallel execution with the specified number of workers.
// When > 1, the program is executed in parallel if it is safe to do so.
// When 0 or 1, sequential execution is used (default).
// Note: Parallel execution has limitations - see CanParallelize().
Parallel int
// ChunkSize is the approximate size in bytes of each input chunk
// when parallel execution is enabled. Default: 4MB (4 * 1024 * 1024).
ChunkSize int
}
Config holds configuration options for AWK execution.
type ExitError ¶
type ExitError struct {
Code int // Exit status code (0 = success)
}
ExitError represents a normal exit with a status code. This is not an error condition; it indicates the AWK program called exit with the given status.
type ParallelAnalysis ¶ added in v0.2.0
type ParallelAnalysis struct {
Safety ParallelSafety
CanParallelize bool
HasAggregation bool
AggregatedVars []int
AggregatedArrays []int
}
ParallelAnalysis contains the results of parallel safety analysis.
type ParallelSafety ¶ added in v0.2.0
type ParallelSafety int
ParallelSafety represents the parallelization safety level.
const ( // ParallelUnsafe indicates the program cannot be parallelized. ParallelUnsafe ParallelSafety = iota // ParallelStateless indicates the program is embarrassingly parallel. ParallelStateless // ParallelAggregatable indicates the program can be parallelized with aggregation. ParallelAggregatable )
type ParseError ¶
type ParseError struct {
Line int // 1-based line number
Column int // 1-based column number
Message string // Error description
}
ParseError represents a syntax error in AWK source code.
func (*ParseError) Error ¶
func (e *ParseError) Error() string
type Program ¶
type Program struct {
// contains filtered or unexported fields
}
Program represents a compiled AWK program ready for execution. It is safe for concurrent use; each call to Run creates an independent execution context.
func Compile ¶
Compile parses and compiles an AWK program for execution. The returned Program can be executed multiple times with different inputs.
Example:
prog, err := uawk.Compile(`{ sum += $1 } END { print sum }`)
if err != nil {
log.Fatal(err)
}
output1, _ := prog.Run(file1, nil)
output2, _ := prog.Run(file2, nil)
Example ¶
package main
import (
"fmt"
"strings"
"github.com/kolkov/uawk"
)
func main() {
prog, _ := uawk.Compile(`{ sum += $1 } END { print sum }`)
output, _ := prog.Run(strings.NewReader("1\n2\n3\n"), nil)
fmt.Print(output)
}
Output: 6
func MustCompile ¶
MustCompile is like Compile but panics if the program cannot be compiled. It simplifies initialization of global program variables.
Example:
var sumProgram = uawk.MustCompile(`{ sum += $1 } END { print sum }`)
func (*Program) CanParallelize ¶ added in v0.2.0
func (p *Program) CanParallelize(rs string) *ParallelAnalysis
CanParallelize checks if this program can be safely parallelized. Returns a ParallelAnalysis struct with detailed information about why the program can or cannot be parallelized.
func (*Program) Disassemble ¶
Disassemble returns a human-readable representation of the compiled bytecode. Useful for debugging and understanding program structure.
func (*Program) Run ¶
Run executes the compiled program with the given input and configuration. Returns the output as a string, or an error if execution fails.
If config is nil, default configuration is used. If config.Output is set, output is written there and the returned string will be empty. If config.Parallel > 1 and the program is parallelizable, it will be executed using multiple worker goroutines.
type RuntimeError ¶
type RuntimeError struct {
Message string // Error description
}
RuntimeError represents an error during AWK execution.
func (*RuntimeError) Error ¶
func (e *RuntimeError) Error() string
Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
uawk
command
uawk - Ultra AWK interpreter
|
uawk - Ultra AWK interpreter |
|
internal
|
|
|
ast
Package ast defines the abstract syntax tree for AWK programs.
|
Package ast defines the abstract syntax tree for AWK programs. |
|
compiler
Package compiler compiles an AST into bytecode for the VM.
|
Package compiler compiles an AST into bytecode for the VM. |
|
lexer
Package lexer provides AWK source code tokenization.
|
Package lexer provides AWK source code tokenization. |
|
parser
Package parser provides an AWK recursive descent parser.
|
Package parser provides an AWK recursive descent parser. |
|
runtime
Package runtime provides AWK runtime support including regex operations.
|
Package runtime provides AWK runtime support including regex operations. |
|
semantic
Package semantic provides semantic analysis for AWK programs.
|
Package semantic provides semantic analysis for AWK programs. |
|
token
Package token defines lexical tokens for AWK.
|
Package token defines lexical tokens for AWK. |
|
types
Package types defines runtime value types for uawk.
|
Package types defines runtime value types for uawk. |
|
vm
Package vm provides the AWK virtual machine implementation.
|
Package vm provides the AWK virtual machine implementation. |