Caissa Chess Engine

(image generated with DALL·E 2)

Overview

Caissa is a strong, UCI-compatible chess engine written from scratch in C++ since early 2021. It features a custom neural network evaluation system trained on over 17 billion self-play positions, achieving ratings of 3600+ ELO on major chess engine rating lists, placing it at around top-10 spot.

The engine is optimized for:

Regular Chess - Standard chess rules
FRC (Fischer Random Chess) - Chess960 variant
DFRC (Double Fischer Random Chess) - Extended FRC variant

Playing Strength

Caissa consistently ranks among the top chess engines on major rating lists:

CCRL (Computer Chess Rating Lists)

List	Rating	Rank	Version	Notes
CCRL 40/2 FRC	4022	#6	1.23	Fischer Random Chess
CCRL Chess324	3770	#6	1.23	Chess324 variant
CCRL 40/15	3622	#9	1.23	4 CPU
CCRL Blitz	3755	#10	1.22	8 CPU

SPCC (Schachprogramm-Computer-Chess)

List	Rating	Rank	Version
SPCC UHO-Top15	3697	#10	Caissa 1.24 avx512

IpMan Chess

List	Rating	Rank	Version	Architecture
10+1 (R9-7945HX)	3542	#16	1.24	AVX-512
10+1 (i9-7980XE)	3526	#14	1.21	AVX-512
10+1 (i9-13700H)	3544	#17	1.22	AVX2-BMI2

CEGT (Chess Engine Grand Tournament)

List	Rating	Rank	Version
CEGT 40/20	3576	#8	1.24
CEGT 40/4	3614	#8	1.22
CEGT 5+3	3618	#5	1.22

Note: The rankings above may be outdated.

Features

General

✅ UCI Protocol - Full Universal Chess Interface support
✅ Neural Network Evaluation - Custom NNUE-style evaluation
✅ Endgame Tablebases - Syzygy and Gaviota support
✅ Chess960 Support - Fischer Random Chess (FRC) and Double FRC

Search Algorithm

✅ Negamax with alpha-beta pruning
✅ Iterative Deepening with aspiration windows
✅ Principal Variation Search (PVS)
✅ Quiescence Search for tactical positions
✅ Transposition Table with large pages support
✅ Multi-PV Search - Analyze multiple lines simultaneously
✅ Multithreaded Search - Parallel search with shared TT
✅ Late Move Reductions (LMR)
✅ Null-Move Pruning
✅ Singular Extensions
✅ Correction History - Pawn and non-pawn correction tables improve static eval accuracy
✅ Cuckoo Hashing for fast repetition detection

Neural Network Evaluation

Architecture: (32×768→1024)×2→1 — dual-perspective (one accumulator per king), 32 king buckets, 768 features per perspective (12 piece types × 64 squares)
Incremental Updates - Efficiently updated first layer
Vectorized Code - Manual SIMD optimization for:
- AVX-512 (fastest)
- AVX2
- SSE2
- ARM NEON
Activation: Clipped-ReLU
Variants: 8 variants of last layer weights (piece count dependent)
Features: Absolute piece coordinates with horizontal symmetry, 32 king buckets
Special Endgame Routines - Enhanced endgame evaluation

Neural Network Trainer

Custom CPU-based Trainer using Adam algorithm
Highly Optimized - Exploits AVX instructions, multithreading, and network sparsity
Self-Play Training - Trained on 17+ billion positions from self-generated games
Progressive Training - Older games purged, networks trained on latest engine versions

Performance Optimizations

Magic Bitboards - Efficient move generation
Large Pages - Transposition table uses large pages for better performance
Node Caching - Evaluation result caching
Accumulator Caching - Neural network accumulator caching
NUMA Support - Memory allocation and thread pinning respect NUMA topology on multi-socket systems (Linux, requires libnuma)
Ultra-Fast - Outstanding performance at ultra-short time controls (sub-second games)

Quick Start

Using Pre-built Binaries

Download the appropriate executable from the Releases page
Choose the version matching your CPU:
- AVX-512: Latest Intel Xeon/AMD EPYC (fastest)
- BMI2: Most modern CPUs (recommended)
- AVX2: Older CPUs with AVX2 support
- POPCNT: Older CPUs with SSE4.2
- Legacy: Very old x64 CPUs
Run the engine with any UCI-compatible chess GUI

Running from Source

See the Compilation section below for detailed build instructions.

Compilation

Prerequisites

C++ Compiler with C++20 support:
- GCC 10+ or Clang 12+ (Linux)
- Visual Studio 2022 (Windows)
CMake 3.15 or later
Make (Linux) or Visual Studio (Windows)

Linux

Using Makefile (Quick Build)

cd src
make -j$(nproc)

Note: This compiles the default AVX2/BMI2 version.

Using CMake (Recommended)

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Final ..
make -j$(nproc)

Build Configurations:

Final - Production build, no asserts, maximum optimizations
Release - Development build with asserts, optimizations enabled
Debug - Development build with asserts, optimizations disabled

Architecture Selection:

To build for a specific architecture, set the TARGET_ARCH variable:

# AVX-512 (requires AVX-512 support)
cmake -DTARGET_ARCH=x64-avx512 -DCMAKE_BUILD_TYPE=Final ..
# BMI2 (recommended for modern CPUs)
cmake -DTARGET_ARCH=x64-bmi2 -DCMAKE_BUILD_TYPE=Final ..
# AVX2
cmake -DTARGET_ARCH=x64-avx2 -DCMAKE_BUILD_TYPE=Final ..
# SSE4-POPCNT
cmake -DTARGET_ARCH=x64-sse4-popcnt -DCMAKE_BUILD_TYPE=Final ..
# Legacy (fallback)
cmake -DTARGET_ARCH=x64-legacy -DCMAKE_BUILD_TYPE=Final ..

Windows

Run GenerateVisualStudioSolution.bat to generate the Visual Studio solution
Open build_<arch>/caissa.sln in Visual Studio 2022
Select the desired configuration (Debug/Release/Final)
Build the solution (Ctrl+Shift+B)

Note: Visual Studio 2022 is the only tested version. CMake directly in Visual Studio has not been tested.

ARM / AArch64

CMake supports two ARM targets via TARGET_ARCH:

mkdir build && cd build

# Generic AArch64 (no NEON intrinsics)
cmake -DTARGET_ARCH=aarch64 -DCMAKE_BUILD_TYPE=Final ..

# AArch64 with NEON SIMD (recommended on modern ARM hardware)
cmake -DTARGET_ARCH=aarch64-neon -DCMAKE_BUILD_TYPE=Final ..

make -j$(nproc)

Post-Compilation

After compilation, copy the appropriate neural network file from data/neuralNets/ to:

Linux: build/bin/
Windows: build\bin\x64\<Configuration>\

Architecture Variants

Variant	CPU Requirements	Performance	Recommended For
AVX-512	AVX-512 instruction set	Fastest	Latest Intel Xeon, AMD EPYC
BMI2	AVX2 + BMI2	Fast	Most modern CPUs (2015+)
AVX2	AVX2 instruction set	Fast	Intel Haswell, AMD Ryzen
POPCNT	SSE4.2 + POPCNT	Moderate	Older CPUs (2008-2014)
Legacy	x64 only	Slowest	Very old x64 CPUs

Tip: If unsure, try BMI2 first. It's supported by most modern CPUs and offers excellent performance.

Custom Commands

In addition to the standard UCI protocol, the engine supports these non-standard commands useful for development and debugging:

Command	Description
`bench [depth]`	Run a benchmark / smoke test
`perft [depth]`	Count legal moves to a given depth (move generation test)
`eval`	Display evaluation of the current position
`print`	Pretty-print the current board
`scoremoves`	Show move ordering scores for the current position
`threats`	Show threat information for the current position
`ttinfo`	Print transposition table statistics
`ttprobe`	Probe the transposition table for the current position
`tbprobe`	Probe tablebases for the current position
`cacheprobe`	Probe the node cache for the current position
`printparams`	Print all tunable search/eval parameters (only with `ENABLE_TUNING` build flag)

UCI Options

The engine supports the following UCI options:

Search Options

Hash (int) - Transposition table size in megabytes
Threads (int) - Number of search threads
MultiPV (int) - Number of principal variation lines to search
Ponder (bool) - Enable pondering mode

Time Management

MoveOverhead (int) - Move overhead in milliseconds (increase if engine loses time)

Evaluation

EvalFile (string) - Path to neural network evaluation file (.pnn)
EvalRandomization (int) - Evaluation randomization range (weakens engine, introduces non-determinism)

Tablebases

SyzygyPath (string) - Semicolon-separated paths to Syzygy tablebases
SyzygyProbeLimit (int) - Maximum number of pieces for tablebase probing

Display Options

UCI_AnalyseMode (bool) - Analysis mode (full PV lines, no depth constraints)
UCI_Chess960 (bool) - Enable Chess960 mode (castling as "king captures rook")
UCI_ShowWDL (bool) - Show win/draw/loss probabilities with evaluation
UseSAN (bool) - Use Standard Algebraic Notation (FIDE standard)
ColorConsoleOutput (bool) - Enable colored console output

History & Originality

Caissa has been written from the ground up since early 2021. The development journey:

Early Versions - Used simple PeSTO evaluation
Version 0.6 - Temporarily used Stockfish NNUE
Version 0.7+ - Custom neural network evaluation system

Neural Network Development

The engine's neural network has evolved significantly:

Initial Network: Based on Stockfish's architecture, trained on a few million positions
Current Network (v1.24+): Trained on 17+ billion positions from self-play
Progressive Training: Older games are purged, ensuring networks are trained only on the latest engine versions

Key Components

Runtime Evaluation: PackedNeuralNetwork.cpp
- Inspired by nnue.md
- Highly optimized with manual SIMD vectorization
Network Trainer: NetworkTrainer.cpp, NeuralNetwork.cpp
- Written completely from scratch
- CPU-based, heavily optimized with AVX and multithreading
- Exploits network sparsity for performance
Self-Play Generator: SelfPlay.cpp
- Generates games with fixed nodes/depth
- Custom binary format for efficient storage
- Uses Stefan's Pohl UHO books or DFRC openings

Project Structure

The project is organized into three main modules:

src/
├── backend/     # Core engine library
│   ├── Search.*            # Search algorithms
│   ├── Position.*          # Position representation
│   ├── MoveGen.*           # Move generation
│   ├── PackedNeuralNetwork.*   # Neural network evaluation
│   ├── TranspositionTable.*    # Position caching
│   └── ...
│
├── frontend/    # UCI interface executable
│   ├── Main.cpp            # Entry point
│   └── UCI.*               # UCI protocol implementation
│
└── utils/       # Development and training tools
    ├── NetworkTrainer.*    # Neural network training
    ├── SelfPlay.*          # Self-play game generation
    ├── Tests.*             # Unit tests
    └── ...

Module Descriptions

backend (library) - Engine core: search, evaluation, move generation, position management
frontend (executable) - UCI wrapper providing command-line interface
utils (executable) - Utilities: network trainer, self-play generator, unit tests, performance tests

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author: Michał Witanowski
Started: Early 2021
Language: C++20
License: MIT

Name		Name	Last commit message	Last commit date
Latest commit History 808 Commits
.github/workflows		.github/workflows
data		data
scripts		scripts
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
GenerateVisualStudioSolution.bat		GenerateVisualStudioSolution.bat
LICENSE		LICENSE
README.md		README.md
cmake-variants.yaml		cmake-variants.yaml
version.txt		version.txt

Folders and files

Latest commit

History

Repository files navigation

Caissa Chess Engine

Overview

Table of Contents

Playing Strength

CCRL (Computer Chess Rating Lists)

SPCC (Schachprogramm-Computer-Chess)

IpMan Chess

CEGT (Chess Engine Grand Tournament)

Features

General

Search Algorithm

Neural Network Evaluation

Neural Network Trainer

Performance Optimizations

Quick Start

Using Pre-built Binaries

Running from Source

Compilation

Prerequisites

Linux

Using Makefile (Quick Build)

Using CMake (Recommended)

Windows

ARM / AArch64

Post-Compilation

Architecture Variants

Custom Commands

UCI Options

Search Options

Time Management

Evaluation

Tablebases

Display Options

History & Originality

Neural Network Development

Key Components

Project Structure

Module Descriptions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 35

Uh oh!

Contributors 3

Languages