Open Source Python MIT
ModelAudit

ModelAudit

Model Audit

★ 1 ⑂ 0 Updated 2026-03-17
Statistical forensics-based multi-method distillation detection framework — extracts model fingerprints through behavioral probing, determines distillation relationships via hypothesis testing, combining Pearson correlation, Jensen-Shannon divergence, and CKA similarity.
Statistical Forensics Behavioral Probing Hypothesis Testing

Quick Start

Install
pip install knowlyr-modelaudit
Usage
from modelaudit import AuditEngine

engine = AuditEngine()
results = engine.detect(["Hello! I'd be happy to help..."])
detect_text_source Detect text data source — determine which LLM likely generated the text
verify_model Verify model identity — check if an API serves the claimed model
compare_models Compare fingerprint similarity of two models to detect distillation/derivation
compare_models_whitebox Whitebox comparison of two local models — uses REEF CKA method to compare hidden state similarity (requires model weights)
audit_memorization Detect if a model has memorized training data — evaluate via prefix completion and token-level checking
audit_report Generate a complete model audit report — aggregate results from all audit tools
audit_watermark Detect AI watermarks in text (statistical features and pattern matching)
audit_distillation Full distillation audit — comprehensive fingerprint comparison + style analysis, generates detailed audit report

Documentation

ModelAudit

LLM Distillation Detection and Model Fingerprinting
via Statistical Forensics

Detect unauthorized model distillation through behavioral probing,
stylistic fingerprinting, and representation similarity analysis.

Statistical Forensics · Behavioral Signatures · Cross-Model Lineage Inference

The Problem

Large language model distillation has become a core threat to model IP protection. Student models can replicate teacher model capabilities by mimicking output distributions -- without authorization. Existing detection methods either require white-box weight access (often unavailable) or only analyze surface-level text features (easily evaded).

The Solution

ModelAudit is a multi-method distillation detection framework based on statistical forensics. It extracts model fingerprints through behavioral probing, applies hypothesis testing to determine distillation relationships, and combines four complementary methods to form a complete black-box to white-box audit chain.

Four Complementary Detection Methods

Method Type Mechanism
LLMmap Black-box 20 behavioral probes, Pearson correlation on response patterns
DLI Black-box Behavioral signatures + Jensen-Shannon divergence lineage inference
REEF White-box CKA layer-wise hidden state similarity
StyleAnalysis Stylistic 12 model family style signatures + language detection

10-Dimensional Behavioral Probing

Go beyond simple text statistics. ModelAudit probes 10 cognitive dimensions -- self-awareness, safety boundaries, injection testing, reasoning, creative writing, multilingual, format control, role-playing, code generation, and summarization -- capturing deep behavioral differences that persist even after RLHF alignment.

Cross-Provider Audit Chain

Audit across providers seamlessly. Teacher and student models can come from different APIs:

knowlyr-modelaudit audit \
  --teacher claude-opus --teacher-provider anthropic \
  --student kimi-k2.5 --student-provider openai \
  --student-api-base https://api.moonshot.cn/v1 \
  -o report.md

Get Started

pip install knowlyr-modelaudit

# Detect text source
knowlyr-modelaudit detect texts.jsonl

# Verify model identity
knowlyr-modelaudit verify gpt-4o --provider openai

# Full distillation audit
knowlyr-modelaudit audit --teacher gpt-4o --student my-model -o report.md
from modelaudit import AuditEngine

engine = AuditEngine()
audit = engine.audit("claude-opus", "suspect-model")
print(f"{audit.verdict} (confidence: {audit.confidence:.3f})")

MCP Integration

ModelAudit ships with 8 MCP tools for seamless integration into AI workflows:

detect_text_source · verify_model · compare_models · compare_models_whitebox · audit_distillation · audit_memorization · audit_report · audit_watermark

Built-in Benchmark

100% detection accuracy across 6 model families (14 samples). Supports 12 model families: GPT-4 · GPT-3.5 · Claude · LLaMA · Gemini · Qwen · DeepSeek · Mistral · Yi · Phi · Cohere · ChatGLM.


GitHub · PyPI

knowlyr — LLM distillation detection and model fingerprinting via statistical forensics

Want to discuss this project? Reach out to

Kai
Kai Founder & CEO
林锐
林锐 AI Code Review & Refactoring Consultant