BFFM-XGB: Big Five From 20 Questions

Open-source pipeline for training XGBoost quantile regression models that predict Big Five personality scores from partial questionnaire responses. Trained on ~603k respondents from the IPIP-BFFM dataset with sparsity augmentation, the 15 exported ONNX models (5 domains x 3 quantiles) produce percentile scores with calibrated 90% prediction intervals from as few as 20 items.

What's here:

Models — Pre-trained ONNX models, public domain, on HuggingFace
Pipeline — Reproducible end-to-end training: data download through ONNX export and publication figures
Inference packages — Python and TypeScript libraries for running predictions
Live demo — big5.shawnprice.com

Comparison with the Mini-IPIP

The Mini-IPIP is the standard short personality test in psychology research (Donnellan et al., 2006). Both approaches use 20 items (4 per domain) to recover the full 50-item IPIP-BFFM scale scores. All r values are Pearson correlations with the full 50-item scale on a held-out test set (N = 90,499).

	Mini-IPIP	BFFM-XGB-20
Items	20 (4 per domain)	20 (4 per domain)
Item selection	Expert-curated brevity	Top-4 by within-domain r
Scoring	Simple scale averaging	XGBoost quantile regression
Overall r	.906	.927
MAE (percentile pts)	9.2	8.2
90% prediction intervals	—	✓ (89.5% coverage)

Per-domain accuracy at K = 20:

Domain	Mini-IPIP α	Mini-IPIP r	BFFM-XGB-20 r
Extraversion	.77	.939	.947
Agreeableness	.70	.911	.920
Conscientiousness	.69	.909	.919
Emotional Stability	.68	.929	.937
Intellect/Imagination	.65	.842	.910

At 15 items, BFFM-XGB already matches the Mini-IPIP's 20-item accuracy (r = .908 vs .906).

Quick Start

Language	Directory	Install	Docs
Python	`python/`	`pip install onnxruntime numpy scipy pytest`	Inference guide
TypeScript	`typescript/`	`npm ci`	Inference guide

Give it answers (1–5 scale, reverse-scored), get percentiles with 90% confidence intervals. See docs/inference.md for full code examples.

Reproduce

make setup    # Python, TypeScript, and web dependencies
make all      # Full pipeline: download through figures
make test     # All tests: lib, inference, and web

See docs/pipeline.md for pipeline stages, training variants, hyperparameter tuning, and research evaluation.

Directory Structure

bffm-xgb/
├── artifacts/          Pipeline artifacts and static reference data
├── configs/            YAML training configurations (reference + ablation variants)
├── data/               Downloaded and processed data (gitignored)
├── docs/               Documentation (inference, pipeline, research, infrastructure)
├── figures/            Generated publication figures (gitignored, regenerable)
├── infra/              Terraform configs for AWS CPU/GPU instances
├── lib/                Shared Python library
├── models/             Trained model checkpoints (gitignored)
├── notes/              Research notes (auto-generated data sections)
├── output/             Exported ONNX models by variant (reference/, ablation_*/)
├── pipeline/           Numbered pipeline scripts (01 through 13)
├── python/             Python inference package with tests
├── scripts/            Pipeline utilities (research summary, notes, provenance, backup, deployment)
├── templates/          Jinja2 templates for generated outputs
├── tests/              Unit tests for lib/ modules (pytest)
├── typescript/         TypeScript inference package with tests (vitest)
├── web/                React + Hono web assessment app (deployed to HuggingFace Spaces)
├── .env.example        Template for HF_TOKEN (required for `make upload-hf`)
├── .gitignore
├── LICENSE.md          MIT License
├── Makefile            Orchestrates the full pipeline
├── NOTICES.md          Third-party attributions (CC0, IPIP, OSPP)
├── pyproject.toml      pytest configuration (pythonpath, testpaths)
├── requirements.txt    Python dependencies
└── README.md           This file

Documentation

Inference Guide — Python/TypeScript usage, code examples, reverse-scoring
Pipeline Guide — Full reproduction, pipeline stages, training, research evaluation
Research Notes — Model architecture, sparsity augmentation, norms, data, limitations
Infrastructure Guide — AWS remote training (CPU/GPU, spot/on-demand)
Web App — React + Hono assessment app
Model Cards — ONNX model details and provenance
NOTES.md — Auto-generated research notes with cross-variant evaluation data

Limitations

Norms are derived from self-selected online respondents (OSPP); they may not represent the general population
Models are trained on English-language IPIP items only
Accuracy degrades with fewer items; 20 items is the recommended minimum for reliable scoring
Intended for educational use only

License

MIT. See NOTICES.md for third-party attributions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BFFM-XGB: Big Five From 20 Questions

Comparison with the Mini-IPIP

Quick Start

Reproduce

Directory Structure

Documentation

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.claude/skills/commit		.claude/skills/commit
.github/workflows		.github/workflows
artifacts		artifacts
configs		configs
data		data
docs		docs
figures		figures
infra		infra
lib		lib
models		models
notes		notes
output		output
pipeline		pipeline
python		python
scripts		scripts
templates		templates
tests		tests
typescript		typescript
web		web
.env.example		.env.example
.gitignore		.gitignore
LICENSE.md		LICENSE.md
Makefile		Makefile
NOTICES.md		NOTICES.md
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

BFFM-XGB: Big Five From 20 Questions

Comparison with the Mini-IPIP

Quick Start

Reproduce

Directory Structure

Documentation

Limitations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages