Skip to content
Ivan Svetunkov edited this page Feb 24, 2026 · 7 revisions

greybox — Toolbox for Model Building and Forecasting

Overview

greybox is a toolbox for regression model building, selection, and forecasting evaluation. It provides the Augmented Linear Model (ALM) — a flexible regression framework supporting 26 distributions and 7 loss functions — along with stepwise selection, model combination, rolling origin cross-validation, and a comprehensive suite of forecast accuracy measures.

The package exists in two implementations:

  • R package (CRAN) — the mature original, actively maintained
  • Python package (PyPI) — a port covering core functionality with a scikit-learn-compatible API

Installation

See details on Installation page.

R, From CRAN (Recommended)

The easiest way to install Greybox in R is from CRAN:

install.packages("greybox")

Python, From PyPI

The easiest way to install Greybox in Python is from PyPI:

pip install greybox

Quick Start

R

install.packages("greybox")
library(greybox)

# Fit a model
model <- alm(y ~ x1 + x2, data=mydata, distribution="dnorm")
summary(model)

# Stepwise selection
best <- stepwise(mydata, ic="AICc", distribution="dnorm")

# Forecast evaluation
ro_result <- ro(y, h=5, origins=10,
                call="predict(alm(y~1, data=data), h=5)")

Python

pip install greybox
import numpy as np
from greybox import ALM, formula, stepwise

# Fit a model
y, X = formula("y ~ x1 + x2", data)
model = ALM(distribution="dnorm")
model.fit(X, y)
print(model.summary())

# Stepwise selection
best = stepwise(data, ic="AICc", distribution="dnorm")

# Forecast evaluation
from greybox.measures import measures
result = measures(actual, forecast, insample)

Documentation

Page Description
ALM Augmented Linear Model — core estimator, 26 distributions, 7 loss functions
stepwise Forward stepwise variable selection
CALM Combination of ALM (model averaging)
distributions Distribution families: d/p/q/r functions
measures Forecast accuracy metrics (point, interval, quantile, half-moment)
manipulations Variable transformations: xreg functions
association Measures of association: correlation, partial correlation, determination
rolling_origin Rolling origin cross-validation for time series
diagnostics Outlier detection and model diagnostics

Python version also includes the standard dataset from the R stats package called mtcars, which is a pandas data frame. It can be imported in Python via:

import from greybox mtcars

R vs Python Implementation Comparison

Model Fitting

Feature R Function Python Function Status
Augmented Linear Model alm() ALM().fit() Implemented
Scale Model sm() R only
Bootstrap coefficients coefbootstrap() R only

Model Selection

Feature R Function Python Function Status
Stepwise selection stepwise() stepwise() Implemented
Model combination calm() CALM() Implemented

Prediction

Feature R Function Python Function Status
Predict / Forecast predict() / forecast() ALM.predict() Implemented

Forecast Evaluation

Feature R Function Python Function Status
Rolling origin ro() rolling_origin() Implemented
RMCB test rmcb() R only
Point measures (16) ME(), MAE(), MSE(), etc. me(), mae(), mse(), etc. Implemented
Interval measures MIS(), sMIS() mis(), smis() Implemented
Half-moment measures hm(), ham(), asymmetry(), etc. hm(), ham(), asymmetry(), etc. Implemented
Pinball loss pinball() pinball() Implemented
measures() measures() measures() Implemented

Measures of Association

Feature R Function Python Function Status
Association association() association() Implemented
Cramer's V cramer() R only
Partial correlation pcor() pcor() Implemented
Multiple correlation mcor() mcor() Implemented
Determination determination() determination() Implemented

Distributions

Feature R Function Python Function Status
26 distributions d/p/q/r functions d/p/q/r functions Implemented
Three-param lognormal dtplnorm() etc. R only

Feature Engineering

Feature R Function Python Function Status
Variable expansion xregExpander() xreg_expander() Implemented
Transformations xregTransformer() xreg_transformer() Implemented
Cross-products xregMultiplier() xreg_multiplier() Implemented
Temporal dummies temporalDummy() temporal_dummy() Implemented
Outlier dummies outlierdummy() outlier_dummy() Implemented

Information Criteria

Feature R Function Python Function Status
AIC / AICc / BIC / BICc AIC(), AICc(), BIC(), BICc() ALM.aic, .aicc, .bic, .bicc Implemented (as properties)
Point IC pointLik(), pAIC(), pAICc(), pBIC() point_lik() Partial

Visualization

Feature R Function Python Function Status
Graph maker graphmaker() R only
Spread plot spread() R only
Table plot tableplot() R only

Demand Analysis

Feature R Function Python Function Status
Demand identification aid() R only

Utilities

Feature R Function Python Function Status
DST detection detectdst() R only
Leap year detection detectleap() R only
Polynomial products polyprod() R only
DSR bootstrap dsrboot() R only

Naming Conventions

R uses camelCase, Python uses snake_case:

R Python
alm() ALM().fit()
calm() CALM()
xregExpander() xreg_expander()
xregTransformer() xreg_transformer()
xregMultiplier() xreg_multiplier()
temporalDummy() temporal_dummy()
outlierdummy() outlier_dummy()
pointLik() point_lik()
ro() rolling_origin()

References

Clone this wiki locally