Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

greybox

PyPI version PyPI - Downloads Python CI Python versions License: LGPL-2.1

Python port of the R greybox package — a toolbox for regression model building and forecasting.

hex-sticker of the greybox package for Python

Installation

pip install greybox

For more installation options, see the Installation wiki page.

Quick Example

import numpy as np
import pandas as pd
from greybox import ALM, formula

# Generate sample data
np.random.seed(42)
n = 200
data = pd.DataFrame({
    "y": np.random.normal(10, 2, n),
    "x1": np.random.normal(5, 1, n),
    "x2": np.random.normal(3, 1, n),
})
data["y"] = 2 + 0.5 * data["x1"] - 0.3 * data["x2"] + np.random.normal(0, 1, n)

# Parse formula and fit model
y, X = formula("y ~ x1 + x2", data=data)
model = ALM(distribution="dnorm")
model.fit(X, y)

# Summary
print(model.summary())

# Predict with intervals
pred = model.predict(result.data, interval="prediction", level=0.95)
print(pred.mean[:5])

# Include AR terms (ARIMA-like models)
# For example, ARIMA(1,1,0) model with Log-Normal distribution:
model = ALM(distribution="dlnorm", orders=(1, 1, 0))
model.fit(X, y)

Supported Distributions

Category Distributions
Continuous dnorm, dlaplace, ds, dgnorm, dlgnorm, dfnorm, drectnorm, dt
Positive dlnorm, dinvgauss, dgamma, dexp, dchisq
Count dpois, dnbinom, dgeom
Bounded dbeta, dlogitnorm, dbcnorm
CDF-based pnorm, plogis
Other dalaplace, dbinom

Features

  • ALM (Augmented Linear Model): Likelihood-based regression with 26 distributions
  • Formula parser: R-style formulas (y ~ x1 + x2, log(y) ~ ., y ~ 0 + x1) with support for backshift operator
  • stepwise(): IC-based variable selection with partial correlations
  • CALM(): Combine ALM models based on IC weights
  • Forecast error measures: MAE, MSE, RMSE, MAPE, MASE, MPE, sMAPE, and more
  • Variable processing: xreg_expander (lags/leads), xreg_multiplier (interactions), temporal_dummy
  • Distributions: 27 distribution families with density, CDF, quantile, and random generation
  • Association: Partial correlations and measures of association
  • Diagnostics: Model diagnostics and validation

Links

License

LGPL-2.1