-
Notifications
You must be signed in to change notification settings - Fork 8
distributions
Greybox supports 26 distributions for use with ALM. Each distribution is identified by a short code (e.g. "dnorm") used in the distribution parameter. The d/p/q/r convention follows R: density, CDF, quantile, and random generation.
These distributions model continuous data with identity link (mu = X @ beta).
| Code | Full Name | Extra Parameter | Use Case |
|---|---|---|---|
dnorm |
Normal | — | General continuous data, default |
dlaplace |
Laplace | — | Heavy tails, robust to outliers |
ds |
S (half-Laplace) | — | Light-tailed data |
dgnorm |
Generalized Normal |
shape (default 2.0) |
Flexible tail weight; shape=2 is Normal, shape=1 is Laplace |
dlogis |
Logistic | — | Heavy tails, symmetric, longer tails than Normal |
dt |
Student's t |
nu (default 2) |
Heavy tails, small samples; nu→∞ approaches Normal |
dalaplace |
Asymmetric Laplace |
alpha (default 0.5, range 0–1) |
Quantile regression; alpha=0.5 is symmetric Laplace |
These model positive continuous data. The location parameter operates on the log scale.
| Code | Full Name | Extra Parameter | Use Case |
|---|---|---|---|
dlnorm |
Log-Normal | — | Positive, right-skewed (e.g. prices, durations) |
dllaplace |
Log-Laplace | — | Positive, heavy-tailed |
dls |
Log-S | — | Positive, light-tailed |
dlgnorm |
Log-Generalized Normal |
shape (default 2.0) |
Positive data, flexible tails |
| Code | Full Name | Extra Parameter | Use Case |
|---|---|---|---|
dfnorm |
Folded Normal | — | Absolute values, non-negative data |
drectnorm |
Rectified Normal | — | Zero-inflated non-negative (zeros are structural) |
dbcnorm |
Box-Cox Normal |
lambda_bc (default 0.1, range 0–1) |
Non-normal data, power transformation |
dlogitnorm |
Logit-Normal | — | Proportions in (0, 1) |
dbeta |
Beta | — | Proportions in (0, 1), two-part model (shape1 + shape2) |
These use log-link: mu = exp(X @ beta), so coefficients are initialized from lstsq(X, log(y)).
| Code | Full Name | Extra Parameter | Use Case |
|---|---|---|---|
dinvgauss |
Inverse Gaussian | — | Positive, right-skewed (e.g. waiting times) |
dgamma |
Gamma | — | Positive, right-skewed (e.g. insurance claims) |
dexp |
Exponential | — | Time between events, memoryless |
Count data distributions using log-link: mu = exp(X @ beta).
| Code | Full Name | Extra Parameter | Use Case |
|---|---|---|---|
dpois |
Poisson | — | Count data where mean ≈ variance |
dnbinom |
Negative Binomial |
size (default var(y)) |
Overdispersed count data (variance > mean) |
dbinom |
Binomial | — | Binary/count with known number of trials |
dgeom |
Geometric | — | Number of trials until first success |
dchisq |
Chi-squared |
nu (default 1) |
Sum of squared normal variables |
These model binary (0/1) outcomes.
| Code | Full Name | Link | Use Case |
|---|---|---|---|
plogis |
Logistic CDF | logistic | Binary classification (logistic regression) |
pnorm |
Normal CDF (probit) | probit | Binary classification (probit regression) |
Some distributions require an additional parameter beyond the standard location and scale. If not provided by the user, ALM estimates it automatically.
| Distribution | Parameter | R argument | Python argument | Default (if estimated) |
|---|---|---|---|---|
dalaplace |
Quantile level | alpha |
alpha |
0.5 |
dgnorm |
Shape | shape |
shape |
2.0 |
dlgnorm |
Shape | shape |
shape |
2.0 |
dbcnorm |
Box-Cox lambda | lambdaBC |
lambda_bc |
0.1 |
dt |
Degrees of freedom | nu |
nu |
2 |
dchisq |
Degrees of freedom | nu |
nu |
1 |
dnbinom |
Size | size |
size |
var(y) |
dfnorm |
— | — | — | Estimated as sd(y) |
drectnorm |
— | — | — | Estimated as sd(y) |
The greybox.distributions module provides d/p/q/r functions for each distribution:
-
d — density (PDF/PMF):
dnorm(x, loc=0, scale=1) -
p — cumulative distribution (CDF):
pnorm(q, loc=0, scale=1) -
q — quantile (inverse CDF):
qnorm(p, loc=0, scale=1) -
r — random generation:
rnorm(n, loc=0, scale=1)
from greybox import distributions as dist
# Normal distribution
dist.dnorm(0, loc=0, scale=1) # density at x=0
dist.pnorm(1.96, loc=0, scale=1) # CDF at q=1.96
dist.qnorm(0.975, loc=0, scale=1) # quantile at p=0.975
dist.rnorm(100, loc=0, scale=1) # 100 random draws
# Laplace distribution
dist.dlaplace(0, loc=0, scale=1)
dist.plaplace(0, loc=0, scale=1)
# Generalized Normal
dist.dgnorm(0, loc=0, scale=1, shape=2)# R — Laplace regression
model <- alm(y ~ x1 + x2, data, distribution="dlaplace")
# R — Quantile regression (median)
model <- alm(y ~ x1 + x2, data, distribution="dalaplace", alpha=0.5)
# R — Poisson count model
model <- alm(count ~ x1 + x2, data, distribution="dpois")# Python — Laplace regression
from greybox import ALM, formula
y, X = formula("y ~ x1 + x2", data)
model = ALM(distribution="dlaplace")
model.fit(X, y)
# Python — Quantile regression (90th percentile)
model = ALM(distribution="dalaplace", alpha=0.9)
model.fit(X, y)
# Python — Poisson count model
model = ALM(distribution="dpois")
model.fit(X, y)- Svetunkov, I. (2023). Statistics for Business Analytics. https://openforecast.org/sba/