How to Perform Quantile Regression in Python

I run into the same tension every time I analyze real-world data: the average tells one story, but the edges tell the story I actually need. If you’re modeling delivery times, patient wait times, ad latency, or emissions per mile, the mean can look fine while the slowest 10% or the cleanest 20% behave very differently. That’s where quantile regression earns its place. It lets you model the 10th, 50th, 70th, or 95th percentile directly instead of guessing them from a mean-based model.

In this post I’ll show you how to perform quantile regression in Python using a clean, runnable workflow. I’ll start with an intuitive view of what quantile regression does, then walk through a concrete dataset, fit multiple quantiles, and visualize results. I’ll also cover when you should use it, when you shouldn’t, common mistakes I see in production code reviews, and performance considerations for bigger datasets in 2026-style workflows. By the end, you’ll have a reusable template you can drop into your own analysis.

Why quantiles beat averages for messy reality

When I explain quantile regression to newer engineers, I use a simple analogy: imagine the average height of waves in the ocean. If you build a boat based on the average wave height, you’ll be unprepared for the biggest waves that actually swamp you. A mean-based regression gives you the “average wave,” but quantile regression lets you model the wave you actually care about—maybe the 90th percentile if safety matters.

Linear regression estimates the conditional mean of a response variable given predictors. Quantile regression estimates a conditional quantile, such as the 50th percentile (median) or the 90th percentile (tail risk). This difference matters anytime your data has skew, heavy tails, or unequal variance (heteroscedasticity). In practice, those are the default conditions in engineering datasets, not the exception.

Here’s the practical payoff: you can answer questions like “What emission level should I expect for the worst 30% of trips at this distance?” or “What is the typical response time for the fastest 20% of requests?” And you can do it with a model that is still linear and interpretable.

Building a small dataset you can run today

I like to teach with a small, clear dataset first. Let’s create a synthetic dataset that mimics total distance traveled and total emissions generated by 20 cars. I’ll add noise that scales with distance to mimic the real world where variability increases as you go farther.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

Reproducibility

np.random.seed(0)

Sample size

rows = 20

Predictors

distance = np.random.uniform(1, 10, rows)

Response with distance-dependent noise

emission = 40 + distance + np.random.normal(loc=0, scale=0.25 * distance, size=rows)

DataFrame

df = pd.DataFrame({"Distance": distance, "Emission": emission})

print(df.head())

You should see a DataFrame with two columns. This data is small on purpose: it’s easy to visualize and keeps the mental model clean. The next step is to fit a quantile regression model.

How quantile regression actually fits the line

Under the hood, quantile regression solves a different objective than ordinary least squares (OLS). OLS minimizes the sum of squared residuals, which gives you the conditional mean. Quantile regression minimizes an asymmetric loss called the “check” function. For the 70th percentile, it weights under-predictions more heavily than over-predictions so that 70% of points end up below the fitted line.

You don’t need to hand-derive it to use it, but you should understand the intuition: the model is trying to place the line so that a chosen fraction of the data falls below it, while still keeping the line as “close” as possible under that asymmetric loss.

In statsmodels, quantile regression is accessible through quantreg in the formula API. You set the quantile you want with the q parameter. Here’s a runnable example for the 70th percentile.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

np.random.seed(0)

rows = 20

distance = np.random.uniform(1, 10, rows)

emission = 40 + distance + np.random.normal(loc=0, scale=0.25 * distance, size=rows)

df = pd.DataFrame({"Distance": distance, "Emission": emission})

Fit 70th percentile regression

model = smf.quantreg("Emission ~ Distance", df).fit(q=0.7)

print(model.summary())

The fitted equation has the form:

Emission = intercept + slope * Distance

That equation is not “the average emission.” It’s the 70th percentile estimate. If the slope is 1.3, that means when distance rises by 1 unit, the 70th percentile emission rises by about 1.3 units. That’s a very different statement than a mean-based claim, especially when your data has wide tails.

Plotting the fitted quantile against the data

Numbers are useful, but a plot makes the idea stick. When I review quantile regression outputs with a team, I always plot at least one quantile line with the raw data points. That makes it obvious which percentile you’re modeling and helps catch mistakes fast.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

import matplotlib.pyplot as plt

np.random.seed(0)

rows = 20

distance = np.random.uniform(1, 10, rows)

emission = 40 + distance + np.random.normal(loc=0, scale=0.25 * distance, size=rows)

df = pd.DataFrame({"Distance": distance, "Emission": emission})

model = smf.quantreg("Emission ~ Distance", df).fit(q=0.7)

Prepare line for plot

x = np.linspace(df["Distance"].min(), df["Distance"].max(), 100)

y = model.params["Intercept"] + model.params["Distance"] * x

fig, ax = plt.subplots(figsize=(10, 6))

ax.scatter(df["Distance"], df["Emission"], alpha=0.4, label="Data")

ax.plot(x, y, color="black", label="70th percentile")

ax.set_xlabel("Distance Traveled")

ax.set_ylabel("Emission Generated")

ax.legend()

plt.tight_layout()

plt.show()

If the line tracks the upper third of the points, your quantile is doing what you want. If it looks off, your issue is usually data quality, scaling, or a mismatch between model form and reality.

Modeling multiple quantiles to see the full shape

I rarely stop at a single quantile in practice. The 10th, 50th, and 90th percentiles together show the spread of the outcome across your predictor range. This is especially useful if variability grows with distance or time.

Here’s a clean pattern I use to fit several quantiles and plot them together.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

import matplotlib.pyplot as plt

np.random.seed(42)

rows = 120

distance = np.random.uniform(1, 10, rows)

emission = 40 + distance + np.random.normal(loc=0, scale=0.35 * distance, size=rows)

df = pd.DataFrame({"Distance": distance, "Emission": emission})

quantiles = [0.1, 0.5, 0.9]

models = {}

for q in quantiles:

models[q] = smf.quantreg("Emission ~ Distance", df).fit(q=q)

x = np.linspace(df["Distance"].min(), df["Distance"].max(), 200)

fig, ax = plt.subplots(figsize=(10, 6))

ax.scatter(df["Distance"], df["Emission"], alpha=0.25, label="Data")

for q, model in models.items():

y = model.params["Intercept"] + model.params["Distance"] * x

ax.plot(x, y, label=f"{int(q*100)}th percentile")

ax.set_xlabel("Distance Traveled")

ax.set_ylabel("Emission Generated")

ax.legend()

plt.tight_layout()

plt.show()

When those quantile lines spread out as distance increases, that’s a visual sign of heteroscedasticity. Mean-based regression hides this, while quantiles expose it. In a code review, I treat this as evidence that a mean-only model will mislead users about risk or performance extremes.

Interpreting coefficients without fooling yourself

This is the section I wish more people read carefully. Quantile regression coefficients are conditional quantile effects, not average effects. A slope of 2.0 at the 90th percentile means: “when the predictor increases by one unit, the 90th percentile of the response increases by about two units, holding other predictors constant.” It does not mean the average response increases by two units. That distinction is easy to forget when you’re scanning a table of coefficients.

Two practical habits help avoid confusion:

  • Always label the quantile in your output. I literally prefix coefficient tables with “Q90” or “Q50.”
  • When communicating results, include a phrase like “at the 90th percentile” in the sentence. It feels repetitive, but it eliminates misunderstandings.

If you’re reporting multiple quantiles, it’s useful to align them in a table so readers can compare slopes across quantiles and see how effects change across the distribution.

A quick comparison with OLS (side-by-side)

I often show both OLS and quantile regression in the same notebook to make the differences obvious. Here’s a compact way to do that with the same dataset.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

np.random.seed(12)

rows = 200

distance = np.random.uniform(1, 12, rows)

emission = 35 + 1.1 distance + np.random.normal(loc=0, scale=0.35 distance, size=rows)

df = pd.DataFrame({"Distance": distance, "Emission": emission})

ols = smf.ols("Emission ~ Distance", df).fit()

q50 = smf.quantreg("Emission ~ Distance", df).fit(q=0.5)

q90 = smf.quantreg("Emission ~ Distance", df).fit(q=0.9)

print("OLS:", ols.params)

print("Q50:", q50.params)

print("Q90:", q90.params)

The output won’t be identical even though the dataset is linear. As you move to higher quantiles, the slope often increases because the noise distribution widens with distance. This is exactly the story you want to tell if the tails are operationally important.

A deeper, more realistic example (with multiple predictors)

Single-variable examples are great for intuition, but most real workflows use multiple predictors. Here’s a more realistic setup: we model delivery time as a function of distance, number of stops, and traffic index. Notice how we construct the synthetic data so variability increases with traffic and stops—this is the kind of heteroscedasticity that breaks mean-only thinking.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

np.random.seed(21)

rows = 400

Predictors

distance = np.random.uniform(1, 30, rows)

stops = np.random.randint(1, 8, rows)

traffic = np.random.uniform(0, 1, rows)

Response: base + linear effects + noise that grows with traffic and stops

base = 12

noise = np.random.normal(loc=0, scale=1.5 + 2.5 traffic + 0.3 stops, size=rows)

travel_time = base + 0.8 distance + 1.2 stops + 4.0 * traffic + noise

Build dataset

ship = pd.DataFrame({

"Distance": distance,

"Stops": stops,

"Traffic": traffic,

"Time": travel_time

})

Fit multiple quantiles

quantiles = [0.1, 0.5, 0.9]

models = {q: smf.quantreg("Time ~ Distance + Stops + Traffic", ship).fit(q=q) for q in quantiles}

for q, m in models.items():

print(f"Q{int(q*100)}:")

print(m.params)

print("---")

This example shows you how quantile regression behaves when multiple features interact. Watch for coefficients that change across quantiles. In real datasets, traffic might have a small effect on the median but a huge effect on the 90th percentile, because congestion affects late deliveries disproportionately. Quantile regression makes that visible.

Visualizing multiple predictors (practical plotting trick)

Plotting with multiple predictors can get messy. One trick I use is to fix all but one predictor to their median values and then plot the fitted lines for different quantiles. It’s not perfect, but it gives stakeholders a clean “slice” of the model.

import numpy as np

import matplotlib.pyplot as plt

Fix values for Stops and Traffic

stops_fixed = ship["Stops"].median()

traffic_fixed = ship["Traffic"].median()

x = np.linspace(ship["Distance"].min(), ship["Distance"].max(), 200)

fig, ax = plt.subplots(figsize=(10, 6))

ax.scatter(ship["Distance"], ship["Time"], alpha=0.15, label="Data")

for q, m in models.items():

y = (m.params["Intercept"] +

m.params["Distance"] * x +

m.params["Stops"] * stops_fixed +

m.params["Traffic"] * traffic_fixed)

ax.plot(x, y, label=f"{int(q*100)}th percentile")

ax.set_xlabel("Distance")

ax.set_ylabel("Delivery Time")

ax.legend()

plt.tight_layout()

plt.show()

This approach keeps the story simple: “Here’s how distance affects delivery time at different percentiles when stops and traffic are typical.” It’s a good compromise when you want interpretability without building 3D plots.

Choosing the right quantiles (not just 10/50/90)

The right quantiles are about the business question, not statistical tradition. I often use:

  • 0.5 when stakeholders want “typical” behavior.
  • 0.75 or 0.9 when they care about bad cases but not catastrophic outliers.
  • 0.95 or 0.99 when strict SLAs or safety constraints exist.
  • 0.1 or 0.2 when you want to understand best-case behavior or top performers.

In reliability engineering or customer support, 0.9 and 0.95 are common. In financial risk, 0.95 or 0.99 might be necessary. In education or health datasets, 0.1 and 0.9 help reveal the distribution of outcomes across students or patients.

My rule of thumb: pick quantiles that map to decisions. If you’re not sure what decision a quantile supports, you probably don’t need it.

When quantile regression is the right tool

In my experience, quantile regression is the right call when you need insight into tails, not just averages. A few concrete cases:

  • Service-level analysis: If your SLA is 95th percentile response time, you should model the 95th percentile, not the mean.
  • Pricing and risk: If you need to plan for worst-case cost, you want upper quantiles rather than mean projections.
  • Human behavior: In education, health, or engagement metrics, the low and high performers carry the real story.
  • Heteroscedastic data: When variability increases with your predictor, quantile regression gives a fuller picture.

I also like it for explainability. A quantile line is still linear, still interpretable, and still communicates a simple slope, just anchored to a percentile instead of the mean.

When you should not use it

Quantile regression is not a general replacement for everything. I avoid it when:

  • The dataset is tiny and I need stable estimates. Quantile regression can be noisy with very small samples.
  • The relationship is strongly non-linear and I’m unwilling to transform or expand features.
  • I’m working with heavy multicollinearity and need robust inference rather than percentile-specific behavior.
  • The team needs a single, simple forecast and the tails are not relevant.

If your goal is only the expected average outcome, OLS is faster and usually more stable. That’s a reasonable trade-off in some production systems, especially when you’re building a first pass.

Edge cases that break naïve implementations

Here are the edge cases that commonly cause confusing results or failed convergence, and how I handle them:

  • Tiny datasets: With fewer than ~30 observations, quantiles can be unstable because you’re effectively slicing very few points. Solution: aggregate more data, reduce the number of quantiles, or switch to robust OLS as a stopgap.
  • High-leverage outliers: Extreme points can warp upper quantiles. Solution: inspect leverage, consider trimming or winsorizing, and always plot your data.
  • Collinearity: Strong correlation between predictors makes any regression unstable, but it’s worse when you’re looking at tails. Solution: reduce redundant features or use domain-driven feature selection.
  • Domain constraints: If the response can’t be negative, a linear quantile model can still predict negatives at low quantiles. Solution: transform the response (e.g., log) or add a floor in post-processing.
  • Sparse regions: If there are gaps in predictor space, quantile lines can swing wildly. Solution: use splines or segment the data by meaningful ranges.

The common theme: quantile regression exposes data problems that mean-based models hide. That’s a feature, not a bug, but you still need to handle those issues thoughtfully.

Common mistakes I see in real projects

Here are the pitfalls that show up most often when I review quantile regression code in production repositories:

  • Forgetting to scale or normalize inputs: Large feature scales can slow down or destabilize the solver. For very large or very small values, standardizing helps.
  • Interpreting coefficients as mean effects: A 90th percentile slope is not an average slope. It’s the slope at the 90th percentile of the response distribution.
  • Mixing training and evaluation data: I’ve seen people plot quantile lines against a filtered dataset and misread the fit quality. Always check against the full data.
  • Using a single quantile to claim full distribution behavior: One quantile is a single slice. Use at least two or three if you care about distribution shape.
  • Ignoring constraints of the domain: If emission can’t be negative, a linear model might still predict negatives for low values. Consider transformations or constraints when needed.

Fixing these issues usually takes minutes, and it makes your results far easier to trust.

A practical workflow: data cleaning and feature checks

Quantile regression is sensitive to data quality. Here’s a minimal checklist I follow before fitting models:

  • Verify there are no impossible values (negative time, negative distance, etc.).
  • Check missing values and decide whether to impute or drop.
  • Inspect distributions of each predictor and the response.
  • Plot response vs. key predictors to visually confirm the relationship shape.
  • Confirm that outliers are real phenomena, not logging errors.

A little pre-flight work saves hours of debugging and prevents the “quantile line looks weird” surprise.

Bootstrap confidence intervals (when you need uncertainty)

Quantile regression doesn’t always give simple closed-form inference like OLS. When I need uncertainty bounds, I often use bootstrap resampling. It’s slower, but it’s robust and easy to explain.

Here’s a compact example that bootstraps the slope for the 90th percentile:

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

np.random.seed(101)

rows = 300

distance = np.random.uniform(1, 15, rows)

emission = 45 + 0.9 distance + np.random.normal(loc=0, scale=0.4 distance, size=rows)

df = pd.DataFrame({"Distance": distance, "Emission": emission})

q = 0.9

boot = 300

slopes = []

for _ in range(boot):

sample = df.sample(n=len(df), replace=True)

m = smf.quantreg("Emission ~ Distance", sample).fit(q=q)

slopes.append(m.params["Distance"])

slopes = np.array(slopes)

cilow, cihigh = np.percentile(slopes, [2.5, 97.5])

print(f"Q{int(q*100)} slope CI: [{cilow:.3f}, {cihigh:.3f}]")

Use this when you’re presenting results to stakeholders who want “how sure are we?” or when you’re comparing quantile slopes across groups.

Grouped quantile regression (a common real-world need)

A typical production scenario: you have multiple segments (regions, device types, user cohorts) and you want quantiles for each group. You can do this with a simple loop. Here’s a pattern I actually use.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

np.random.seed(22)

rows = 600

region = np.random.choice(["North", "South", "West"], size=rows)

distance = np.random.uniform(1, 20, rows)

traffic = np.random.uniform(0, 1, rows)

Region-specific baseline and noise

baseline = {"North": 10, "South": 12, "West": 9}

noise_scale = {"North": 1.5, "South": 2.0, "West": 1.0}

travel_time = (

0.7 distance + 3.0 traffic +

np.array([baseline[r] for r in region]) +

np.random.normal(0, [noise_scale[r] for r in region], size=rows)

)

ship = pd.DataFrame({

"Region": region,

"Distance": distance,

"Traffic": traffic,

"Time": travel_time

})

q = 0.9

models = {}

for r in ship["Region"].unique():

sub = ship[ship["Region"] == r]

models[r] = smf.quantreg("Time ~ Distance + Traffic", sub).fit(q=q)

for r, m in models.items():

print(r, m.params)

This workflow makes it easy to compare tail behavior across regions. It’s also a great way to spot operational differences that are invisible to average-based models.

Alternative approaches and why quantile regression still matters

It’s worth acknowledging other methods that sometimes get used for similar goals:

  • Gradient boosting with quantile loss: Great for non-linear effects and complex interactions. It can outperform linear quantile regression on prediction accuracy but is less interpretable.
  • Quantile random forests: Useful for non-linear quantiles without heavy tuning, but can be computationally expensive.
  • Distributional regression (e.g., modeling mean and variance): Provides a full distribution if you can assume a parametric form. This can be elegant but fragile when assumptions break.

I still default to linear quantile regression when interpretability and speed matter. It’s the tool I use to quickly answer “what’s happening at the tails?” without building a full ML pipeline.

Performance considerations for larger datasets (2026 context)

Quantile regression is heavier than OLS. With 100,000 rows, you’ll feel the difference. In my experience, you can expect a simple linear quantile regression to run in the tens to hundreds of milliseconds on a modern laptop for a small dataset, but it can stretch into seconds for big datasets and many quantiles. The cost grows with the number of quantiles and predictors.

Here’s how I keep it practical in modern pipelines:

  • Batch your quantiles: Fit a small set of key quantiles (for example 0.1, 0.5, 0.9) rather than every 5%. You can expand later.
  • Reduce features early: Use domain knowledge to drop irrelevant columns. Feature pruning matters more here than with OLS.
  • Work with samples for exploration: Fit on a 10-20% sample during prototyping, then scale to full data for final results.
  • Automate plots in notebooks: In 2026 workflows, I typically run this in JupyterLab or VS Code notebooks with a small plotting helper to avoid repeating setup code.

If you need high-throughput quantile estimation, you might pair a faster approximate method with full quantile regression for final validation. But for most engineering datasets, statsmodels still holds up well for linear quantile regression.

Traditional vs modern workflow choices

When teams ask me how to structure their workflow, I present a simple comparison. It helps keep decisions clear without overthinking.

Approach

Best for

Tools

Trade-off

Traditional OLS

Mean behavior

statsmodels OLS, scikit-learn LinearRegression

Faster, but misses tails

Quantile Regression

Tail or median modeling

statsmodels quantreg

Slower, but tail-aware

Quantile + Visualization

Communication and QA

statsmodels + matplotlib

More setup, but clearer resultsIf you’re just starting, go straight to quantile regression when the problem is tail-focused. If you’re unsure, fit both OLS and a couple of quantiles and compare the lines. The mismatch tells you what the mean is hiding.

A production-ready template I actually use

Below is a single script you can drop into a project. It creates data, fits a few quantiles, prints a compact summary, and plots the results. I keep it in a scratch file and reuse it whenever I need a quick quantile scan.

import numpy as np

import pandas as pd

import statsmodels.formula.api as smf

import matplotlib.pyplot as plt

1) Data

np.random.seed(7)

rows = 200

distance = np.random.uniform(1, 12, rows)

noise = np.random.normal(loc=0, scale=0.3 * distance, size=rows)

emission = 35 + 1.1 * distance + noise

Create dataset

cars = pd.DataFrame({"Distance": distance, "Emission": emission})

2) Fit models

quantiles = [0.1, 0.5, 0.9]

models = {q: smf.quantreg("Emission ~ Distance", cars).fit(q=q) for q in quantiles}

3) Print compact summary

for q, m in models.items():

intercept = m.params["Intercept"]

slope = m.params["Distance"]

print(f"Q{int(q100)}: Emission = {intercept:.3f} + {slope:.3f} Distance")

4) Plot

x = np.linspace(cars["Distance"].min(), cars["Distance"].max(), 250)

fig, ax = plt.subplots(figsize=(10, 6))

ax.scatter(cars["Distance"], cars["Emission"], alpha=0.3, label="Data")

for q, m in models.items():

y = m.params["Intercept"] + m.params["Distance"] * x

ax.plot(x, y, label=f"{int(q*100)}th percentile")

ax.set_xlabel("Distance Traveled")

ax.set_ylabel("Emission Generated")

ax.legend()

plt.tight_layout()

plt.show()

This script does what I need 90% of the time: it gives me the key quantile lines and a fast visual check. From there, I can decide whether to add feature interactions, transform the response, or split the data by group.

Monitoring and model validation in production

If you deploy quantile regression in a pipeline, you should monitor it like any other model. I keep a few basic checks:

  • Track residuals by quantile over time. If the 90th percentile model starts underpredicting systematically, it’s drifting.
  • Compare predicted quantiles to actual quantiles in recent windows. That’s the best sanity check.
  • Watch for changes in predictor distributions. Quantile regression is more sensitive to shifts in feature ranges.

These checks are simple to automate and prevent quietly degrading performance.

Key takeaways and next steps

If you care about the average, linear regression is fine. If you care about the distribution, especially the tails, quantile regression is the tool I trust. It gives you interpretable lines for the percentiles that matter and reveals patterns that mean-only models hide. I’ve used it for latency budgets, emissions analysis, and cost forecasting; the pattern is the same every time: the tails move differently than the mean, and quantile regression makes that visible.

Your next step should be to pick a real dataset where you already know the mean is misleading. Fit the 10th, 50th, and 90th percentiles and plot them. If the lines diverge, you have evidence that tail behavior deserves its own model. If they stay tightly aligned, you can justify keeping a simpler approach.

From there, decide what your stakeholders actually care about: typical performance, best-case performance, or worst-case risk. Match your quantiles to that decision, communicate your results clearly, and you’ll have a model that tells the story the mean never could.

Scroll to Top