Skip to content

Implements "preprocessing frisch newton" algorithm#941

Merged
s3alfisc merged 36 commits intomasterfrom
qreg-pfn
Jul 1, 2025
Merged

Implements "preprocessing frisch newton" algorithm#941
s3alfisc merged 36 commits intomasterfrom
qreg-pfn

Conversation

@s3alfisc
Copy link
Copy Markdown
Member

@s3alfisc s3alfisc commented Jun 15, 2025

Implements the "preprocessing frisch newton" algorithm as in Portnoy & Koenker (see Chernozukov et al Algorithm 1 for reference).

Questions:

  • should "pfn" also be used when computing standard errors? currently always defaults to "fn"
  • should we block CRV inference with the pfn algo? If yes, should we use the same set of starting observations? Likely yes?
  • As the method assumes independent observations (due to the independent sampling), should we block CRV inference? Could be generalized to using a block bootstrap in the preprocessing step? fyi @apoorvalal

Usage:

import pyfixest as pf 
import pandas as pd 
import numpy as np 

N = 1_000_000
data = pd.DataFrame({
    "Y": np.random.randn(N),
    "X1": np.random.randn(N),
    "X2": np.random.randn(N),
    "X3": np.random.randn(N),
})

fml = "Y ~ X1 + X2 + X3"

# ~ 18s
pf.quantreg(fml, data, method = "fn")

# ~ 12s
pf.quantreg(fml, data, method = "pfn")

@s3alfisc
Copy link
Copy Markdown
Member Author

Btw, applying the preprocessing also in the CRV estimation, we can cut down run time to 1.5 seconds:

# 18 s
fit_fn = pf.quantreg(fml, data, method = "fn")
# 1.5 s
fit_pfn = pf.quantreg(fml, data, method = "pfn")

@s3alfisc s3alfisc linked an issue Jun 15, 2025 that may be closed by this pull request
@codecov
Copy link
Copy Markdown

codecov bot commented Jun 15, 2025

Codecov Report

Attention: Patch coverage is 22.80000% with 193 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyfixest/estimation/quantreg/quantreg_.py 8.65% 95 Missing ⚠️
pyfixest/estimation/quantreg/QuantregMulti.py 20.86% 91 Missing ⚠️
pyfixest/estimation/estimation.py 66.66% 4 Missing ⚠️
pyfixest/estimation/FixestMulti_.py 83.33% 3 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (4879904) and HEAD (83144c7). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (4879904) HEAD (83144c7)
tests-extended 1 0
Flag Coverage Δ
core-tests 76.03% <22.80%> (-2.42%) ⬇️
tests-extended ?
tests-vs-r 16.27% <12.00%> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
pyfixest/estimation/feols_.py 84.64% <ø> (-6.83%) ⬇️
pyfixest/estimation/literals.py 87.50% <100.00%> (+0.83%) ⬆️
pyfixest/estimation/FixestMulti_.py 80.47% <83.33%> (-0.43%) ⬇️
pyfixest/estimation/estimation.py 89.93% <66.66%> (-1.52%) ⬇️
pyfixest/estimation/quantreg/QuantregMulti.py 20.86% <20.86%> (ø)
pyfixest/estimation/quantreg/quantreg_.py 42.98% <8.65%> (-30.55%) ⬇️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@apoorvalal
Copy link
Copy Markdown
Member

AFK for most of today but re: quantregmulti - might be worth looking into VQR and its repo https://github.com/vistalab-technion/vqr

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@s3alfisc
Copy link
Copy Markdown
Member Author

Some updates: as a default, I will implement the Powell Estimator (then uses a non-parametric KDE to estimate sparsity) for computing iid, hetero, CRV errors. This allows use one class of estimators systematically throughout (the CRV estimator is a Powell style estimator). "nid" as in the R package will also be supported (based on linear interpolation, significantly slower than Powell). I will not port the R-quantreg defaults, iid errors in pf and R-quantreg will not match. Main motivation is consistency + I don't know how to implement the R-quantreg SEs without breaching the GPL license as the actual computations are not explained anyhwhere (except for nid, which is explained in the Koenker QR book).

I've also added some benchmarks on the QR process (the Chernozhukov methods 2 is very nice and fast); the performance of the pfn algo in the quantile regression process still seems to be lacking.

@s3alfisc
Copy link
Copy Markdown
Member Author

s3alfisc commented Jul 1, 2025

I hope it's done now (minus some smaller cleanups) ...

@s3alfisc
Copy link
Copy Markdown
Member Author

s3alfisc commented Jul 1, 2025

pre-commit.ci autofix

@s3alfisc s3alfisc merged commit e299b03 into master Jul 1, 2025
8 of 9 checks passed
@s3alfisc s3alfisc deleted the qreg-pfn branch January 11, 2026 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement "Algorithm 2 (Preprocessing for the quantile regression process)" from Chernozhukov et al" Implement the "pfn" preprocessing algorithm

2 participants