Skip to content

Differences in IV estimation relative to linearmodels #197

@aeturrell

Description

@aeturrell

I'm working on an IV example that was previously working in linearmodels but doesn't solve in pyfixest. It's possible that I've just misunderstood the syntax though!

reprex:

In this case, the model will be

$$ \text{Price}_i = \hat{\pi_0} + \hat{\pi_1} \text{SalesTax}_i + v_i $$

in the first stage regression and

$$ \text{Packs}_i = \hat{\beta_0} + \hat{\beta_2}\widehat{\text{Price}_i} + \hat{\beta_1} \text{RealIncome}_i + u_i $$

in the second stage.

Data:

import pandas as pd
from linearmodels.iv import IV2SLS

dfiv = pd.read_csv(
    "https://vincentarelbundock.github.io/Rdatasets/csv/AER/CigarettesSW.csv",
    dtype={"state": "category", "year": "category"},
).assign(
    rprice=lambda x: x["price"] / x["cpi"],
    rincome=lambda x: x["income"] / x["population"] / x["cpi"],
)
dfiv.head()

linearmodels runs okay:

results_iv2sls = IV2SLS.from_formula(
    "np.log(packs) ~ 1 + np.log(rincome) + C(year) + C(state) + [np.log(rprice) ~ taxs]",
    df,
).fit(cov_type="clustered", clusters=df["year"])
print(results_iv2sls.summary)
                          IV-2SLS Estimation Summary                          
==============================================================================
Dep. Variable:          np.log(packs)   R-squared:                      0.9659
Estimator:                    IV-2SLS   Adj. R-squared:                 0.9279
No. Observations:                  96   F-statistic:                -1.296e+17
Date:                Thu, Oct 26 2023   P-value (F-stat)                1.0000
Time:                        09:31:50   Distribution:                 chi2(50)
Cov. Estimator:             clustered                                         
                                                                              
                                Parameter Estimates                                
===================================================================================
                 Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
-----------------------------------------------------------------------------------
Intercept           9.4924     0.0263     360.24     0.0000      9.4407      9.5440
np.log(rincome)     0.4434                                                         
C(year)[T.1995]    -0.0328                                                         
C(state)[T.AR]      0.1770     0.0531     3.3338     0.0009      0.0729      0.2810
C(state)[T.AZ]     -0.0899     0.0132    -6.8132     0.0000     -0.1158     -0.0640
C(state)[T.CA]     -0.2781     0.0214    -12.996     0.0000     -0.3200     -0.2361
C(state)[T.CO]     -0.2479     0.0090    -27.625     0.0000     -0.2655     -0.2303
C(state)[T.CT]     -0.0171     0.0196    -0.8720     0.3832     -0.0556      0.0213
C(state)[T.DE]      0.1110     0.0291     3.8105     0.0001      0.0539      0.1682
C(state)[T.FL]      0.0762     0.0142     5.3596     0.0000      0.0483      0.1041
C(state)[T.GA]     -0.0695     0.0251    -2.7706     0.0056     -0.1186     -0.0203
C(state)[T.IA]      0.0120     0.0739     0.1629     0.8706     -0.1328      0.1569
C(state)[T.ID]     -0.1272     0.0077    -16.597     0.0000     -0.1423     -0.1122
C(state)[T.IL]     -0.0339     0.0081    -4.1912     0.0000     -0.0497     -0.0180
C(state)[T.IN]      0.1198     0.0611     1.9609     0.0499   5.573e-05      0.2395
C(state)[T.KS]     -0.0910     0.0305    -2.9884     0.0028     -0.1507     -0.0313
C(state)[T.KY]      0.3525     0.0631     5.5906     0.0000      0.2289      0.4761
C(state)[T.LA]      0.1315     0.0104     12.664     0.0000      0.1112      0.1519
C(state)[T.MA]     -0.0403     0.0069    -5.8826     0.0000     -0.0538     -0.0269
C(state)[T.MD]     -0.2322     0.0239    -9.7376     0.0000     -0.2790     -0.1855
C(state)[T.ME]      0.2008     0.0574     3.5011     0.0005      0.0884      0.3133
C(state)[T.MI]      0.1268     0.0745     1.7009     0.0890     -0.0193      0.2728
C(state)[T.MN]      0.0568     0.0490     1.1595     0.2463     -0.0392      0.1529
C(state)[T.MO]      0.0640     0.0476     1.3454     0.1785     -0.0292      0.1572
C(state)[T.MS]      0.1501     0.0272     5.5267     0.0000      0.0969      0.2034
C(state)[T.MT]     -0.1522     0.0054    -28.250     0.0000     -0.1627     -0.1416
C(state)[T.NC]      0.0396     0.0191     2.0655     0.0389      0.0020      0.0771
C(state)[T.ND]     -0.0311     0.0399    -0.7787     0.4361     -0.1092      0.0471
C(state)[T.NE]     -0.0741     0.0375    -1.9765     0.0481     -0.1476     -0.0006
C(state)[T.NH]      0.3504     0.0315     11.114     0.0000      0.2886      0.4122
C(state)[T.NJ]     -0.0873  6.107e-05    -1429.3     0.0000     -0.0874     -0.0872
C(state)[T.NM]     -0.2858     0.0040    -71.049     0.0000     -0.2937     -0.2779
C(state)[T.NV]      0.1789     0.0259     6.9075     0.0000      0.1281      0.2296
C(state)[T.NY]     -0.0719     0.0032    -22.256     0.0000     -0.0782     -0.0655
C(state)[T.OH]      0.0325     0.0402     0.8088     0.4186     -0.0463      0.1114
C(state)[T.OK]      0.0946     0.0538     1.7572     0.0789     -0.0109      0.2000
C(state)[T.OR]     -0.0153     0.0673    -0.2269     0.8205     -0.1471      0.1166
C(state)[T.PA]     -0.0031     0.0006    -4.8401     0.0000     -0.0044     -0.0019
C(state)[T.RI]      0.1394     0.0921     1.5136     0.1301     -0.0411      0.3200
C(state)[T.SC]     -0.0212     0.0334    -0.6345     0.5257     -0.0866      0.0442
C(state)[T.SD]     -0.0675     0.0711    -0.9488     0.3427     -0.2069      0.0719
C(state)[T.TN]      0.1473     0.0470     3.1340     0.0017      0.0552      0.2394
C(state)[T.TX]     -0.0579     0.0136    -4.2560     0.0000     -0.0845     -0.0312
C(state)[T.UT]     -0.4899     0.0276    -17.776     0.0000     -0.5440     -0.4359
C(state)[T.VA]     -0.0559     0.0471    -1.1875     0.2350     -0.1482      0.0364
C(state)[T.VT]      0.2209     0.0467     4.7267     0.0000      0.1293      0.3125
C(state)[T.WA]      0.0064     0.0011     6.0151     0.0000      0.0043      0.0085
C(state)[T.WI]      0.0741     0.0590     1.2569     0.2088     -0.0415      0.1897
C(state)[T.WV]      0.1576     0.0582     2.7097     0.0067      0.0436      0.2716
C(state)[T.WY]     -0.0169     0.0590    -0.2858     0.7750     -0.1325      0.0988
np.log(rprice)     -1.2793                                                         
===================================================================================

Endogenous: np.log(rprice)
Instruments: taxs
Clustered Covariance (One-Way)
Debiased: False
Num Clusters: 2

pyfixest produces an "UnderDeterminedIVError". Code:

results_iv = feols("np.log(packs) ~ 1 + np.log(rincome) | C(year) + C(state) | np.log(rprice) ~ taxs ", data=dfiv, vcov={"CRV1": "year"})
results_iv.summary()
---------------------------------------------------------------------------
UnderDeterminedIVError                    Traceback (most recent call last)
[/Users/aet/Documents/git_projects/coding-for-economists/econmt-regression.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/aet/Documents/git_projects/coding-for-economists/econmt-regression.ipynb) Cell 67 line 1
----> [1](vscode-notebook-cell:/Users/aet/Documents/git_projects/coding-for-economists/econmt-regression.ipynb#Y340sZmlsZQ%3D%3D?line=0) results_iv = feols("np.log(packs) ~ 1 + np.log(rincome) + C(year) + C(state) | np.log(rprice) ~ taxs ", data=dfiv, vcov={"CRV1": "year"})
      [2](vscode-notebook-cell:/Users/aet/Documents/git_projects/coding-for-economists/econmt-regression.ipynb#Y340sZmlsZQ%3D%3D?line=1) results_iv.summary()

File [~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/estimation.py:129](https://file+.vscode-resource.vscode-cdn.net/Users/aet/Documents/git_projects/coding-for-economists/~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/estimation.py:129), in feols(fml, data, vcov, ssc, fixef_rm, collin_tol)
    126 _estimation_input_checks(fml, data, vcov, ssc, fixef_rm, collin_tol)
    128 fixest = FixestMulti(data=data)
--> 129 fixest._prepare_estimation("feols", fml, vcov, ssc, fixef_rm)
    131 # demean all models: based on fixed effects x split x missing value combinations
    132 fixest._estimate_all_models(vcov, fixest._fixef_keys, collin_tol=collin_tol)

File [~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/FixestMulti.py:85](https://file+.vscode-resource.vscode-cdn.net/Users/aet/Documents/git_projects/coding-for-economists/~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/FixestMulti.py:85), in FixestMulti._prepare_estimation(self, estimation, fml, vcov, ssc, fixef_rm)
     82 self._fixef_keys = None
     83 self._is_multiple_estimation = None
---> 85 fxst_fml = FixestFormulaParser(fml)
     86 fxst_fml.get_fml_dict()  # fxst_fml._fml_dict might look like this: {'0': {'Y': ['Y~X1'], 'Y2': ['Y2~X1']}}. Hence {FE: {DEPVAR: [FMLS]}}
     87 if fxst_fml._is_iv:

File [~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/FormulaParser.py:100](https://file+.vscode-resource.vscode-cdn.net/Users/aet/Documents/git_projects/coding-for-economists/~/mambaforge/envs/codeforecon/lib/python3.10/site-packages/pyfixest/FormulaParser.py:100), in FixestFormulaParser.__init__(self, fml)
     98 if endogvars is not None:
     99     if len(endogvars) > len(instruments):
--> 100         raise UnderDeterminedIVError(
    101             "The IV system is underdetermined. Only fully determined systems are allowed. Please provide as many instruments as endogenous variables."
    102         )
    103     else:
    104         pass

UnderDeterminedIVError: The IV system is underdetermined. Only fully determined systems are allowed.

Grateful for any pointers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions