A high-performance Python toolkit for portfolio sorting and empirical asset pricing, with a focus on corporate bonds. Part of the Open Source Bond Asset Pricing project.
Paper: Dickerson, Robotti, and Rossetti (2025). The Corporate Bond Factor Replication Crisis: A New Protocol. SSRN
pip install PyBondLabnumba>=0.57 is part of the base install because the maintained package surface depends on it.
For WRDS data download support:
pip install PyBondLab[wrds]For all optional dependencies:
pip install PyBondLab[all]Install from source
git clone https://github.com/GiulioRossetti94/PyBondLab.git
cd PyBondLab
pip install -e ".[performance]"import PyBondLab as pbl
# Sort bonds into quintile portfolios by credit spread
strategy = pbl.SingleSort(holding_period=1, sort_var='cs', num_portfolios=5)
results = pbl.StrategyFormation(data, strategy=strategy, turnover=True).fit()
# Long-short factor returns (equal- and value-weighted)
ew_ls, vw_ls = results.get_long_short()
# Turnover
ew_turn, vw_turn = results.get_turnover()Your data currently needs columns: date, ID (bond identifier), ret (returns), VW (value weight), and RATING_NUM (numeric credit rating, 1-10 = IG, 11-22 = NIG). PRICE is optional and only needed for price filters.
Monthly holding_period > 1 means staggered overlapping cohorts. Quarterly, semi-annual, and annual rebalancing are controlled by rebalance_frequency, where holding_period must be 1.
Use column mapping if your names differ:
results = pbl.StrategyFormation(data, strategy=strategy).fit(
IDvar='cusip', RETvar='ret_vw', VWvar='mcap_e', RATINGvar='spc_rat'
)Start with docs/CoreWorkflow_README.md. It defines the canonical first workflow, required schema, result tiers, and the main semantic traps.
Use StrategyFormation first, then move to BatchStrategyFormation when the single-run workflow is clear. Treat WithinFirmSort, RollingBeta, DataUncertaintyAnalysis, and anomaly assaying as advanced workflows built on top of that core.
dynamic_weightshas no effect whenholding_period == 1.- Non-monthly rebalancing uses
rebalance_frequency; in that modeholding_periodmust be1. WithinFirmSortcurrently supportsholding_period=1only.- Fast batch results contain long-short returns only; use
turnover=Trueorchars=[...]when you need full legs, bond counts, turnover, characteristics, orextract_panel().
Sort the cross-section into portfolios by one or two characteristics.
# Single sort: quintile portfolios
strategy = pbl.SingleSort(holding_period=1, sort_var='cs', num_portfolios=5)
# Double sort: conditional (dependent) 3x3
strategy = pbl.DoubleSort(
holding_period=1,
sort_var='cs', num_portfolios=3,
sort_var2='duration', num_portfolios2=3,
how='conditional'
)Supports banding, custom breakpoints, characteristics tracking, and portfolio turnover. See docs/SingleSort_DoubleSort_README.md for full API.
Isolate within-firm bond dispersion from cross-firm differences. Bonds are sorted into HIGH/LOW portfolios within each firm, then aggregated across firms using market-cap weighting within rating terciles.
strategy = pbl.WithinFirmSort(
holding_period=1,
sort_var='cs',
firm_id_col='PERMNO',
)
results = pbl.StrategyFormation(data, strategy=strategy).fit()See docs/WithinFirmSort_README.md for methodology details.
Process many signals at once. Batch formation can return either full formation results or a reduced fast-path result depending on your settings.
from PyBondLab import BatchStrategyFormation
batch = BatchStrategyFormation(
data=data,
signals=['cs', 'ytm', 'tmat', 'mom6_1', 'val_hz'],
holding_period=1,
num_portfolios=5,
turnover=False,
)
results = batch.fit()
ew_ls, vw_ls = results['cs'].get_long_short()Use turnover=True or chars=[...] when you need full portfolio legs, bond counts, turnover, characteristics, or extract_panel().
Within-firm batch:
from PyBondLab import BatchWithinFirmSortFormation
batch = BatchWithinFirmSortFormation(
data=data,
signals=['cs', 'ytm', 'tmat'],
firm_id_col='PERMNO',
turnover=False,
)
results = batch.fit()See docs/BatchStrategyFormation_README.md and docs/BatchWithinFirmSortFormation_README.md.
Monthly (default), quarterly, semi-annual, or annual. Non-monthly rebalancing computes returns every month while holding portfolio composition fixed between rebalancing dates.
# Quarterly rebalancing
strategy = pbl.SingleSort(
sort_var='cs', num_portfolios=5,
rebalance_frequency='quarterly',
)
# Annual rebalancing in June (Fama-French style)
strategy = pbl.SingleSort(
sort_var='BtM', num_portfolios=5,
rebalance_frequency='annual',
rebalance_month=7, # Formation in July, returns start August
)See docs/NonStaggeredRebalancing_README.md.
Compute breakpoints on a subset (e.g., NYSE stocks) and apply them to the full cross-section.
def nyse_filter(df):
return (df['EXCHCD'] == 1) & (df['SHRCD'].isin([10, 11]))
strategy = pbl.DoubleSort(
holding_period=1,
sort_var='ME', sort_var2='BtM',
num_portfolios=2, num_portfolios2=3,
breakpoints=[50], breakpoints2=[30, 70],
how='unconditional',
rebalance_frequency='annual', rebalance_month=7,
breakpoint_universe_func=nyse_filter,
breakpoint_universe_func2=nyse_filter,
)See examples/FF3/ for a complete Fama-French replication.
Test factor robustness across data filtering configurations. Computes ex-ante and ex-post returns for each filter, with Newey-West t-statistics.
from PyBondLab import DataUncertaintyAnalysis
results = DataUncertaintyAnalysis(
data=data,
signals=['cs', 'ytm'],
holding_periods=[1, 3, 6],
filters={
'trim': [0.2, 0.5],
'price': [[1, 5], [150, 200]],
'bounce': [0.05, -0.05],
'wins': [(99, 'both'), (95, 'both')],
},
ratings=['IG', 'NIG', None],
num_portfolios=5,
).fit()
results.summary() # Summary stats with NW t-statistics
results.to_excel('out.xlsx')See docs/DataUncertaintyAnalysis_README.md.
Test factor significance across specification choices (weighting, number of portfolios, rating subsets, breakpoint universes) following Novy-Marx and Velikov (2023).
from PyBondLab import AssayAnomaly
report = AssayAnomaly(data=data, sort_var='cs', holding_periods=[1])
_, recap = report.summary_results()
print(recap)Which anomaly tool to use:
assay_anomaly_fast: single signal, speed-firstBatchAssayAnomaly: multiple signals, speed-firstAssayAnomaly: richer slow-path workflowAssayAnomalyRunner: advanced/internal control, not the default entry point for new users
See docs/AnomalyAssay_README.md and docs/BatchAssayAnomaly_README.md.
Consistent, readable factor names with optional sign correction:
from PyBondLab import NamingConfig, extract_panel
# Extract all batch results into a single panel
panel = extract_panel(batch_results, naming=NamingConfig(sign_correct=True))
# Columns: date | factor | freq | leg | weighting | return | turnover | chars...See docs/NamingConfig_README.md.
Four look-ahead bias free filtering procedures for corporate bond research:
| Filter | Description | Example |
|---|---|---|
| Trim | Exclude extreme returns | {'adj': 'trim', 'level': 0.2} |
| Price | Exclude extreme prices | {'adj': 'price', 'level': [20, 150]} |
| Bounce | Exclude return reversals | {'adj': 'bounce', 'level': 0.01} |
| Winsorize | Cap tails at percentiles | {'adj': 'wins', 'level': 98, 'location': 'both'} |
results = pbl.StrategyFormation(
data, strategy=strategy,
filters={'adj': 'trim', 'level': 0.2}
).fit()
ew_ea, vw_ea = results.get_long_short() # Ex-ante returns
ew_ep, vw_ep = results.get_long_short_ex_post() # Ex-post returns| Tool | Description | Docs |
|---|---|---|
pbl.Momentum(lookback_period, skip) |
Momentum strategy from past returns | |
pbl.LTreversal(lookback_period, skip) |
Long-term reversal strategy | |
pbl.RollingBeta(factors, window) |
Rolling beta estimation (~30x with numba) | docs |
pbl.PreAnalysisStats(data, variables) |
Summary statistics before sorting | docs |
- Python >= 3.11
- numpy < 2, pandas >= 1.5, statsmodels >= 0.14, scipy >= 1.10, pyarrow
Optional: numba >= 0.57 (performance), wrds (data access)
Dickerson, A., Robotti, C., and Rossetti, G. (2025). The Corporate Bond Factor Replication Crisis: A New Protocol. Working Paper.
Novy-Marx, R. and Velikov, M. (2023). Assaying Anomalies. Working Paper.
Data: openbondassetpricing.com
- Giulio Rossetti -- giulio.rossetti.1@wbs.ac.uk
- Alex Dickerson -- alexander.dickerson1@unsw.edu.au
| Abbreviation | Meaning |
|---|---|
| EW | Equal-weighted |
| VW | Value-weighted |
| LS (L-S) | Long-short (long top portfolio, short bottom portfolio) |
| HP | Holding period (number of overlapping monthly cohorts) |
| IG | Investment grade (rating 1-10) |
| NIG | Non-investment grade / high yield (rating 11-22) |
| EA | Ex-ante (before applying data filters) |
| EP | Ex-post (after applying data filters) |
| NW | Newey-West (heteroskedasticity and autocorrelation consistent standard errors) |
| TRACE | Trade Reporting and Compliance Engine (FINRA corporate bond transaction data) |
| DUA | Data Uncertainty Analysis |
MIT. See LICENSE.