Skip to content

yasumorishima/savant-extras

Repository files navigation

savant-extras

Baseball Savant leaderboard data — complements pybaseball.

pybaseball is great but many Baseball Savant leaderboards are missing or limited. savant-extras fills that gap with 17 leaderboards covering batting, pitching, catching, baserunning, fielding, and park factors — all as simple one-line function calls returning DataFrames.

Installation

pip install savant-extras

Quick Start

from savant_extras import (
    bat_tracking, pitch_tempo, arm_strength,
    pitch_movement, swing_take, catcher_throwing,
)

# Bat tracking with custom date range (2024+)
df = bat_tracking("2024-04-01", "2024-04-30")

# Pitcher pitch tempo
df = pitch_tempo(2024)

# Outfielder arm strength
df = arm_strength(2024, position="Outfielder")

# Slider movement
df = pitch_movement(2024, pitch_type="SL")

# Batter plate discipline
df = swing_take(2024)

# Catcher pop time & CS rate
df = catcher_throwing(2024)

All Functions

Every leaderboard function returns a pd.DataFrame. Most have a _range() variant for multi-season queries (adds a year column).

Batting

Function Data from Description
bat_tracking(start_date, end_date) 2024+ Bat speed, attack angle, swing tilt (custom date range)
bat_tracking_monthly(year) 2024+ Monthly bat tracking (Apr–Oct)
bat_tracking_splits(year) 2024+ First-half / second-half splits
batted_ball(year) GB/FB/LD rates, pull/oppo splits
home_runs(year) HR distance, exit velocity, xHR, no-doubters
swing_take(year) Run values by zone (heart/shadow/chase/waste)
year_to_year(year) xwOBA changes across seasons

Pitching

Function Data from Description
pitch_tempo(year) 2010+ Pace metrics (median seconds, hot/warm/cold)
pitch_movement(year) Horizontal/vertical break by pitch type
pitcher_arm_angle(year) Release point angles and positions
running_game(year) Pitcher running game (pickoffs, CS above avg)
timer_infractions(year) 2023+ Pitch clock violations

Catching

Function Data from Description
catcher_blocking(year) Blocks above average, PB/WP prevention
catcher_throwing(year) Pop time, exchange time, CS rate, arm strength
catcher_stance(year) One-knee vs traditional: framing, blocking, throwing

Baserunning & Fielding

Function Data from Description
arm_strength(year) 2020+ Fielder throw speed by position
baserunning(year) Total baserunning run value (XB + SB)
basestealing(year) Stolen base run value, lead distances

Park Factors

Function Data from Description
park_factors(season) 2015+ Per-season ballpark run factors for all 30 MLB teams (FanGraphs)
park_factors_range(start, end) 2015+ Multi-season park factors concatenated

Columns returned: season, team, pf_5yr, pf_3yr, pf_1yr, pf_hr, pf_1b, pf_2b, pf_3b, pf_so, pf_bb, pf_fip. All factors: 100 = neutral, >100 = hitter-friendly, <100 = pitcher-friendly.

from savant_extras import park_factors, park_factors_range

# Single season
df = park_factors(2024)
print(df[df["team"] == "COL"][["team", "pf_5yr", "pf_hr"]])
#    team  pf_5yr  pf_hr
# 5   COL     116    131

# Multi-season (e.g. for model training)
df = park_factors_range(2020, 2025)
print(df.shape)  # 6 seasons × 30 teams = 180

Common Parameters

Parameter Type Description
player_type str "batter" or "pitcher" (where applicable)
min_* int or str Minimum qualifier. Pass an int (e.g. min_pa=100) or "q" to apply the MLB standard qualifier automatically.
position str Position filter (arm_strength): "", "RF", "SS", etc.
pitch_type str Pitch type filter (pitch_movement): "FF", "SL", etc.

Multi-Season Queries

Most functions have a _range(start_year, end_year) variant:

from savant_extras import pitch_tempo_range, arm_strength_range

# Compare pitch tempo pre/post pitch clock
df = pitch_tempo_range(2022, 2024)

# 5 years of arm strength
df = arm_strength_range(2020, 2024)

Demo App

MLB Bat Tracking Dashboard — built with savant-extras (source)

Why savant-extras?

Leaderboard pybaseball savant-extras
Bat tracking (date range) Full season only Custom date ranges
Pitch tempo Not supported
Arm strength Not supported
Batted ball profile Not supported
Home runs Not supported
Pitch movement Not supported
Swing & take Not supported
Year-to-year changes Not supported
Pitcher arm angle Not supported
Running game (pitcher) Not supported
Catcher blocking Not supported
Catcher throwing Not supported
Catcher stance Not supported
Baserunning run value Not supported
Basestealing run value Not supported
Timer infractions Not supported
Park factors (FanGraphs) Not supported

Known Issues

  • swing_take(): Baseball Savant の Swing & Take リーダーボードの CSV エンドポイントに障害中(ヘッダーのみ、データ行なし)。現在は空の DataFrame が返ります。上流 API が復旧次第、コード変更なしで動作します。代替として batted_ball()year_to_year() を使用してください。

Cloud Environment Notes

park_factors() and park_factors_range() fetch data from FanGraphs. In cloud environments (Kaggle, Google Colab, GitHub Actions), FanGraphs may return 403 errors. In that case, park_factors_range() returns an empty DataFrame with a warning instead of raising an exception.

Recommended workaround: pre-download locally and upload as a dataset file.

# Run locally and save
from savant_extras import park_factors_range
df = park_factors_range(2024, 2025)
df.to_csv("park_factors.csv", index=False)
# Upload park_factors.csv to your Kaggle dataset, then read it in the notebook

In Kaggle notebooks, pybaseball is not pre-installed. Install both explicitly:

!pip install savant-extras pybaseball

License

MIT

About

Baseball Savant leaderboard data with date range support — complements pybaseball

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages