chore: bump pyarrow to unlock python 3.14 support#707
Conversation
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Greptile SummaryThis PR bumps
|
| Filename | Overview |
|---|---|
| packages/data-designer-config/pyproject.toml | pyarrow bumped from >=19.0.1,<20 to >=22,<23; Python 3.14 classifier added. Lock file matches. |
| .github/workflows/ci.yml | Python 3.14 added to all five job matrices; no other logic changed. |
| uv.lock | pyarrow entry updated from 19.0.1 to 22.0.0 with full wheel list including cp314 wheels; specifier matches pyproject.toml. |
| Makefile | Comment for DOCS_PYTHON_VERSION updated to remove stale pyarrow 3.14 rationale; default stays at 3.13. |
| .agents/skills/datadesigner-docs/SKILL.md | Two doc lines updated: notebook-deps comment and troubleshooting table row both de-reference the now-resolved 3.14/pyarrow constraint. |
| README.md | Python version badge updated from 3.10–3.13 to 3.10–3.14. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Install DataDesigner] --> B{Python version?}
B -->|3.10 to 3.14| C[Resolve pyarrow 22.x]
C --> D{Platform?}
D -->|Linux glibc 2.28+ / macOS / Windows| E[Pre-built wheel available]
D -->|Linux glibc less than 2.28 - EOL distros| F[No wheel - install fails]
E --> G[CI: ubuntu-latest + macos-latest, Python 3.10 3.11 3.12 3.13 3.14]
G --> H[3766 unit tests pass]
Reviews (1): Last reviewed commit: "Simplify pyarrow dep" | Re-trigger Greptile
Review: PR #707 — chore: bump pyarrow to unlock python 3.14 supportSummaryBumps FindingsCorrectness
Style / conventions
Tests
Security / supply chain
Suggestions
VerdictApprove with minor suggestions. Clean, well-scoped dependency bump. The diff stays in dep/CI/docs files only, the import-direction and structural invariants from |
|
Nice work on this one, @mikeknep: this is a tidy dependency unlock with the right CI surface expanded. SummaryThis PR bumps DataDesigner's parquet dependency to I also ran an extra isolated validation pass for the pyarrow 22 risk: Python 3.14.4 and 3.10.12 e2e envs both installed FindingsNo findings. What Looks Good
VerdictShip it. This review was generated by an AI assistant. |
johnnygreco
left a comment
There was a problem hiding this comment.
Approved. The pyarrow 22 + Python 3.14 changes look good, and the extra isolated e2e/parquet smoke coverage passed.
📋 Summary
Bumps the pyarrow dep to
>=22,<23(previously pinned to19.x). Pyarrow 22 is the first release with Python 3.14 wheels🔗 Related Issue
Closes #673
🔄 Changes
🧪 Testing
make testpasses✅ Checklist
Additional note
The original issue suggested a conservative approach where Python <3.14 would continue using pyarrow 19.x while only Python 3.14 would use pyarrow 22. This seemed annoying to deal with, so had another agent assess whether it was necessary and they decided no; full agent output is below.
Verdict: pyarrow 22 across the board is plausible
Test results:
pyarrow 22.0.0.What this confirms: The runtime/behavioral risk in pyarrow 20-22 doesn't manifest in DataDesigner's parquet I/O paths. The codebase uses stable APIs that didn't change.
What this does NOT eliminate: The packaging risk remains. Anyone on a pre-glibc-2.28 Linux distro (CentOS 7, RHEL 7, Ubuntu 18.04 and older — all EOL'd) who could install the previous version will now hit a wheel-not-found situation and either need to upgrade their distro or (more likely) fail to install. That's not something CI catches; it would surface as user reports.
My recommendation: Go with the simpler single-pin approach. The complexity savings are real (one dependency, one CI matrix exercising the same pyarrow everywhere, no maintenance of the version-marker line as we revisit pyarrow upper bounds), and the user-impact risk is bounded: anyone affected is on EOL'd Linux. If the project later gets a bug report from such a user, reverting to a split pin in a patch release is straightforward.