Skip to content

TST: assert reading of legacy pickles against current data#61792

Merged
jorisvandenbossche merged 9 commits intopandas-dev:mainfrom
jorisvandenbossche:tests-pickle-legacy
Jan 21, 2026
Merged

TST: assert reading of legacy pickles against current data#61792
jorisvandenbossche merged 9 commits intopandas-dev:mainfrom
jorisvandenbossche:tests-pickle-legacy

Conversation

@jorisvandenbossche
Copy link
Copy Markdown
Member

@jorisvandenbossche jorisvandenbossche commented Jul 7, 2025

While reviewing #61770, I noticed that we didn't actually compare the read pickle data to some ground truth expected value, but just to itself (we were essentially doing assert_equal(result, result) ..), due to some accidental change in a clean-up many years ago in f2246cf)

Fixing that here by again creating the expected unpickled data with create_pickle_data() during the test run, to compare with the data from the older pickled files.

@jorisvandenbossche jorisvandenbossche added Testing pandas testing functions or related to the test suite IO Pickle read_pickle, to_pickle labels Jul 7, 2025
Comment thread pandas/tests/io/generate_legacy_storage_files.py Outdated
and legacy_version < Version("1.3.0")
):
# convert to wall time
# (bug since pandas 2.0 that tz gets dropped for older pickle files)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there an issue ref for this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had one: #54659, but reported for an even older pickle file, and therefore closed.

We can reopen that issue, but on the other hand, not sure this is still worth the time to fix since this already was broken the full 2.x cycle, and at some point in the future we will probably drop compat for 1.x pickles altogether. But of course if someone wants to do a PR, I suppose that is welcome.

@mroeschke mroeschke added this to the 3.0 milestone Jul 7, 2025
@jbrockmendel
Copy link
Copy Markdown
Member

can you merge main and see if the pyarrow decimal issue resolves itself?

@github-actions
Copy link
Copy Markdown
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions Bot added the Stale label Aug 10, 2025
@jorisvandenbossche jorisvandenbossche removed this from the 3.0 milestone Jan 14, 2026
for typ, dv in data.items():
for dt, result in dv.items():
expected = result
expected = current_data[typ][dt]
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the actual fix to ensure we are testing things correctly (and not just comparing the result with itself)

@jorisvandenbossche jorisvandenbossche added this to the 3.0 milestone Jan 21, 2026
@jorisvandenbossche
Copy link
Copy Markdown
Member Author

Going to merge this to ensure we will be actually testing the pickle compat in 3.0.x branch

@jorisvandenbossche jorisvandenbossche merged commit fd2a4f4 into pandas-dev:main Jan 21, 2026
42 checks passed
@jorisvandenbossche jorisvandenbossche deleted the tests-pickle-legacy branch January 21, 2026 13:06
vkverma9534 pushed a commit to vkverma9534/pandas that referenced this pull request Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

IO Pickle read_pickle, to_pickle Testing pandas testing functions or related to the test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants