Skip to content

ENH: __repr__ for NpzFile object#23357

Merged
seberg merged 8 commits intonumpy:mainfrom
ganesh-k13:NpzFile_str
Apr 6, 2023
Merged

ENH: __repr__ for NpzFile object#23357
seberg merged 8 commits intonumpy:mainfrom
ganesh-k13:NpzFile_str

Conversation

@ganesh-k13
Copy link
Copy Markdown
Member

@ganesh-k13 ganesh-k13 commented Mar 7, 2023

Changes

Code:

import numpy as np

from tempfile import TemporaryFile
outfile = TemporaryFile()
x = np.arange(10)

np.savez(outfile, *[x]*10)
_ = outfile.seek(0) # Only needed here to simulate closing & reopening file
npzfile = np.load(outfile)
print(npzfile)

np.lib.npyio.NpzFile.MAX_REPR_ARRAY_COUNT=10
print(npzfile)

Before:

<numpy.lib.npyio.NpzFile object at 0x7f3a202f8d90>
<numpy.lib.npyio.NpzFile object at 0x7f3a202f8d90>

Now:

<NpzFile object containing arr_0, arr_1, arr_2, arr_3, arr_4...>
<NpzFile object containing arr_0, arr_1, arr_2, arr_3, arr_4, arr_5, arr_6, arr_7, arr_8, arr_9>

Todo

  • Small release notes
  • Typing changes
  • Some doc changes (?)

resolves #23319

@charris charris changed the title ENH: __str__ for NpzFile object ENH: __str__ for NpzFile object Mar 7, 2023
@ganesh-k13 ganesh-k13 force-pushed the NpzFile_str branch 2 times, most recently from e0eed41 to a6f123d Compare March 8, 2023 07:03
@ganesh-k13
Copy link
Copy Markdown
Member Author

@seberg is this what you had in mind?

@ganesh-k13 ganesh-k13 added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Mar 8, 2023
@seberg
Copy link
Copy Markdown
Member

seberg commented Mar 8, 2023

Something in this direction, maybe @rossbar has more opinions suggestions though. I really just thought that including the filename may be nice and that we should use __repr__ since that is what users probably end up staring at most.

@ganesh-k13 ganesh-k13 changed the title ENH: __str__ for NpzFile object ENH: __repr__ for NpzFile object Mar 8, 2023
ganesh-k13 added a commit to ganesh-k13/numpy that referenced this pull request Mar 9, 2023
@ganesh-k13 ganesh-k13 marked this pull request as ready for review March 9, 2023 14:40
@ganesh-k13 ganesh-k13 removed the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Mar 9, 2023
@seberg seberg requested a review from rossbar March 9, 2023 14:41
Copy link
Copy Markdown
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small comments, otherwise I would be happy with this. To be clear, I am also happy to use a very different __repr__. For example, I am having second thoughts if there really is a reason for including the <> just because exec(repr) doesn't make sense.

Copy link
Copy Markdown
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ganesh-k13 this is a very nice usability improvement. I've added my own little nits but like @seberg no major blockers, though I do lean pretty strongly against adding another publicly-exposed parameter.

It'd also be nice to fix the keys() repr here, but that can also be saved for a separate PR!

@ganesh-k13 ganesh-k13 force-pushed the NpzFile_str branch 2 times, most recently from 19145be to 4b7ba5f Compare March 18, 2023 05:48
* Made ``MAX_REPR_ARRAY_COUNT`` private
* Change repr string to make it more readable
@seberg
Copy link
Copy Markdown
Member

seberg commented Mar 27, 2023

@rossbar should we just give this a shot, or do you have any tweaking ideas for how to format the repr?

Copy link
Copy Markdown
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ganesh-k13 I took the liberty of making two minor adjustments - I hope you don't mind!

  1. There was one remaining instance of the old-style %s formatting - I managed to get rid of it by slightly modifying the logic in d8b97f5.
  2. I also took the liberty of simplifying the test. My main goal was to remove the parameterization of the _MAX_REPR_ARRAY_COUNT attribute, as this struck me as potentially problematic for niche testing setups, such as working with extensions like pytest-randomly and/or pytest-xdist. Also - I don't know that we'd want to encourage users to modify that value, so leaving it out of the test parametrization seems desirable.

Hopefully neither of these changes is too disruptive, but please feel free to git reset if you want to discuss further! Other than that, LGTM thanks @ganesh-k13 . IMO this is a nice usability improvement 🚀

@rossbar
Copy link
Copy Markdown
Contributor

rossbar commented Apr 5, 2023

FWIW my suggestions broke the linter, but IMO the line that is 2 characters over the limit looks a lot better than splitting it up into multiple lines. Please feel free to reformat if you disagree!

@seberg
Copy link
Copy Markdown
Member

seberg commented Apr 6, 2023

Thanks Ganesh and Ross for the input. I like the new repr, and if anyone has a good idea for improving it, I think improving the repr is fine to just follow up.

@seberg seberg merged commit be0e502 into numpy:main Apr 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ENH: Explicitly show keys of .npz file in repr

3 participants