-
-
Notifications
You must be signed in to change notification settings - Fork 12k
ENH: implement voidtype_repr and voidtype_str #8981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This needs some kind of prefix/suffix to make non-alphabetic hex-strings distinguishable from ints. But hex seems like the right output to me |
|
Maybe like |
|
It would also be nice if the or just taking advantage of the existing bytes->void conversion, at the cost of verbosity (using uppercase for readability) Of course, |
|
Hmm that's pretty good. Let me try that. |
|
While we're at it, |
|
if its hex you should prefix it with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as this is private api, does pypy have it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not.. I was going to update it once we dicided what format we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could fall back on the "invoke something in _internal.py" strategy again - repr doesn't need to be incredibly fast
|
@juliantaylor: I'm not a fan of So then we're left comparing:
The first option is surprising, because passing bytes to void (py2) should give a buffer. So we're faced with the last two, and the first is just shorter |
|
hm yes as void can represent arbitrary bytes I like the |
|
I am actually trying out your idea of I like that because it gives back the original array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can just call PyBytes_FromStringAndSize on obval here (it is defined in our compat headers to work in python2 too)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably the same in this case, but shouldn't this be PyObject_Repr here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, already changed on my side
03408ec to
fa50ebd
Compare
|
I still need to figure out what is going on with 0-d arrays. Apparently they are printed totally differently here which I want to try to fix. >>> a = np.zeros(4, dtype='V4')
>>> a
array([b'\x00\x00\x00\x00', b'\x00\x00\x00\x00', b'\x00\x00\x00\x00',
b'\x00\x00\x00\x00'],
dtype='|V4')
>>> np.array(a[0])
array(array([0, 0, 0, 0], dtype=int8),
dtype='|V4') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't enough - I think that this should be our output:
>>> np.array("Hello world").view(np.void)
array(b'\x68\x65\x6C\x6C\x6F\x20\x77\x6F\x72\x6C\x64', dtype=void)
Showing ascii characters as text isn't useful - we already have the bytes_ dtype for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which you can produce with:
"b'{}'".format(''.join(map(r'\x{:02X}'.format, bytearray( voiditem.tobytes() ))))
I think that's a result of the |
Yes, that's right, since That code block (for |
|
Missed your link there |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe just do this always - 2.7 supports the syntax, and that saves confusion when from __future__ import unicode_literals is in place. extrachars would go too, as a result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is lowercase - which makes separating the hex chars from the x a little harder on the eye
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercase seems to be the python standard though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If nothing else, I think this function should have a comment pointing out that the string it produces is lowercase.
Uppercase is pretty typical of hex memory viewers, isn't it? If anything, being inconsistent with bytes.__repr__ makes it more obvious to the user that they're looking at a void
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how I feel about void.__str__ looking like byte.__repr__. Seems that here printing out raw hex might be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the form void(hex='1b5b324b07410a08'), are you suggesting we modify the void constructor to add a hex argument? If not, I don't like the fact that the repr couldn''t actually be converted to an instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was indeed suggesting we allow it to take a kwarg. But that was an alternative to the bytes solution you settled for, and not my point here.
Here I'm suggesting something like:
>>> print(repr(v))
'\x12\x34\xAB`
>>> print(str(v))
12 34 AB
Which is sort of consistent with
>>> s = 'hello world'
>>> print(repr(v))
'hello world'
>>> print(str(v))
hello world
|
After looking at fixing the 0d array problem, it looks like a more general and more difficult problem better fixed in a future other PR. We seem to have double implementations of many of the dtype reprs: once in |
|
Wonderful. I agree, that can go in another PR |
25438a2 to
7e39112
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mismatching quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh oops
|
Saw this as it had drifted up in the sort-by-update list. Nice! I briefly wondered about documentation beyond the release notes, but it seems there are no examples where void arrays are shown. |
466d620 to
3248ed9
Compare
|
Release notes updated |
doc/release/1.14.0-notes.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the void type was always customizable, but before it was done through the (still) mis-documented 'numpystr' key
numpy/core/tests/test_arrayprint.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add an eval(repr(x), vars(np)) == x test, like we have elsewhere, for both scalars and arrays? That's one of the features of this patch too
|
Good idea, done. |
|
Needs rebase. |
|
Sounds like this is ready. Needs a rebase. |
|
Rebased |
|
Merged, thanks Allan. |
|
|
||
|
|
||
| def _block_check_depths_match(arrays, parent_index=[]): | ||
| class _Recurser(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh oh - bad merge / rebase - this just reverted #9667
This restores the changes in numpygh-9667 that were overwritten.
This PR implements
voidtype_reprandvoidtype_strto output something more sensible.Currently, unstructured void types print their raw values directly to output, allowing you to do some funny things:
(if you paste this into an ANSI terminal the output willl be red colored and the terminal will beep)
In this PR I've implemented that the hex representation of the byte-string will be printed:
I also toyed with printing something like
<memory>,<V8>,<memory at 123456>for the elements. Better suggestions are welcome.