Fix print precision and match numpy behavior#12746
Fix print precision and match numpy behavior#12746ailzhang wants to merge 16 commits intopytorch:masterfrom
Conversation
soumith
left a comment
There was a problem hiding this comment.
the test cases in the issue description, where you match numpy behavior,, make them test cases.
| self.max_width = 1 | ||
|
|
||
| # use tensor_view for 0-dim tensor iteration | ||
| tensor_view = tensor.view(tensor.nelement()) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
abc306f to
e92f38c
Compare
|
@soumith I added a few tests. There is one thing left todo: I actually think we should port the dragon4_scientific code from Numpy here to make the print prettier. But it should be in a separate PR and depends on the priority. I opened #12797 for it. |
soumith
left a comment
There was a problem hiding this comment.
lgtm. Instead of separate expect files per each string, it feels like it's better to inline the expected strings in the tests. Making a separate file doesn't add a lot of value here, and one has to jump 1 extra file to find out what the expected value is supposed to be.
facebook-github-bot
left a comment
There was a problem hiding this comment.
ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
I implemented inline expect tests in my spare time. Let me put it in PyTorch. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
facebook-github-bot
left a comment
There was a problem hiding this comment.
ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
facebook-github-bot
left a comment
There was a problem hiding this comment.
ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Fixes pytorch#12578 pytorch#9395. * Fix and simplify print logic * Follow numpy print rule https://github.com/numpy/numpy/blob/eb2bd11870731ea19a0eee72e616c7deb00f6c54/numpy/core/arrayprint.py#L859 > scientific notation is used when absolute value of the smallest number is < 1e-4 or maximum > 1e8 or the ratio of the maximum absolute value to the minimum is > 1e3 I hope I didn't break anything since there seems to be a lot of edge cases here... Here are some easy sanity checks. ``` In [5]: torch.tensor(1) Out[5]: tensor(1) Out[2]: array(1) # numpy In [6]: torch.tensor(10) Out[6]: tensor(10) Out[3]: array(10) # numpy In [8]: torch.tensor(99000000) Out[8]: tensor(99000000) Out[5]: array(99000000) # numpy In [9]: torch.tensor(100000000) Out[9]: tensor(100000000) Out[6]: array(100000000) # numpy In [10]: torch.tensor(100000001) Out[10]: tensor(100000001) Out[7]: array(100000001) # numpy In [11]: torch.tensor(1000000000) Out[11]: tensor(1000000000) Out[8]: array(1000000000) # numpy In [12]: torch.tensor([1, 1000]) Out[12]: tensor([ 1, 1000]) Out[9]: array([ 1, 1000]) # numpy In [13]: torch.tensor([1, 1010]) Out[13]: tensor([ 1, 1010]) Out[10]: array([ 1, 1010]) # numpy ``` For floating points, we use scientific when `max/min > 1000 || max > 1e8 || min < 1e-4` Lines with "old" are old behaviors that either has precision issue, or not aligned with numpy ``` In [14]: torch.tensor(0.01) Out[14]: tensor(0.0100) Out[11]: array(0.01) # numpy In [15]: torch.tensor(0.1) Out[15]: tensor(0.1000) Out[12]: array(0.1) # numpy In [16]: torch.tensor(0.0001) Out[16]: tensor(0.0001) Out[14]: array(0.0001) # numpy In [17]: torch.tensor(0.00002) Out[17]: tensor(2.0000e-05) Out[15]: array(2e-05) # numpy Out[5]: tensor(0.0000) # old In [18]: torch.tensor(1e8) Out[18]: tensor(100000000.) Out[16]: array(100000000.0) # numpy In [19]: torch.tensor(1.1e8) Out[19]: tensor(1.1000e+08) Out[17]: array(1.1e8) # numpy 1.14.5, In <= 1.13 this was not using scientific print Out[10]: tensor(110000000.) # old In [20]: torch.tensor([0.01, 10.]) Out[20]: tensor([ 0.0100, 10.0000]) Out[18]: array([ 0.01, 10. ]) # numpy In [21]: torch.tensor([0.01, 11.]) Out[21]: tensor([1.0000e-02, 1.1000e+01]) Out[19]: array([ 1.00000000e-02, 1.10000000e+01]) # numpy Out[7]: tensor([ 0.0100, 11.0000]) # old ``` When print floating number in int mode, we still need to respect rules to use scientific mode first ``` In [22]: torch.tensor([1., 1000.]) Out[22]: tensor([ 1., 1000.]) Out[20]: array([ 1., 1000.]) # numpy In [23]: torch.tensor([1., 1010.]) Out[23]: tensor([1.0000e+00, 1.0100e+03]) Out[21]: array([ 1.00000000e+00, 1.01000000e+03]) # numpy Out[9]: tensor([ 1., 1010.]) # old ``` Pull Request resolved: pytorch#12746 Differential Revision: D10443800 Pulled By: ailzhang fbshipit-source-id: f5e4e3fe9bf0b44af2c64c93a9ed42b73fa613f5
Fixes #12578 #9395.
Fix and simplify print logic
Follow numpy print rule https://github.com/numpy/numpy/blob/eb2bd11870731ea19a0eee72e616c7deb00f6c54/numpy/core/arrayprint.py#L859
I hope I didn't break anything since there seems to be a lot of edge cases here... Here are some easy sanity checks.
For floating points, we use scientific when
max/min > 1000 || max > 1e8 || min < 1e-4Lines with "old" are old behaviors that either has precision issue, or not aligned with numpy
When print floating number in int mode, we still need to respect rules to use scientific mode first