Skip to content

Conversation

@eteq
Copy link
Member

@eteq eteq commented Jan 10, 2018

@taldcroft or @mhvk can correct me (and immediately close this PR) if I'm crazy here (or if Travis reveals it's failing), but it looks to me like group_by now works on all the relevant mixin classes I see. I wonder if this got fixed implicitly somewhere and the NotImplementedError is just now wrong?

@eteq eteq added the table label Jan 10, 2018
@astropy-bot
Copy link

astropy-bot bot commented Jan 10, 2018

Hi there @eteq 👋 - thanks for the pull request! I'm just a friendly 🤖 that checks for issues related to the changelog and making sure that this pull request is milestoned and labeled correctly. This is mainly intended for the maintainers, so if you are not a maintainer you can ignore this, and a maintainer will let you know if any action is required on your part 😃.

I noticed the following issues with this pull request:

  • The milestone has not been set (this can only be set by a maintainer)
  • Changelog entry not present (or pull request number missing) and neither the Affects-dev nor the no-changelog-entry-needed label are set

Would it be possible to fix these? Thanks!

If there are any issues with this message, please report them here.

grped = t.group_by('index')
assert np.all(grped['index'] == ['a', 'b', 'b', 'c'])
assert np.all(grped['quantity'] == [0, 1, 3, 2]*u.m)
assert np.all(grped['skycoord'] == [0, 1, 3, 2]*u.deg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to compare grped['skycoord'].ra here.

@mhvk
Copy link
Contributor

mhvk commented Jan 10, 2018

I'll let @taldcroft look into this in more detail -- I've never really looked into grouping -- but did see an obvious problem with a test.

@taldcroft
Copy link
Member

taldcroft commented Jan 10, 2018

@eteq - in principle there should not be any problem with mixin columns as long as they are not used as a key column. That is the simple answer.

The more nuanced answer is that you can use mixins for key column(s) as long as the output of tbl.as_array() is reasonable and sortable. Obviously SkyCoord is a fail there, but Quantity is OK, and Time is kind of OK. This is due to the current implementation of Table.argsort, which converts to a plain structured ndarray and uses the argsort method there.

By "kind of OK" I mean this:

In [15]: t = Table([Time([2,1,1], format='cxcsec'), [3,4,2]*u.m])

In [16]: t.as_array()
Out[16]: 
array([(<Time object: scale='tt' format='cxcsec' value=2.0>, 3.0),
       (<Time object: scale='tt' format='cxcsec' value=1.0>, 4.0),
       (<Time object: scale='tt' format='cxcsec' value=1.0>, 2.0)], 
      dtype=[('col0', 'O'), ('col1', '<f8')])

In [17]: t.argsort()
Out[17]: array([2, 1, 0])

So it actually works, but at the expense of creating individual Time objects for each element.

Interestingly, SkyCoord does not fail here and t.argsort() gives an answer (but I don't know what it means). I was surprised to see that SkyCoord does not fail for < and > operations. (Though I was using astropy 2.0 on Python 2.7).

[EDIT] - Python 3 is just better and gives the expected exception TypeError: '<' not supported between instances of 'SkyCoord' and 'SkyCoord'.

@taldcroft
Copy link
Member

Takeaway: you can open up functionality with a simple PR that just checks that no specified key columns are non-Quantity mixins. The relevant check that a col is not OK is:

bad_key = has_info_class(col, MixinInfo) and not has_info_class(col, QuantityInfoBase)

This would allow e.g. EarthLocation and I honestly don't know if that is meaningful either.
You could also allow Time:

bad_key = has_info_class(col, MixinInfo) and not has_info_class(col, (QuantityInfoBase, TimeInfo))

OR, for the simplest of all (requiring minimal additional testing)

bad_key = has_info_class(col, MixinInfo)

@astropy-bot astropy-bot bot closed this Jul 18, 2018
@astrofrog
Copy link
Member

This was closed without warning by the bot so re-opening

@astrofrog astrofrog reopened this Jul 18, 2018
@astropy astropy deleted a comment from astropy-bot bot Jul 19, 2018
@astropy-bot
Copy link

astropy-bot bot commented Jul 19, 2018

Hi humans 👋 - this pull request hasn't had any new commits for approximately 6 months. I plan to close this in a month if the pull request doesn't have any new commits by then.

In lieu of a stalled pull request, please consider closing this and open an issue instead if a reminder is needed to revisit in the future. Maintainers may also choose to add keep-open label to keep this PR open but it is discouraged unless absolutely necessary.

If this PR still needs to be reviewed, as an author, you can rebase it to reset the clock. You may also consider sending a reminder e-mail about it to the astropy-dev mailing list.

If you believe I commented on this pull request incorrectly, please report this here.

@bsipocz
Copy link
Member

bsipocz commented Aug 6, 2018

@eteq - I would really love to use this right now :)

Any plan to come back to this PR?

@taldcroft
Copy link
Member

I just started taking a look at this. Maybe I can knock off a quick PR.

@bsipocz
Copy link
Member

bsipocz commented Aug 6, 2018

(Actually the my immediate need for this is worked around, but nevertheless, your suggestion above worked perfectly well for the use case I had with mixin columns (that are not being used as keys).)

@eteq
Copy link
Member Author

eteq commented Aug 7, 2018

Aha, I was about to look prompted by @bsipocz, but I guess #7712 supercedes it. As a reply, I will try to review #7712 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants