Skip to content

Documented an issue in the cartesian product function in utils/extmat…#16406

Merged
rth merged 4 commits intoscikit-learn:masterfrom
hcars:master
Aug 7, 2020
Merged

Documented an issue in the cartesian product function in utils/extmat…#16406
rth merged 4 commits intoscikit-learn:masterfrom
hcars:master

Conversation

@hcars
Copy link
Copy Markdown
Contributor

@hcars hcars commented Feb 7, 2020

…h.py

Reference Issues/PRs

See issue
Unable to Take Cartesian Products of Many Arrays
#16350 .

What does this implement/fix? Explain your changes.

This fixes the problem of only accepting 32 arrays into the cartesian function.

Any other comments?

@adrinjalali
Copy link
Copy Markdown
Member

Thanks for the PR @hcars
I think the idea was to document the limitation of the method, and not fixing the issue.

If we'd like to support larger input, we may need to implement that in an efficient way (in python or cython), and I'm not sure of the performance of your fix.

If we go for the fix, it also needs to add tests to test the implementation.

Another point to consider is how severe the issue is, and when exactly from the user's perspective the issue happens, which I don't see in your reported issue.

@hcars
Copy link
Copy Markdown
Contributor Author

hcars commented Feb 17, 2020

Thank you for the feedback, @adrinjalali . I can add some unit tests for the method if need be. I'm just not sure how to do that with this repo.

The problem arose for me while using the method to take the Cartesian product as a step in a reduction of the SIR forecasting problem to DNF counting. This reduction routinely requires obtaining the product of more than 32 sets. I can just add documentation, however, if you don't think adding a fix is worth it.

@adrinjalali
Copy link
Copy Markdown
Member

I'm not sure what SIR or DNF are, but sounds pretty specific to me. I think at this stage just stating in the docs that there's a limitation would be idea. Since it's a numpy limitation (if I understand correctly), you could also open an issue on numpy upstrea, which would make sense I guess.

@jnothman
Copy link
Copy Markdown
Member

jnothman commented Feb 17, 2020 via email

@hcars
Copy link
Copy Markdown
Contributor Author

hcars commented Feb 18, 2020

@jnothman it's for an epidemiological forecasting toolkit that will be released soon. It's more of an agent-based model. I don't want to provide excessive detail before it's released. I will just change my fork of this to document the issue and remove my code.

Copy link
Copy Markdown
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like the warning now @hcars, and good luck with your library :)

Copy link
Copy Markdown
Member

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rth
Copy link
Copy Markdown
Member

rth commented Feb 25, 2020

Could you please merge upstream/master into this PR, to see if that would fix the CI?

@rth rth self-assigned this Aug 7, 2020
@rth rth merged commit eff1bdf into scikit-learn:master Aug 7, 2020
jayzed82 pushed a commit to jayzed82/scikit-learn that referenced this pull request Oct 22, 2020
…-learn#16406)

Co-authored-by: Henry <hlc5v@virginia.edu>
Co-authored-by: Roman Yurchak <rth.yurchak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants