More performance improvements for generating neighbors#129
More performance improvements for generating neighbors#129mfeurer merged 10 commits intoautoml:masterfrom
Conversation
|
Thanks a lot for the patch. Unfortunately, I cannot confirm any speed improvements on my machine. I tried the unit tests in My outputs with this PR: without: I have some further ideas on how to improve the performance of the transform functions that I'd be happy to share if you're interested, but they might take a bit more time to implement. |
|
Thanks, I will look into it soon! And I would be happy to hear your ideas. |
|
I think a lot of slowdown here comes from having a single method to handle both the array and the non-array case, which results in quite some overhead. Instead, if one would have a function |
|
Hi @mfeurer, I investigated the
However, looking at the I could split all the |
That sounds great!
Keep in mind that this is also important for generating nearest neighbor (for SMAC), and I assume it has a larger effect there.
I think this would be most important.
From a maintenance point I would definitely be in favor of this refactoring as it will make the code clearer an replace a few if/else statements. One can then also start typing the transform functions. |
|
So I'd definitely appreciate an updated PR, but I think you need to decide if it's worth for you splitting up the other transform functions. |
|
Hi @mfeurer, I have split everything into scalar and vector transforms and debugged most problems I encountered, but I'm still failing one check: in For example, the second Do you have an idea of what could be wrong here? |
|
Do the unit tests work locally for you? Currently, they are killed on travis-ci which makes it really hard to see what's going on. Could you please check? |
|
Hi @mfeurer, it turned out to be a problem with precision. Cython uses 32 bit floats & Python uses 64 bit floats, and this caused the Cython floats to be rounded down to the wrong bucket. So the solution was to use double's instead of float's. And I fixed some other problems as well. |
mfeurer
left a comment
There was a problem hiding this comment.
Great work, I only have a few questions about this.
…ixed ordinal/categorical transforms.
mfeurer
left a comment
There was a problem hiding this comment.
Great work! There's just one final request for changes.
mfeurer
left a comment
There was a problem hiding this comment.
Thanks a lot for the quick response. Amazing work!
|
Ah, I just realized what "requested changes" actually means. Didn't see the changed code before. Are your changes also included now? |
Sorry, I don't get this. Did I change some part of the code? |
|
Oh nevermind, I thought you did. I got confused because there was a button that said "View Changes". |
I have added more performance improvements and cleaned up the code for the
_transform()functions so that it is more consistent and has less code repetition.I ran the unit tests locally on my computer and found that running all cases took ~36 seconds with the code in my previous pull request (#127). This is now further reduced to ~16 seconds.