-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13185][SQL] Reuse Calendar object in DateTimeUtils.StringToDate method to improve performance #11090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #50810 has finished for PR 11090 at commit
|
|
This kind of optimization always feels icky to me but it is pretty self contained. Are there more instances of this across the code that we can at least gather and handle in one place? |
|
Reusing a
|
|
I'm not arguing with the optimization, just moaning at the use of However now that I am back in front of the computer I don't see any. Most are in test code. There are some silly calls to Yeah, Is your 20 seconds improvement total wall-clock time difference over all records -- as in saving 6 nanoseconds per record? although I like optimization that sounds very small. #11071 doesn't seem to be correct; is that intended to be part of this change? |
|
@srowen The 20 seconds improvements is the difference of the stage time. i.e. before the patch, the stage runs 1.6 min. With this path it runs 1.2 min. It takes about 1 second to create 1 million Right, #11071 also wants to optimize the |
|
OK, I think this is OK to merge |
|
Jenkins, retest this please |
|
Test build #51269 has finished for PR 11090 at commit
|
|
Thanks - I've merged this in master. |
The java
Calendarobject is expensive to create. I have a sub query like thisSELECT a, b, c FROM table UV WHERE (datediff(UV.visitDate, '1997-01-01')>=0 AND datediff(UV.visitDate, '2015-01-01')<=0))The table stores
visitDateas String type and has 3 billion records. ACalendarobject is created every timeDateTimeUtils.stringToDateis called. By reusing theCalendarobject, I saw about 20 seconds performance improvement for this stage.