feat: Add Spark date_trunc function#11340
feat: Add Spark date_trunc function#11340zml1206 wants to merge 8 commits intofacebookincubator:mainfrom
Conversation
✅ Deploy Preview for meta-velox canceled.
|
rui-mo
left a comment
There was a problem hiding this comment.
Thanks. The key idea is that by moving some of the methods to the lib for reuse, we can reduce the code duplication.
| } | ||
| }; | ||
|
|
||
| FOLLY_ALWAYS_INLINE std::optional<DateTimeUnit> fromDateTimeUnitString( |
There was a problem hiding this comment.
Would you please add some comments for the new API? If there are unit tests for them, please move them into the lib; if not, we will need to add them.
There was a problem hiding this comment.
And maybe mention it in the currently empty PR description/commit message.
jinchengchenghh
left a comment
There was a problem hiding this comment.
Thanks! Some minors. Could you also clarify what you have changed in the PR description?
There was a problem hiding this comment.
Returns timestamp ts -> Returns timestamp that ts, by the format model fmt
There was a problem hiding this comment.
that the ts falls, please quote all the ts, thanks!
There was a problem hiding this comment.
All this fields are used only once, do we need to set them? Or we could use the string directly in if condition? CC @rui-mo
For example, https://github.com/facebookincubator/velox/blob/main/velox/common/base/StatsReporter.h#L70
There was a problem hiding this comment.
This is the original presto function code. Now it is just extracted to lib for reuse. Do we need to change it? @jinchengchenghh @rui-mo
There was a problem hiding this comment.
Can we remove those defined in DateTimeFormatter then? And +1 for using string directly because these variables are used only once.
There was a problem hiding this comment.
Don't add the const to enum class, you could consider it as integer like.
https://stackoverflow.com/questions/44749658/c-is-it-better-to-pass-an-enum-class-as-value-or-const-reference
There was a problem hiding this comment.
@jinchengchenghh Thanks for review. It seems to be used this way in velox,
There was a problem hiding this comment.
No need to add const when passing by value.
There was a problem hiding this comment.
Please move the class member to the last of the struct as FromUnixtimeFunction.
And move it to private scope
There was a problem hiding this comment.
const DateTimeUnit unit. I would prefer to drop this initialization since it is used only once.
There was a problem hiding this comment.
Can we remove those defined in DateTimeFormatter then? And +1 for using string directly because these variables are used only once.
There was a problem hiding this comment.
No need to add const when passing by value.
525fdaa to
8ab10ec
Compare
There was a problem hiding this comment.
Please move the function registration after the date time functions https://github.com/facebookincubator/velox/pull/11340/files#diff-19962809ab2250cbf3c99e92b7ba14ae89e2f26ae5e646db3c2b31e67af16e8dL450
There was a problem hiding this comment.
What if ts is null?
There was a problem hiding this comment.
There was a problem hiding this comment.
Please move the function description before argument description L56.
There was a problem hiding this comment.
This is valid too.
ASSERT_EQ(
DateTimeUnit::kMillisecond, fromDateTimeUnitString("millisecond", false));
And other public functions also need tests. I'm open for this point since the function test should have cover the function you moved, but it is better to add it.
There was a problem hiding this comment.
Test invalid fmt by VELOX_ASSERT_THROW
rui-mo
left a comment
There was a problem hiding this comment.
Thanks. Added some comments.
There was a problem hiding this comment.
The example section does not render correctly. You could add the '::' as below to fix it.
* "MICROSECOND" - everything remains.
::
SELECT date_trunc('YEAR', '2015-03-05T09:32:05.359'); -- 2015-01-01 00:00:00
There was a problem hiding this comment.
The microseconds are not zeroed out. Is this example correct?
There was a problem hiding this comment.
Please avoid repeating the code in comments. Comment suggestion:
/// Returns timestamp with seconds adjusted to the nearest lower multiple of the
/// specified interval. If the given seconds is negative and not an exact
/// multiple of the interval, it adjusts further down.
There was a problem hiding this comment.
nit: format is not used either.
|
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the PR, make sure you've addressed reviewer comments, and rebase on the latest main. Thank you for your contributions! |
|
@zml1206 Would you like to rebase this PR? Thanks. |
d544dde to
67b53d7
Compare
67b53d7 to
712b231
Compare
rui-mo
left a comment
There was a problem hiding this comment.
Thanks! Looks good overall.
Extend funciton fromDateTimeUnitString
Would be elaborate how the function is extended in the PR description? Thanks.
|
@rui-mo all comments resolved, can you help take a look again? thanks. |
rui-mo
left a comment
There was a problem hiding this comment.
Thanks. Just several nits.
There was a problem hiding this comment.
nit: perhaps add this case in the test.
rui-mo
left a comment
There was a problem hiding this comment.
Can you rebase this PR? Thanks.
rui-mo
left a comment
There was a problem hiding this comment.
Can you run the fuzzer test for 2 minutes to make sure it works? You could use --only to specify this function.
Command example: https://github.com/facebookincubator/velox/blob/main/.github/workflows/scheduled.yml#L587-L596
fix style fix fix fix fix fix fix update for reuse fix style fix fix fix fix fix style update update fix fix update update fix fix update update update update add ut add ut
|
|
@bikramSingh91 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@bikramSingh91 merged this pull request in f14efd0. |

Adds Spark date_trunc function. Presto date_trunc function was refactored for reuse:
DateTimeUnittofunctions/lib/DateTimeFormatter.h.fromDateTimeUnitStringby adding paramsallowMicroandallowAbbreviatedfor Spark, and moves tofunctions/lib/TimeUtils.h.adjustDateTimeandadjustEpochfunctions tofunctions/lib/TimeUtils.h.truncateTimestamptofunctions/lib/TimeUtils.h.