Skip to content

[C++] Temporal cast from timestamp to date rounds instead of extracting date component #26216

@asfimport

Description

@asfimport

I'd expect this code to give 1950-01-01 twice (i.e. a timestamp -> date cast extracts the date component, ignoring the time component):

import datetime
import pyarrow as pa
arr = pa.array([
    datetime.datetime(1950, 1, 1, 0, 0, 0),
    datetime.datetime(1950, 1, 1, 12, 0, 0),
], type=pa.timestamp("ns"))
print(arr)
print(arr.cast(pa.date32(), safe=False)) 

However it gives 1950-01-02 in the second case:


[
  1950-01-01 00:00:00.000000000,
  1950-01-01 12:00:00.000000000
]
[
  1950-01-01,
  1950-01-02
]

The reason is that the temporal cast simply divides, and C truncates towards 0 (note: Python truncates towards -Infinity, so it would give the right answer in this case!), resulting in -7304 days instead of -7305.

Depending on the intended semantics of a temporal cast, either it should be fixed to extract the date component, or the rounding behavior should be noted and a separate kernel should be implemented for extracting the date component.

Reporter: David Li / @lidavidm
Assignee: David Li / @lidavidm
Watchers: Rok Mihevc / @rok

Related issues:

Note: This issue was originally created as ARROW-10213. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions