Skip to content

[C++] A "replace" or "map" kernel to replace values in array based on mapping #26597

@asfimport

Description

@asfimport

A "replace" or "map" kernel to replace values in array based on mapping. This would be similar as the pandas Series.replace (or Series.map) kernel, and as a small illustration of what is meant:

In [41]: s = pd.Series(["Yes", "Y", "No", "N"])

In [42]: s
Out[42]: 
0    Yes
1      Y
2     No
3      N
dtype: object

In [43]: s.replace({"Y": "Yes", "N": "No"})
Out[43]: 
0    Yes
1    Yes
2     No
3     No
dtype: object

Note: in pandas the difference between "replace" and "map" is that replace will only replace a value if it is present in the mapping, while map will replace every value in the input array with the corresponding value in the mapping and return null if not present in the mapping. This different behaviour could maybe be triggered with a keyword.

Note, this is different from ARROW-10306 which is about string replacement within array elements (replacing a substring in each string element in the array), while here it is about replacing full elements of the array)

cc @maartenbreddels

Reporter: Joris Van den Bossche / @jorisvandenbossche

Related issues:

Note: This issue was originally created as ARROW-10641. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions