Skip to content

.map is acting on the partition index, rather than the values #960

@jreback

Description

@jreback
In [27]: s = Series([-100,-50,0,50,100],index=list('ABCDE'))

In [28]: sm = Series(list('ABCAB'),list('ABCDE'))

In [29]: sm.map(s)
Out[29]: 
A   -100
B    -50
C      0
D   -100
E    -50
dtype: int64

In [30]: ddf.from_pandas(sm,npartitions=5).compute()                                         
Out[30]: 
A    A
B    B
C    C
D    A
E    B
dtype: object

In [31]: ddf.from_pandas(sm,npartitions=5).map(s)   
Out[31]: dd.Series<elemwise-b533b55a33e686222d4aafe30c2f6c9e, divisions=('A', 'B', 'C', 'D', 'E')>

In [32]: ddf.from_pandas(sm,npartitions=5).map(s).compute()
Out[32]: 
A   -100
B    -50
C      0
D    NaN
E    NaN
dtype: float64

so [32] should be equiv [29]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions