Diff between PR #9216 and #8652 by ruxandraburtica · Pull Request #1 · arjunjauhari/scikit-learn

ruxandraburtica · 2017-06-25T13:03:07Z

Reference Issue

What does this implement/fix? Explain your changes.

This implements UnaryEncoder, which is a more informative encoding than one-hot for ordinal features. Implemented a new class UnaryEncoder whose interface is same as OneHotEncoder class.

Any other comments?

Logic: For k values 0, ..., k - 1 of the ordinal feature x, this creates k - 1 binary features such that the ith is active if x > j (for j = 0, ... i)

Working Example

>>> from sklearn.preprocessing import UnaryEncoder
>>> enc = UnaryEncoder()
>>> enc.fit([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]])  
UnaryEncoder(dtype=<type 'numpy.float64'>, handle_unknown='error',
        n_values='auto', ordinal_features='all', sparse=True)
>>> 
>>> enc.n_values_
array([2, 3, 4])
>>> 
>>> enc.feature_indices_
array([0, 1, 3, 6])
>>> 
>>> enc.active_features_
array([0, 1, 2, 3, 4, 5])
>>>
>>> enc.transform([[0, 1, 1]]).toarray()
array([[ 0.,  1.,  0.,  1.,  0.,  0.]])
>>> UnaryEncoder(3).fit_transform([[0], [1], [2]]).toarray()
array([[ 0.,  0.],
       [ 1.,  0.],
       [ 1.,  1.]])

Merged changes from scikit-learn#9216

b47f86e

ruxandraburtica merged commit 19d33fb into ordinal-encoder Jun 25, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diff between PR #9216 and #8652#1

Diff between PR #9216 and #8652#1
ruxandraburtica merged 1 commit intoordinal-encoderfrom
ordinal-encoder-updates

ruxandraburtica commented Jun 25, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ruxandraburtica commented Jun 25, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant