[MRG+2] add MaxAbsScaler#4828
Conversation
|
@amueller : I hope this is what you had in mind |
doc/modules/preprocessing.rst
Outdated
|
Apart from nitpicks and sparse matrix testing LGTM |
676d201 to
22074da
Compare
doc/modules/preprocessing.rst
Outdated
|
Looks good :) |
|
Since it is not in RobustScaler anymore, do we want to use |
|
Thanks for that 2nd review. I've implemented the changes you've suggested and squashed the commits. |
doc/modules/preprocessing.rst
Outdated
There was a problem hiding this comment.
Yet this is not true of the following example. Either qualify the statement or add a trim option to the scaler.
There was a problem hiding this comment.
I've changed the text a bit, please have a look to see if you like the new wording better.
|
Remove backticks, add what's new entry, squash commits, and you can haz merge. LGTM! |
|
Thanks @untom, for your contribution and your perseverance! |
|
Thanks for your review and in general for helping out with this! |
|
I found a small problem (?) with MaxAbsScaler and not sure whether I should file a bug report for this (because 0.17 is not officially released) So I have my collection of data scaled with MinMaxScaler. Then I need to transform a new sparse matrix with one row (sparse vector?), eg. However, if I attempt to transform the 1-row sparse matrix above (so I can do comparison with the collection), I get this assertion error The workaround to the problem is to use a matrix that has more than 1 row, due to this part of the code (if I am not mistaken) |
|
Could you please file this as a separate issue, and also report: X.shape On 19 October 2015 at 14:15, Jeffrey04 notifications@github.com wrote:
|
|
done, thanks (: |
|
Just wanted to say thanks for this feature! I've already tested it out on several datasets and have found it super useful with sparse arrays. Thanks for all the hard work you've put into this, everyone! |
This PR adds the
MaxAbsScalerandmaxabs_scaletosklearn.preprocessing. This scaler scales its inputs by the maximum absolute value of each feature. This scaler is especially useful for sparse data, but is probably also always a better alternative toMinMaxScalerwhen the data is already centered.The scaler itself was previously discussed in #1799 and #2514.