SelectFromModel.transform is very slow: > 0.25 seconds to run on a single row

### Description

When profiling unexpectedly slow production code, I was shocked to find out that `SelectFromModel.transform()` took over a quarter of a second to run on a single row, while everything else in the Pipeline took very roughly a microsecond to run on a single row. 

We're using this in production with private data, so I can't copy exactly the example we're using. But the dataset that we feed into `SelectFromModel.fit` has many hundreds of data points, while the pruned version of the dataset after feature selection has only dozens of data points. 

The compute time required for `SelectFromModel.transform()` over tens of thousands of rows is only a fraction of a second longer than the compute time required for a single row. We can probably pre-calculate a lot of what `SelectFromModel.transform` is calculating on the fly each time. 

At some point, I'll dive into the code a little more and submit a PR to speed this up. This slowness would have prevented us from using a pipeline built using scikit-learn in production. Everything else runs very quickly (that's one of the reasons I like scikit-learn so much: a very active community of developers optimizing the performance all the time), but this one bottleneck took orders of magnitude longer than all the other calculations combined when getting a prediction on a single row. 
#### Versions

```
>>> import platform; print(platform.platform())
Darwin-15.6.0-x86_64-i386-64bit
>>> import sys; print("Python", sys.version)
('Python', '2.7.12 (default, Jun 29 2016, 14:05:02) \n[GCC 4.2.1 Compatible Apple LLVM 7.3.0 (clang-703.0.31)]')
>>> import numpy; print("NumPy", numpy.__version__)
('NumPy', '1.11.1')
>>> import scipy; print("SciPy", scipy.__version__)
('SciPy', '0.18.0')
>>> import sklearn; print("Scikit-Learn", sklearn.__version__)
('Scikit-Learn', '0.17.1')
```

Thanks, as always, for running such an awesome project! Hopefully this will help speed up other peoples' production code as well, and encourage even wider adoption of scikit-learn. It seems a shame to have such a highly optimized library bottlenecked by such a primitive piece of functionality. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SelectFromModel.transform is very slow: > 0.25 seconds to run on a single row #7478

Description

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

SelectFromModel.transform is very slow: > 0.25 seconds to run on a single row #7478

Description

Description

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions