[WIP] Add verbose option for Pipeline and Feature Union [rebased 8568]#9668
[WIP] Add verbose option for Pipeline and Feature Union [rebased 8568]#9668petrushev wants to merge 12 commits intoscikit-learn:masterfrom
Conversation
- Each line printed by Pipeline and FeatureUnion, when their verbosity mode is on, will be 70 characters long.
|
The |
sklearn/pipeline.py
Outdated
| print('%s%s%s' % (name, '.' * (70 - len(name + elapsed)), elapsed)) | ||
|
|
||
|
|
||
| class _BasePipeline(six.with_metaclass(ABCMeta, _BaseComposition)): |
There was a problem hiding this comment.
It seems that it has been badly merged. This class has been removed in favor of _BaseComposition.
sklearn/pipeline.py
Outdated
| # License: BSD | ||
|
|
||
| from collections import defaultdict | ||
| from abc import ABCMeta, abstractmethod |
There was a problem hiding this comment.
not needed any more with _BaseComposition
sklearn/pipeline.py
Outdated
|
|
||
|
|
||
| class FeatureUnion(_BaseComposition, TransformerMixin): | ||
| class FeatureUnion(_BasePipeline, TransformerMixin): |
There was a problem hiding this comment.
It should also be subclassed from BaseComposition
sklearn/pipeline.py
Outdated
|
|
||
| def __init__(self, transformer_list, n_jobs=1, transformer_weights=None, | ||
| verbose=False): | ||
| self.transformer_list = tosequence(transformer_list) |
There was a problem hiding this comment.
it should be list as previously stated
sklearn/pipeline.py
Outdated
| def _update_transformer_list(self, transformers): | ||
| transformers = iter(transformers) | ||
| self.transformer_list[:] = [ | ||
| self.transformer_list = tosequence([ |
Fix error message bug
916663e to
3938adc
Compare
|
@petrushev Could you solve the PEP8 issue: |
glemaitre
left a comment
There was a problem hiding this comment.
@jnothman this PR has been resurrect and it seems to have been extensively reviewed previously.
Having a quick look, I was thinking that a decorator could be used in multiple places instead of repeating the same line around the processing. WDYT?
doc/whats_new.rst
Outdated
|
|
||
| .. _Neeraj Gangwar: http://neerajgangwar.in | ||
| .. _Arthur Mensch: https://amensch.fr | ||
|
|
doc/whats_new.rst
Outdated
| - Added optional parameter ``verbose`` in functions `pipeline.make_pipeline` | ||
| and `pipeline.make_union` to extend the same functionality as the | ||
| corresponding classes. :issue:`9668` by | ||
| :user:`Baze Petrushev <petrushev>`. |
There was a problem hiding this comment.
if you could also add @karandesai-96 in the authorship since that it did a good part of the job as well :)
There was a problem hiding this comment.
Thanks @glemaitre and @petrushev
Current What's New file on this branch has these two lines:
Miscellaneous
- Added optional parameter ``verbose`` in :class:`pipeline.Pipeline` and
:class:`pipeline.FeatureUnion` for showing progress and timing of each
step. :issue:`8568` by :user:`Karan Desai <karandesai-96>`.
- Added optional parameter ``verbose`` in functions `pipeline.make_pipeline`
and `pipeline.make_union` to extend the same functionality as the
corresponding classes. :issue:`9668` by
:user:`Baze Petrushev <petrushev>`.
I think they should be unified outlining the feature, issue, PR and contributors.
sklearn/pipeline.py
Outdated
| from .externals.joblib import Parallel, delayed | ||
| from .externals import six | ||
| from .utils.metaestimators import if_delegate_has_method | ||
| from .utils.metaestimators import if_delegate_has_method, _BaseComposition |
There was a problem hiding this comment.
revert to the version of master with line 20 and line 24. It is equivalent to master in bring 2 diffs.
sklearn/pipeline.py
Outdated
|
|
||
| def _fit_one_transformer(transformer, X, y): | ||
| return transformer.fit(X, y) | ||
| def _fit_one_transformer(trans, X, y, verbose=False, idx=None, |
There was a problem hiding this comment.
replace trans by transformer in all this function
sklearn/pipeline.py
Outdated
| total_steps=None, name=None): | ||
| # idx, total_steps and name are not required when verbosity is disabled | ||
| step_start_time = time.time() | ||
| trans = trans.fit(X, y) |
sklearn/pipeline.py
Outdated
| return res, transformer | ||
| return res * weight, transformer | ||
| return res, trans | ||
| return res * weight, trans |
sklearn/pipeline.py
Outdated
| This estimator | ||
| """ | ||
| self._validate_transformers() | ||
| all_transformers = [(name, trans, weight) for name, trans, weight in |
There was a problem hiding this comment.
This diff seems weird to me. It is not in master. I would be checking that this is not from the previous version before _BaseComposition
There was a problem hiding this comment.
It seems this was done, along with the change in _fit_one_transform, for the benefit of the verbosity, so not in master.
Here 's the commit in the original PR.
sklearn/pipeline.py
Outdated
| sum of n_components (output dimension) over transformers. | ||
| """ | ||
| self._validate_transformers() | ||
| all_transformers = [(name, trans, weight) for name, trans, weight in |
There was a problem hiding this comment.
The same as the previous statement
sklearn/tests/test_pipeline.py
Outdated
|
|
||
| class NoInvTransf(NoTrans): | ||
| def transform(self, X): | ||
| def transform(self, X, y=None): |
There was a problem hiding this comment.
Remove y=None. This is from the previous transformer version
sklearn/tests/test_pipeline.py
Outdated
|
|
||
| class Transf(NoInvTransf): | ||
| def transform(self, X): | ||
| def transform(self, X, y=None): |
|
Hi @petrushev, thanks for taking forward my PR. |
28740b6 to
efb4aac
Compare
|
Thanks for taking this on, @petrushev. As stated in the previous PR, I had hoped we could share the logging logic with |
|
@jnothman Sure, that seems like the way to go. I'll see what I can do about it. |
39a0736 to
2e5b6cc
Compare
|
@jnothman I wonder if this is something like you had in mind? The |
|
it is, at a glance, thanks! I'll give it a more thorough review soon.
|
|
could you please add a unit test for the helper, making sure it doesn't
explode, for instance, if the text is too long. The standard library
textwrap may prove useful for such things.
…On 6 Sep 2017 9:27 am, "Joel Nothman" ***@***.***> wrote:
it is, at a glance, thanks! I'll give it a more thorough review soon.
|
|
of course, i just wanted to make sure i'm on the right track |
|
are you still working on this? |
|
@amueller unfortunately i can't really promise i'll commit myself to it any time soon :/ |
|
thanks for your work so far
|
|
@petrushev now worries and thanks for an honest assessment :) |
Reference Issue
Rebases #8568
Fixes #5298, #5321
What does this implement/fix? Explain your changes.
Rebases #8568 which has the following description:
Also, it extends the
make_pipelinehelper to support theverbosekwarg.The snippets in the original PR produce the expected output.
Also checked with the example
feature_selection/plot_feature_selection_pipeline.pywhich produces a nice output: