Skip to content

[MRG+1] add a transform_max_iter to SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning#12682

Merged
thomasjpfan merged 29 commits intoscikit-learn:masterfrom
adrinjalali:bug/sparsecoder/max_iter
Jul 8, 2019
Merged

[MRG+1] add a transform_max_iter to SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning#12682
thomasjpfan merged 29 commits intoscikit-learn:masterfrom
adrinjalali:bug/sparsecoder/max_iter

Conversation

@adrinjalali
Copy link
Copy Markdown
Member

@adrinjalali adrinjalali commented Nov 27, 2018

Fixes #12650

SparseCoder now passes max_iter to the underlying LassoLarse when algorithm='lasso_lars'

@adrinjalali adrinjalali changed the title SparseCoder passes max_iter to LassoLarse [MRG+1] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can/should we test this?

@adrinjalali adrinjalali changed the title [MRG+1] SparseCoder passes max_iter to LassoLarse [WIP] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
@adrinjalali
Copy link
Copy Markdown
Member Author

The test is basically a copy/paste of the example which was giving the warning.

@adrinjalali adrinjalali changed the title [WIP] SparseCoder passes max_iter to LassoLarse [MRG] SparseCoder passes max_iter to LassoLarse Nov 28, 2018
@adrinjalali
Copy link
Copy Markdown
Member Author

I'd also need to check the docs for this, but I'd appreciate some feedback before I do that.

@adrinjalali adrinjalali changed the title [MRG] SparseCoder passes max_iter to LassoLarse [MRG] ENH SparseCoder passes max_iter to LassoLarse Nov 29, 2018
@adrinjalali adrinjalali changed the title [MRG] ENH SparseCoder passes max_iter to LassoLarse ENH SparseCoder passes max_iter to LassoLarse Nov 29, 2018
@jnothman jnothman changed the title ENH SparseCoder passes max_iter to LassoLarse ENH SparseCoder passes max_iter to LassoLars Dec 3, 2018
:mod:`slkearn.decomposition`
............................

- |Fix| :class:`decomposition.SparseCoder` now passes `max_iter` to the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now out of date. Please describe here what this change does, or at least update the PR description, as now it seems unclear, with the addition of method_max_iter and transform_max_iter

@adrinjalali
Copy link
Copy Markdown
Member Author

This PR now touches quite a few public functions/classes:

  • sparse_encode() now passes the max_iter to the underlying LassoLarse when algorithm='lasso_lars' (BUG)
  • dict_learning() and dict_learning_online() now accept method_max_iter and pass it to sparse_encode (ENH)
  • SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning now take a transform_max_iter parameter and pass it to either dict_learning() or sparse_encode() (ENH)

I've update the whats_new entries reflecting these changes, but I have two questions:

  1. those are three separate entries in the log, I'm not sure how to merge them, or if I should. Should I also include the models in the Changed Models section?
  2. I've added a test which checks for the warning I was encountering in the example (ConvergenceWarning), but there's no new explicit test for the 5 other public functions/classes which have a change in this PR. I'm not sure how to best test them though!

@adrinjalali adrinjalali changed the title ENH SparseCoder passes max_iter to LassoLars ENH add a transform_max_iter to SparseCoder, DictionaryLearning, and MiniBatchDictionaryLearning Dec 3, 2018
@jnothman
Copy link
Copy Markdown
Member

jnothman commented Dec 6, 2018 via email

@jnothman
Copy link
Copy Markdown
Member

I've not yet understood this fully, but I am not very familiar with the algorithms. Why is method_max_iter different from transform_max_iter?

@adrinjalali
Copy link
Copy Markdown
Member Author

I mostly followed whatever naming convention was there in each method/class. Some call the algorithm a method, and prefix the parameters related to it with method_, and some call it transform_algorithm, and use transform_ prefix for the algorithm's parameters.

Copy link
Copy Markdown
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As regards the other tests, perhaps checking that changing parameter value can affect the result is an okay level of testing.

def ricker_function(resolution, center, width):
"""Discrete sub-sampled Ricker (Mexican hat) wavelet"""
x = np.linspace(0, resolution - 1, resolution)
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you must need parentheses around the 1/4... ** takes precedence. Perhaps use 0.25 instead of 1/4. You can remove some parentheses by changing * to /.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't write these. They're copy pasted from the failing example (plot_sparse_coding.py).

But I'll try to simplify both then.

x = np.linspace(0, resolution - 1, resolution)
x = ((2 / ((np.sqrt(3 * width) * np.pi ** 1 / 4)))
* (1 - ((x - center) ** 2 / width ** 2))
* np.exp((-(x - center) ** 2) / (2 * width ** 2)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have extraneous brackets here making things hard to read. ** takes precedence over *

@adrinjalali
Copy link
Copy Markdown
Member Author

Tests are green again :)

@adrinjalali
Copy link
Copy Markdown
Member Author

@jnothman happy with this now maybe?

@adrinjalali
Copy link
Copy Markdown
Member Author

@jnothman should we put this in for 0.21?

Copy link
Copy Markdown
Member

@thomasjpfan thomasjpfan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The simplest test for MiniBatchDictionaryLearning and DictionaryLearning would be to mock out sparse_encode, but I think it is not needed.

assert dico.components_.shape == (n_components, n_features)


def test_max_iter():
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing for convergence this way is great!

n_jobs=n_jobs, check_input=False,
positive=positive_code)
positive=positive_code, max_iter=method_max_iter,
verbose=verbose)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have avoided this because sparse_encode accepts a int for verbose and verbose in this context is a bool. (Although this will still work)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could pass verbose > 0, but I guess this way it's fine.

code = sparse_encode(X, dictionary, algorithm=method, alpha=alpha,
init=code, n_jobs=n_jobs, positive=positive_code)
init=code, n_jobs=n_jobs, positive=positive_code,
max_iter=method_max_iter, verbose=verbose)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have avoided this because sparse_encode accepts a int for verbose and verbose in this context is a bool. (Although this will still work)

@adrinjalali
Copy link
Copy Markdown
Member Author

@thomasjpfan I'm not sure if in your reviews you want me to change anything :P

.. versionadded:: 0.20

method_max_iter : int, optional (default=1000)
It is passed to the underlying ``method`` as their ``max_iter``
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Maximum number of iterations to perform it..."

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to add this at the beginning of the sentence? I would become Maximum number of iterations to perform it is passed to the underlying ``method`` as their ``max_iter`` parameter.. I can't parse the sentence, confused!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is passed to sparse_encode I was thinking of using the docstring there here:

method_max_iter : int, 1000 by default
    Maximum number of iterations to perform if `algorithm='lasso_cd'`.

This way a user would not need to figure it out by parsing the code to get to sparse_code and figure out how max_iter is used.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks better now.

method_max_iter : int, optional (default=1000)
It is passed to the underlying ``method`` as their ``max_iter``
parameter.
Maximum number of iterations to perform in each ``sparse_encode`` step.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent with dict_learning:

Maximum number of iterations to perform.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the extra part at the end is that dict_learning_online has an n_iter parameter as well, which has Maximum number of iterations to perform. as the description. I needed to distinguish the two somehow.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With two iters, we may need to describe what they actually do. Since dict_learning_online only supports lassos, maybe:

n_iter:
    Number of mini-batch iterations to perform.

method_max_iter:
    Maximum number of iterations to perform when solving the lasso problem.

@adrinjalali
Copy link
Copy Markdown
Member Author

Oops, I suppose I also need to change the versions in docstrings and move the whats_new

@adrinjalali
Copy link
Copy Markdown
Member Author

ping @thomasjpfan :)

@thomasjpfan thomasjpfan merged commit 7b8cbc8 into scikit-learn:master Jul 8, 2019
@thomasjpfan
Copy link
Copy Markdown
Member

Thank you @adrinjalali !

koenvandevelde pushed a commit to koenvandevelde/scikit-learn that referenced this pull request Jul 12, 2019
@adrinjalali adrinjalali deleted the bug/sparsecoder/max_iter branch November 11, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SparseCoder doesn't expose max_iter for Lasso

5 participants