Skip to content

New Feature: Added NCA as first algorithm in metric_learning module#5276

Closed
terrytangyuan wants to merge 5 commits intoscikit-learn:masterfrom
terrytangyuan:metriclearn
Closed

New Feature: Added NCA as first algorithm in metric_learning module#5276
terrytangyuan wants to merge 5 commits intoscikit-learn:masterfrom
terrytangyuan:metriclearn

Conversation

@terrytangyuan
Copy link
Copy Markdown

See #3213. I've integrated NCA with scikit-learn in this PR.

@terrytangyuan
Copy link
Copy Markdown
Author

Anyone knows what I have missed? I got this from Travis:

FAIL: sklearn.tests.test_common.test_root_import_all_completeness
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/scikit-learn/scikit-learn/testvenv/local/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/tests/test_common.py", line 144, in test_root_import_all_completeness
    assert_in(modname, sklearn.__all__)
AssertionError: 'metric_learning' not found in ['calibration', 'cluster', 'covariance', 'cross_decomposition', 'cross_validation', 'datasets', 'decomposition', 'dummy', 'ensemble', 'externals', 'feature_extraction', 'feature_selection', 'gaussian_process', 'grid_search', 'isotonic', 'kernel_approximation', 'kernel_ridge', 'lda', 'learning_curve', 'linear_model', 'manifold', 'metrics', 'mixture', 'multiclass', 'naive_bayes', 'neighbors', 'neural_network', 'pipeline', 'preprocessing', 'qda', 'random_projection', 'semi_supervised', 'svm', 'tree', 'discriminant_analysis', 'clone']

It looks like I need to add this metric_learning into somewhere? Any help would be appreciated.

@jnothman
Copy link
Copy Markdown
Member

It looks like I need to add this metric_learning into somewhere?

Modify __all__ in sklearn/__init__.py.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please inherit from sklearn.base.BaseEstimator

@jnothman
Copy link
Copy Markdown
Member

Huh. Wait, this is being worked on in #4789, and was the subject of GSoC funding... If you can see benefits of your implementation over that one, please comment there. Otherwise I think you should expect that implementation to take precedence.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RCA?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo! I was reading something else at the same time...thanks!
On Sep 16, 2015 4:19 AM, "B@rmaley.exe" notifications@github.com wrote:

In sklearn/metric_learning/init.py
#5276 (comment)
:

@@ -0,0 +1,8 @@
+"""
+The :mod:sklearn.metric_learning module implements metric learning models.
+
+The algorithms that have been implemented are:
+Relevant Components Analysis (RCA)

RCA?


Reply to this email directly or view it on GitHub
https://github.com/scikit-learn/scikit-learn/pull/5276/files#r39603872.

@terrytangyuan
Copy link
Copy Markdown
Author

@Barmaley-exe @jnothman Still failing after the changes. Any ideas?

@artsobolev
Copy link
Copy Markdown
Contributor

@terrytangyuan you can take a look at Travis' output to find failed tests. You can also see the failing test's code to understand what's going on.

@terrytangyuan
Copy link
Copy Markdown
Author

@Barmaley-exe I got this

  File "/home/travis/build/scikit-learn/scikit-learn/sklearn/metric_learning/NCA.py", line 219, in fit
    res = opt.minimize(fun=self.objective,
AttributeError: 'module' object has no attribute 'minimize'

which is not expected since it's imported from here import scipy.optimize as opt and it passed on my local pc.

BTW, do you suggest that I keep working on this or help you finish yours (I'd love to help)? Yours seems to have a problem of not improving the classification accuracy? The main goal is to get metric learning on this package since it's extremely useful.

@artsobolev
Copy link
Copy Markdown
Contributor

@terrytangyuan I'm not sure why this exception arises. One possibility is that sklearn supports SciPy >= 0.9, and minimize was added in 0.11

My personal opinion is that my PR is much more developed, so we'd be better off finishing that. Main problem is that it's not well-tested, and it'd benefit from more documentation and examples.

@terrytangyuan
Copy link
Copy Markdown
Author

Okay. Let's finish that instead of making a new one then. Can you update your existing code? How should I make changes on your PR? I assume there's a better way than fork your forked repo.

@terrytangyuan terrytangyuan deleted the metriclearn branch September 16, 2015 21:57
@mblondel
Copy link
Copy Markdown
Member

In my opinion, the point of metric learning is that it can leverage supervised information in the form of pairwise similarity or dissimilarity labels. If you are going to generate those labels from just (X, y), one is probably better off just using a classifier. So before we add NCA to scikit-learn, I really want to see empirical evidence that it is worth it both in terms of accuracy and training time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants