[MRG] DOC: shift projects missing in related_projects from the wiki page#8297
[MRG] DOC: shift projects missing in related_projects from the wiki page#8297dalmia wants to merge 4 commits intoscikit-learn:masterfrom
Conversation
doc/related_projects.rst
Outdated
| wrapper around scikit-learn that makes it easy to run machine learning | ||
| experiments with multiple learners and large feature sets. | ||
|
|
||
| - `sklearn-deap <https://github.com/rsteca/sklearn-deap>`_ Use evolutionary |
There was a problem hiding this comment.
I don't think this belongs in this subsection, which is about providing API wrappers. PyMC appears to be accidentally, incorrectly, in this section.
There was a problem hiding this comment.
The other possible choices seem - Auto-ML and Model Export for production. Might you have any preference?
| - `mlxtend <https://github.com/rasbt/mlxtend>`_ Includes model visualization | ||
| utilities. | ||
|
|
||
| - `Fast svmlight / libsvm file loader <https://github.com/mblondel/svmlight-loader>`_ |
There was a problem hiding this comment.
I'm not certain whether this is current, i.e. still much faster than what's in sklearn.datasets. @mblondel?
There was a problem hiding this comment.
Yep, it is still significantly faster.
There was a problem hiding this comment.
Should be kept included then.
doc/related_projects.rst
Outdated
| Caruana et al's Ensemble Selection algorithm in Python, based on scikit-learn | ||
|
|
||
| - `random-output-trees <https://github.com/arjoly/random-output-trees>`_ | ||
| Multi-output random forest on randomised output space |
There was a problem hiding this comment.
I think this needs to be clearer about what it's useful for
doc/related_projects.rst
Outdated
|
|
||
| - `libOPF <https://github.com/LibOPF/LibOPF>`_ Optimal path forest classifier | ||
|
|
||
| - `pyensemble <https://github.com/dclambert/pyensemble>`_ An implementation of |
There was a problem hiding this comment.
I suspect this is better fit in the Auto-ML section. But you would do well to check that.
There was a problem hiding this comment.
I felt this was a trade-off between the two but maybe Auto-ML might be more natural for it. I'll make the change.
| K-means and mixture of von Mises Fisher clustering routines for data on the | ||
| unit hypersphere. | ||
|
|
||
| - `pyIPCA <https://github.com/pickle27/pyIPCA>`_ Incremental Principal |
There was a problem hiding this comment.
How, if at all, does this differ from our IncrementalPCA?
There was a problem hiding this comment.
Seems to be a modification of our IncrementalPCA like:
class CCIPCA(BaseEstimator, TransformerMixin):
"""Candid covariance-free incremental principal component analysis (CCIPCA)
Linear dimensionality reduction using an online incremental PCA algorithm.
CCIPCA computes the principal components incrementally without
estimating the covariance matrix. This algorithm was designed for high
dimensional data and converges quickly.
This implementation only works for dense arrays. However it should scale
well to large data.
doc/related_projects.rst
Outdated
| - `Deep Learning <http://deeplearning.net/software_links/>`_ A curated list of deep learning | ||
| software libraries. | ||
|
|
||
| - `glm-sklearn <https://github.com/jcrudy/glm-sklearn>`_ scikit-learn |
There was a problem hiding this comment.
This fits in regression and classification above
doc/related_projects.rst
Outdated
| - Generating data with `non-parametric Gaussian mixture models <https://gist.github.com/2011426>`_ | ||
| Useful if you need "random" data that should have non-trivial structure. | ||
|
|
||
| - `scikit-protopy <https://github.com/dvro/scikit-protopy>`_ scikit-learn |
There was a problem hiding this comment.
This best lives with decomposition and clustering.
doc/related_projects.rst
Outdated
| --------------------- | ||
|
|
||
| The `wiki <https://github.com/scikit-learn/scikit-learn/wiki/Third-party-projects-and-code-snippets>`_ has more! | ||
| **Gists** |
There was a problem hiding this comment.
I think these gists can't really be considered projects, and a Wiki page might best be retained for them.
There was a problem hiding this comment.
Yes, I was skeptic about including this here too. Should we then rename the original wiki to just "Code Snippets" ?
There was a problem hiding this comment.
But since we are going to include this heading anyways, it might be better to retain the gists in this page and save the trouble of maintaining another wiki page separately?
|
I have removed the gists as they correctly don't belong here. We should retain the original wiki page for them. If this seems fine, I'll move on to making the change there. |
Codecov Report
@@ Coverage Diff @@
## master #8297 +/- ##
==========================================
- Coverage 96.19% 94.75% -1.45%
==========================================
Files 348 342 -6
Lines 64645 60809 -3836
==========================================
- Hits 62187 57617 -4570
- Misses 2458 3192 +734
Continue to review full report at Codecov.
|
| - `random-output-trees <https://github.com/arjoly/random-output-trees>`_ | ||
| Randomized output tree for multilabel / multi-output regression tasks | ||
|
|
||
| - `fastFM <https://github.com/ibayer/fastFM>`_ Fast factorization machine |
There was a problem hiding this comment.
"Factorization" makes me pretty sure this belongs with decomposition, not classification/regression
| - `fastFM <https://github.com/ibayer/fastFM>`_ Fast factorization machine | ||
| implementation compatible with scikit-learn | ||
|
|
||
| - `glm-sklearn <https://github.com/jcrudy/glm-sklearn>`_ scikit-learn |
There was a problem hiding this comment.
I'd put this higher up in the list.
| software libraries. | ||
|
|
||
| - `sklearn-deap <https://github.com/rsteca/sklearn-deap>`_ Use evolutionary | ||
| algorithms instead of gridsearch in scikit-learn. |
There was a problem hiding this comment.
gridsearch -> grid search.
This does not belong here. Perhaps a "model selection and evaluation" section under "other estimators and tasks"
Reference Issue
Fixes #8266
What does this implement/fix? Explain your changes.
Links the related projects from the wiki page to
related_projects.rstAny other comments
Am linking everything that is missing as of now. However, in the issue thread it was mentioned that we need include only the important. Please provide feedback as to what all should be included.