[MRG+1] Bayesian Gaussian Mixture (Integration of GSoC2015 -- second step)#6651
[MRG+1] Bayesian Gaussian Mixture (Integration of GSoC2015 -- second step)#6651ogrisel merged 33 commits intoscikit-learn:masterfrom
Conversation
| # XXX @xuewei4d I think you forgot n_component in your code ? | ||
| temp1 = (.5 * np.sum(temp1) + | ||
| self.n_components * self._log_gaussian_norm_prior) | ||
|
|
There was a problem hiding this comment.
@xuewei4d I think you forgot to multiply the log_gaussian_norm by n_components. Could you confirm it for the 4 functions please ?
There was a problem hiding this comment.
I checked it. I didn't forget it in Line791 in my PR
|
@tguillemot could you please rebase/squash on top of the current master to take the recent changes from #6666 into account in this PR? |
352b51a to
427650b
Compare
427650b to
8710b60
Compare
|
@tguillemot This PR needs to be updated to take the precision-based parametrization into account: |
|
@ogrisel I push the last commit I've done but I'm working on another PR for the moment. |
2d93f16 to
8b205df
Compare
|
I've solved the problem with VBGMM. I've to do some cleaning but I think I will be good to merge next week. |
|
@tguillemot Sure. Can I have your email address? |
|
Is is expected that when increasing |
|
@xuewei4d Thanks for the formula. @ngoix The version current version of this PR is not working well and have a lot a problem. So I suspect that it is a problem of that. |
d486d12 to
65e3400
Compare
|
@ngoix The code of the BayesianGaussianMixture is corrected now. |
|
@tguillemot Can I have the updated formula pdf? |
|
It can be due to my data, but now the number of components found is always maximal (even with |
|
whoops, it does not always find the maximal number of components sorry. |
|
@xuewei4d I haven't corrected the latex formula. I will put everything on scikit once it will be done. @ngoix This method is an EM and converge to a local minimum. If the init is not good, you will never reach the global minimum. |
|
@agramfort @amueller @ogrisel BayesianGaussianMixture is mergeable. |
sklearn/mixture/base.py
Outdated
| resp = np.zeros((n_samples, self.n_components)) | ||
| label = cluster.KMeans(n_clusters=self.n_components, n_init=1, | ||
| random_state=random_state).fit(X).labels_ | ||
| random_state=0).fit(X).labels_ |
sklearn/mixture/base.py
Outdated
| ------- | ||
| log_prob_norm : array, shape (n_samples,) | ||
| log p(X) | ||
| Logarithm of the probability of X. |
There was a problem hiding this comment.
Logarithm of the probability of each sample in X.
4d5de40 to
9c7ca50
Compare
|
Let's merge now. Thanks for all your efforts @tguillemot! |
|
And also thank you again @xuewei4d for the initial code refactoring and maths derivations. |
|
Thanks @tguillemot ! |
|
Hurrah!! On 31 August 2016 at 08:28, Wei Xue notifications@github.com wrote:
|
|
Hurrah !!!!!!!! Thanks everyone !!! |
|
yay! 🍻 |
|
awesome :) Thanks everyone! |
| :class:`BayesianGaussianMixture`. The new class solves the computational | ||
| problems of the old class and computes the Variational Bayesian Gaussian | ||
| mixture faster than before. | ||
| Ref :ref:`b` for more information. |
There was a problem hiding this comment.
@tguillemot what's b supposed to reference? It's a dead link.
…step) (scikit-learn#6651) * Add the new BayesianGaussianMixture class. Add the test file for the BayesianGaussianMixture. * Add the use of the cholesky decomposition of the precision matrix. * Fix some bugs. * Modification of GaussianMixture class. The purpose here is to prepare the integration of BayesianGaussianMixture. * Fix comments. * Modification of the Docstring. * Add license and author. * Fix pb typo of eq 10.64 and 10.62. * Correct VBGMM bugs. * Fix full version. * Fix the precision normalisation pb. * Fix all cov_type algo for BayesianGaussianMixture. * Optimisation of spherical and diag computation. * Code simplification. * Check the Gaussian Mixture tests are ok. * Add test. * Add new tests for BayesianGaussianMixture and GaussianMixture. * Add the bayesian_gaussian_example and the doc. * Fix comments. * Fix review comments and add license and author. * Fix test compare covar type. * Fix reviews. * Fix tests. * Fix review comments. * Correct reviews. * Fix travis pb. * Fix circleci pb. * Fix review comments. * Fix typo. * Fix comments. Add reg_covar and what's new. * Fix comments. * Fix comments. * [ci skip] Correct legend.
Previously modified with PR scikit-learn#6651
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR #6651 * Change tag name Old refers to new tag added with PR #7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
…step) (scikit-learn#6651) * Add the new BayesianGaussianMixture class. Add the test file for the BayesianGaussianMixture. * Add the use of the cholesky decomposition of the precision matrix. * Fix some bugs. * Modification of GaussianMixture class. The purpose here is to prepare the integration of BayesianGaussianMixture. * Fix comments. * Modification of the Docstring. * Add license and author. * Fix pb typo of eq 10.64 and 10.62. * Correct VBGMM bugs. * Fix full version. * Fix the precision normalisation pb. * Fix all cov_type algo for BayesianGaussianMixture. * Optimisation of spherical and diag computation. * Code simplification. * Check the Gaussian Mixture tests are ok. * Add test. * Add new tests for BayesianGaussianMixture and GaussianMixture. * Add the bayesian_gaussian_example and the doc. * Fix comments. * Fix review comments and add license and author. * Fix test compare covar type. * Fix reviews. * Fix tests. * Fix review comments. * Correct reviews. * Fix travis pb. * Fix circleci pb. * Fix review comments. * Fix typo. * Fix comments. Add reg_covar and what's new. * Fix comments. * Fix comments. * [ci skip] Correct legend.
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw
* Fix Rouseeuw1984 broken link * Change label vbgmm to bgmm Previously modified with PR scikit-learn#6651 * Change tag name Old refers to new tag added with PR scikit-learn#7388 * Remove prefix underscore to match tag * Realign to fit 80 chars * Link to metrics.rst. pairwise metrics yet to be documented * Remove tag as LSHForest is deprecated * Remove all references to randomized_l1 and sphx_glr_auto_examples_linear_model_plot_sparse_recovery.py. It is deprecated. * Fix few Sphinx warnings * Realign to 80 chars * Changes based on PR review * Remove unused ref in calibration * Fix link ref in covariance.rst * Fix linking issues * Differentiate Rouseeuw1999 tag within file. * Change all duplicate Rouseeuw1999 tags * Remove numbers from tag Rousseeuw

This PR is the second part of the GSoC integration. It is directly based on the work of the #6407.
Here I propose to integrate the Bayesian Gaussian Mixture :
This PR is based on #6407, it will be better to analyse only the files that refer to the
BayesianGaussianMixtureclass.