Skip to content

Add option to change covariance matrix type for GMM class#50

Merged
francois-rozet merged 15 commits intoprobabilists:masterfrom
dominik-strutz:master
Mar 3, 2025
Merged

Add option to change covariance matrix type for GMM class#50
francois-rozet merged 15 commits intoprobabilists:masterfrom
dominik-strutz:master

Conversation

@dominik-strutz
Copy link
Copy Markdown
Contributor

This PR adds changes to the zuko.flows.mixture.GMM class, which allow the user to change the type of the covariance matrix used for each of the Gaussian components of the mixture.

The options added are

  • covariance_type, which allows to change the type of the covariance matrices
  • tied a switch which allows to control if covariance matrices are tied between components
  • cov_rank the rank of the low-rank covariance matrix when covariance_type is 'lowrank'

Since the construction of the shapes got quite long I moved this part in its own function.

Below is an illustration of the effect these different choices have for a mixture of 3 two-dimensional Gaussians.

image

@francois-rozet
Copy link
Copy Markdown
Member

Hello @dominik-strutz, I quickly went over the code, and it looks nice! I think we should add some tests however, maybe in a new tests/test_flows_gmm.py file.

Are you also planning to improve the initialization as well?

@dominik-strutz
Copy link
Copy Markdown
Contributor Author

Hi @francois-rozet, Yes, I am happy to write some tests.

I'm also happy to try to improve the initialization. Following sklearn again, the initialization methods 'random' and 'random_from_data' (the latter one working quite well in my limited experience) should be easy enough to implement. Also, implementing 'k-mean' or 'k-means++' should be manageable. I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.

What is your opinion on how to structure the initialization? I think it would be beneficial to keep the __init__ method quite simple and have a separate method (e.g., GMM.initialization) that takes user-provided data samples and sets the phi variable or the last layer of the network to reasonable values given an initialization method.

I will give the initialisation a try and let you know how it goes.

P.S: I have no idea why the pre-commit hook fails. I used ruff --fix locally to reformat, and from what I can see, it fulfils the requirements.

@francois-rozet
Copy link
Copy Markdown
Member

francois-rozet commented Apr 4, 2024

I don't know how to handle the conditional case yet, but having only the unconditional one would be a good first start.

I think a good way to handle the conditional case would be to make the weight $W$ of the last layer small (e.g. standard initialization * 1e-2) and set the bias $b$ to the unconditional initialization.

What is your opinion on how to structure the initialization?

I agree that a separate method could be appropriate, similar to the reset_parameters of nn.Linear.

P.S: I have no idea why the pre-commit hook fails. I used ruff --fix locally to reformat, and from what I can see, it fulfils the requirements.

I pulled your branch and ruff check . at the root of the repo returns

zuko/flows/mixture.py:7:1: I001 [*] Import block is un-sorted or un-formatted
Found 1 error.
[*] 1 fixable with the `--fix` option.

Maybe you were not at the root? My version of ruff is 0.1.14 by the way.

@francois-rozet
Copy link
Copy Markdown
Member

@dominik-strutz Do you still plan on contributing this PR?

@dominik-strutz
Copy link
Copy Markdown
Contributor Author

Yes, I still like to contribute but haven't found much free time to do it recently. I have implemented most of the initialization methods for the unconditional case, but it still needs to be polished up and tested. The extension for the conditional case shouldn't take too long afterwards. If you or someone else wants to continue this sooner, I'm happy to push an intermediary commit of everything I have so far.

@francois-rozet
Copy link
Copy Markdown
Member

No problem, take your time! I am currently updating a few things and wanted to know if I should wait for this PR for the next minor release.

@dominik-strutz
Copy link
Copy Markdown
Contributor Author

It seems like I closed this automatically when I rebased my fork. Is there a way to reopen it with the same master branch of my fork or do I need to open a new request?

In short, I have added an initialisation method that works with and without context variables and added some tests.

@dominik-strutz
Copy link
Copy Markdown
Contributor Author

dominik-strutz commented Jan 14, 2025

The pull request should be ready to merge now.

It adds the following new functionalities:

  • covariance_type: the covariance type can now be chosen between full, diagonal, and spherical. Low-rank was in a previous version of this request but I have not yet made it work fully (see commit history for implementation). I might implement this in the future if I find the time.

  • tied covariance matrices for all of the above types. Tied covariance matrices share their covariance across components.

  • an initialize method that works with and without context variables. The user can choose between the following strategies: random, kmeans, kmeans++

  • a test file that mostly makes sure that all the shapes are right, but does not test the training process itself

Let me know if this is up to your standard or if there are any changes necessary :)

PS.: I uploaded two jupyter notebooks which I used to test the functionality to a separate branch in my fork of this repository: https://github.com/dominik-strutz/zuko/tree/test_gmm.

@dominik-strutz dominik-strutz marked this pull request as ready for review January 14, 2025 09:42
@francois-rozet
Copy link
Copy Markdown
Member

francois-rozet commented Mar 3, 2025

Hi @dominik-strutz, sorry for the delay. I finally had the time to go over the PR. Overall, it is very good! The initialization in the conditional case was very smart 🧠

I modified a few things:

  1. optimized the clustering methods
  2. regrouped the initialization process into initialize
  3. renamed a few functions
  4. moved the tests to tests/test_mixtures.py

I tested the code, and it works really well. I will merge the PR soon. Thanks again!

@francois-rozet francois-rozet merged commit 3d47337 into probabilists:master Mar 3, 2025
6 checks passed
@francois-rozet francois-rozet linked an issue Mar 3, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add covariance_type option to GMM class

2 participants