We have found several problems in the implementation of the method to automatically tune the number of components of the PCA algorithms:
- The algorithm never tests full rank: this is most probably due to the fact that loops using the rank end always at rank-1 (
for i in range(rank)).
- If two eigen values are equals there is a log(0) issue.
- Zeros eigen values are not treated explicitly
Possible solutions:
- For (1): Checking the loops ranges
- For (3): Predetecting small eigen values lower than the numerical noise excluding them from rank scan
I have no idea for 2. We had the problem here with very small eigen values (in numerical noise) which were totally identical. I never managed to create a syntetic dataset which reproduce the problem since the even with symetric datasets, there is always a small difference (in the order of numerical precision) between theoretically identical eigen values.
We have found several problems in the implementation of the method to automatically tune the number of components of the PCA algorithms:
for i in range(rank)).Possible solutions:
I have no idea for 2. We had the problem here with very small eigen values (in numerical noise) which were totally identical. I never managed to create a syntetic dataset which reproduce the problem since the even with symetric datasets, there is always a small difference (in the order of numerical precision) between theoretically identical eigen values.