TODO
Non-Method Features to add:
- [x] Rotation Matrix for PCA and ICA
- [ ] mean and std vectors from PCA, ICA and PCA L1
- [ ] Eigenvalues
- [ ] kernel Matrices for kPCA
- [ ] update documentation for .mute in embed
- [x] add TravisCI script
- [ ] the possibility to use distance or kernel matrices as input.
- [ ] the possibility to use (semi) supervised methods
- [ ] a simple way to register new methods
- [ ] ...
Methods to add:
- [x] L1-PCA
- [ ] elastic map, C++ implementation here, need to ask for source, http://www.ihes.fr/~zinovyev/vida/elmap/index.htm
- [ ] unsupervised regression
- [x] autoencoder
- [x] NNMF
- [ ] SNE variants
- [ ] SOM
- [x] umap
- [ ] largeVis
- [ ] KECA
- [ ] OKECA
(Semi) supervised methods:
- [ ] tukey data depth
- [ ] cca
- [ ] opls
- [ ] pls
- [ ] kcca
- [ ] kopls
- [ ] kpls
Please propose more
Those are the main ones that I use (plus MDS) and I'm not familiar with some of the others.
You might want to add (or link to existing) functions to visualize these results.
Any plans on adding non-negative matrix factorizations? The equivalent of their rotations would be nice.
This may be a bit out of scope, but adding some sort of aggregate measures of predictor contribution would be good. For PCA, I've weighted the absolute values of the loadings by their variance contribution (but there are likely to be more theoretically justified methods out there).
I missed some techniques, e.g. Autoencoders, Non Negative Matrix Factorization, Local Tangent Space Alignment, SNE and derivates, ... I will implement them when I find some time.
Spontaneous idea: for the quality measures keep only the axis of interest in the original matrix and all axes in the reduced dimensions and compare the outcome between axes of interest. No idea, if this works or if it has a sound theoretical basis.
For linear techniques just look at the values in the rotation matrix, higher values in rotation[Var1, PCA1] mean a higher contribution of variable Var1 to axis PCA1.
first commit for the autoencoder https://github.com/gdkrmr/dimRed/commit/ed647560db2b3dca7e9a89a875ac5a80d3daa5ed
there is a really cool new one: umap: https://github.com/ropenscilabs/umapr https://arxiv.org/abs/1802.03426 currently only a wrapper with reticulate around a python implementation, there is also another package: https://github.com/tkonopka/umap both not on CRAN yet
there is also largeVis: LargeVis which already is on cran EDIT: largeVis got archived
Add methods KECA and OKECA.
Some other possible methods to consider:
Incidentally, there is another UMAP implementation that is pretty well-documented and appears to do a good job mirroring the original Python API: https://github.com/jlmelville/uwot
Some other possible methods to consider:
* Sparse PCA * [sparsepca](https://cran.r-project.org/web/packages/sparsepca/index.html) * [nsprcomp](https://cran.r-project.org/web/packages/nsprcomp/nsprcomp.pdf) * Robust PCA * [FastHCS](https://cran.r-project.org/web/packages/FastHCS/FastHCS.pdf) * Robust Sparse PCA * [rospca](https://cran.r-project.org/web/packages/rospca/index.html) * [rpca](https://cran.r-project.org/web/packages/rpca/index.html) * Other * [whitening](https://cran.r-project.org/web/packages/whitening/index.html)
Sounds great! To save copying code, there should probably be a single pca class with different backends and the PCAL1 class should be part of it.
Incidentally, there is another UMAP implementation that is pretty well-documented and appears to do a good job mirroring the original Python API: https://github.com/jlmelville/uwot
This one sounds really promising. Python dependencies always cause trouble, swapping them for a native R package is always welcome! The package still has to be released on CRAN.