SMIT icon indicating copy to clipboard operation
SMIT copied to clipboard

SMIT: A Simple Modality Integration Tool

Results 6 SMIT issues
Sort by recently updated
recently updated
newest added

The current workflow leads to a certain amount of catastrophic forgetting, the base model used [abacaj/phi-2-super](https://huggingface.co/abacaj/phi-2-super) reach an average of $62.13$ on the [open_llm_leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) while the resulting model [Thytu/phi-2-audio-super](https://huggingface.co/Thytu/phi-2-audio-super) falls...

In order to allow user to better find and filter their runs, each run should be named based on the model used. One option could be `{decoder_name}_{speech_encoder_name}` (i.e `abacaj/phi-2-super_facebook/hubert-large-ls960-ft`). However...

good first issue

Currently the only metrics available during evaluation is the model loss however this does not provide enough granularity about the model performance on each of the modality its training on....

priority: high

In the current situation there is no way to detect a bug introduced by a PR, has showcased by #8 SMIT should integrate a test suite.

Currently SMIT will only use either the `cpu` when no gpu available, or `cuda:0` when at least one gpu is found. This considerably limits the size performance of SMIT when...

In the current state, SMIT. will save the whole model (encoder + projector + decoder) during the pretraining. As only the `linear_projector` is trained during pre-training, this unnecessarily consumes disk...