hparams : add n_embd_inp() to support extended embed#16928
Conversation
|
Tested this. It works correctly with a 1d cvector of the size 5120, and for basic MTMD use cases. Thanks! |
This comment was marked as outdated.
This comment was marked as outdated.
|
Hmm, please ignore what I said earlier. Indeed, I think there is currently a misunderstanding here. The I would suggest calling it |
| const int n_embd = hparams.n_embd; | ||
| ggml_tensor * b = ggml_new_tensor_4d(ctx, GGML_TYPE_F32, n_embd, w->ne[1], 1, 1); | ||
| const int n_embd_inp = hparams.n_embd_inp(); | ||
| ggml_tensor * b = ggml_new_tensor_4d(ctx, GGML_TYPE_F32, n_embd_inp, w->ne[1], 1, 1); |
There was a problem hiding this comment.
I'm unsure if this is correct as well...
There was a problem hiding this comment.
I think either n_embd_inp or n_embd should not be very important. at least for now, models that use this ops will have n_embd_inp == n_embd
|
I tested the latest commit in this series and it works for text and image processing successfully with a cvector applied. |
I will test again later today, but feel free to do so too before that. |
Tested again on the latest commit. Seems to work great, MTMD and text work with a cvector applied. cvector is still working as expected. |
* add n_embd_full to support extended embed * don't change output * rename to n_embd_inp * restore n_embd where applicable
* add n_embd_full to support extended embed * don't change output * rename to n_embd_inp * restore n_embd where applicable
Required for proper handling of
Qwen3-VLDeepStack embeds.May change more than currently necessary for future use, f.ex. in
llama-context(or maybe even not enough), please review carefully!Fixes #16908