Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Dec 30, 2025

Ref: #18469 (comment)

I'm extracting this change to a dedicated PR for visibility

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we delegate it to the llama_model to free the adapters, then we should remove the llama_adapter_lora_free() from the public API.

For example, with the current change I believe we will be double freeing the adapters - once on model destruction and once more here for all llama_adapter_lora_ptr instances:

struct llama_adapter_lora_deleter {
void operator()(llama_adapter_lora * adapter) { llama_adapter_lora_free(adapter); }
};

@ngxson
Copy link
Collaborator Author

ngxson commented Dec 31, 2025

Yes that makes sense. llama_adapter_lora_free exists because we wanted to allow hot-loading / hot-removing of lora adapter. But I don't think it's actually used that way in reality.

By removing llama_adapter_lora_free, we will now assume that lora adapters are an "extension" to a model, and must be loaded before the model can be used (i.e. context creation from a model)

@ngxson ngxson requested a review from ggerganov December 31, 2025 11:07
@ggerganov
Copy link
Member

@ngxson This is probably good to merge?

@ngxson
Copy link
Collaborator Author

ngxson commented Jan 15, 2026

Ah yes, thanks. Merging it now

@ngxson ngxson merged commit a7e6ddb into master Jan 15, 2026
79 of 80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants