Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Allow the llama_progress_callback to return a value that will stop the model being loaded, and free all resources.
Motivation
LLMs can brush up against the limits of some computers, and sometimes you just need an emergency stop button. llama.cpp can already catch std::exceptions inside the model loading process and clean up the half-loaded model, but unfortunately, non-C++ languages (such as Rust) can't throw std::exceptions, so even if they do unwind, it won't be caught by llama.cpp's try-catch and the resources used by the model won't actually be properly cleaned up.
Possible Implementation
Allow the llama_progress_callback to return a value that aborts model loading early. Maybe have it return a bool where true is continue and false is abort? This could totally bite existing codebases though since it's really subtle.
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Feature Description
Allow the
llama_progress_callbackto return a value that will stop the model being loaded, and free all resources.Motivation
LLMs can brush up against the limits of some computers, and sometimes you just need an emergency stop button. llama.cpp can already catch
std::exceptions inside the model loading process and clean up the half-loaded model, but unfortunately, non-C++ languages (such as Rust) can't throwstd::exceptions, so even if they do unwind, it won't be caught by llama.cpp's try-catch and the resources used by the model won't actually be properly cleaned up.Possible Implementation
Allow the
llama_progress_callbackto return a value that aborts model loading early. Maybe have it return a bool wheretrueis continue andfalseis abort? This could totally bite existing codebases though since it's really subtle.