I am using this code: "python3 -m sglang.launch_server --model-path liuhaotian/llava-v1.5-7b --tokenizer-path llava-hf/llava-1.5-7b-hf --chat-template vicuna_v1.1 --port 30000" , but I am not able to use any instruction for quantity 4-bit.
can you tell me how to use 4-bit llava on sglang?
I am using this code: "python3 -m sglang.launch_server --model-path liuhaotian/llava-v1.5-7b --tokenizer-path llava-hf/llava-1.5-7b-hf --chat-template vicuna_v1.1 --port 30000" , but I am not able to use any instruction for quantity 4-bit.
can you tell me how to use 4-bit llava on sglang?