- add a flag to differentiate between different APIs - add parsing llama.cpp response docs: https://github.com/ggerganov/llama.cpp/tree/master/examples/server