Use conversation template for api proxy, fix eventsource format by zeyugao · Pull Request #2383 · ggml-org/llama.cpp

zeyugao · 2023-07-25T10:26:53Z

In this pr, it adds a --chat-prompt-model parameter enables the use of a model registered in fastchat/conversation.py. As model prompt templates, like llama 2, become more intricate, handling them exclusively with tools such as --chat-prompt and --user-name becomes less manageable. Thus, a community-maintained conversation template has been developed as a more user-friendly solution.

Currently, the customized system message is pending the merge of lm-sys/FastChat#2069. Yet, the current fschat version should operate without exceptions.

Furthermore, there exists an issue when presenting data in an event-source format. The data must conclude with two \n characters, rather than just one \n, implying the necessity for a blank line that contains only a single \n character, which is what OpenAI did.

Fix eventsource format

Make fschat and flask-cors optional

Azeirah · 2023-07-26T09:11:02Z

Thank you! The PHPStorm plugin I was using, codeGPT didn't work with the main branch api_like_OAI.py.

With yours it works smoothly! With the new llama-2 based wizard-13b I finally have a usable local-only assistant that integrates seamlessly in my existing workflows.

:D

zeyugao · 2023-08-02T01:37:31Z

The pr has been merged in upstream, and due to the limitation from GitHub (https://github.com/orgs/community/discussions/5634), it seems that I cannot allow editing by maintainer.

thomasbergersen · 2023-08-05T08:27:10Z

Thank you! The llama-cpp-python generated result is missing some key words

vmajor · 2023-08-07T07:46:54Z

I am observing this error with a 70B Llama 2 model when attempting to run the guidance tutorial notebook and dropping in openai.api_base = "http://127.0.0.1:8081/": https://github.com/microsoft/guidance/blob/main/notebooks/tutorial.ipynb

[2023-08-07 09:31:43,210] ERROR in app: Exception on /completions [POST]
Traceback (most recent call last):
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/vmajor/anaconda3/lib/python3.9/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 225, in completion
    postData = make_postData(body, chat=False, stream=stream)
  File "/home/vmajor/llama.cpp/examples/server/api_like_OAI_fixed.py", line 106, in make_postData
    if(is_present(body, "stop")): postData["stop"] += body["stop"]
TypeError: 'NoneType' object is not iterable
127.0.0.1 - - [07/Aug/2023 09:31:43] "POST //completions HTTP/1.1" 500 -

See reddit thread for setup: https://www.reddit.com/r/LocalLLaMA/comments/15ak5k4/short_guide_to_hosting_your_own_llamacpp_openai/

zeyugao added 2 commits July 26, 2023 15:26

Use coversation template from fastchat for api proxy

ea5a7fb

Fix eventsource format

Add docs

bee2a3d

Make fschat and flask-cors optional

zeyugao added 2 commits August 2, 2023 09:25

Merge remote-tracking branch 'origin/master'

59484c6

Use conv.set_system_message from upstream

712c2e9

zeyugao force-pushed the master branch from d8a8d0e to 712c2e9 Compare August 2, 2023 01:35

zeyugao added 2 commits August 7, 2023 18:08

Merge remote-tracking branch 'origin/master'

6ae3702

Fix when stop in request is null

ea73dac

AlienKevin added a commit to AlienKevin/llama.cpp that referenced this pull request Aug 21, 2023

Add OpenAI server PR from ggml-org/llama.cpp#2383

3af9477

See reddit thread for setup: https://www.reddit.com/r/LocalLLaMA/comments/15ak5k4/short_guide_to_hosting_your_own_llamacpp_openai/

jcushman mentioned this pull request Sep 8, 2023

OpenAI API Wrapper huggingface/text-generation-inference#735

Closed

Azeirah mentioned this pull request Nov 8, 2023

fix oai proxy #3972

Merged

zeyugao closed this Nov 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use conversation template for api proxy, fix eventsource format#2383

Use conversation template for api proxy, fix eventsource format#2383
zeyugao wants to merge 6 commits intoggml-org:masterfrom
elsagranger:master

zeyugao commented Jul 25, 2023

Uh oh!

Azeirah commented Jul 26, 2023

Uh oh!

zeyugao commented Aug 2, 2023

Uh oh!

thomasbergersen commented Aug 5, 2023 •

edited

Loading

Uh oh!

vmajor commented Aug 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zeyugao commented Jul 25, 2023

Uh oh!

Azeirah commented Jul 26, 2023

Uh oh!

zeyugao commented Aug 2, 2023

Uh oh!

thomasbergersen commented Aug 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vmajor commented Aug 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

thomasbergersen commented Aug 5, 2023 •

edited

Loading