Anemll Server

This is an OpenAI-compatible API server for Anemll models. It provides a /v1/chat/completions endpoint that follows the OpenAI API format. It also provides the /v1/models endpoint which allows it to work with Open WebUI.

Features

OpenAI-compatible API
Streaming responses
System prompt, conversation history supported
Works with Open WebUI (other frontends not tested but should work aswell)

A version of chat_full.py (from the swift-inference branch, which runs the fastest for me) is included for ease of use.

Installation

Install the required dependencies, preferably in a conda or venv environment

pip install -r requirements.txt

You will also need to download an Anemll model. I have used this one from the official Anemll Huggingface, 0.1.1 should also work fine.

Configuration

Modify the MODEL_DIR variable in server.py to your Anemll model path.

# Hardcoded model directory path
MODEL_DIR = "/example-path/anemll-Meta-Llama-3.2-1B-ctx2048_0.1.2"

Usage

Run the server with:

python server.py

The server will start on 0.0.0.0:8000 by default.

In order to connect Open WebUI to it, simply go to "Connections" in the settings and enter this as the base URL: http://0.0.0.0:8000/v1.

Known issues, limitations

Sometimes, but rarely, when you start the server you will get a GIL issue when you try to generate a response. Just restart the server and it will most likely work the next time you run it, and keep working from then on.

One last thing

Anemll is still in its early stages, with a limited amount of models on Hugging Face and development of the core library still ongoing. This presents a unique opportunity to become an early contributor to this emerging technology. Whether you're interested in experimenting with the library, converting models, contributing code, or simply raising awareness - your involvement can help shape the future of on-device AI acceleration. The ANE represents a significant advancement in efficient ML inference, and community participation is vital to realizing its full potential.

API Endpoints

`/v1/chat/completions`

This endpoint follows the OpenAI API format for chat completions.

Example request:

{
  "model": "anemll-model",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"}
  ],
  "temperature": 0.7,
  "stream": true
}

`/v1/models`

Lists available models. Needed to work with Open WebUI.

Testing with curl

Non-streaming request:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anemll-model",
    "messages": [
      {"role": "system", "content": "Whatever you do, always reply in ALL CAPS!"},
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "stream": false
  }'

Streaming request:

curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anemll-model",
    "messages": [
      {"role": "system", "content": "Whatever you do, always reply in ALL CAPS!"},
      {"role": "user", "content": "Hello, how are you?"}
    ],
    "temperature": 0.7,
    "stream": true
  }'

List models:

curl http://localhost:8000/v1/models

Links

Unofficial Discord server for Anemll:

https://discord.gg/xgtQDDBGcM

If you're interested in a general AI system performing any kind of digital labour for you, visit:

https://planetarylabour.com

License

MIT - Do whatever you want with this.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
README.md		README.md
chat_full.py		chat_full.py
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anemll Server

Features

Installation

Configuration

Usage

Known issues, limitations

One last thing

API Endpoints

`/v1/chat/completions`

`/v1/models`

Testing with curl

Non-streaming request:

Streaming request:

List models:

Links

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Anemll Server

Features

Installation

Configuration

Usage

Known issues, limitations

One last thing

API Endpoints

/v1/chat/completions

/v1/models

Testing with curl

Non-streaming request:

Streaming request:

List models:

Links

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`/v1/chat/completions`

`/v1/models`

Packages