🚧  RFC: Redesign Batch Processing as an Offline Workflow

### **Summary**
This RFC proposes removing the existing `/v1/batches` and `/v1/files` endpoints from the main OpenAI-compatible server and replacing them with a standalone offline batch processing service.

> **Note:** As part of the ongoing OpenAI API refactor, the batch support has already been removed from the main server. This RFC serves to document the rationale and formalize the replacement plan.


---

### Problem

#### 7.1 Fundamental Issues with the Current Batch API (#7068 )

The current design for online batch processing is flawed and not production-safe. Key issues include:

- **Server Stability Risk**: Uploading and processing thousands of requests at once can overwhelm online API servers.
- **Timing Constraints**: Difficult to enforce `completion_window` in a real-time environment.
- **Resource Contention**: Batch jobs run alongside latency-sensitive requests without proper isolation.
- **Architecture Mismatch**: Batch workloads are inherently asynchronous/offline, conflicting with the synchronous nature of standard OpenAI endpoints.

---

### Proposed Solution

#### 1. **Simplify Online Endpoints**
- Remove logic for handling list-wrapped input in `/v1/chat/completions`, `/v1/embeddings`, etc.
- Accept only single request per HTTP call (OpenAI spec-compliant).
- Cleaner code and better performance for common-case usage.

#### 2. **Split Out Batch Service**
Implement batch processing as a **separate offline job runner**, modeled after how vLLM does it.

This batch runner will:
- Accept batch jobs in OpenAI-compatible `.jsonl` format
- Spawn a new process/container to handle the job
- Stream output to a results file (local or presigned S3 URLs)
- Optionally enforce `completion_window` guarantees in the background

#### 3. **Remove from Main Server**
- Remove `/v1/batches` and `/v1/files` routes from the main OpenAI-compatible HTTP server.
- These should live in a separate service (`batch-runner`) to enforce separation of concerns.

---

### 📌 Action Items

- [ ] Finalize and approve this RFC
- [ ] Implement batch runner
- [x] Deprecate online batch endpoints
- [ ] Update docs and integration tests


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚧 RFC: Redesign Batch Processing as an Offline Workflow #7427

Summary

Problem

7.1 Fundamental Issues with the Current Batch API (#7068 )

Proposed Solution

1. Simplify Online Endpoints

2. Split Out Batch Service

3. Remove from Main Server

📌 Action Items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🚧 RFC: Redesign Batch Processing as an Offline Workflow #7427

Description

Summary

Problem

7.1 Fundamental Issues with the Current Batch API (#7068 )

Proposed Solution

1. Simplify Online Endpoints

2. Split Out Batch Service

3. Remove from Main Server

📌 Action Items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions