Skip to content

Cypher Batch creation is slow with vector indexes #3864

@vitoprr

Description

@vitoprr

Hello @lvca,

cc @ExtReMLapin

At the office, there is a performance issue when executing OpenCypher batches via the HTTP API. While the operation is fast on a fresh database, latency increases significantly as the database is populated.

Query (in a .json file): result_payload.zip
(to be used with the below script, important!)

Database Backup: LATENCYTEST-backup-20260414-161848746.zip

Reproduction script:

import requests
import json
import time
import sys

# Configuration
SERVER_URL = "http://localhost:2480"
DATABASE_NAME = "LATENCYTEST" 
USERNAME = "root"
PASSWORD = "rootroot"
PAYLOAD_FILE = "result_payload.json"
THRESHOLD_SEC = 4.0

def run_reproduction():
    url = f"{SERVER_URL}/api/v1/command/{DATABASE_NAME}"
    auth = (USERNAME, PASSWORD)
    
    try:
        with open(PAYLOAD_FILE, 'r', encoding='utf-8') as f:
            payload = json.load(f)
    except FileNotFoundError:
        print(f"Error: {PAYLOAD_FILE} not found.")
        sys.exit(1)

    print(f"Target Database: {DATABASE_NAME}")
    print(f"Command Language: {payload.get('language')}")
    print(f"Batch Size: {len(payload.get('params', {}).get('batch', []))} entries")
    print("-" * 30)

    start_time = time.time()
    
    try:
        response = requests.post(
            url,
            json=payload,
            auth=auth,
            timeout=60
        )
        
        elapsed = time.time() - start_time

        if response.status_code == 200:
            print("Execution successful")
            if elapsed > THRESHOLD_SEC:
                print(f"⚠️  WARNING: Execution time is abnormally high: {elapsed:.4f}s (Threshold: {THRESHOLD_SEC}s)")
            else:
                print(f"Execution Time: {elapsed:.4f}s")
        else:
            print(f"HTTP Error {response.status_code}")
            print(f"Response: {response.text}")

    except requests.exceptions.RequestException as e:
        print(f"Connection error: {e}")

if __name__ == "__main__":
    run_reproduction()

Observed Result:

Target Database: LATENCYTEST
Command Language: opencypher
Batch Size: 3,423 entries
------------------------------
Execution successful
⚠️  WARNING: Execution time is abnormally high: 7.5667s (Threshold: 4.0s)

7.56s latency on a database containing only 3 small documents (a few MBs total). For a Cypher query processing a batch of 3,423 vectors (dimension 1024), i s this expected behavior and normal latency?

Have a wonderful day ~~

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions