Skip to content

feat: vector integrationTest;feat: ob quantization#6366

Merged
c121914yu merged 4 commits into
mainfrom
ob-test
Feb 2, 2026
Merged

feat: vector integrationTest;feat: ob quantization#6366
c121914yu merged 4 commits into
mainfrom
ob-test

Conversation

@c121914yu

Copy link
Copy Markdown
Collaborator

No description provided.

yixin-zh and others added 4 commits January 31, 2026 14:58
)

Support OceanBase vector index quantization via VECTOR_VQ_LEVEL:
- 32 (default): hnsw + inner_product
- 8: hnsw_sq + inner_product (2-3x memory savings)
- 1: hnsw_bq + cosine (~15x memory savings)

HNSW_BQ requires cosine distance per OceanBase docs.
Tested on OceanBase 4.3.5.5 (BP5).

Closes #6202
…6358)

* feat: add test inclusion for vectorDB tests in vitest configuration

* refactor: update vectorDB README and setup for environment configuration

- Enhanced README to clarify the use of factory pattern for vectorDB integration tests.
- Updated instructions for setting up environment variables from a local file.
- Removed obsolete PG integration test file and adjusted test execution instructions.
- Improved structure explanation for shared test data and factory functions.
Copilot AI review requested due to automatic review settings February 2, 2026 10:38
@gru-agent

gru-agent Bot commented Feb 2, 2026

Copy link
Copy Markdown
Contributor

There is too much information in the pull request to test.

@cla-assistant

cla-assistant Bot commented Feb 2, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ YixinZ-NUS
✅ c121914yu
❌ alswl
You have signed the CLA already but the status is still pending? Let us recheck it.

@cla-assistant

cla-assistant Bot commented Feb 2, 2026

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
2 out of 3 committers have signed the CLA.

✅ YixinZ-NUS
✅ c121914yu
❌ alswl
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions

github-actions Bot commented Feb 2, 2026

Copy link
Copy Markdown

Preview mcp_server Image:

registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-pr:fatsgpt_mcp_server_45780f1850666b9c3a034184777535458009c57a

@github-actions

github-actions Bot commented Feb 2, 2026

Copy link
Copy Markdown

Preview sandbox Image:

registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-pr:fatsgpt_sandbox_45780f1850666b9c3a034184777535458009c57a

@github-actions

github-actions Bot commented Feb 2, 2026

Copy link
Copy Markdown

Docs Preview:


🚀 FastGPT Document Preview Ready!

🔗 👀 Click here to visit preview

@github-actions

github-actions Bot commented Feb 2, 2026

Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 25.74% 18812 / 73079
🔵 Statements 25.74% 18812 / 73079
🔵 Functions 38.31% 587 / 1532
🔵 Branches 71.07% 2008 / 2825
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/service/common/vectorDB/constants.ts 0% 0% 0% 0% 1-82
packages/service/common/vectorDB/controller.ts 96.73% 86.95% 100% 96.73% 25-28
packages/service/common/vectorDB/milvus/index.ts 0% 0% 0% 0% 1-314
packages/service/common/vectorDB/oceanbase/controller.ts 0% 0% 0% 0% 1-242
packages/service/common/vectorDB/oceanbase/index.ts 0% 100% 100% 0% 2-215
packages/service/common/vectorDB/seekdb/index.ts 100% 100% 100% 100%
Generated in workflow #3683 for commit 45780f1 by the Vitest Coverage Report Action

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces comprehensive integration tests for vector databases (PostgreSQL, OceanBase, SeekDB, Milvus) and adds support for OceanBase quantization with three levels (32, 8, 1) using different HNSW index types. The changes include refactoring the OceanBase/SeekDB controllers to support multiple database types, updating deployment configurations, and fixing Milvus API compatibility issues.

Changes:

  • Added integration test suite for vector databases with factory pattern for shared test cases
  • Implemented OceanBase vector quantization support with configurable HNSW index types (hnsw, hnsw_sq, hnsw_bq)
  • Updated SeekDB and OceanBase docker-compose configurations with corrected connection strings and environment variables

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 20 comments.

Show a summary per file
File Description
vitest.config.mts Excludes vector DB integration tests from main test suite
test/setup.ts Minor whitespace formatting change
test/integrationTest/vectorDB/yml/docker-compose.yml Docker compose configuration for test databases (PG, Milvus, OceanBase, SeekDB)
test/integrationTest/vectorDB/vitest.config.mts Vitest configuration for vector DB integration tests
test/integrationTest/vectorDB/utils.ts Utility for loading environment variables from .env.test.local
test/integrationTest/vectorDB/testSuites.ts Reusable test suite factory for all vector databases
test/integrationTest/vectorDB/testData.ts Test fixtures with vector data and ID generators
test/integrationTest/vectorDB/setup.ts Test setup file that loads environment variables
test/integrationTest/vectorDB/seekdb/index.integration.test.ts SeekDB integration tests
test/integrationTest/vectorDB/pg/index.integration.test.ts PostgreSQL integration tests
test/integrationTest/vectorDB/oceanbase/index.integration.test.ts OceanBase integration tests
test/integrationTest/vectorDB/milvus/index.integration.test.ts Milvus integration tests
test/integrationTest/vectorDB/globalSetup.ts Global setup for vector DB tests with environment logging
test/integrationTest/vectorDB/README.md Documentation for vector DB integration tests
test/integrationTest/vectorDB/.env.test.tempalte Environment template for test configuration (contains typo in filename)
test/integrationTest/READMD.md Integration test directory documentation (contains typo in filename)
projects/app/.env.template Updated vector quantization documentation and configuration
packages/service/common/vectorDB/seekdb/index.ts Removed duplicate export
packages/service/common/vectorDB/oceanbase/index.ts Refactored to support both OceanBase and SeekDB with quantization
packages/service/common/vectorDB/oceanbase/controller.ts Refactored ObClass to support multiple database types
packages/service/common/vectorDB/milvus/index.ts Fixed Milvus search API parameters and type handling
packages/service/common/vectorDB/controller.ts Updated to pass type parameter to OceanBase/SeekDB constructors
packages/service/common/vectorDB/constants.ts Added OceanBaseIndexConfig for quantization support
package.json Added test:vector npm script
document/public/deploy/docker/* Updated docker-compose files with corrected database URLs and ports
document/data/doc-last-modified.json Updated documentation modification timestamp
document/content/docs/upgrading/4-14/4147.mdx Added changelog entry for vector DB integration tests
document/content/docs/toc.mdx Added link to version 4.14.7 documentation
deploy/templates/vector/* Updated deployment templates with corrected configurations
deploy/init.mjs Updated initial configuration with correct database URLs
deploy/docker/* Updated docker configurations with corrected database URLs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1 to +5
# FastGPT 集成测试

## 目录

- vectorDB: 向量数据库 No newline at end of file

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name has a typo: 'READMD.md' should be 'README.md'. This should follow the standard README naming convention.

Copilot uses AI. Check for mistakes.
Comment on lines +50 to +51
- ./seekdb/data:/var/lib/mysql
- ./seekdb/config:/etc/mysql/conf.d

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume path change from '../seekdb/data' to './seekdb/data' could break existing deployments if users have data in the old path. This is a potentially breaking change that should be documented in the upgrade guide or migration notes. Consider keeping backward compatibility or providing clear migration instructions.

Copilot uses AI. Check for mistakes.
Comment on lines +38 to +42
- **fixtures.ts**:统一测试数据(`TEST_TEAM_ID`、`TEST_DATASET_ID`、`TEST_COLLECTION_ID`、1536 维 `TEST_VECTORS`),所有向量库共用。
- **factory.ts**:工厂函数 `runVectorDBTests(driver)`,同一套用例(init、insert、getVectorCount、embRecall、getVectorDataByTime、delete)供各驱动复用。
- **integration.test.ts**:注册各驱动(PG、后续 Oceanbase/Milvus),按 `driver.envKey` 决定是否跳过;每个驱动执行同一套 `runVectorDBTests(driver)`。

新增向量库时:在 `integration.test.ts` 的 `drivers` 数组中增加一项(`name`、`envKey`、`createCtrl`),无需改 fixtures 或 factory。

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation refers to 'fixtures.ts', 'factory.ts', and 'integration.test.ts' files, but the actual implementation uses different file names ('testData.ts', 'testSuites.ts', and individual test files per database). The documentation should be updated to reflect the actual file structure.

Suggested change
- **fixtures.ts**:统一测试数据(`TEST_TEAM_ID``TEST_DATASET_ID``TEST_COLLECTION_ID`、1536 维 `TEST_VECTORS`),所有向量库共用。
- **factory.ts**:工厂函数 `runVectorDBTests(driver)`,同一套用例(init、insert、getVectorCount、embRecall、getVectorDataByTime、delete)供各驱动复用。
- **integration.test.ts**注册各驱动(PG、后续 Oceanbase/Milvus),按 `driver.envKey` 决定是否跳过;每个驱动执行同一套 `runVectorDBTests(driver)`
新增向量库时:`integration.test.ts` `drivers` 数组中增加一项(`name``envKey``createCtrl`),无需改 fixtures 或 factory
- **testData.ts**:统一测试数据(`TEST_TEAM_ID``TEST_DATASET_ID``TEST_COLLECTION_ID`、1536 维 `TEST_VECTORS`),所有向量库共用。
- **testSuites.ts**:工厂函数 `runVectorDBTests(driver)`,同一套用例(init、insert、getVectorCount、embRecall、getVectorDataByTime、delete)供各驱动复用。
- **各驱动的集成测试文件**为每种向量库(PG、后续 Oceanbase/Milvus 等)提供独立的测试入口,按对应环境变量决定是否跳过;每个驱动执行同一套 `runVectorDBTests(driver)`
新增向量库时:新增一个对应驱动的集成测试文件,并在其中复用 `testData.ts` `testSuites.ts`,无需修改这两个公共文件

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +8
const describePg = isEnabled ? describe : describe.skip;

describePg('Seekdb Vector Integration', () => {

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'describePg' is used for all vector database tests (SeekDB, PG, OceanBase, Milvus), not just PostgreSQL. This naming is misleading and should be renamed to something more generic like 'describeDB' or 'conditionalDescribe'.

Suggested change
const describePg = isEnabled ? describe : describe.skip;
describePg('Seekdb Vector Integration', () => {
const describeDB = isEnabled ? describe : describe.skip;
describeDB('Seekdb Vector Integration', () => {

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +8
const describePg = isEnabled ? describe : describe.skip;

describePg('PG Vector Integration', () => {

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'describePg' is used for all vector database tests (SeekDB, PG, OceanBase, Milvus), not just PostgreSQL. This naming is misleading and should be renamed to something more generic like 'describeDB' or 'conditionalDescribe'.

Suggested change
const describePg = isEnabled ? describe : describe.skip;
describePg('PG Vector Integration', () => {
const describeDB = isEnabled ? describe : describe.skip;
describeDB('PG Vector Integration', () => {

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +8
const describePg = isEnabled ? describe : describe.skip;

describePg('Oceanbase Vector Integration', () => {

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'describePg' is used for all vector database tests (SeekDB, PG, OceanBase, Milvus), not just PostgreSQL. This naming is misleading and should be renamed to something more generic like 'describeDB' or 'conditionalDescribe'.

Suggested change
const describePg = isEnabled ? describe : describe.skip;
describePg('Oceanbase Vector Integration', () => {
const describeDB = isEnabled ? describe : describe.skip;
describeDB('Oceanbase Vector Integration', () => {

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +10
VECTOR_VQ_LEVEL=32
# PG
PG_URL=postgresql://username:password@localhost:6001/postgres
# OceanBase 可以用云服务来测
# OCEANBASE_URL=mysql://root%40tenantname:tenantpassword@localhost:6005/mysql
# SeekDB vector database connection
SEEKDB_URL=mysql://root:seekdbpassword@127.0.0.1:6003/mysql
# Milvus vector database connection
MILVUS_ADDRESS=http://localhost:6002
MILVUS_TOKEN= No newline at end of file

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file name has a typo: 'tempalte' should be 'template'. This should be '.env.test.template' to match the naming convention used in the codebase and referenced in the README.

Copilot uses AI. Check for mistakes.
Comment on lines +55 to +56
- ./ob/data:/root/ob
- ./ob/config:/root/.obd/cluster

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The volume path change from '../ob/data' to './ob/data' could break existing deployments if users have data in the old path. This is a potentially breaking change that should be documented in the upgrade guide or migration notes. Consider keeping backward compatibility or providing clear migration instructions.

Copilot uses AI. Check for mistakes.
Comment on lines +21 to +22
teamId: `test_team`,
datasetId: `test_dataset_${suffix}`

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The teamId is hardcoded as 'test_team' for all tests, which could cause test isolation issues if multiple test suites run concurrently. While the datasetId has a unique suffix, sharing the same teamId across concurrent tests could lead to race conditions or data conflicts, especially when tests clean up by teamId. Consider adding the same unique suffix to teamId to ensure complete test isolation.

Copilot uses AI. Check for mistakes.
exclude: ['node_modules', 'dist'],
testTimeout: 60000,
hookTimeout: 60000,
fileParallelism: false,

Copilot AI Feb 2, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test configuration sets 'fileParallelism: false', which means tests will run sequentially. However, the test suite structure with conditional describe blocks and unique dataset IDs suggests tests could potentially run in parallel. The comment or documentation should clarify whether this is a temporary limitation or a design decision, especially since parallel execution could significantly speed up test runs.

Copilot uses AI. Check for mistakes.
@c121914yu c121914yu merged commit 64f70a4 into main Feb 2, 2026
15 of 16 checks passed
@github-actions

github-actions Bot commented Feb 2, 2026

Copy link
Copy Markdown

Preview fastgpt Image:

registry.cn-hangzhou.aliyuncs.com/fastgpt/fastgpt-pr:fatsgpt_45780f1850666b9c3a034184777535458009c57a

@c121914yu c121914yu deleted the ob-test branch February 2, 2026 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants