refactor: test suite and fix test infrastructure by longbingljw · Pull Request #91 · oceanbase/pyseekdb

longbingljw · 2025-12-22T11:56:02Z

Summary

This PR refactors the entire test suite to use pytest fixtures, significantly reducing code duplication and improving maintainability. It also separates unit tests from integration tests and updates the CI configuration accordingly.
57% less code in test files
close #90

Solution Description

Separated unit tests from integration tests
Refactored files (all using new fixture pattern)

Summary by CodeRabbit

Tests
- Reorganized and expanded test coverage with clear unit vs integration separation.
- Refactored integration suites to run across embedded, server, and OceanBase modes via parameterized fixtures.
- Added many new end-to-end integration scenarios for client, collection, query, hybrid search, embedding, admin, security, and migration-like workflows.
Chores
- CI updated with dedicated unit-test and integration-test jobs and a test matrix for multi-mode integration runs.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-22T11:56:12Z

Walkthrough

Consolidates and refactors tests: adds a pytest conftest with parameterized fixtures for embedded/server/oceanbase modes, moves many per-mode tests into integration_tests using those fixtures, removes legacy per-mode test files, adds a unit test for Version, and updates CI to separate unit and integration jobs with a test-mode matrix.

Changes

Cohort / File(s)	Summary
CI/CD Workflow `\.github/workflows/ci.yml`	Split single "test" job into `unit-test` and `integration-test`; integration job uses a matrix over `test_mode` (`embedded`, `server`, `oceanbase`), installs Python/Poetry, and runs pytest per-mode; conditional container start limited to server mode; added env vars for integration tests.
Test fixtures & infrastructure `tests/integration_tests/conftest.py`	New pytest conftest: environment-driven constants, factory helpers and client/admin-client creators for embedded/server/oceanbase, parameterized `db_client` and `admin_client` fixtures, connection validation, and teardown logic.
Refactored integration tests — admin & management `tests/integration_tests/test_admin_database_management.py`	New admin-client-driven integration test for database CRUD operations using `admin_client` (parameterized across modes).
Refactored integration tests — client & collection lifecycle `tests/integration_tests/test_client_creation.py`, `tests/integration_tests/test_collection_dml.py`, `tests/integration_tests/test_collection_embedding_function.py`	Reworked collection/client tests to use `db_client` fixture; consolidated per-mode variants into single parameterized tests covering creation, collection management, DML, and embedding-function scenarios.
Refactored integration tests — search & hybrid scenarios `tests/integration_tests/test_collection_hybrid_search.py`, `tests/integration_tests/test_collection_hybrid_search_builder_integration.py`, `tests/integration_tests/test_collection_hybrid_search_builder_integration.py`, `tests/integration_tests/test_collection_hybrid_search_builder_integration.py`, `tests/integration_tests/test_collection_hybrid_search_builder_integration.py`	New/refactored hybrid-search and builder-based tests converted to use `db_client`; cover full-text, vector, combined ranking, metadata filters, logical ops, and ID/scalar operators (parameterized across modes).
Refactored integration tests — queries, get, and utilities `tests/integration_tests/test_collection_query.py`, `tests/integration_tests/test_collection_get.py`, `tests/integration_tests/test_default_embedding_function.py`, `tests/integration_tests/test_detect_db_type_and_version.py`	Added/updated integration tests for query/get APIs, default embedding behavior, and detect_db_type_and_version — all adapted to fixture-driven execution.
Refactored integration tests — misc `tests/integration_tests/test_empty_value_handling.py`, `tests/integration_tests/test_fulltext_parser_config.py`, `tests/integration_tests/test_offical_case.py`, `tests/integration_tests/test_security_sql_injection.py`	Consolidated empty-value, fulltext parser, official example, and security SQL-injection tests to use `db_client` (single parameterized entry points replacing per-mode methods).
Removed legacy per-mode tests `tests/test_admin_database_management.py`, `tests/test_client_creation.py`, `tests/test_collection_dml.py`, `tests/test_collection_hybrid_search.py`, `tests/test_collection_hybrid_search_builder_integration.py`, `tests/test_collection_query.py`, `tests/test_default_embedding_function.py`, `tests/test_detect_db_type_and_version.py`, `tests/test_offical_case.py`	Removed old test files that implemented separate embedded/server/oceanbase variants in favor of the refactored integration_tests versions.
New unit tests `tests/unit_tests/test_version.py`	Added unit tests for the Version class: parsing/normalization to four parts and rich comparison operator checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas to focus on:
- tests/integration_tests/conftest.py — fixture parameterization, env-var handling, client factories, and teardown semantics.
- CI workflow (.github/workflows/ci.yml) — ensure matrix logic, container startup conditions, and env propagation are correct.
- Representative refactored integration tests (e.g., collection DML, hybrid search, admin tests) — confirm they correctly use db_client/admin_client and that assertions/cleanup remain valid across backends.

Possibly related PRs

fix: metadata $in/$nin operators not matching JSON array fields #87 — Fixes $in/$nin SQL generation for JSON arrays; directly relevant to new tests covering metadata array $in/$nin cases.

Suggested reviewers

hnwyllmm

Poem

🐰 I hopped through fixtures, tidy and spry,
One test per flow beneath the test sky.
No duplicates left in my burrowed den,
Matrixed CI hums — run tests three times then.
Happy hops for embedded, server, and sea!

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'refactor: test suite and fix test infrastructure' clearly and concisely describes the primary changes: refactoring the test suite and fixing test infrastructure via pytest fixtures.
Linked Issues check	✅ Passed	The PR successfully meets the requirements of issue #90 by consolidating duplicated test logic across embedded, server, and OceanBase modes via parameterized db_client fixtures, reducing test code duplication by approximately 57%.
Out of Scope Changes check	✅ Passed	All changes align with the stated objectives: CI workflow updates, conftest.py fixture additions, refactored integration tests using db_client fixtures, and unit test additions are all within scope for test infrastructure improvements and code duplication reduction.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

longbingljw · 2025-12-22T11:57:13Z

test_collection_get.py was not refactored because #87 was not merged.I am afraid it will be a conflict.

longbingljw · 2025-12-22T11:57:58Z

test_collection_get.py was not refactored because #87 was not merged.I am afraid it will be a conflict.

After #87 being merged,it will also refactored.

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (24)

.github/workflows/ci.yml (2)
8-8: Remove temporary branch trigger before merging.

The refactor-test branch trigger appears to be temporary for testing this PR. Remove it before merging to avoid triggering CI on a feature branch indefinitely.
🔎 Proposed fix
       - main
       - develop
-      - refactor-test 
24-27: Update actions/setup-python to v5.

The static analysis tool correctly flags that actions/setup-python@v3 is outdated. Version 5 is the current stable release and includes important bug fixes and performance improvements.
🔎 Proposed fix
       - name: Set up Python
-        uses: actions/setup-python@v3
+        uses: actions/setup-python@v5
         with:
           python-version: '3.11'
Apply the same change at line 63 for the integration-test job.
tests/unit_tests/test_version.py (2)
13-41: Consider splitting into focused test methods for better test isolation.

While the single test covers all functionality, splitting into smaller methods (e.g., test_comparison_operators, test_equality, test_normalization, test_string_representation) would improve test isolation and make failures easier to diagnose.

39-39: Remove extraneous f-prefix from string literal.

This string has no placeholders, so the f prefix is unnecessary.
🔎 Proposed fix
-        print(f"\n✅ Version comparison tests passed")
+        print("\n✅ Version comparison tests passed")
tests/integration_tests/test_collection_dml.py (2)
16-191: Test is comprehensive but lacks explicit cleanup.

The test relies on fixture cleanup, but doesn't explicitly delete the created collection. Consider adding cleanup in a finally block or using pytest.fixture with cleanup to ensure test isolation, especially if the fixture doesn't drop collections.
🔎 Suggested cleanup pattern
def test_collection_dml(self, db_client):
    collection_name = f"test_dml_{int(time.time() * 1000)}"
    try:
        # ... existing test code ...
    finally:
        # Cleanup: drop the test collection
        try:
            db_client.delete_collection(name=collection_name)
        except Exception:
            pass
42-42: Multiple f-strings without placeholders throughout the file.

Lines 42, 60, 79, 95, 111, 127, 144, 156, 159, 166, 169, 185, and 188 use f-strings without any placeholders. Remove the f prefix from these string literals.
🔎 Example fixes
-        print(f"\n✅ Testing collection.add() - single item")
+        print("\n✅ Testing collection.add() - single item")
Apply similar changes to all flagged lines.
tests/integration_tests/test_detect_db_type_and_version.py (1)
32-32: Remove extraneous f-prefix from string literals.

Lines 32, 53, 72, and 93 use f-strings without placeholders.
🔎 Proposed fix for line 32
-        print(f"\n✅ Successfully detected seekdb Server")
+        print("\n✅ Successfully detected seekdb Server")
Apply similar changes to lines 53, 72, and 93.
tests/integration_tests/test_collection_query.py (1)
121-121: Multiple f-strings without placeholders throughout the file.

Lines 121, 136, 147, 158, 169, 185, 201, 214, 225, and 236 use f-strings without placeholders.
🔎 Example fix
-        print(f"\n✅ Testing basic query")
+        print("\n✅ Testing basic query")
tests/integration_tests/test_offical_case.py (2)
1-4: Filename typo: "offical" should be "official"

The filename test_offical_case.py contains a typo. Consider renaming to test_official_case.py for consistency and discoverability.

81-87: Collection cleanup may be missing

The comment states cleanup is handled by the db_client fixture, but the fixture only closes the client connection—it doesn't delete created collections. This could leave test artifacts in the database across runs.

Consider explicitly deleting the collection in a finally block:
🔎 Proposed fix
     def test_official_example(self, db_client):
         """
         Official example using client (automatic mode selection).
         
         Automatically runs for: embedded, server, oceanbase
         """
         collection_name = f"official_example_{int(time.time() * 1000)}"
         collection = db_client.get_or_create_collection(name=collection_name)
         
-        # Run the official example workflow
-        _run_official_example(collection)
-        
-        # Note: cleanup is handled automatically by the db_client fixture
+        try:
+            # Run the official example workflow
+            _run_official_example(collection)
+        finally:
+            # Cleanup: delete the test collection
+            try:
+                db_client.delete_collection(name=collection_name)
+            except Exception:
+                pass
tests/integration_tests/test_security_sql_injection.py (1)
98-98: Remove extraneous f-string prefixes

Multiple print statements use f-strings without any placeholders. While functionally harmless, removing the f prefix improves consistency.

Affected lines: 98, 120, 141, 156, 160, 192, 196, 231, 235, 271-272, 276, 308-309, 313, 364.
🔎 Example fix
-        print(f"\n🔒 Running security SQL injection tests")
+        print("\n🔒 Running security SQL injection tests")
Also applies to: 120-120, 141-141, 156-156
tests/integration_tests/test_client_creation.py (2)

17-246: Consider splitting this monolithic test

This single test method spans ~230 lines and covers 16 distinct test scenarios. While comprehensive, this makes failures harder to diagnose and increases test coupling.

Consider splitting into focused test methods (e.g., test_create_collection, test_get_collection, test_count_and_peek) that can fail independently while sharing setup via class-level fixtures.

71-118: Direct SQL execution couples test to internal implementation

Using db_client._server._execute() to verify table structure bypasses the public API and creates tight coupling to internal implementation details. If the internal structure changes, this test will break.

Consider using public collection APIs for verification where possible, or documenting this as an intentional implementation detail check.

tests/integration_tests/test_default_embedding_function.py (2)

83-84: Consider using a more robust wait mechanism than time.sleep(1).

A fixed 1-second sleep may be insufficient in some environments or excessive in others. Consider implementing a polling-based approach or making the wait time configurable.

158-170: Unit test test_dimension_of should be in unit tests directory.

This test doesn't require the db_client fixture and is purely a unit test for the dimension_of function. Consider moving it to a unit test module for clearer separation.
tests/integration_tests/test_fulltext_parser_config.py (1)
178-196: Single test method iterating through parsers reduces parameterization benefits.

While this approach works, running all parser tests within a single test_fulltext_parser method means a failure in one parser test will prevent subsequent parsers from being tested. Consider using @pytest.mark.parametrize for the parsers list to get independent test runs per parser.
🔎 Example of parametrized approach
@pytest.mark.parametrize("parser_name", ['ik', 'space', 'ngram', 'ngram2', 'beng'])
def test_fulltext_parser_config(self, db_client, parser_name):
    """Test each fulltext parser configuration."""
    self._test_fulltext_parser_config(db_client, parser_name)

def test_fulltext_parser_with_params(self, db_client):
    """Test parser with parameters."""
    self._test_fulltext_parser_config(db_client, 'ngram', params={'ngram_token_size': 3})

def test_default_parser_fallback(self, db_client):
    """Test default parser."""
    self._test_default_parser(db_client)

def test_backward_compatibility_hnsw_only(self, db_client):
    """Test backward compatibility."""
    self._test_backward_compatibility(db_client)
tests/integration_tests/test_collection_hybrid_search.py (1)
17-29: Use explicit Optional type annotation for nullable parameters.

The dimension parameter can be None but lacks explicit type annotation.
🔎 Proposed fix
+from typing import List, Optional
+
 class TestCollectionHybridSearchRefactored:
     """Test collection.hybrid_search() interface using parameterized db_client fixture"""
     
-    def _create_test_collection(self, client, collection_name: str, dimension: int = None):
+    def _create_test_collection(self, client, collection_name: str, dimension: Optional[int] = None):
         """Helper method to create a test collection"""
tests/integration_tests/conftest.py (2)
10-12: Consider using relative imports or proper package installation instead of sys.path manipulation.

Directly modifying sys.path can lead to import issues. If the tests are run via pytest from the project root with the package installed in development mode (pip install -e .), this shouldn't be necessary.

177-181: Bare except in cleanup could mask important errors during teardown.

While silently ignoring cleanup failures is often acceptable, consider catching Exception explicitly and optionally logging for debugging.
🔎 Proposed fix
     try:
         if hasattr(client, 'close'):
             client.close()
-    except:
+    except Exception:
         pass
tests/integration_tests/test_collection_embedding_function.py (2)
14-37: Duplicate Simple3DEmbeddingFunction class across test files.

This class is duplicated in tests/integration_tests/test_collection_get.py (per relevant code snippets). Consider extracting it to a shared test utilities module or the conftest.py.
🔎 Example: Move to conftest.py
# In tests/integration_tests/conftest.py

class Simple3DEmbeddingFunction:
    """Simple embedding function that returns 3-dimensional vectors for testing"""
    
    def __init__(self):
        self.dimension = 3
    
    def __call__(self, input: Union[str, List[str]]) -> List[List[float]]:
        """Convert documents to 3D embeddings (simple hash-based)"""
        if isinstance(input, str):
            input = [input]
        
        embeddings = []
        for doc in input:
            hash_val = hash(doc) % 1000
            embedding = [
                float((hash_val % 10) / 10.0),
                float(((hash_val // 10) % 10) / 10.0),
                float(((hash_val // 100) % 10) / 10.0)
            ]
            embeddings.append(embedding)
        
        return embeddings
218-218: Unused variable created_collection is intentional.

The variable is assigned for side effects (creating the collection) and the actual test retrieves it separately. Consider using _ prefix or removing the assignment entirely.
🔎 Proposed fix
-        created_collection = db_client.create_collection(
+        db_client.create_collection(
             name=collection_name,
             configuration=config,
             embedding_function=None
         )
tests/integration_tests/test_collection_hybrid_search_builder_integration.py (3)
34-46: Fix implicit Optional type hint.

Per PEP 484, the type annotation should explicitly use Optional or | None when defaulting to None. As flagged by static analysis.
🔎 Proposed fix
-    def _create_test_collection(self, client, collection_name: str, dimension: int = None):
+    def _create_test_collection(self, client, collection_name: str, dimension: int | None = None):
48-52: Avoid mutable default argument.

Using a mutable list as a default argument is a Python anti-pattern. While safe here since it's only read, it's better to use None and initialize inside the function. As flagged by static analysis.
🔎 Proposed fix
-    def _generate_query_vector(self, dimension: int, base_vector: List[float] = [1.0, 2.0, 3.0]) -> List[float]:
+    def _generate_query_vector(self, dimension: int, base_vector: List[float] | None = None) -> List[float]:
+        if base_vector is None:
+            base_vector = [1.0, 2.0, 3.0]
         if dimension <= len(base_vector):
             return base_vector[:dimension]
146-146: Remove extraneous f prefix from strings without placeholders.

Multiple print statements use f-strings without any placeholders. This applies to lines 146, 164, 192, 226, 255, 296, 335, 349, 363, 374. As flagged by static analysis.
🔎 Example fix
-        print(f"\n✅ Testing hybrid_search with full-text search only")
+        print("\n✅ Testing hybrid_search with full-text search only")

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 638ddbf and 61074bf.

📒 Files selected for processing (29)

.github/workflows/ci.yml
tests/integration_tests/__init__.py
tests/integration_tests/conftest.py
tests/integration_tests/test_admin_database_management.py
tests/integration_tests/test_client_creation.py
tests/integration_tests/test_collection_dml.py
tests/integration_tests/test_collection_embedding_function.py
tests/integration_tests/test_collection_get.py
tests/integration_tests/test_collection_hybrid_search.py
tests/integration_tests/test_collection_hybrid_search_builder_integration.py
tests/integration_tests/test_collection_query.py
tests/integration_tests/test_default_embedding_function.py
tests/integration_tests/test_detect_db_type_and_version.py
tests/integration_tests/test_empty_value_handling.py
tests/integration_tests/test_fulltext_parser_config.py
tests/integration_tests/test_offical_case.py
tests/integration_tests/test_security_sql_injection.py
tests/test_admin_database_management.py
tests/test_client_creation.py
tests/test_collection_dml.py
tests/test_collection_hybrid_search.py
tests/test_collection_hybrid_search_builder_integration.py
tests/test_collection_query.py
tests/test_default_embedding_function.py
tests/test_detect_db_type_and_version.py
tests/test_offical_case.py
tests/unit_tests/__init__.py
tests/unit_tests/test_configuration.py
tests/unit_tests/test_version.py

💤 Files with no reviewable changes (9)

tests/test_default_embedding_function.py
tests/test_collection_hybrid_search_builder_integration.py
tests/test_client_creation.py
tests/test_detect_db_type_and_version.py
tests/test_admin_database_management.py
tests/test_collection_hybrid_search.py
tests/test_collection_query.py
tests/test_collection_dml.py
tests/test_offical_case.py

🧰 Additional context used

🧬 Code graph analysis (12)

tests/integration_tests/test_admin_database_management.py (2)

tests/integration_tests/conftest.py (1)

admin_client (222-256)

src/pyseekdb/client/client_seekdb_embedded.py (1)

SeekdbEmbeddedClient (26-357)

tests/integration_tests/test_security_sql_injection.py (2)

tests/integration_tests/conftest.py (1)

db_client (147-181)

src/pyseekdb/client/collection.py (2)

client (78-80)

name (63-65)

tests/integration_tests/test_default_embedding_function.py (3)

src/pyseekdb/client/embedding_function.py (1)

DefaultEmbeddingFunction (84-469)

src/pyseekdb/client/hybrid_search.py (1)

knn (305-363)

demo/rag/seekdb_insert.py (1)

main (135-156)

tests/integration_tests/test_collection_dml.py (3)

src/pyseekdb/client/collection.py (8)

client (78-80)

name (63-65)

embedding_function (88-90)

add (103-149)

get (364-428)

update (151-191)

upsert (194-234)

delete (236-272)

src/pyseekdb/client/meta_info.py (3)

CollectionNames (12-15)

CollectionFieldNames (4-10)

table_name (14-15)

tests/integration_tests/conftest.py (1)

db_client (147-181)

tests/unit_tests/test_version.py (1)

src/pyseekdb/client/version.py (1)

Version (7-121)

tests/integration_tests/test_detect_db_type_and_version.py (3)

src/pyseekdb/client/version.py (1)

Version (7-121)

src/pyseekdb/client/client_seekdb_server.py (1)

RemoteServerClient (19-267)

src/pyseekdb/client/client_base.py (1)

detect_db_type_and_version (219-311)

tests/integration_tests/test_empty_value_handling.py (2)

tests/integration_tests/conftest.py (1)

db_client (147-181)

src/pyseekdb/client/collection.py (2)

name (63-65)

embedding_function (88-90)

tests/integration_tests/conftest.py (2)

src/pyseekdb/client/collection.py (2)

get (364-428)

client (78-80)

src/pyseekdb/client/__init__.py (2)

Client (86-203)

AdminClient (206-314)

tests/integration_tests/test_fulltext_parser_config.py (1)

tests/integration_tests/conftest.py (1)

db_client (147-181)

tests/integration_tests/test_collection_hybrid_search_builder_integration.py (2)

src/pyseekdb/client/hybrid_search.py (6)

HybridSearch (220-481)

contains (64-69)

limit (409-411)

select (413-428)

knn (305-363)

rank (365-407)

src/pyseekdb/client/collection.py (5)

hybrid_search (430-523)

client (78-80)

distance (93-95)

get (364-428)

metadata (83-85)

tests/integration_tests/test_collection_embedding_function.py (3)

tests/integration_tests/conftest.py (1)

db_client (147-181)

src/pyseekdb/client/configuration.py (1)

HNSWConfiguration (37-53)

tests/integration_tests/test_collection_get.py (1)

Simple3DEmbeddingFunction (22-44)

tests/integration_tests/test_offical_case.py (2)

src/pyseekdb/client/collection.py (4)

upsert (194-234)

metadata (83-85)

get (364-428)

name (63-65)

tests/integration_tests/conftest.py (1)

db_client (147-181)

🪛 actionlint (1.7.9)

.github/workflows/ci.yml

25-25: the runner of "actions/setup-python@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Ruff (0.14.8)

tests/integration_tests/test_admin_database_management.py

43-43: f-string without any placeholders

Remove extraneous f prefix

(F541)

80-80: f-string without any placeholders

Remove extraneous f prefix

(F541)

89-89: f-string without any placeholders

Remove extraneous f prefix

(F541)

91-91: Do not catch blind exception: Exception

(BLE001)

95-95: Do not use bare except

(E722)

95-96: try-except-pass detected, consider logging the exception

(S110)

tests/integration_tests/test_security_sql_injection.py

98-98: f-string without any placeholders

Remove extraneous f prefix

(F541)

120-120: f-string without any placeholders

Remove extraneous f prefix

(F541)

136-136: Do not catch blind exception: Exception

(BLE001)

141-141: f-string without any placeholders

Remove extraneous f prefix

(F541)

160-160: f-string without any placeholders

Remove extraneous f prefix

(F541)

196-196: f-string without any placeholders

Remove extraneous f prefix

(F541)

235-235: f-string without any placeholders

Remove extraneous f prefix

(F541)

276-276: f-string without any placeholders

Remove extraneous f prefix

(F541)

313-313: f-string without any placeholders

Remove extraneous f prefix

(F541)

364-364: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_default_embedding_function.py

38-38: f-string without any placeholders

Remove extraneous f prefix

(F541)

74-74: f-string without any placeholders

Remove extraneous f prefix

(F541)

77-77: f-string without any placeholders

Remove extraneous f prefix

(F541)

87-87: f-string without any placeholders

Remove extraneous f prefix

(F541)

110-110: Do not catch blind exception: Exception

(BLE001)

115-115: f-string without any placeholders

Remove extraneous f prefix

(F541)

130-130: Do not catch blind exception: Exception

(BLE001)

135-135: f-string without any placeholders

Remove extraneous f prefix

(F541)

154-154: Do not catch blind exception: Exception

(BLE001)

165-165: f-string without any placeholders

Remove extraneous f prefix

(F541)

173-173: f-string without any placeholders

Remove extraneous f prefix

(F541)

190-190: f-string without any placeholders

Remove extraneous f prefix

(F541)

207-207: f-string without any placeholders

Remove extraneous f prefix

(F541)

221-221: f-string without any placeholders

Remove extraneous f prefix

(F541)

223-223: Unused method argument: input

(ARG002)

229-229: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_client_creation.py

61-62: try-except-pass detected, consider logging the exception

(S110)

61-61: Do not catch blind exception: Exception

(BLE001)

112-112: Do not catch blind exception: Exception

(BLE001)

116-117: try-except-pass detected, consider logging the exception

(S110)

116-116: Do not catch blind exception: Exception

(BLE001)

132-132: f-string without any placeholders

Remove extraneous f prefix

(F541)

136-136: f-string without any placeholders

Remove extraneous f prefix

(F541)

147-147: f-string without any placeholders

Remove extraneous f prefix

(F541)

181-181: f-string without any placeholders

Remove extraneous f prefix

(F541)

190-190: f-string without any placeholders

Remove extraneous f prefix

(F541)

215-215: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

251-251: Do not catch blind exception: Exception

(BLE001)

tests/integration_tests/test_collection_dml.py

42-42: f-string without any placeholders

Remove extraneous f prefix

(F541)

60-60: f-string without any placeholders

Remove extraneous f prefix

(F541)

79-79: f-string without any placeholders

Remove extraneous f prefix

(F541)

95-95: f-string without any placeholders

Remove extraneous f prefix

(F541)

111-111: f-string without any placeholders

Remove extraneous f prefix

(F541)

127-127: f-string without any placeholders

Remove extraneous f prefix

(F541)

144-144: f-string without any placeholders

Remove extraneous f prefix

(F541)

156-156: f-string without any placeholders

Remove extraneous f prefix

(F541)

159-159: f-string without any placeholders

Remove extraneous f prefix

(F541)

166-166: f-string without any placeholders

Remove extraneous f prefix

(F541)

169-169: f-string without any placeholders

Remove extraneous f prefix

(F541)

185-185: f-string without any placeholders

Remove extraneous f prefix

(F541)

188-188: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_collection_query.py

88-89: Possible SQL injection vector through string-based query construction

(S608)

121-121: f-string without any placeholders

Remove extraneous f prefix

(F541)

136-136: f-string without any placeholders

Remove extraneous f prefix

(F541)

147-147: f-string without any placeholders

Remove extraneous f prefix

(F541)

158-158: f-string without any placeholders

Remove extraneous f prefix

(F541)

169-169: f-string without any placeholders

Remove extraneous f prefix

(F541)

185-185: f-string without any placeholders

Remove extraneous f prefix

(F541)

201-201: f-string without any placeholders

Remove extraneous f prefix

(F541)

214-214: f-string without any placeholders

Remove extraneous f prefix

(F541)

225-225: f-string without any placeholders

Remove extraneous f prefix

(F541)

236-236: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/unit_tests/test_version.py

39-39: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_detect_db_type_and_version.py

30-30: Possible binding to all interfaces

(S104)

32-32: f-string without any placeholders

Remove extraneous f prefix

(F541)

51-51: Possible binding to all interfaces

(S104)

53-53: f-string without any placeholders

Remove extraneous f prefix

(F541)

72-72: f-string without any placeholders

Remove extraneous f prefix

(F541)

91-91: Possible binding to all interfaces

(S104)

93-93: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_empty_value_handling.py

50-50: f-string without any placeholders

Remove extraneous f prefix

(F541)

55-55: Do not catch blind exception: Exception

(BLE001)

60-60: f-string without any placeholders

Remove extraneous f prefix

(F541)

102-102: f-string without any placeholders

Remove extraneous f prefix

(F541)

137-137: f-string without any placeholders

Remove extraneous f prefix

(F541)

179-179: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_collection_hybrid_search.py

17-17: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

31-31: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

114-115: Possible SQL injection vector through string-based query construction

(S608)

134-134: f-string without any placeholders

Remove extraneous f prefix

(F541)

159-159: f-string without any placeholders

Remove extraneous f prefix

(F541)

191-191: f-string without any placeholders

Remove extraneous f prefix

(F541)

229-229: f-string without any placeholders

Remove extraneous f prefix

(F541)

272-272: f-string without any placeholders

Remove extraneous f prefix

(F541)

324-324: f-string without any placeholders

Remove extraneous f prefix

(F541)

376-376: f-string without any placeholders

Remove extraneous f prefix

(F541)

394-394: f-string without any placeholders

Remove extraneous f prefix

(F541)

412-412: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/conftest.py

42-42: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

67-67: Do not catch blind exception: Exception

(BLE001)

88-88: Do not catch blind exception: Exception

(BLE001)

98-98: Unused noqa directive (non-enabled: F401)

Remove unused noqa directive

(RUF100)

119-119: Do not catch blind exception: Exception

(BLE001)

139-139: Do not catch blind exception: Exception

(BLE001)

173-173: Avoid specifying long messages outside the exception class

(TRY003)

180-180: Do not use bare except

(E722)

180-181: try-except-pass detected, consider logging the exception

(S110)

192-192: Do not use bare except

(E722)

192-193: try-except-pass detected, consider logging the exception

(S110)

204-204: Do not use bare except

(E722)

204-205: try-except-pass detected, consider logging the exception

(S110)

216-216: Do not use bare except

(E722)

216-217: try-except-pass detected, consider logging the exception

(S110)

248-248: Avoid specifying long messages outside the exception class

(TRY003)

255-255: Do not use bare except

(E722)

255-256: try-except-pass detected, consider logging the exception

(S110)

267-267: Do not use bare except

(E722)

267-268: try-except-pass detected, consider logging the exception

(S110)

279-279: Do not use bare except

(E722)

279-280: try-except-pass detected, consider logging the exception

(S110)

291-291: Do not use bare except

(E722)

291-292: try-except-pass detected, consider logging the exception

(S110)

tests/integration_tests/test_collection_hybrid_search_builder_integration.py

34-34: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

48-48: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

127-128: Possible SQL injection vector through string-based query construction

(S608)

146-146: f-string without any placeholders

Remove extraneous f prefix

(F541)

164-164: f-string without any placeholders

Remove extraneous f prefix

(F541)

192-192: f-string without any placeholders

Remove extraneous f prefix

(F541)

226-226: f-string without any placeholders

Remove extraneous f prefix

(F541)

255-255: f-string without any placeholders

Remove extraneous f prefix

(F541)

296-296: f-string without any placeholders

Remove extraneous f prefix

(F541)

335-335: f-string without any placeholders

Remove extraneous f prefix

(F541)

349-349: f-string without any placeholders

Remove extraneous f prefix

(F541)

363-363: f-string without any placeholders

Remove extraneous f prefix

(F541)

374-374: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_collection_embedding_function.py

50-50: f-string without any placeholders

Remove extraneous f prefix

(F541)

67-68: try-except-pass detected, consider logging the exception

(S110)

67-67: Do not catch blind exception: Exception

(BLE001)

77-77: f-string without any placeholders

Remove extraneous f prefix

(F541)

97-98: try-except-pass detected, consider logging the exception

(S110)

97-97: Do not catch blind exception: Exception

(BLE001)

107-107: f-string without any placeholders

Remove extraneous f prefix

(F541)

130-131: try-except-pass detected, consider logging the exception

(S110)

130-130: Do not catch blind exception: Exception

(BLE001)

140-140: f-string without any placeholders

Remove extraneous f prefix

(F541)

163-163: f-string without any placeholders

Remove extraneous f prefix

(F541)

184-185: try-except-pass detected, consider logging the exception

(S110)

184-184: Do not catch blind exception: Exception

(BLE001)

194-194: f-string without any placeholders

Remove extraneous f prefix

(F541)

214-214: f-string without any placeholders

Remove extraneous f prefix

(F541)

218-218: Local variable created_collection is assigned to but never used

Remove assignment to unused variable created_collection

(F841)

239-240: try-except-pass detected, consider logging the exception

(S110)

239-239: Do not catch blind exception: Exception

(BLE001)

249-249: f-string without any placeholders

Remove extraneous f prefix

(F541)

253-253: Local variable created_collection is assigned to but never used

Remove assignment to unused variable created_collection

(F841)

275-276: try-except-pass detected, consider logging the exception

(S110)

275-275: Do not catch blind exception: Exception

(BLE001)

285-285: f-string without any placeholders

Remove extraneous f prefix

(F541)

301-302: try-except-pass detected, consider logging the exception

(S110)

301-301: Do not catch blind exception: Exception

(BLE001)

311-311: f-string without any placeholders

Remove extraneous f prefix

(F541)

315-315: Local variable created_collection is assigned to but never used

Remove assignment to unused variable created_collection

(F841)

336-337: try-except-pass detected, consider logging the exception

(S110)

336-336: Do not catch blind exception: Exception

(BLE001)

346-346: f-string without any placeholders

Remove extraneous f prefix

(F541)

369-370: try-except-pass detected, consider logging the exception

(S110)

369-369: Do not catch blind exception: Exception

(BLE001)

379-379: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: integration-test (oceanbase)
GitHub Check: integration-test (embedded)
GitHub Check: integration-test (server)

🔇 Additional comments (38)

.github/workflows/ci.yml (2)

18-39: LGTM: Unit test job is well-structured.

The unit-test job follows a clean pattern with proper checkout, Python setup, Poetry installation, and focused test execution against tests/unit_tests/.

41-103: LGTM: Integration test matrix approach is clean.

The matrix-based approach with test_mode: [embedded, server, oceanbase] effectively reduces duplication while maintaining clear separation. The -k "${{ matrix.test_mode }}" filter pattern is a pragmatic way to run mode-specific tests.

tests/unit_tests/test_version.py (1)

1-46: LGTM: Good unit test coverage for Version class.

The test validates core Version functionality including parsing, normalization (3-part to 4-part), comparison operators, equality, and string representation. This aligns well with the Version class implementation in src/pyseekdb/client/version.py.

tests/integration_tests/test_collection_dml.py (1)

1-196: LGTM: Comprehensive DML test coverage using fixture pattern.

The test effectively exercises all collection DML operations (add, update, upsert, delete) with proper assertions. The use of db_client fixture aligns with the PR's goal of reducing code duplication across client modes.

tests/integration_tests/test_detect_db_type_and_version.py (2)

15-55: LGTM: Good use of mode-specific fixtures.

Using server_client and oceanbase_client fixtures separately is appropriate here since the tests need to verify mode-specific behavior (db_type == "seekdb" vs "oceanbase").

57-97: LGTM: Tests cover return format and connection behavior.

The test_connection_establishment and test_return_format tests provide good coverage of edge cases and API contract verification.

tests/integration_tests/test_collection_query.py (2)

16-92: Helper method is well-structured with proper dimension handling.

The _insert_test_data helper correctly handles dimension adaptation (truncation/extension) to match the actual collection dimension. This makes the test robust across different configurations.

94-236: LGTM: Comprehensive query test coverage.

The test method thoroughly exercises collection.query() with various scenarios including:

Basic similarity search

Metadata filters ($eq, $in, $gte)

Document filters ($contains, $regex)

Include parameters

Single and multi-vector queries

This provides excellent coverage of the query interface.

tests/integration_tests/test_offical_case.py (1)

34-69: LGTM: Well-structured helper function

The _run_official_example helper cleanly encapsulates the official workflow test logic with appropriate assertions for result validation and metadata verification.

tests/integration_tests/test_security_sql_injection.py (2)

22-84: LGTM: Comprehensive security test vectors

The test cases cover a good range of attack vectors including quotes, backslashes, SQL injection attempts, special characters, and Unicode. The verify_data_integrity helper properly validates round-trip data storage.

134-137: Cleanup exception handling is acceptable

The blind Exception catch in cleanup code (line 136) is appropriate here since cleanup failures shouldn't mask test failures. The warning log provides visibility.

tests/integration_tests/test_admin_database_management.py (1)

91-97: Cleanup exception handling is appropriate

The bare except at line 95 is acceptable for cleanup code where we want to ensure the test fails with the original error, not a cleanup error.

tests/integration_tests/test_empty_value_handling.py (3)

15-56: LGTM: Well-structured empty value handling tests

The test properly uses the db_client fixture and includes comprehensive coverage for edge cases including empty strings, whitespace, "0", "false", and None values. The cleanup in the finally block ensures test isolation.

58-99: LGTM: Thorough update path testing

The _test_upsert_update_empty_document helper properly validates that the Line 1072 fix works correctly by testing the upsert update path with various falsy document values and verifying they're not converted to "NULL" strings.

208-244: LGTM: Good batch operation coverage

The mixed empty/non-empty values batch test validates that batch operations correctly handle heterogeneous data, which is an important integration scenario.

tests/integration_tests/test_default_embedding_function.py (4)

1-11: LGTM! Well-structured test module with proper imports.

The module docstring clearly describes the test purpose, and the imports are appropriate for the testing scenarios.

14-36: Test setup and collection creation looks correct.

The test properly verifies that a collection created without an explicit embedding function uses DefaultEmbeddingFunction by default, and validates key properties (name, dimension, embedding_function type).

88-112: Hybrid search exception handling is intentionally lenient for cross-mode compatibility.

The broad except Exception catches are acceptable here since hybrid search may not be fully supported in all modes, and the test continues with other scenarios. However, consider logging the exception type for debugging.

220-229: Edge case test for empty result is well-designed.

The EmptyResultEF class intentionally ignores the input parameter to test the error handling path. The unused input argument flagged by static analysis is acceptable in this context.

tests/integration_tests/test_fulltext_parser_config.py (2)

1-16: Good refactoring to fixture-based approach.

The class has been properly updated to use the db_client fixture pattern, aligning with the PR's goal of reducing code duplication across test modes.

50-67: Accessing internal API _server._execute is acceptable for integration tests.

While using internal APIs is generally discouraged, it's appropriate here for verifying the underlying SQL schema. This is a valid integration testing pattern.

tests/integration_tests/test_collection_hybrid_search.py (3)

1-15: Good test class structure following the fixture-driven pattern.

The docstring accurately describes the purpose and the code reduction achieved through parameterized fixtures.

121-176: Full-text search tests are comprehensive.

Good coverage of $contains and $not_contains operators with proper assertions on the result content.

363-426: Good coverage of scalar filters including $in, $nin, and #id support.

The test properly validates that inserted IDs can be queried back using the #id filter, which is a useful integration check.

tests/integration_tests/conftest.py (5)

1-16: Well-organized conftest with clear documentation.

The module structure with distinct sections (Environment Variables, Factory Functions, Fixtures) is clear and maintainable.

39-49: Good pattern: Skip test when embedded package is unavailable.

Using pytest.skip() when pylibseekdb is not installed allows the test suite to gracefully degrade rather than fail entirely.

63-69: Connection validation is a good defensive pattern.

Testing the connection with SELECT 1 before returning the client ensures tests fail fast with a clear message if the server is unreachable.

145-181: The db_client fixture is well-designed for multi-mode testing.

The parameterized fixture with clear documentation and automatic cleanup is the core of this refactoring. It effectively reduces test duplication by automatically running each test across all three modes.

220-256: Admin client fixture follows the same good pattern as db_client.

Consistent design between db_client and admin_client fixtures makes the codebase predictable.

tests/integration_tests/test_collection_embedding_function.py (5)

1-11: Good refactoring to fixture-based approach with clear docstring.

The module properly describes the test coverage and follows the established pattern.

40-68: Test for default embedding function is correct.

The test verifies that omitting embedding_function results in DefaultEmbeddingFunction being used with dimension 384.

133-154: Good test for dimension mismatch error handling.

The test correctly verifies that providing a configuration with a dimension that doesn't match the embedding function's dimension raises a ValueError.

187-205: Good negative test for both None scenario.

This test ensures the API correctly rejects ambiguous configurations where dimension cannot be determined.

372-390: Comprehensive test coverage for get_or_create_collection scenarios.

The test suite covers creating new collections, getting existing ones, custom embedding functions, and error cases. This provides good confidence in the API's behavior.

tests/integration_tests/test_collection_hybrid_search_builder_integration.py (4)

1-24: LGTM!

The imports are appropriate for the test suite, and the module docstring clearly describes the purpose of the file.

26-33: LGTM!

The _hs helper provides a clean abstraction for the builder pattern, reducing boilerplate in test methods.

134-374: Well-structured test coverage for HybridSearch builder API.

The test methods comprehensively cover full-text search, vector search, combined queries, metadata filters, logical operators, and scalar operations. The assertions appropriately verify both result structure and content correctness. This aligns well with the PR objective of consolidating test logic with the db_client fixture pattern.

377-379: LGTM!

Standard pytest main block for running tests directly.

…-test

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

.github/workflows/ci.yml (1)
61-64: Upgrade deprecated Python setup action.

Same deprecation issue as in the unit-test job. The actions/setup-python@v3 action should be upgraded.
🔎 Proposed fix
       - name: Set up Python
-        uses: actions/setup-python@v3
+        uses: actions/setup-python@v5
         with:
           python-version: '3.11'
Based on static analysis hints.

🧹 Nitpick comments (8)

tests/integration_tests/test_admin_database_management.py (1)
43-43: Remove unnecessary f-string prefixes.

Lines 43, 74, and 83 use f-strings without any placeholders. Removing the f prefix improves consistency and avoids unnecessary string processing.
🔎 Proposed fix
-            print(f"\n📋 Step 1: List all databases")
+            print("\n📋 Step 1: List all databases")
-            print(f"\n📋 Step 5: List all databases to verify deletion")
+            print("\n📋 Step 5: List all databases to verify deletion")
-            print(f"\n🎉 All database management operations completed successfully!")
+            print("\n🎉 All database management operations completed successfully!")
Also applies to: 74-74, 83-83
tests/integration_tests/test_collection_hybrid_search.py (2)
17-17: Consider explicit Optional type hints per PEP 484.

The dimension and base_vector parameters use implicit Optional (defaulting to None). While functionally correct, PEP 484 recommends explicit Optional or union with None.
🔎 Proposed fix
+from typing import List, Optional
+
 class TestCollectionHybridSearchRefactored:
     """Test collection.hybrid_search() interface using parameterized db_client fixture"""
     
-    def _create_test_collection(self, client, collection_name: str, dimension: int = None):
+    def _create_test_collection(self, client, collection_name: str, dimension: int | None = None):
         """Helper method to create a test collection"""
-    def _generate_query_vector(self, dimension: int, base_vector: List[float] = None) -> List[float]:
+    def _generate_query_vector(self, dimension: int, base_vector: List[float] | None = None) -> List[float]:
         """Generate a query vector with the correct dimension"""
Also applies to: 31-31

137-137: Optional: Remove unnecessary f-string prefixes in print statements.

Several print statements use f-string syntax without any placeholders. While harmless, removing the f prefix would be slightly more accurate.
Example fix
-        print(f"\n✅ Testing hybrid_search with full-text search only")
+        print("\n✅ Testing hybrid_search with full-text search only")
Apply similar changes to lines 162, 194, 232, 275, 327, 379, 397, 415.
Also applies to: 162-162, 194-194, 232-232, 275-275, 327-327, 379-379, 397-397, 415-415
tests/integration_tests/test_collection_hybrid_search_builder_integration.py (2)
34-34: Consider explicit Optional type hints per PEP 484.

The dimension and base_vector parameters use implicit Optional (defaulting to None). While functionally correct, PEP 484 recommends explicit Optional or union with None.
🔎 Proposed fix
+from typing import List, Optional
+
 class TestCollectionHybridSearchWithBuilderRefactored:
     """Test collection.hybrid_search() using HybridSearch builder with parameterized db_client fixture"""
 
-    def _create_test_collection(self, client, collection_name: str, dimension: int = None):
+    def _create_test_collection(self, client, collection_name: str, dimension: int | None = None):
         """Helper method to create a test collection"""
-    def _generate_query_vector(self, dimension: int, base_vector: List[float] = None) -> List[float]:
+    def _generate_query_vector(self, dimension: int, base_vector: List[float] | None = None) -> List[float]:
+        if base_vector is None:
+            base_vector = [1.0, 2.0, 3.0]
Also applies to: 48-48

149-149: Optional: Remove unnecessary f-string prefixes in print statements.

Several print statements use f-string syntax without any placeholders. While harmless, removing the f prefix would be slightly more accurate.
Example fix
-        print(f"\n✅ Testing hybrid_search with full-text search only")
+        print("\n✅ Testing hybrid_search with full-text search only")
Apply similar changes to lines 167, 195, 229, 258, 299, 338, 352, 366, 377.
Also applies to: 167-167, 195-195, 229-229, 258-258, 299-299, 338-338, 352-352, 366-366, 377-377
tests/integration_tests/test_client_creation.py (2)
59-62: Consider logging cleanup exceptions during test execution.

The silent exception handling here occurs mid-test rather than in final cleanup. If delete_collection fails for this test artifact, it could indicate an issue worth investigating.
🔎 Suggested improvement
         # Clean up
         try:
             db_client.delete_collection(test_collection_name_config)
-        except Exception:
-            pass
+        except Exception as e:
+            print(f"   Warning: Failed to cleanup {test_collection_name_config}: {e}")
132-132: Remove unnecessary f prefix from strings without placeholders.

Several print statements use f-strings but don't contain any placeholders.
🔎 Suggested fix
-        print(f"\n✅ has_collection correctly returns False for non-existent collection")
+        print("\n✅ has_collection correctly returns False for non-existent collection")
         
         # Test 4: has_collection - should return True for existing collection
         assert db_client.has_collection(test_collection_name)
-        print(f"\n✅ has_collection correctly returns True for existing collection")
+        print("\n✅ has_collection correctly returns True for existing collection")
         
         # ... later in the file ...
         
-        print(f"\n✅ get_or_create_collection successfully retrieved existing collection")
+        print("\n✅ get_or_create_collection successfully retrieved existing collection")
         
         # ... later in the file ...
         
-            print(f"\n✅ delete_collection correctly raises ValueError for non-existent collection")
+            print("\n✅ delete_collection correctly raises ValueError for non-existent collection")
         
         # ... later in the file ...
         
-        print(f"\n✅ get_or_create_collection successfully created collection with default configuration")
+        print("\n✅ get_or_create_collection successfully created collection with default configuration")
Also applies to: 136-136, 147-147, 181-181, 190-190
tests/integration_tests/test_collection_get.py (1)
200-204: Consider logging cleanup exceptions.

While except Exception: pass ensures robust cleanup, consider logging exceptions for debugging (similar to line 210). This helps diagnose cleanup failures during test troubleshooting.
🔎 Suggested improvement
             finally:
                 try:
                     db_client.delete_collection(name=collection_name_2)
-                except Exception:
-                    pass
+                except Exception as e:
+                    print(f"Warning: cleanup failed for {collection_name_2}: {e}")

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 61074bf and 7f1ad29.

📒 Files selected for processing (6)

.github/workflows/ci.yml
tests/integration_tests/test_admin_database_management.py
tests/integration_tests/test_client_creation.py
tests/integration_tests/test_collection_get.py
tests/integration_tests/test_collection_hybrid_search.py
tests/integration_tests/test_collection_hybrid_search_builder_integration.py

🧰 Additional context used

🧠 Learnings (4)

📚 Learning: 2025-12-22T12:44:52.456Z

Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_collection_dml.py:36-36
Timestamp: 2025-12-22T12:44:52.456Z
Learning: In integration tests under tests/integration_tests, it is acceptable to use the internal db_client._server._execute() to run raw SQL (e.g., CREATE TABLE) when needing custom table setup with specific vector index configurations. Prefer using public/test harness APIs for common setup, and clearly document and limit the use of internal calls to ensure test reliability. Ensure proper cleanup and isolation for each test, and avoid leaking internal APIs into production or non-test code.

Applied to files:

tests/integration_tests/test_collection_get.py
tests/integration_tests/test_admin_database_management.py
tests/integration_tests/test_client_creation.py
tests/integration_tests/test_collection_hybrid_search.py
tests/integration_tests/test_collection_hybrid_search_builder_integration.py

📚 Learning: 2025-12-22T12:45:51.412Z

Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_collection_hybrid_search_builder_integration.py:127-132
Timestamp: 2025-12-22T12:45:51.412Z
Learning: In integration test Python files under tests/integration_tests, it is acceptable to use string interpolation to construct SQL INSERT statements for test data when the data is controlled and internal to the test. This pattern should only be used for trusted, test-only data and not with untrusted input; ensure the injection risk is mitigated by keeping inputs deterministic, non-user-supplied, and isolated to the test environment. If possible, prefer parameterized queries in non-test code, but in these tests, interpolate values selectively when you can guarantee safety.

Applied to files:

tests/integration_tests/test_collection_get.py
tests/integration_tests/test_admin_database_management.py
tests/integration_tests/test_client_creation.py
tests/integration_tests/test_collection_hybrid_search.py
tests/integration_tests/test_collection_hybrid_search_builder_integration.py

📚 Learning: 2025-12-22T12:33:44.844Z

Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_admin_database_management.py:26-39
Timestamp: 2025-12-22T12:33:44.844Z
Learning: In tests/integration_tests/test_admin_database_management.py, it is acceptable to detect the client type with isinstance(admin_client._server, pyseekdb.SeekdbEmbeddedClient). Do not expose a public AdminClient 'mode' property; keep it private (e.g., _mode) or implement it without a public accessor. If needed, document the private attribute and ensure tests rely only on internal type checks rather than public configuration.

Applied to files:

tests/integration_tests/test_admin_database_management.py

📚 Learning: 2025-12-22T12:33:44.844Z

Learnt from: longbingljw
Repo: oceanbase/pyseekdb PR: 91
File: tests/integration_tests/test_admin_database_management.py:26-39
Timestamp: 2025-12-22T12:33:44.844Z
Learning: In tests/integration_tests/test_admin_database_management.py, using `isinstance(admin_client._server, pyseekdb.SeekdbEmbeddedClient)` to detect client type is acceptable. AdminClient should not expose a public `mode` property.

Applied to files:

tests/integration_tests/test_client_creation.py
tests/integration_tests/test_collection_hybrid_search_builder_integration.py

🧬 Code graph analysis (2)

tests/integration_tests/test_collection_get.py (3)

tests/integration_tests/test_collection_embedding_function.py (1)

Simple3DEmbeddingFunction (15-37)

tests/integration_tests/conftest.py (1)

db_client (147-181)

src/pyseekdb/client/configuration.py (1)

HNSWConfiguration (37-53)

tests/integration_tests/test_client_creation.py (3)

src/pyseekdb/client/configuration.py (3)

HNSWConfiguration (37-53)

Configuration (56-69)

FulltextParserConfig (25-34)

src/pyseekdb/client/collection.py (7)

distance (93-95)

name (63-65)

embedding_function (88-90)

get (364-428)

count (527-541)

peek (543-570)

add (103-149)

src/pyseekdb/client/meta_info.py (1)

table_name (14-15)

🪛 actionlint (1.7.9)

.github/workflows/ci.yml

24-24: the runner of "actions/setup-python@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue

(action)

🪛 Ruff (0.14.8)

tests/integration_tests/test_collection_get.py

101-102: Possible SQL injection vector through string-based query construction

(S608)

203-204: try-except-pass detected, consider logging the exception

(S110)

203-203: Do not catch blind exception: Exception

(BLE001)

209-209: Do not catch blind exception: Exception

(BLE001)

374-374: Do not catch blind exception: Exception

(BLE001)

tests/integration_tests/test_admin_database_management.py

43-43: f-string without any placeholders

Remove extraneous f prefix

(F541)

74-74: f-string without any placeholders

Remove extraneous f prefix

(F541)

83-83: f-string without any placeholders

Remove extraneous f prefix

(F541)

85-85: Do not catch blind exception: Exception

(BLE001)

89-89: Do not use bare except

(E722)

89-90: try-except-pass detected, consider logging the exception

(S110)

tests/integration_tests/test_client_creation.py

61-62: try-except-pass detected, consider logging the exception

(S110)

61-61: Do not catch blind exception: Exception

(BLE001)

112-112: Do not catch blind exception: Exception

(BLE001)

116-117: try-except-pass detected, consider logging the exception

(S110)

116-116: Do not catch blind exception: Exception

(BLE001)

132-132: f-string without any placeholders

Remove extraneous f prefix

(F541)

136-136: f-string without any placeholders

Remove extraneous f prefix

(F541)

147-147: f-string without any placeholders

Remove extraneous f prefix

(F541)

181-181: f-string without any placeholders

Remove extraneous f prefix

(F541)

190-190: f-string without any placeholders

Remove extraneous f prefix

(F541)

215-215: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)

251-251: Do not catch blind exception: Exception

(BLE001)

257-257: Do not catch blind exception: Exception

(BLE001)

tests/integration_tests/test_collection_hybrid_search.py

17-17: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

31-31: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

117-118: Possible SQL injection vector through string-based query construction

(S608)

137-137: f-string without any placeholders

Remove extraneous f prefix

(F541)

162-162: f-string without any placeholders

Remove extraneous f prefix

(F541)

194-194: f-string without any placeholders

Remove extraneous f prefix

(F541)

232-232: f-string without any placeholders

Remove extraneous f prefix

(F541)

275-275: f-string without any placeholders

Remove extraneous f prefix

(F541)

327-327: f-string without any placeholders

Remove extraneous f prefix

(F541)

379-379: f-string without any placeholders

Remove extraneous f prefix

(F541)

397-397: f-string without any placeholders

Remove extraneous f prefix

(F541)

415-415: f-string without any placeholders

Remove extraneous f prefix

(F541)

tests/integration_tests/test_collection_hybrid_search_builder_integration.py

34-34: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

48-48: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)

130-131: Possible SQL injection vector through string-based query construction

(S608)

149-149: f-string without any placeholders

Remove extraneous f prefix

(F541)

167-167: f-string without any placeholders

Remove extraneous f prefix

(F541)

195-195: f-string without any placeholders

Remove extraneous f prefix

(F541)

229-229: f-string without any placeholders

Remove extraneous f prefix

(F541)

258-258: f-string without any placeholders

Remove extraneous f prefix

(F541)

299-299: f-string without any placeholders

Remove extraneous f prefix

(F541)

338-338: f-string without any placeholders

Remove extraneous f prefix

(F541)

352-352: f-string without any placeholders

Remove extraneous f prefix

(F541)

366-366: f-string without any placeholders

Remove extraneous f prefix

(F541)

377-377: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: integration-test (oceanbase)

🔇 Additional comments (10)

tests/integration_tests/test_admin_database_management.py (1)

10-91: Well-structured integration test with comprehensive CRUD coverage.

The test provides excellent coverage of AdminClient database management operations across all three client modes (embedded, server, OceanBase). The step-by-step flow with assertions, debug output, and proper cleanup on failure makes this test maintainable and easy to debug. The tenant detection logic using isinstance(admin_client._server, ...) is appropriate for this context.

Based on learnings, the isinstance check on admin_client._server is acceptable for detecting client type in tests.

tests/integration_tests/test_client_creation.py (1)

17-258: Comprehensive integration test with excellent fixture usage.

This test effectively validates all collection management operations across multiple client modes using the parameterized db_client fixture. The test demonstrates good practices:

Clear test structure with numbered test cases and descriptive assertions

Reproducible test data generation using seeded randomness (line 212)

Comprehensive verification of collection properties and table structure

Proper cleanup of test artifacts

The use of db_client._server._execute() for table verification is acceptable in integration tests. Based on learnings, this pattern is appropriate for custom setup/verification in this test context.

.github/workflows/ci.yml (3)

42-45: LGTM!

The matrix strategy with fail-fast: false is appropriate for running tests across multiple backend modes. This ensures all test modes complete even if one fails, providing comprehensive feedback.

87-92: LGTM!

The renamed step "Start seekdb server container" improves clarity, and the conditional execution based on test_mode is correct.

94-102: Test filtering approach is valid and correctly implemented.

The -k "${{ matrix.test_mode }}" filtering works as intended because the integration tests use properly parameterized fixtures (@pytest.fixture(params=['embedded', 'server', 'oceanbase'])) that generate test IDs containing the mode strings. When pytest runs with -k "embedded", it matches all tests with [embedded] in their IDs, ensuring each mode runs the correct subset of tests. The fixture parameterization and test discovery are correctly configured.

tests/integration_tests/test_collection_get.py (5)

1-12: LGTM: Clear test scope and standard imports.

The module docstring clearly describes the refactored fixture-based approach, and all imports are appropriate for integration testing.

43-105: Acceptable test data setup pattern.

The use of client._server._execute() with string-interpolated SQL for controlled test data is acceptable in this integration test context. The data is deterministic and internal to the test (generated UUIDs, fixed test strings), with appropriate quote escaping applied.

Based on learnings, internal API usage and string interpolation for test-only controlled data are permitted in integration tests when proper safeguards are in place.

212-375: Excellent comprehensive test coverage.

This test effectively validates 14 different collection.get() scenarios including IDs, filters, pagination, logical operators, and include parameters. The parameterized fixture ensures coverage across all three client modes (embedded, server, oceanbase).

378-379: LGTM: Standard pytest entry point.

163-168: $nin operator correctly excludes records with missing fields.

Line 168 expects {"id_disjoint", "id_null", "id_empty"} and correctly excludes "id_missing". When the tags field is missing, JSON_EXTRACT returns NULL, and NOT JSON_OVERLAPS(NULL, ...) evaluates to NULL/FALSE per SQL three-valued logic, excluding the record from results. This behavior matches the implementation and is the intended semantics.

longbing added 17 commits December 22, 2025 16:16

refactor: conftest

bce73f9

conftest

b685f0a

official_case

7feffe6

admin_database_management

013a78a

client_creat_creation

bc091b0

collection_dml

f3bc3cd

collection_embedding_function

72ce13d

collection_hybrid_search_builder_integration

23ce9fe

collection_hybrid_search

8689a13

collection_query

8aff4d1

default_embedding_function

fbbb698

fulltext_parser_config

13f26f3

test

0572317

test again

1a05e92

replace

ec3eaae

fix

ead6499

fix db_version name

61074bf

longbingljw marked this pull request as draft December 22, 2025 12:01

coderabbitai Bot reviewed Dec 22, 2025

View reviewed changes

longbing added 4 commits December 22, 2025 20:25

Merge branch 'develop' of github.com:oceanbase/pyseekdb into refactor…

ac231d4

…-test

collection_get

cbef7cb

fix

4ed8250

changable default paras

7f1ad29

longbingljw marked this pull request as ready for review December 22, 2025 12:50

coderabbitai Bot reviewed Dec 22, 2025

View reviewed changes

Comment thread .github/workflows/ci.yml

Comment thread tests/integration_tests/test_collection_get.py

hnwyllmm approved these changes Dec 23, 2025

View reviewed changes

hnwyllmm merged commit 2d07a5e into oceanbase:develop Dec 23, 2025
6 checks passed

coderabbitai Bot mentioned this pull request Dec 29, 2025

feat: readme and comments about tests #97

Merged

This was referenced Jan 12, 2026

Support more embedding function platform #110

Merged

chore: add pre commit #112

Merged

support embedding function persistence #114

Merged

This was referenced Jan 20, 2026

support embedding function metadata persistence #119

Merged

change mock embeddings to generated by default embedding function #124

Merged

Support more embedding function platforms #132

Merged

coderabbitai Bot mentioned this pull request Jan 28, 2026

rollback enhanced hybrid search #151

Merged

coderabbitai Bot mentioned this pull request Feb 26, 2026

Refine index configuration validation and typed helpers #173

Merged

2 tasks

coderabbitai Bot mentioned this pull request May 13, 2026

Feat/sdk namespaces stats #218

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: test suite and fix test infrastructure#91

refactor: test suite and fix test infrastructure#91
hnwyllmm merged 21 commits into
oceanbase:developfrom
longbingljw:refactor-test

longbingljw commented Dec 22, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Dec 22, 2025 •

edited

Loading

Uh oh!

longbingljw commented Dec 22, 2025

Uh oh!

longbingljw commented Dec 22, 2025

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

longbingljw commented Dec 22, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Solution Description

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Pre-merge checks and finishing touches

Uh oh!

longbingljw commented Dec 22, 2025

Uh oh!

longbingljw commented Dec 22, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

longbingljw commented Dec 22, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Dec 22, 2025 •

edited

Loading