Skip to content

Test collection name more than 64 characters#141

Closed
NTLx wants to merge 4 commits into
oceanbase:developfrom
NTLx:feature/test-long-collection-name
Closed

Test collection name more than 64 characters#141
NTLx wants to merge 4 commits into
oceanbase:developfrom
NTLx:feature/test-long-collection-name

Conversation

@NTLx

@NTLx NTLx commented Jan 26, 2026

Copy link
Copy Markdown
Contributor

Issue: #121

  • Add tests for collection names with 65, 128, and 512 characters
  • Cover create, get_or_create, and delete operations
  • Test boundary cases (64, 65, 512 chars)
  • Mark get_or_create test as xfail due to underlying table identifier limit bug

Test collection name more than 64 characters

Issue

#121

Track

Track 8 — seekdb, pyseekdb (Mentor: @hnwyllmm)

Changes

  • Added tests/test_long_collection_name.py with comprehensive tests for collection names exceeding 64 characters.
  • Tested collection names with 65, 128, and 512 characters.
  • Covered create, get_or_create, and delete operations.
  • Verified boundary cases (64, 65, 512 chars).

Findings & Bug Report 🐛

While writing these tests, I discovered that get_or_create_collection fails with long names.

  • create_collection: ✅ Works perfectly with names > 64 chars.
  • get_or_create_collection: ❌ Fails with OperationalError: (1059, "Identifier name ... is too long").
  • Reason: It seems get_or_create still uses the raw collection name as the table identifier, hitting the DB's 64-char limit, whereas create correctly uses the new metadata mapping mechanism.
  • Action: I have marked this specific test case as @pytest.mark.xfail to allow the test suite to pass while highlighting the issue.

Testing

  • All tests passed (3 passed, 1 xfailed).
  • Verified against a remote seekdb server.

AI Contribution Details 🤖

This PR was entirely implemented by an AI Agent (Claude Code) under human supervision.

  1. Analysis: Parsed the task guide and identified requirements.
  2. Environment: Setup uv virtual environment and installed dependencies via proxy.
  3. Implementation: Wrote tests/test_long_collection_name.py covering all edge cases.
  4. Validation: Ran tests against a live seekdb server.
  5. Bug Discovery: The AI agent independently identified the get_or_create bug via test failures, analyzed the pymysql stack trace, and applied @pytest.mark.xfail to handle it gracefully without blocking CI.

Summary

Solution Description

Summary by CodeRabbit

  • Tests
    • Added comprehensive test coverage for handling collection names exceeding 64 characters, including boundary case validation.
    • Documented a known issue with the get_or_create method for long collection names.

✏️ Tip: You can customize this high-level summary in your review settings.

Issue: oceanbase#121

- Add tests for collection names with 65, 128, and 512 characters
- Cover create, get_or_create, and delete operations
- Test boundary cases (64, 65, 512 chars)
- Mark get_or_create test as xfail due to underlying table identifier limit bug
@CLAassistant

CLAassistant commented Jan 26, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@coderabbitai

coderabbitai Bot commented Jan 26, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@NTLx has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 8 minutes and 3 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📝 Walkthrough

Walkthrough

Introduces a new test module tests/test_long_collection_name.py with a deterministic embedding function and comprehensive end-to-end tests for handling collection names exceeding 64 characters, including boundary cases and a marked test for a known limitation.

Changes

Cohort / File(s) Summary
Test Suite for Long Collection Names
tests/test_long_collection_name.py
New test module with Simple3DEmbeddingFunction (3D deterministic embeddings), TestLongCollectionName class containing fixtures for server connection, and four test methods covering create, get_or_create (xfail), delete, and boundary case scenarios for collections with names of 65, 128, and 512 characters.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Possibly related PRs

  • fix: Validate collection names #106: The new tests exercise the create_collection, get_or_create_collection, and delete_collection operations that interact with collection name validation logic introduced in that PR.

Suggested reviewers

  • hnwyllmm

Poem

🐰 A test for long names, oh what a sight!
Collection names stretching to 512, quite the height!
With embeddings dancing in 3D space,
Each boundary checked at a steady pace.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding tests for collection names exceeding 64 characters, which is the primary focus of the new test module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@tests/test_long_collection_name.py`:
- Around line 41-46: Replace the hardcoded server credentials (SERVER_HOST,
SERVER_PORT, SERVER_DATABASE, SERVER_USER, SERVER_PASSWORD) with
environment-driven values using os.environ.get (e.g., SEEKDB_HOST, SEEKDB_PORT,
SEEKDB_DATABASE, SEEKDB_USER, SEEKDB_PASSWORD) and sensible defaults, and update
any test code that references these constants to use the new variables; also
modify the server_client pytest fixture to detect a missing SEEKDB_HOST (or
equivalent) and call pytest.skip("SEEKDB_HOST not configured") so tests are
skipped when the external server is not available. Ensure you update references
to the constants in this file (and any imports) and keep the internal hostname
removed.
🧹 Nitpick comments (6)
tests/test_long_collection_name.py (6)

16-28: Parameter input shadows Python built-in.

The parameter name input shadows the built-in function. Consider renaming to texts or documents for clarity.

Also note that hash() is not deterministic across Python runs unless PYTHONHASHSEED is set. This is acceptable here since embedding values aren't asserted, but worth documenting if reused elsewhere.


48-60: Missing client cleanup after yield.

The fixture yields the client but doesn't perform cleanup. If pyseekdb.Client holds connections or resources, they may leak.

♻️ Add cleanup after yield
     `@pytest.fixture`
     def server_client(self):
         """Create server mode client"""
         import pyseekdb
         client = pyseekdb.Client(
             host=self.SERVER_HOST,
             port=self.SERVER_PORT,
             tenant="sys",
             database=self.SERVER_DATABASE,
             user=self.SERVER_USER,
             password=self.SERVER_PASSWORD
         )
         yield client
+        # Cleanup: close client connection if method exists
+        if hasattr(client, 'close'):
+            client.close()

62-79: Cleanup may be skipped if assertion fails.

If the assertion on line 74 fails, the collection won't be deleted, potentially leaving test artifacts. Consider using try/finally or addCleanup.

Also, line 79 has an f-string without placeholders (per static analysis).

♻️ Proposed fix with guaranteed cleanup
     def test_create_collection_with_long_name(self, server_client):
         """Test create collection with name > 64 chars"""
         print("\n" + "="*60)
         print("Testing create collection with long names (>64 chars)")
         print("="*60)
 
         for name in self.LONG_NAMES:
             print(f"\n✅ Creating collection with name length: {len(name)}")
-            collection = server_client.create_collection(
-                name=name,
-                embedding_function=Simple3DEmbeddingFunction()
-            )
-            assert collection.name == name, f"Collection name mismatch: {collection.name} != {name}"
-            print(f"   Created successfully: {name[:50]}...")
-
-            # Cleanup
-            server_client.delete_collection(name=name)
-            print(f"   Deleted successfully")
+            try:
+                collection = server_client.create_collection(
+                    name=name,
+                    embedding_function=Simple3DEmbeddingFunction()
+                )
+                assert collection.name == name, f"Collection name mismatch: {collection.name} != {name}"
+                print(f"   Created successfully: {name[:50]}...")
+            finally:
+                # Cleanup regardless of assertion result
+                if server_client.has_collection(name=name):
+                    server_client.delete_collection(name=name)
+                    print("   Deleted successfully")

111-137: Same cleanup concern; multiple f-strings without placeholders.

If the assertion on line 128 fails, the collection remains. Consider wrapping in try/finally.

Static analysis flags f-strings without placeholders on lines 125, 129, 133, 137. These should be regular strings.

♻️ Fix f-strings
-        print(f"   Created collection")
+        print("   Created collection")
 
         # Verify exists
         assert server_client.has_collection(name=name)
-        print(f"   Verified collection exists")
+        print("   Verified collection exists")
 
         # Delete
         server_client.delete_collection(name=name)
-        print(f"   Deleted successfully")
+        print("   Deleted successfully")
 
         # Verify deleted
         assert not server_client.has_collection(name=name)
-        print(f"   Verified collection deleted")
+        print("   Verified collection deleted")

163-171: Minor: f-string without placeholder on line 167.

Line 167 uses f-string but has no placeholders. Use a regular string.

-    print(f"\nEnvironment Configuration:")
+    print("\nEnvironment Configuration:")

35-39: Optional: Annotate mutable class attribute with ClassVar.

Per static analysis (RUF012), mutable class attributes should be annotated. The fullwidth parentheses in Chinese comments (RUF003) are acceptable for readability.

+from typing import ClassVar, List, Union
+
 class TestLongCollectionName:
-    LONG_NAMES = [
+    LONG_NAMES: ClassVar[list[str]] = [
         "a" * 65,
         "b" * 128,
         "c" * 512,
     ]

Comment thread tests/test_long_collection_name.py Outdated
Comment thread tests/test_long_collection_name.py Outdated
@@ -0,0 +1,171 @@
"""
Test collection name length > 64 characters
测试 collection name 长度超过 64 字符的情况

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line.

@hnwyllmm hnwyllmm left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. But you should put the file into tests/integration_tests directory. And you can refer to other test file, for example, we test different seekdb mode by pass through a db_client parameter, not creating the seekdb end by yourself.

@frostming frostming left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the instructions of the linked issue, you can just add the cases to tests/unit_tests/test_collection_name_validation.py

Comment thread tests/test_long_collection_name.py Outdated
Comment on lines +163 to +171
if __name__ == "__main__":
print("\n" + "="*60)
print("pyseekdb - Long Collection Name Tests")
print("="*60)
print(f"\nEnvironment Configuration:")
print(f" Server: {TestLongCollectionName.SERVER_HOST}:{TestLongCollectionName.SERVER_PORT}")
print(f" Database: {TestLongCollectionName.SERVER_DATABASE}")
print("="*60 + "\n")
pytest.main([__file__, "-v", "-s"])

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This main block is unnecessary, please remove it

NTLx added a commit to NTLx/pyseekdb that referenced this pull request Jan 31, 2026
…e#155

- Move test_long_collection_name.py to tests/integration_tests
- Use environment variables for database connection
- Add proper cleanup in fixtures
- Change pytest.raises(Exception) to pytest.raises(TypeError)
- Add assertions to verify special character escaping
@NTLx

NTLx commented Feb 10, 2026

Copy link
Copy Markdown
Contributor Author

Hi @hnwyllmm! This PR is now MERGEABLE with all CI checks passing.

Summary:

  • Added validation test for collection names exceeding 64 characters
  • Ensures proper error handling for long collection names

Ready for merge!

@hnwyllmm

Copy link
Copy Markdown
Member

The issue that this pull request try to fix is fixed by #142.
You committed the test file in another pull request. So I'll close pull request directly.
Thanks for your contribution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants