feat: add pyseekdb CLI entry point in pyproject.toml#161
Conversation
## Summary Introduce a new script entry point for the pyseekdb command-line interface. ### Changes - Added `[project.scripts]` section in `pyproject.toml` to define the `pyseekdb` command, linking it to the `main` function in the `pyseekdb.cli` module. ## Test plan - Verified the new CLI entry point by running the command and ensuring it executes without errors.
📝 WalkthroughWalkthroughAdds a CLI package and console script entrypoint for pyseekdb; implements a full command-line tool (argument parsing, connection factories, admin/collection/query/sql/get commands, output formatting, and a main(argv) entrypoint). Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as "User"
participant CLI as "CLI Parser\nrgba(52,152,219,0.5)"
participant Factory as "Connection Factory\nrgba(46,204,113,0.5)"
participant Client as "Client/Admin Proxy\nrgba(155,89,182,0.5)"
participant DB as "Database/Server\nrgba(231,76,60,0.5)"
participant Formatter as "Output Formatter\nrgba(241,196,15,0.5)"
User->>CLI: run `pyseekdb` with args
CLI->>CLI: parse global + subcommand args
CLI->>Factory: request client/admin (embedded | server)
Factory->>DB: open connection
DB-->>Factory: connection established
Factory-->>Client: return proxy
CLI->>Client: execute subcommand (db/collections/sql/query/get)
Client->>DB: perform operation
DB-->>Client: return results
Client-->>CLI: results returned
CLI->>Formatter: format results (table|json)
Formatter-->>CLI: formatted output
CLI-->>User: display output
CLI->>DB: close server connection (if applicable)
DB-->>CLI: closed
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/pyseekdb/cli/main.py`:
- Around line 200-229: cmd_collections_info currently emits two separate JSON
documents when args.output == "json" (one for info via _print_json(info) and
another for sample), breaking machine parsing; change the logic so when
args.output == "json" you build a single payload (for example {"info": info,
"sample": sample_or_empty}) and call _print_json exactly once after retrieving
sample (use coll.peek when args.sample and count>0), otherwise keep the existing
human-readable prints; update the branches around _print_json and sample
handling in cmd_collections_info to assemble and print one JSON object.
- Around line 16-35: In _add_connection_args, make embedded vs server options
mutually exclusive by replacing the argument_group with
parser.add_mutually_exclusive_group(required=False) (or a non-required mutual
group) and add the conflicting flags (--path and --host) into that group so the
parser enforces that users cannot supply both; keep other server flags (--port,
--tenant, --database, --user, --password) as normal arguments but ensure only
--path or --host can be chosen at parse time.
🧹 Nitpick comments (2)
src/pyseekdb/cli/main.py (2)
72-75: Remove unused# noqa: SLF001directives.Ruff reports these as unused because
SLF001isn’t enabled. Removing them here (and the other cleanup lines) will keep lint clean.♻️ Suggested cleanup
- server = client._server # noqa: SLF001 - return server._execute(sql) # noqa: SLF001 + server = client._server + return server._execute(sql)
338-394: Add connection args to db and collections subparsers for consistent option placement.Currently,
sql,query, andgetsubcommands accept connection flags after the subcommand (via_add_connection_args()on their parsers), whiledbandcollectionsonly accept them before the subcommand. This inconsistency makes the UX unpredictable. Adding_add_connection_args()to all subparsers would allow consistent flag placement across all commands.🔧 Suggested adjustment
db = sub.add_parser("db", help="Database (admin) operations") + _add_connection_args(db) db_sub = db.add_subparsers(dest="db_command", required=True) _add_collection_subparsers(sub.add_parser("collections", help="Collection operations")) + collections_p = sub.add_parser("collections", help="Collection operations") + _add_connection_args(collections_p) + _add_collection_subparsers(collections_p) _add_collection_subparsers(sub.add_parser("coll", help="Alias for collections")) + coll_p = sub.add_parser("coll", help="Alias for collections") + _add_connection_args(coll_p) + _add_collection_subparsers(coll_p)
| def _add_connection_args(parser: argparse.ArgumentParser) -> None: | ||
| """Add global connection options (embedded vs server).""" | ||
| g = parser.add_argument_group("connection (choose one)") | ||
| g.add_argument( | ||
| "--path", | ||
| metavar="DIR", | ||
| default=None, | ||
| help="Embedded mode: path to seekdb data directory (default: seekdb.db in cwd)", | ||
| ) | ||
| g.add_argument("--host", default=None, help="Server mode: host (e.g. localhost)") | ||
| g.add_argument("--port", type=int, default=2881, help="Server mode: port (default: 2881)") | ||
| g.add_argument("--tenant", default="sys", help="Server mode: tenant (default: sys)") | ||
| g.add_argument("--database", "-d", default="test", help="Database name (default: test)") | ||
| g.add_argument("--user", default="root", help="Server mode: user (default: root)") | ||
| g.add_argument( | ||
| "--password", | ||
| "-p", | ||
| default=None, | ||
| help="Server mode: password (or set SEEKDB_PASSWORD)", | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "main.py" -path "*/cli/*" | head -5Repository: oceanbase/pyseekdb
Length of output: 88
🏁 Script executed:
cat -n src/pyseekdb/cli/main.py | head -60Repository: oceanbase/pyseekdb
Length of output: 2470
🏁 Script executed:
rg "def _add_connection_args" -A 30 src/pyseekdb/cli/main.pyRepository: oceanbase/pyseekdb
Length of output: 1329
🏁 Script executed:
rg "args\.path\|args\.host" src/pyseekdb/cli/main.py -A 2 -B 2Repository: oceanbase/pyseekdb
Length of output: 44
🌐 Web query:
argparse mutually exclusive group best practice Python
💡 Result:
argparse mutually exclusive group: best practices
-
Use it for “choose one of these options” flags
- Create a group with
parser.add_mutually_exclusive_group()and add the options to that group;argparsewill reject cases where the user supplies more than one. [1]
- Create a group with
-
If the user must pick one, set
required=Trueon the group (not the individual args)required=Truemeans “at least one of these options must be provided”. [1]- Don’t try to set
required=Trueon each argument inside the group (that’s not how exclusivity is intended to be expressed). [1]
-
Prefer subcommands (
add_subparsers) when you’re modeling verbs/modes- For CLI patterns like
tool start|stop|restart, subparsers usually produce clearer help/usage and extensibility than a mutually exclusive set of boolean flags. [2]
- For CLI patterns like
-
Help formatting: put the exclusive group inside an argument group if you need a title/description
- Mutually exclusive groups don’t take
title/descriptiondirectly, but you can create anadd_argument_group(...)and then callgroup.add_mutually_exclusive_group(...)inside it. [1]
- Mutually exclusive groups don’t take
-
Avoid unsupported/removed nesting
- As of Python 3.14, calling
add_argument_group()oradd_mutually_exclusive_group()on a mutually exclusive group raises an exception (it was deprecated in 3.11 and removed in 3.14). [1]
- As of Python 3.14, calling
Minimal example
import argparse
p = argparse.ArgumentParser()
mode = p.add_mutually_exclusive_group(required=True)
mode.add_argument("--json", action="store_true")
mode.add_argument("--yaml", action="store_true")
args = p.parse_args()Sources: Python argparse docs (Mutual exclusion, grouping, deprecations) [1]; Stack Overflow discussion showing subparsers as the better fit for verb-style commands [2].
[1] (docs.python.org)
[2] (stackoverflow.com)
Citations:
- 1: https://docs.python.org/3/library/argparse.html?utm_source=openai
- 2: https://stackoverflow.com/questions/7869345/how-to-make-python-argparse-mutually-exclusive-group-arguments-without-prefix
🏁 Script executed:
find . -name "*.py" -path "*/client*" | grep -v __pycache__ | head -10Repository: oceanbase/pyseekdb
Length of output: 458
🏁 Script executed:
rg "class Client" -A 15 src/pyseekdb/Repository: oceanbase/pyseekdb
Length of output: 1070
🏁 Script executed:
rg "def __init__" src/pyseekdb/client/ -A 20 | head -80Repository: oceanbase/pyseekdb
Length of output: 6236
🏁 Script executed:
cat -n src/pyseekdb/client/__init__.py | head -100Repository: oceanbase/pyseekdb
Length of output: 4013
🏁 Script executed:
rg "class Client" -B 2 -A 20 src/pyseekdb/client/__init__.pyRepository: oceanbase/pyseekdb
Length of output: 44
🏁 Script executed:
rg "def __init__.*path.*host" -A 30 src/pyseekdb/client/Repository: oceanbase/pyseekdb
Length of output: 44
🏁 Script executed:
cat -n src/pyseekdb/client/__init__.py | sed -n '100,150p'Repository: oceanbase/pyseekdb
Length of output: 2123
🏁 Script executed:
rg "^def Client" -A 50 src/pyseekdb/client/__init__.pyRepository: oceanbase/pyseekdb
Length of output: 2254
Make embedded vs server mode mutually exclusive.
--path and --host can both be provided, which silently favors embedded mode. Using add_mutually_exclusive_group() enforces the choice at parse time and prevents ambiguous configuration.
🔧 Suggested fix
- g = parser.add_argument_group("connection (choose one)")
- g.add_argument(
- "--path",
- metavar="DIR",
- default=None,
- help="Embedded mode: path to seekdb data directory (default: seekdb.db in cwd)",
- )
- g.add_argument("--host", default=None, help="Server mode: host (e.g. localhost)")
+ g = parser.add_argument_group("connection")
+ mode = g.add_mutually_exclusive_group()
+ mode.add_argument(
+ "--path",
+ metavar="DIR",
+ default=None,
+ help="Embedded mode: path to seekdb data directory (default: seekdb.db in cwd)",
+ )
+ mode.add_argument("--host", default=None, help="Server mode: host (e.g. localhost)")🤖 Prompt for AI Agents
In `@src/pyseekdb/cli/main.py` around lines 16 - 35, In _add_connection_args, make
embedded vs server options mutually exclusive by replacing the argument_group
with parser.add_mutually_exclusive_group(required=False) (or a non-required
mutual group) and add the conflicting flags (--path and --host) into that group
so the parser enforces that users cannot supply both; keep other server flags
(--port, --tenant, --database, --user, --password) as normal arguments but
ensure only --path or --host can be chosen at parse time.
… output ## Summary Updated the `cmd_collections_info` function to include a sample of collection data in the JSON output format. This change allows users to see a preview of the collection's contents alongside its metadata. ### Changes - Modified the JSON output to include both collection info and a sample of documents when the `--sample` argument is provided. - Removed redundant sample printing logic from the non-JSON output section. ## Test plan - Verified the updated command by running it with various output formats and sample sizes to ensure correct functionality.
Updated the CLI documentation in `__init__.py` and `main.py` to use lowercase "seekdb" for consistency across the project. This change enhances the clarity and uniformity of the command-line interface documentation.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@src/pyseekdb/cli/main.py`:
- Around line 392-397: The get subparser currently allows both --ids and --limit
which is ambiguous; modify the parser setup for the "get" command (the get_p
parser) to enforce mutual exclusivity by creating a mutually exclusive argument
group (e.g., group = get_p.add_mutually_exclusive_group()) and move the --ids
and --limit definitions into that group (keep their types/defaults and help
text), or alternatively add a runtime check in the get command handler to raise
an error if both args.ids and args.limit are provided; update code referencing
get_p, --ids, and --limit accordingly.
- Around line 177-183: The dimension check in cmd_collections_create currently
treats 0 as falsy and drops the configuration; change the logic to test for None
explicitly so a provided value of 0 is honored: build config =
HNSWConfiguration(dimension=args.dimension) if args.dimension is not None else
None and then pass that config into client.create_collection (references:
cmd_collections_create, HNSWConfiguration, client.create_collection).
| def cmd_collections_create(args: argparse.Namespace) -> int: | ||
| """Create collection.""" | ||
| client = _make_client(args) | ||
| try: | ||
| config = HNSWConfiguration(dimension=args.dimension) if args.dimension else None | ||
| client.create_collection(args.name, configuration=config) | ||
| print(f"Created collection: {args.name}") |
There was a problem hiding this comment.
Validate --dimension explicitly; 0 is currently ignored.
The truthy check skips config when --dimension 0 is passed, silently creating a collection without the intended dimension.
🛠️ Suggested fix
- config = HNSWConfiguration(dimension=args.dimension) if args.dimension else None
+ if args.dimension is not None and args.dimension <= 0:
+ print("dimension must be positive", file=sys.stderr)
+ return 1
+ config = HNSWConfiguration(dimension=args.dimension) if args.dimension is not None else None🤖 Prompt for AI Agents
In `@src/pyseekdb/cli/main.py` around lines 177 - 183, The dimension check in
cmd_collections_create currently treats 0 as falsy and drops the configuration;
change the logic to test for None explicitly so a provided value of 0 is
honored: build config = HNSWConfiguration(dimension=args.dimension) if
args.dimension is not None else None and then pass that config into
client.create_collection (references: cmd_collections_create, HNSWConfiguration,
client.create_collection).
| get_p = sub.add_parser("get", help="Get documents from collection") | ||
| _add_connection_args(get_p) | ||
| get_p.add_argument("collection", help="Collection name") | ||
| get_p.add_argument("--ids", default=None, help="Comma-separated IDs") | ||
| get_p.add_argument("--limit", type=int, default=10, help="Max rows (default: 10)") | ||
| get_p.add_argument("--include", nargs="+", default=None, help="Include fields") |
There was a problem hiding this comment.
Prevent --ids and --limit from being passed together.
Spec says “by id or by limit”; allowing both is ambiguous.
🛠️ Suggested fix
- get_p.add_argument("--ids", default=None, help="Comma-separated IDs")
- get_p.add_argument("--limit", type=int, default=10, help="Max rows (default: 10)")
+ get_grp = get_p.add_mutually_exclusive_group()
+ get_grp.add_argument("--ids", default=None, help="Comma-separated IDs")
+ get_grp.add_argument("--limit", type=int, default=10, help="Max rows (default: 10)")🤖 Prompt for AI Agents
In `@src/pyseekdb/cli/main.py` around lines 392 - 397, The get subparser currently
allows both --ids and --limit which is ambiguous; modify the parser setup for
the "get" command (the get_p parser) to enforce mutual exclusivity by creating a
mutually exclusive argument group (e.g., group =
get_p.add_mutually_exclusive_group()) and move the --ids and --limit definitions
into that group (keep their types/defaults and help text), or alternatively add
a runtime check in the get command handler to raise an error if both args.ids
and args.limit are provided; update code referencing get_p, --ids, and --limit
accordingly.
|
All issues are solved, except several minor issues (improvement and styles). |
|
@hnwyllmm Please review this pull request. |
Introduce a CLI for pyseekdb to support debugging and managing SeekDB databases/collections (close #55).
Changes
src/pyseekdb/cli/__init__.py、src/pyseekdb/cli/main.py,实现命令行入口与子命令分发。--path(默认当前目录下seekdb.db)。--host、--port(默认 2881)、--tenant、--database、--user、--password(或环境变量SEEKDB_PASSWORD)。list/create <name>/delete <name>(数据库管理)。list/create <name> [--dimension N]/delete <name>/info <name> [--sample N](collection 管理及预览)。--text、--n)。--ids、--limit)。-o table(默认)或-o json。pyproject.toml中增加[project.scripts],将pyseekdb指向pyseekdb.cli:main。Solution Description
针对 issue #55,基于现有
Client/AdminClient和 spec 中的连接与能力说明,实现一个统一的 CLI,便于:CLI 复用
Client()与AdminClient()的工厂逻辑,支持嵌入式(--path)与服务端(--host/--port等)两种模式;子命令通过 argparse 组织,无额外依赖。Test plan
pyseekdb --help、pyseekdb db --help、pyseekdb collections --help等,确认子命令与参数展示正确。db list、collections list、collections info <name> --sample N、sql "SHOW TABLES"、query <collection> --text "..."、get <collection> --limit N及-o json输出,确认无报错且结果符合预期。related to #55
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.