feat(indexer): Rename CLI arg table-name to dataset-name for consistency; Update the name of the column metadata table name to put the dataset name first.#846
Conversation
WalkthroughThis change updates naming conventions and parameter handling related to datasets and table prefixes across several components. In the Python utilities, the function for creating metadata tables now takes a table prefix and constructs the full table name internally. In the C++ core components, all references to "table" are renamed to "dataset" in variable names, method parameters, and documentation, including command-line argument handling. Table name construction is centralized and standardized, with suffixes and prefixes applied consistently. The job orchestration layer now uses a configuration constant for the default dataset name instead of a hardcoded string. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant CommandLineArguments
participant IndexManager
participant MySQLIndexStorage
User->>CommandLineArguments: Provide dataset-name as argument
CommandLineArguments->>IndexManager: get_dataset_name()
IndexManager->>MySQLIndexStorage: init(dataset_name, should_create_table)
MySQLIndexStorage->>MySQLIndexStorage: Construct table_name from prefix + dataset_name + suffix
MySQLIndexStorage->>MySQLIndexStorage: Create or update metadata table
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
⏰ Context from checks skipped due to timeout of 90000ms (6)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
table-name to dataset-name for clarity; Update the default column metadata table name.
table-name to dataset-name for clarity; Update the default column metadata table name.table-name to dataset-name for clarity; Update the name of the default column metadata table created when using clp-s.
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
feat(indexer): Rename CLI arg `table-name` to `dataset-name` for consistency; Update the name of the column metadata table name to put the dataset name first.
table-name to dataset-name for clarity; Update the name of the default column metadata table created when using clp-s.table-name to dataset-name for consistency; Update the name of the column metadata table name to put the dataset name first.
…istency; Update the name of the column metadata table name to put the dataset name first. (y-scope#846) Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Description
As title.
This PR succeeds:
clp-sto compression task workers. #819datasetfields to input configs of compression and search jobs. #839CLP_Sstorage engine. #831and covers the following parts of the dataset feature implementation plan:
clp_column_metadata_defaulttoclp_default_column_metadatato comply with the dataset tables naming conventionclp_<dataset-name>_<type>for all tables.Since for
clp_<dataset-name>_<type>, the table prefixclpand the table suffix<type>(in this case,COLUMN_METADATA) is usually fixed, what the indexer needs is just a dataset name from the CLArgs, not the full table name. Hence, in this PR we update the argument name and all its reference sites.Checklist
breaking change.
Validation performed
clp-package. Running indexer generates no errors.Summary by CodeRabbit