Skip to content

refactor(clp-package): Replace hard-coded CLP metadata table names with table_prefix from connection params and table suffix constants.#864

Merged
Bill-hbrhbr merged 9 commits into
y-scope:mainfrom
Bill-hbrhbr:use-table-constants
Apr 29, 2025

Conversation

@Bill-hbrhbr

@Bill-hbrhbr Bill-hbrhbr commented Apr 29, 2025

Copy link
Copy Markdown
Contributor

Description

  • Replace all hardcoded occurrences of CLP_METADATA_TABLE_PREFIX with the table_prefix field in database_config.get_clp_connection_params_and_type(disable_localhost_socket_connection=true)
  • Replace all archives, tags, archive_tags and files with their corresponding suffix constants.

These will help in future PRs due to:

  • Easier to modify table_prefix into {table_prefix}{dataset}_ when using CLP_S
  • Easier to ensure that all dataset-specific table reference sites have their table_prefix updated by grep-ping *_TABLE_SUFFIX variables.

This PR succeeds:

and prepares for step no. 3 of the dataset feature implementation plan.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Clp compression and search works as expected.
bingranhu@baker22:/home/bingranhu/clp/build/clp-package$ ./sbin/start-clp.sh
2025-04-29T06:28:41.233 INFO [start_clp] Starting database...
2025-04-29T06:31:42.630 INFO [start_clp] Started database.
2025-04-29T06:31:42.630 INFO [start_clp] Creating database tables...
2025-04-29T06:33:17.604 INFO [start_clp] Created database tables.
2025-04-29T06:33:17.604 INFO [start_clp] Starting queue...
2025-04-29T06:34:54.842 INFO [start_clp] Started queue.
2025-04-29T06:34:54.842 INFO [start_clp] Starting redis...
2025-04-29T06:35:20.930 INFO [start_clp] Started redis.
2025-04-29T06:35:20.930 INFO [start_clp] Starting results_cache...
2025-04-29T06:36:47.031 INFO [start_clp] Started results_cache.
2025-04-29T06:36:47.031 INFO [start_clp] Creating results_cache indices...
2025-04-29T06:39:21.753 INFO [start_clp] Created results_cache indices.
2025-04-29T06:39:21.753 INFO [start_clp] Starting compression_scheduler...
2025-04-29T06:40:04.509 INFO [start_clp] Started compression_scheduler.
2025-04-29T06:40:04.509 INFO [start_clp] Starting query_scheduler...
2025-04-29T06:41:40.362 INFO [start_clp] Started query_scheduler.
2025-04-29T06:41:40.362 INFO [start_clp] Starting compression_worker...
2025-04-29T06:43:15.929 INFO [start_clp] Started compression_worker.
2025-04-29T06:43:15.930 INFO [start_clp] Starting query_worker...
2025-04-29T06:44:50.695 INFO [start_clp] Started query_worker.
2025-04-29T06:44:50.695 INFO [start_clp] Starting reducer...
2025-04-29T06:46:26.779 INFO [start_clp] Started reducer.
2025-04-29T06:46:26.779 INFO [start_clp] Starting webui...
2025-04-29T06:48:02.359 INFO [start_clp] Started webui.
2025-04-29T06:48:02.360 INFO [start_clp] Starting log_viewer_webui...
2025-04-29T06:49:37.695 INFO [start_clp] Started log_viewer_webui.
bingranhu@baker22:/home/bingranhu/clp/build/clp-package$ ./sbin/compress.sh --config etc/clp-config.yml fs ~/dataset/data/
2025-04-29T06:53:27.963 INFO [compress] Compression job 1 submitted.
2025-04-29T06:53:28.466 INFO [compress] Compressed 1.77KB into 889.00B (2.04x). Speed: 3.88KB/s.
2025-04-29T06:53:28.967 INFO [compress] Compression finished.
2025-04-29T06:53:28.967 INFO [compress] Compressed 1.77KB into 889.00B (2.04x). Speed: 2.63KB/s.
bingranhu@baker22:/home/bingranhu/clp/build/clp-package$ ./sbin/search.sh --ignore-case "level: INFO"
: {"timestamp":"2025-04-10T23:10:45Z","level":"INFO","service":"auth-service","message":"User login successful","user_id":"abc123","ip":"192.168.1.10"}
: {"timestamp":"2025-04-10T23:10:45Z","level":"INFO","service":"auth-service","message":"User login successful","user_id":"abc123","ip":"192.168.1.10"}
: {"timestamp":"2025-04-10T23:10:45Z","level":"INFO","service":"auth-service","message":"User login successful","user_id":"abc123","ip":"192.168.1.10"}
bingranhu@baker22:/home/bingranhu/clp/build/clp-package$

Summary by CodeRabbit

Summary by CodeRabbit

  • Refactor
    • Updated database table naming to use dynamic prefixes and suffixes from configuration, replacing hardcoded table names across multiple modules.
    • Modified functions and class constructors to accept additional parameters for database connection details, supporting more flexible table naming.
    • Adjusted import statements to use new constants for table suffixes, enhancing consistency and configurability.

@Bill-hbrhbr Bill-hbrhbr requested a review from a team as a code owner April 29, 2025 06:19
@coderabbitai

coderabbitai Bot commented Apr 29, 2025

Copy link
Copy Markdown
Contributor
## Walkthrough

This set of changes refactors how database table names are constructed throughout several components. Hardcoded table name suffixes are replaced with imported constants, and table prefixes are dynamically retrieved from configuration or connection parameters at runtime. Multiple functions and class constructors are updated to accept additional parameters for database connection configuration, enabling dynamic assembly of table names. The modifications affect SQL queries in scripts and core orchestration modules, but do not alter the overall logic, error handling, or control flow.

## Changes

| File(s)                                                                                           | Change Summary                                                                                                                                                                                                                                                                                                                                                                                        |
|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `components/clp-package-utils/clp_package_utils/scripts/native/archive_manager.py`              | Replaces hardcoded table suffixes in SQL queries with imported constants (`ARCHIVE_TAGS_TABLE_SUFFIX`, `ARCHIVES_TABLE_SUFFIX`, `FILES_TABLE_SUFFIX`). Table names are now dynamically constructed using these suffixes and the table prefix from configuration.                                                                                                                                    |
| `components/clp-package-utils/clp_package_utils/scripts/native/decompress.py`                   | Replaces use of the `CLP_METADATA_TABLE_PREFIX` constant with dynamic retrieval of the table prefix from connection parameters and uses the `FILES_TABLE_SUFFIX` constant to construct the files table name in SQL queries.                                                                                                                                    |
| `components/clp-package-utils/clp_package_utils/scripts/start_clp.py`                           | Updates imports to use table suffix constants and the `Database` class. Refactors the `start_webui` function to dynamically construct archive and files table names using the table prefix from the database configuration and the respective suffix constants. Database connection parameters are now accessed via the `Database` object.                         |
| `components/job-orchestration/job_orchestration/executor/compress/compression_task.py`          | Imports `ARCHIVE_TAGS_TABLE_SUFFIX` and updates the `update_tags` function to use this constant for constructing the archive tags table name in SQL insert statements, replacing the hardcoded string.                                                                                                                  |
| `components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py`    | Removes `CLP_METADATA_TABLE_PREFIX` and uses the `TAGS_TABLE_SUFFIX` constant. Table names for tag operations are now dynamically constructed using the table prefix from the connection configuration and the suffix constant in SQL queries.                                                                            |
| `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py`             | Adds a `table_prefix` parameter to several functions and class constructors. Table names in SQL queries are now built dynamically using the table prefix from connection parameters and imported table suffix constants. Function and constructor signatures are updated accordingly.                           |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant Caller
    participant Config
    participant DB

    Caller->>Config: Get DB connection params (includes table_prefix)
    Caller->>Caller: Construct table name (table_prefix + TABLE_SUFFIX)
    Caller->>DB: Execute SQL query using dynamic table name
    DB-->>Caller: Return results

Possibly related PRs

  • y-scope/clp#665: Adds and refactors archive management CLI utilities affecting the same functions (_find_archives, _delete_archives) updated here, showing direct code-level relation.

<!-- walkthrough_end -->


---

<details>
<summary>📜 Recent review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**
**Plan: Pro**

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 28fa098dddaca464582d83123ebfcbd064d66a90 and a0bd99625208b2f98cf288f7f57522361083c441.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `components/clp-package-utils/clp_package_utils/scripts/start_clp.py` (2 hunks)

</details>

<details>
<summary>✅ Files skipped from review due to trivial changes (1)</summary>

* components/clp-package-utils/clp_package_utils/scripts/start_clp.py

</details>

</details>
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNxU3bABsvkCiQBHbGlcABpIcVwvOkgAIn8AMzQxfAoACgYvbjBuZIBrNFIASi4bEm4vZJJIWGdaMCYlegBhABkrSDZcNFpqNAiVaMgMNDZkAHd1WEgAA26BaIB9XhIE+AAPGcgEihYFDHIxeHwsXKpmZEx6eaGHBLX1/cRujFxEUNjIcbRkBwFmdQ0ehyCKwarYRCUSAAIXgPjAsAEFERfB+1jsGEcAihAA4AGwAFn4WFwYMgsjAiCY3BIAHpMtw/OV8Ih1KlZBoYGSGLUMKQmUkUhRkKTqo1qiD/BVkvA+egfDU6uL6PgGAxsBR/BgGNJ+AlZm0rIsALIAUWgAEEACIWy2LS3Q1qmxa2U0AMQAkgANLaTUmg6pzQYkZaJDZbNYkLwqgTdWUxHZ7Ga9eY/ENMDBrIgaUi4RYM/MnQ7iE7LZyjRCLK6LXCyGlpWhIYOLLyqtBeWAsvOIVV5Eh5jPF44YAC8uAowSKM05FtojZLIx8snCooGC2qIzGs2cPMk0hm4SDREQB+3FF3UhrhRP4SuszW0RP6H8TOlOvofumovgfCYmuk3AnI2cp3A8TwvG8GhuAgyCJMkBAULKRDbDKXjqNQuoJHgGrVGQvI6mwrzvPY2A8ugyDMIo8AJLISEBrMNwhisDxbAQKDal42BKJAKZprglI0gw1HwAwPBhusuppCQOYaIeADejGhqsGwAL5ybxkK4Cpiy+mCWAQnRMyGosADKMxFOgGD0GQDj+MgGass87CQNg3AprqbEaf2AmiMJomMQKlBkDqMjyAZcqrkoazkPQMwAFT2hajrOiZACqbqej64GYJBXLVJ4Cr+EEISQAI2BwrQyCuScln0A4aokHQyArBIxwQuiyBmDiACMACc4RdQAzP1kBDd1t5WaNOIEni4QrGcdEJKk9Gkj+dU0Iy+D6pFfSadsJDUDhKDMBUJCEd0C48JUGBQQYcDVLYCgnWhuqfvRvD4CIYgAOT2Sc47wGVl1EOVShoeQlyTagGD4Lg6Clf4aB5HRPKYKQnIAHKiNIiDOPItCqo47DUMOLluRhExBZAJBTFCzA9NUy1KGdMTYAc2OILjLicgAau28ApqT3zNZQS0UGwwLyM8zjiHK7a+Ia9iyE5zBPYB5BEZADZ9AIabhMVwThP4jbEXZ3hvPmyRguETAnXZrKllSYK0N4lD68ELiLE7dCuxQNssCsnPDos4ypH2fuQMVnuhxQ4dGz7OoR+MJACJAqUehN9CtshrUkMnfDJ6n6cWXe9UhYgWFLn4bMYCjAf26TpcHee0y2/TVnERmazi3RpLUJAiu24HDtYE3O7TFh2oLvz4ivWH5HU+sglArd908N4vhFcEzyMJgpXVP4ufJ/Qso8f22i+NQXCwLguDcIgHC0rSRBTNgAgaLbtIUlS+A0vSWS0gKl4Wk+ICS3QsAPFg51fiOHpi4aCjMMBeHkG3ZyNVVrNQ3kyYqO9UCJlVquCEUJkax12LQRCINpDhHGJ2bB28gTw0PvAPO2xdgELJEQ38ihlCqHQvATkbp8A+HwJMCKMEJo4AIPTBhUpkGsL2KuZo3CbAqDUHDRAysaCq3yDDcY0RaCkGuGSJhLCt4lTvLKRsDAMJGP7v0Ex4wvhogBkQUgxtORrw0SreUPZhiwxiH3OG6geL4F1DDOG/gwAOPlIjWg8gHExDbuoSGti4YYKeu3E+lxuAVBEsGYkciaFkEkSwEmokHGXBfLkIhtBboY3wNsDUoo+Am3VEHE44QEiNVoLrBgeRwjLVQZrfO+VditSaMSd6WCzHPFuvoYw4AoBkBVPqNAeBCCkHIFQBhQzOBiX4MIUQ4gpChQUEoKgvDNDaF0GAQwJgoDuGQKgPeayCDEDIMoHZUD2BcCoI4hwTgXClRQdwi5aitA6HmQs0wBgh4nHYIgf+2Rch9MKCQMAeA4SIoLCigopBFiYq8IiqkiFuBvFpCMY5dJx57kWO3NFFANDcFkBwAwsR2UGAgRaD07ytk2JIoC+QW1GC8lIIgBBx1AIUHURdFmrxtjLRmFaHWaYIxsNmDi2QBLxBEo/lkQsmZ4BEF9GiEgS8rgBIae2Xx8ATqpDSbAfwYoTjSyIlwGYFobDNAABIeh5s6S0ABxEyiVkqmXSplU8HqvW+v9SGh0Tpw0ZW9KeO8MxPROnjUlRNaVk0+g8WCSE2VNayKqIqCgtBlRriGJuaooENi6lPiZAAiq0SOHtmETCmKfVcMxFjRVoFWFue4nxpsWGDfsIYaUnIjGzI4LrOQmUEn5eWy4Vr5NrcgMWz4LxUO2HCPdd5p3VG6MeZ8G4RHgQnGIGIII0A5OWb3QtdbsD3A2MWt4EQGm9sUsxcM7EnI9D1C5VkcoHgxGeIhPkyA0J9lmLEY9iBYhRtiA+aQyHM7wePVeY8yGC2oFRnyaoaB7hHJFGSFtbaTKmidM0aA7bKDyFPn2gdQ7d2jsmquK0NHzSmgY4hRtWA+0TpoGxhAM7wg2Q1HXDAjkaDyrvF3I1Gp8kBU3EhAt1Rs4iUk5qZavJaDgyIJhjM44hHbFbI4/w9NT5ztFXQW65hLAWi8DQbZw4RTfrJEoTI5YFzIGFWaqVDDlqeAWCJamrx1CdqMPUkVaNXpU3ptxNikVRCVHcy6/gfBWREEpThAL+ogv2piKFt+aFRLsGi9IKC7LYhGAgGAIwcKNbkoZDkfIaKMU6uxfq3FaLtVYtpCS+AZLEWUr3LSHz9ccZMpZWyjlXKeWbM+RB2BeNgOEbFRK21wX7CyvOgqvgRl2gmnNNaW0FpQ2JtdJGpxsFyiVHfMCyV9rDIZtNFmsNua7v4I1X1rVhLEB6u4AarM05IAehJGSSe86hO5kWKkI1/aD2LAFqeHaqYi2DiOaTM4oxJ3CnPX4xxQWKvqDkf4AGJApD0Fav0ZMAgwdGpzP2fM+qcdw7LOcSs1Zaz1mgBOEgU5MOrgCn+x4qAzXjnggmdVooi3466JQYHeVICUf4/IQJYkadtUQHImp8NagVqreL8SH6tYnaNGaS0No7QJudLdlNFkMzWPk/yt6MxUMHqQ1sVArl3LXAaXE9T1iq5EPolTgTtOGLNgl1sV3GFKUxE9x9r7OaI0pvw/ZeznQqKRmQJ2RxYuN0E5QH9WT45SIMO3auKOWvahwy6bgHkHkyRI5foufdQwPRWhe7kUk4R6bIzlMEpQNIrJoKwBOKLbAeIquxycLMKnLrbLBHwPuo990SXoA5CCjnOXOdc58jzX76I+YyyTLLgWl4lZVHwMLFXIuRBiwYKAcWtseS83Wo1+W7LZbPl80y1k2A2K2lVKwf3Kwiyq1nkphfGS2kgW3qzfxhRawRSRQ61RVIG6yGxxU63xSB2GwYFJXJWlmlXZ24Dm1ZTq0P0gG5V5VWzqnWyBWFU/3FTujJF23tX2wwmgS+CS3z2YSDyZEoikHP3KH8DdxiCtzO1t0u2u0dxsHdBTVqgAxoCAy4OlQiFDmGBYT3xygflmE9R9T9U+wUKTTuzTTT3MJ+yz0h2h0DDILzELnKlnSnmHBXG8wX2dXZjhx4HLGVyJ2cAvUcUEDjBihezDyM3+1ByU2zF4l1khFZwHA5yLFx1LCV15yshrDrBIDSEF0nFPD+xoREi/DJCDHXCUhYnL0XhlxvVqTV2PVULQxFFLy3B7XKOV1SC9n7FlmPAJXJhoCfCsWniBWCJ4lkFD1XSvWr1vRBW1CTxJjEVQAqKWHjy+CmHonIH+VfTAn0KInjjfCfVGV13wHamN0rW4XoAj2FRkJtwu3t2zUUOUPzTVzsjhmFUxxUDTH2CHBqiV0JxSX4GaXsF6KQgexs30m1HswaMeXiyIwFHglSEL0vVXELjTg9F+nnyxxPTaN1HGM0iH3wFajA3VX6HA3oAlwt08kmNGAiwkHbG3jPkQhjz+y+MSJ8L+KwDiJX2HAPy5WP2AM83EKAKvxAJv2CwgPXnC0qyi1gIa0gA/3s2FNXFyz/11CZnSz81PwlLvwAMf2gLlM7X4PgIZk0zz0bDWDd1P3GLQgBAYTYllDcy7201Ek/A6McO6HIJcPgDcL8IsTtS0Oli0QRVqw5RQKa1hQDnhSIlpCEEEDAFSFb0gzFLjMEERxbhCGANpDNVEHWQoHpBm05kLLthxmDm6EQDyCoKQNoPoJW22TW0FU22VIlU/3QDnEhley0M+O5BdQgiMJjVMMSmDRsMzyyjZN7MNRBmAItNdk5Ch3ohmADwwhwyfFhwXE8OqA11lEhCDIO3YCxK9NzGrQ3DL1QEoktKELPwj0ilpIBFEnrUeGjRMP9WHPTxSjHK2HvQnwmVS3KN/XEj90rwOhWXeiYTOP13kAuKrUg0MgQ2HUvFPV93NI12DLlSCUuCJQaUSFSGkQdKtU6HNjhHjH2yg2Qm3SAPUDXSBnoldJEO0BANVIJy4B3MoD6NmDSGwwFhXEKDR1oCnH8J/CeVeG/x3XE1xLPUYnASPzczFJVO8y1KFNANv3APv2lKfxgNf3fwaTYLP1VN/0On/11NUv1KgNlJfxqxrIjOa2jNa0RXjIEETMzJTIXDTKZyTMLRl1cu9hdmiALKHnthLOHmDh8t9mrJoKWwYIbKYKbNYJbI4OqE0I+P1DuPOztyuwdxdCULu2FhEOJJiDvFLWexBCSsMiDXfIsLsIXNXHXNJhmEhHHirGyNCqWG2KvErJvDAtOIhDkWgquJ1zAlSrkMeLDSdyykNybVbX42NO3VYulToibmiCOAimvHu1fCezmPhhDzpNEipLCIYrlyTALC6B6D6HHSZ05wXGZyIAAG1YgALlJ1hYgABdCHOEqkgjE4KQ5PD8TYmYcq0cvNViBpMWdhaoSuXwNTMvWvQofgGkIUlCqatCvg2a2TNi3uVaxa3HFa5CXvZAXREDE9GCCYqYquPfa9B0vEtQkC4DGqhtSk8STGBpWGdfSAOiwZf6XYXwBISzeEsVE06oBA/kmSk/LLP86oC/bU6/IrFSkLSAmU5/ardg7Svmr/eiNUwyjUlpRSuS5SyUtSg08ypWsM5AxrGyu1Oytypy5Mry4cNyjMm27MlqygWkevV2j2LVZ2xlZlagxbZzZbD5aKgVOBIVfUNg1s5U9astVcFqPXXqpUfqogVsXWXwE63iE8ga99fYuGN6foba+82PSo9YgEtzfwzmGIMnESCnLXEGudfzVQ3zTmGYoUcjfuJ05QMQDY/0NOvoE81XNecm6vZE4DGYD0CgU0dYOohcb1K4aIL8yaGYAAKR7AwAnqnuHBnqsjnpJzoR/GI0LrWMAufBBkOzvFyCbuCTYjBCyChFqoXTVzvpAJmAR0Q37W6IapbijRfoQqnWyLQy9lyTzAFkrDFjRwoEWGlyoDhy/rZ2w2rH/sQEAb4s6rTWwzNSQDeC2FqDEOxGKWXPwvQAajJQPqYiPrvAj0vp0r7IpsaSriCH5kjGuDxNOUT3d3YpjvNzekQd8itJIjfUeGzuQDSGfNjTMMytsJ9EPGMNEbfMBsjUPABvEc/MPGsKUaBqKE5CMGqvKIMzWMfT5EWHr0WAcrXLro8JIaqP/Sl0nqgZr3VRmGOvPl4nOoNQwG5wrHnspJ+EhGEIV3BEGITDMay0U2oaHuFGz0i1snxIVA1yjhmuWh7u6D7sJvPzvIiwl3NNMy5os0vTvEoF2D4F0bomswYpcmhIS1oGoU2NXBODkQvJXXruxGOPbPnEbkmg+lyCIDFNpv/LjyPpLqhDvFXD/DsnVmAmQnwcZn1FiY9tIvBPnMDJlV4IRWJx6CEAhAIaSq2JYQCkfI/VVzixZqhHZr4GxGweOD4GeUFIcyczoMFN1vFsAMv3ruMrlvUsNIsuVv+28ch3Hpsfgg3tnukkWDR1rjzEWDSEhC8ASHCAcr4q4Eg1hfTLiK4CtBElwBusRboIwFkGevCB6VcYEsACTCWYEF2UdQEFyFqMGFyAOFgWBF8cJFi6pfI1VF9FzFxl7F3F/F5lg4Hiou8SBligKcN/b5pu5ek4Ne2xwFre4F0FiliFqFmlul2gIVpl66tlsQDliOC0HFvFniXljAYl0l+V8Fql6F9V+l0i9VlFyANFrVrF3V7lg11x/lw+x6oVkVqAZMVYSAb+9jN+8Bj+nkBsQ18IYN2ADV9XZuHkRewQJRKc41n1/Uf18SkB9+mN2AUN11k8yx9YNV0Exq21kyTNuNgQBNrML12YKKP12Bn+pqwdBBpB4BwNsBiB/5uHbNwcSTDtoB8B21qV+CMestito1JNmt1N2leB1HRBtCIByqVtn8dt9ek4LtosN10hj1612o2xttgd3t4d+NllogKt5N2tvMOBv+md5tyqNdvl7LZHf+q1rFi4IgNHfN9iXAcd31ydy8adpYWdil4Bu9jADdvNgtjvFHJYZ9zl1999li14U9mttB9YDBxAED28etmD4V3QPQUqfAIRLYEls9lDtDjD3NiXAtri1V0iiyW5fDwjqyqFe5SLUC15DZQO/lXZX5NAHYpskEcUMFdQCFG5O5RZDJIDhdhJQdJw0T6FKAAABkGkGjxFoAAFYAB2AQAkDTwabqNAXqHqAkBThgHEHEVQAkNANTgQWgDThgWgHEHUAkBIHEFQHEOTljhTtAPEPENAbz3qQaBIbqZT/TvEXqXqXznEWgAAJkGmM9c4EDU+6ns8M7M4U4JG6g8/E4SGi4C9WDQGC5IwYFi5y+i5IA0+M+xF1lEG6gJGc4JAYA0606U+C6y4gHQA07QBIBxEM+6gU567U4SB1Gi+mmi+s96hc/s/67U4YBy6640/U4096iJHmRY406SB6jxG6loB0/C9WBi4YAEEGmi/s905mly5IG6m6jU5G6s96mu/1FW/E5G6SAU8M7nF6GsRmgJDU5xGi4c709i5TiG5s4U8JFoB84M4U7a6gDQAU5s/C7xDG+i/64EGi4SEM9m5G5xASHW805u9i62/68GgYDq8y+Y/E6SXncrGk4gcmn0CAA=== -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details open="true">
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=y-scope/clp&utm_content=864):

- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
  - `I pushed a fix in commit <commit_id>, please review it.`
  - `Generate unit testing code for this file.`
  - `Open a follow-up GitHub issue for this discussion.`
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai generate unit testing code for this file.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and generate unit testing code.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### CodeRabbit Configuration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (1)

389-403: 🛠️ Refactor suggestion

Refactor tag-filter query construction to avoid brittle % string formatting

Mixing f-strings with legacy % interpolation is hard to read and easy to break (e.g., if tags is empty, the generated SQL is IN (), causing a syntax error).

A clearer, fully-parameterised approach:

-        filter_clauses.append(
-            f"id IN (SELECT archive_id FROM {table_prefix}{ARCHIVE_TAGS_TABLE_SUFFIX} WHERE "
-            f"tag_id IN (SELECT tag_id FROM {table_prefix}{TAGS_TABLE_SUFFIX} WHERE tag_name IN "
-            f"(%s)))" % ", ".join(["%s" for _ in search_config.tags])
-        )
+        placeholders = ", ".join(["%s"] * len(search_config.tags))
+        filter_clauses.append(
+            f"id IN (SELECT archive_id FROM {table_prefix}{ARCHIVE_TAGS_TABLE_SUFFIX} "
+            f"WHERE tag_id IN (SELECT tag_id FROM {table_prefix}{TAGS_TABLE_SUFFIX} "
+            f"WHERE tag_name IN ({placeholders})))"
+        )

No mixed formatting, and the placeholder count always matches the parameter list.

🧹 Nitpick comments (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

865-868: Re-use the existing Database instance instead of re-parsing it

clp_config.database is already a validated Database model instance.
Re-parsing it with Database.parse_obj introduces unnecessary overhead and double-validation. You can safely drop the extra call:

-    database_config = Database.parse_obj(clp_config.database)
+    database_config = clp_config.database

This keeps the intent clear and shaves a few milliseconds off start-up.

components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (1)

386-388: Use typing.Any instead of the built-in any in type hints

any (lower-case) refers to the built-in function and is not a valid typing annotation.
Using it here disables static type-checking and may confuse IDEs.

-    clp_metadata_db_conn_params: Dict[str, any],
+    clp_metadata_db_conn_params: Dict[str, Any],

Apply the same change to the other new signatures for consistency.

Also applies to: 419-420, 535-536

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 329eb7f and 0336d57.

📒 Files selected for processing (6)
  • components/clp-package-utils/clp_package_utils/scripts/native/archive_manager.py (4 hunks)
  • components/clp-package-utils/clp_package_utils/scripts/native/decompress.py (2 hunks)
  • components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2 hunks)
  • components/job-orchestration/job_orchestration/executor/compress/compression_task.py (2 hunks)
  • components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py (2 hunks)
  • components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (11 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
components/clp-package-utils/clp_package_utils/scripts/native/archive_manager.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • Database (88-167)
components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py (2)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
  • CLPConfig (603-776)
components/job-orchestration/job_orchestration/scheduler/job_config.py (3)
  • ExtractIrJobConfig (62-66)
  • ExtractJsonJobConfig (69-71)
  • SearchJobConfig (74-92)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (14)
components/job-orchestration/job_orchestration/executor/compress/compression_task.py (2)

13-13: Good addition of the table suffix import.

Importing this constant aligns with the PR's objective to make table names configurable rather than hardcoded.


72-76: Well implemented table name construction.

The function now properly uses the table_prefix parameter combined with the ARCHIVE_TAGS_TABLE_SUFFIX constant instead of a hardcoded table name. This approach provides more flexibility and consistency in handling database table names.

components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py (3)

18-18: Appropriate replacement of hardcoded prefix with table suffix constant.

The import has been correctly updated to use the TAGS_TABLE_SUFFIX constant instead of the previously hardcoded CLP_METADATA_TABLE_PREFIX.


236-241: Well-implemented dynamic table name construction.

The code now properly extracts the table prefix from database connection parameters and uses it with the TAGS_TABLE_SUFFIX constant to dynamically construct the table name. This replaces the previously hardcoded prefix approach.


244-246: Consistent implementation of dynamic table naming.

The SQL query correctly uses the dynamically constructed table name with the table prefix and suffix constant, maintaining consistency with the overall refactoring approach.

components/clp-package-utils/clp_package_utils/scripts/native/decompress.py (3)

12-16: Appropriate import structure for table suffix constants.

The import statement has been properly updated to include the necessary FILES_TABLE_SUFFIX constant, along with other required imports, aligning with the refactoring goal.


51-52: Good addition of table prefix extraction.

These new lines correctly retrieve the connection parameters and extract the table prefix, which is necessary for constructing the dynamic table name.


57-57: Proper implementation of dynamic table name in SQL query.

The SQL query now correctly uses the dynamically constructed table name by combining the table prefix with the FILES_TABLE_SUFFIX constant, replacing the hardcoded approach.

components/clp-package-utils/clp_package_utils/scripts/native/archive_manager.py (5)

10-15: Good import structure for table suffix constants.

All necessary table suffix constants are properly imported for use throughout the file.


253-255: Well-implemented dynamic table name in query.

The SQL query now correctly uses the table_prefix combined with the ARCHIVES_TABLE_SUFFIX constant to construct the table name dynamically.


316-323: Consistent implementation in DELETE query.

The DELETE query properly uses the dynamic table name construction approach, maintaining consistency with the refactoring strategy.


335-340: Good refactoring of the FILES table reference.

The query now properly uses table_prefix with FILES_TABLE_SUFFIX to construct the table name dynamically.


342-347: Consistent implementation for archive tags table.

The DELETE query for the archive tags table correctly follows the same dynamic table name construction pattern as used elsewhere.

components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)

871-876: Double-check that all metadata tables needed by the WebUI are passed

Only the archives and files table names are updated here.
If the WebUI still queries tags or archive_tags tables (older versions did), missing entries in meteor_settings will result in runtime SQL errors.

Please verify WebUI code paths and, if necessary, add:

"SqlDbClpTagsTableName": f"{table_prefix}{TAGS_TABLE_SUFFIX}",
"SqlDbClpArchiveTagsTableName": f"{table_prefix}{ARCHIVE_TAGS_TABLE_SUFFIX}",

before shipping.

Comment on lines 459 to 464
table_prefix = clp_metadata_db_conn_params["table_prefix"]
query = f"""SELECT archive_id, id as file_split_id
FROM {table_prefix}{FILES_TABLE_SUFFIX} WHERE
orig_file_id = '{orig_file_id}' AND
begin_message_ix <= {msg_ix} AND
(begin_message_ix + num_messages) > {msg_ix}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Prevent SQL-injection by parameterising orig_file_id

orig_file_id is interpolated directly into the SQL string, which can allow injection if the ID ever comes from untrusted input. Parameterise it instead:

-    query = f"""SELECT archive_id, id as file_split_id
-            FROM {table_prefix}{FILES_TABLE_SUFFIX} WHERE
-            orig_file_id = '{orig_file_id}' AND
+    query = f"""SELECT archive_id, id as file_split_id
+            FROM {table_prefix}{FILES_TABLE_SUFFIX} WHERE
+            orig_file_id = %s AND
             begin_message_ix <= {msg_ix} AND
             (begin_message_ix + num_messages) > {msg_ix}
             """
 ...
-        cursor.execute(query)
+        cursor.execute(query, (orig_file_id,))

This eliminates the risk and keeps the style consistent with archive_exists().

Also applies to: 467-468

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

refactor(clp-package): Replace hard-coded CLP metadata table names with `table_prefix` from connection params and table suffix constants.

Comment thread components/clp-package-utils/clp_package_utils/scripts/native/archive_manager.py Outdated
Comment thread components/clp-package-utils/clp_package_utils/scripts/start_clp.py Outdated
Comment thread components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py Outdated
@Bill-hbrhbr Bill-hbrhbr changed the title refactor(clp-package): Replace hardcoded metadata table names with table_prefix from Database connection parameters and table suffix constants. refactor(clp-package): Replace hard-coded CLP metadata table names with table_prefix from connection params and table suffix constants. Apr 29, 2025
Bill-hbrhbr and others added 2 commits April 29, 2025 14:10
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
@kirkrodrigues

Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Apr 29, 2025

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants