Skip to content

feat(clp-package): Use full UUID for sbin container names to prevent collisions.#1870

Merged
junhaoliao merged 2 commits into
y-scope:mainfrom
junhaoliao:native-container-name
Jan 28, 2026
Merged

feat(clp-package): Use full UUID for sbin container names to prevent collisions.#1870
junhaoliao merged 2 commits into
y-scope:mainfrom
junhaoliao:native-container-name

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Jan 15, 2026

Copy link
Copy Markdown
Member

Description

This PR changes the container name generation for Docker-based sbin scripts to use the full UUID
instead of only the last 4 characters.

Before: clp-compression-a1b2 (last 4 chars of UUID, ~65,536 unique values)
After: clp-compression-989c2953-3111-41d2-a068-22d2dbf29c41 (full UUID, virtually unlimited)

Motivation

The previous implementation used only the last 4 hex characters of a UUID, providing only 65,536 (16^4) unique values. This could lead to container name collisions in:

  • High-volume production environments with many concurrent jobs
  • Rapid successive job submissions
  • Long-running deployments accumulating many jobs over time

Impact Assessment

Affected Scripts

The following Docker-based sbin scripts use generate_container_name():

Script Job Type New Container Name Pattern
compress.sh compression clp-compression-<full-uuid>
compress-from-s3.sh compression clp-compression-<full-uuid>
decompress.sh file-extraction / ir-extraction clp-file-extraction-<full-uuid>
search.sh search clp-search-<full-uuid>
admin-tools/dataset-manager.sh dataset-manager clp-dataset-manager-<full-uuid>
admin-tools/archive-manager.sh archive-manager clp-archive-manager-<full-uuid>

Affected Config Files

Each script also calls dump_container_config() with get_container_config_filename(container_name)
to create a temporary config file in var/log/:

Before After
.clp-compression-a1b2-config.yaml .clp-compression-<full-uuid>-config.yaml
.clp-search-a1b2-config.yaml .clp-search-<full-uuid>-config.yaml
.clp-file-extraction-a1b2-config.yaml .clp-file-extraction-<full-uuid>-config.yaml
.clp-ir-extraction-a1b2-config.yaml .clp-ir-extraction-<full-uuid>-config.yaml
.clp-dataset-manager-a1b2-config.yaml .clp-dataset-manager-<full-uuid>-config.yaml
.clp-archive-manager-a1b2-config.yaml .clp-archive-manager-<full-uuid>-config.yaml

These files are created before native container execution and used to pass configuration into the
native container.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  1. Build and start CLP:

    task
    cd build/clp-package
    ./sbin/start-clp.sh
  2. Test compress.sh - verify container name and config file both use full UUID:

    ./sbin/compress.sh --timestamp-key timestamp ~/samples/postgresql.jsonl &
    # Poll until container/config file appears (they are cleaned up after script exits)
    for i in {1..20}; do
        CONTAINER=$(docker ps --filter "name=clp-compression" --format "{{.Names}}")
        CONFIG=$(ls var/log/.clp-compression-*-config.yaml 2>/dev/null)
        if [[ -n "$CONTAINER" && -n "$CONFIG" ]]; then
            echo "$CONTAINER"; echo "$CONFIG"; break
        fi
        sleep 0.2
    done

    Output:

    clp-compression-bec69303-7c35-47eb-bf3e-f83c10244f4c
    var/log/.clp-compression-bec69303-7c35-47eb-bf3e-f83c10244f4c-config.yaml
    

    Both the container name and config file use the same full UUID.

  3. Test search.sh:

    ./sbin/search.sh "ERROR" &
    for i in {1..20}; do
        CONTAINER=$(docker ps --filter "name=clp-search" --format "{{.Names}}")
        CONFIG=$(ls var/log/.clp-search-*-config.yaml 2>/dev/null)
        if [[ -n "$CONTAINER" && -n "$CONFIG" ]]; then
            echo "$CONTAINER"; echo "$CONFIG"; break
        fi
        sleep 0.2
    done

    Output:

    clp-search-1495aa61-9206-45a0-a551-9f27e27950c7
    var/log/.clp-search-1495aa61-9206-45a0-a551-9f27e27950c7-config.yaml
    
  4. Test admin-tools/archive-manager.sh:

    ./sbin/admin-tools/archive-manager.sh find &
    for i in {1..20}; do
        CONTAINER=$(docker ps --filter "name=clp-archive-manager" --format "{{.Names}}")
        CONFIG=$(ls var/log/.clp-archive-manager-*-config.yaml 2>/dev/null)
        if [[ -n "$CONTAINER" && -n "$CONFIG" ]]; then
            echo "$CONTAINER"; echo "$CONFIG"; break
        fi
        sleep 0.2
    done

    Output:

    clp-archive-manager-84b0f179-1497-4860-81b7-82fcd93fa09b
    var/log/.clp-archive-manager-84b0f179-1497-4860-81b7-82fcd93fa09b-config.yaml
    
  5. Test admin-tools/dataset-manager.sh:

    ./sbin/admin-tools/dataset-manager.sh list &
    for i in {1..20}; do
        CONTAINER=$(docker ps --filter "name=clp-dataset-manager" --format "{{.Names}}")
        CONFIG=$(ls var/log/.clp-dataset-manager-*-config.yaml 2>/dev/null)
        if [[ -n "$CONTAINER" && -n "$CONFIG" ]]; then
            echo "$CONTAINER"; echo "$CONFIG"; break
        fi
        sleep 0.2
    done

    Output:

    clp-dataset-manager-1447b279-bc73-44ab-a4ba-69925f767c8c
    var/log/.clp-dataset-manager-1447b279-bc73-44ab-a4ba-69925f767c8c-config.yaml
    
  6. Stop CLP:

    ./sbin/stop-clp.sh

    Output:

    2026-01-28T01:07:11.032 INFO [controller] Stopping all CLP containers using Docker Compose...
    Container clp-package-9fb0-reducer-1 Stopping
    ...
    Container clp-package-9fb0-queue-1 Removed
    Network clp-package-9fb0_default Removed
    2026-01-28T01:07:33.020 INFO [controller] Stopped CLP.
    

Summary by CodeRabbit

  • Bug Fixes
    • Improved container name uniqueness by using full identifiers instead of abbreviated versions, reducing the risk of naming collisions.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai

coderabbitai Bot commented Jan 15, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

The container name generation logic has been updated to use the complete UUID instead of only the final four characters, increasing the entropy and uniqueness of generated container identifiers.

Changes

Cohort / File(s) Summary
Container naming logic
components/clp-package-utils/clp_package_utils/general.py
Modified UUID truncation in container name generation from last 4 characters to full UUID for improved entropy and collision avoidance

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: using full UUID instead of partial UUID for container names to prevent collisions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao changed the title feat(clp-package): Use full UUID for native sbin scripts container name. feat(clp-package): Use full UUID for sbin container names to prevent collisions. Jan 28, 2026
@junhaoliao junhaoliao marked this pull request as ready for review January 28, 2026 01:10
@junhaoliao junhaoliao requested a review from a team as a code owner January 28, 2026 01:10
@junhaoliao junhaoliao merged commit 277e03e into y-scope:main Jan 28, 2026
25 checks passed
@junhaoliao junhaoliao deleted the native-container-name branch May 7, 2026 19:46
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants