Skip to content

docs(deployment): Add multi-host deployment and external database setup guides.#1532

Merged
kirkrodrigues merged 15 commits into
y-scope:mainfrom
junhaoliao:multi-node-doc
Nov 9, 2025
Merged

docs(deployment): Add multi-host deployment and external database setup guides.#1532
kirkrodrigues merged 15 commits into
y-scope:mainfrom
junhaoliao:multi-node-doc

Conversation

@junhaoliao

@junhaoliao junhaoliao commented Oct 30, 2025

Copy link
Copy Markdown
Member

Description

This PR adds comprehensive documentation for deploying CLP in multi-node environments and using
external databases.

Key Changes

  1. Multi-node deployment guide (docs/src/user-docs/guides-multi-node.md):

    • Added step-by-step instructions for deploying CLP across multiple hosts using Docker Compose
    • Documented the --setup-only flag workflow for manual orchestration
    • Included configuration steps for setting up hostnames, ports, and shared filesystems
    • Provided detailed service startup commands organized by infrastructure, controller, and worker services
    • Added worker concurrency configuration guidance for optimal performance
  2. External database setup guide (docs/src/user-docs/reference-external-database.md):

    • Created reference documentation for using external MariaDB/MySQL and MongoDB databases
    • Included installation and configuration instructions for Ubuntu
    • Provided connection verification steps and CLP configuration examples

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

  • Validated instructions locally using Debian WSL
  • Tested markdown rendering and link references
    • Verified all internal cross-references resolve correctly

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive "External database" guide for MariaDB/MySQL and MongoDB (including cloud and secure remote setups).
    • Added a "Multi-host deployment" guide and renamed the overview entry accordingly.
    • Removed the previous "Multi-node" guide entry.
    • Added a guidance tip about using object storage with ephemeral hosts and recommending external metadata databases.

@coderabbitai

coderabbitai Bot commented Oct 30, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

Removed the old multi-node guide and added two new guides (multi-host deployment and external-database setup); updated the Guides index/overview and added an object-storage usage tip in the guides-using-object-storage index. No code or public API changes.

Changes

Cohort / File(s) Summary
Removed guide
docs/src/user-docs/guides-multi-node.md
Deleted the previous multi-node Docker Compose deployment guide and its temporary unsupported warning.
New guides
docs/src/user-docs/guides-multi-host.md, docs/src/user-docs/guides-external-database.md
Added a multi-host deployment guide (manual Docker Compose across hosts, SeaweedFS setup, start/stop/monitor instructions) and an external-database guide (MariaDB/MySQL and MongoDB configuration, remote/AWS options, CLP config examples and YAML fragments).
Index / overview / nav updates
docs/src/user-docs/index.md, docs/src/user-docs/guides-overview.md
Updated Guides toctree: removed guides-multi-node, added guides-multi-host, guides-external-database, and guides-using-object-storage/index; renamed overview card from "Multi-node" to "Multi-host" and added an "External database setup" card.
Object-storage note
docs/src/user-docs/guides-using-object-storage/index.md
Added an informational tip advising that if object storage is used because hosts are ephemeral, consider external databases for metadata persistence.

Sequence Diagram(s)

Not applicable — documentation-only changes.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Review focus:
    • Verify toctree entries and file paths resolve correctly.
    • Check overview grid item titles/links render as intended.
    • Proofread new guides for formatting, YAML/code block correctness, and internal link targets.
    • Confirm the object-storage tip placement and clarity.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately reflects the main changes: adding multi-host deployment and external database setup guides as documented in the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao changed the title docs(multi-node): Add multi-node deployment guide using Docker Compose and shared filesystem. docs(clp-package): Add multi-node deployment guide and external database setup reference. Nov 3, 2025
@junhaoliao junhaoliao marked this pull request as ready for review November 3, 2025 09:51
@junhaoliao junhaoliao requested a review from a team as a code owner November 3, 2025 09:51

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cc777c and 58f617f.

📒 Files selected for processing (4)
  • docs/src/user-docs/guides-multi-node.md (1 hunks)
  • docs/src/user-docs/index.md (1 hunks)
  • docs/src/user-docs/reference-external-database.md (1 hunks)
  • docs/src/user-docs/reference-overview.md (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • docs/src/user-docs/guides-multi-node.md
📚 Learning: 2025-06-18T20:39:05.899Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.

Applied to files:

  • docs/src/user-docs/guides-multi-node.md
🪛 LanguageTool
docs/src/user-docs/reference-external-database.md

[uncategorized] ~4-~4: Possible missing preposition found.
Context: ...ses for CLP instead of using the Docker Compose managed databases. :::{warning} The [C...

(AI_HYDRA_LEO_MISSING_TO)


[grammar] ~68-~68: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host, you need to conf...

(CONDITIONAL_CLAUSE)


[grammar] ~126-~126: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host: 1. Edit the Mon...

(CONDITIONAL_CLAUSE)

docs/src/user-docs/guides-multi-node.md

[uncategorized] ~7-~7: Possible missing comma found.
Context: ... deployment using manual Docker Compose orchestration and may change as we actively work to i...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~86-~86: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... of your results cache service. * Update MongoDbPort if you changed the result...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (11)
docs/src/user-docs/reference-external-database.md (4)

1-10: Well-structured introduction with appropriate warnings.

The opening sections clearly establish purpose and scope, with a helpful warning about when this guide applies.


16-111: Comprehensive MariaDB/MySQL setup guidance.

The step-by-step instructions are clear and include practical examples for both local installation and AWS RDS. Security note at line 48–51 about replacing '%' is particularly valuable.


113-169: MongoDB setup sections are thorough and include cloud alternatives.

Security warnings for production deployments (line 148–151) and coverage of AWS DocumentDB and MongoDB Atlas are particularly helpful for users.


170-200: Excellent integration guidance and multi-node considerations.

The configuration examples in YAML are clear, and the note about skipping infrastructure services when using external databases (line 196–199) provides crucial operational guidance. Cross-reference to guides-multi-node.md is well-placed.

docs/src/user-docs/guides-multi-node.md (5)

1-36: Strong introduction and clear requirements section.

The requirements are comprehensive, and the note about manual orchestration (line 28–30) appropriately sets expectations. This effectively replaces the temporary warning with actionable guidance.


37-114: Excellent configuration workflow with practical guidance.

The step-by-step approach (setup host → configure credentials → adjust settings → generate environment → distribute) is logical and well-explained. The note about host customization post-setup (line 68–95) is particularly helpful for operators unfamiliar with the containerization details.


125-130: Well-placed reference to external database guide.

This cross-reference to reference-external-database.md is timely and helpful, enabling operators to skip database services appropriately. The note about still running initialization jobs is crucial operational detail.


115-216: Service startup commands are well-organized and comprehensive.

The categorization by infrastructure, controller, and worker services is clear. Comments and --no-deps flags are appropriate. The reminder about clp-json + Presto deployments (line 121–122) and worker parallelism note (line 219–222) address common deployment variations well.


247-304: Stopping, monitoring, and SeaweedFS sections provide operational completeness.

The generic monitoring commands, stopping workflow, and SeaweedFS setup instructions round out the guide appropriately without being overly prescriptive.

docs/src/user-docs/index.md (1)

87-97: Index update correctly integrates external database reference.

The new reference-external-database entry is properly placed in the Reference section's toctree, maintaining alphabetical ordering and consistency with existing entries.

docs/src/user-docs/reference-overview.md (1)

36-42: Grid card addition is consistent and well-positioned.

The new "External database setup" card follows the established format, includes a descriptive subtitle, and logically extends the reference material offerings. The link target correctly references the new reference-external-database file.


### Configuring MariaDB for remote connections

If CLP components will connect from a different host, you need to configure MariaDB to accept remote

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Grammar: conditional clause should use present tense.

In conditional statements, avoid the future tense modal "will". Rephrase as: "If CLP components connect from a different host, you need to configure MariaDB..."

-If CLP components will connect from a different host, you need to configure MariaDB to accept remote
+If CLP components connect from a different host, configure MariaDB to accept remote
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
If CLP components will connect from a different host, you need to configure MariaDB to accept remote
If CLP components connect from a different host, configure MariaDB to accept remote
🧰 Tools
🪛 LanguageTool

[grammar] ~68-~68: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host, you need to conf...

(CONDITIONAL_CLAUSE)

🤖 Prompt for AI Agents
In docs/src/user-docs/reference-external-database.md around line 68, change the
conditional clause from future-tense "If CLP components will connect..." to
present-tense wording; replace with "If CLP components connect from a different
host, you need to configure MariaDB to accept remote" (or similar present-tense
phrasing) so the sentence uses present tense in the conditional clause.


### Configuring MongoDB for remote connections

If CLP components will connect from a different host:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Grammar: same conditional issue as line 68.

Rephrase to use present tense: "If CLP components connect from a different host:"

-If CLP components will connect from a different host:
+If CLP components connect from a different host:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
If CLP components will connect from a different host:
If CLP components connect from a different host:
🧰 Tools
🪛 LanguageTool

[grammar] ~126-~126: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host: 1. Edit the Mon...

(CONDITIONAL_CLAUSE)

🤖 Prompt for AI Agents
In docs/src/user-docs/reference-external-database.md around line 126, the
sentence "If CLP components will connect from a different host:" uses future
tense like the earlier issue at line 68; change it to present tense by
rephrasing to "If CLP components connect from a different host:" to match style
and consistency with line 68.

@junhaoliao junhaoliao changed the title docs(clp-package): Add multi-node deployment guide and external database setup reference. docs(deployment): Add multi-node deployment guide and external database setup reference. Nov 3, 2025

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 58f617f and d589c0a.

📒 Files selected for processing (1)
  • docs/src/user-docs/guides-multi-node.md (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • docs/src/user-docs/guides-multi-node.md
📚 Learning: 2025-06-18T20:39:05.899Z
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.

Applied to files:

  • docs/src/user-docs/guides-multi-node.md
🪛 LanguageTool
docs/src/user-docs/guides-multi-node.md

[style] ~87-~87: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... of your results cache service. * Update MongoDbPort if you changed the result...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (3)
docs/src/user-docs/guides-multi-node.md (3)

1-312: All cross-references verified—documentation is ready.

The verification confirms all components referenced in the guide are properly configured:

  • External database reference link resolves correctly
  • Design documentation reference is valid
  • All three worker concurrency environment variables are defined with defaults in docker-compose-all.yaml
  • The docker-compose file exists at the expected location

The multi-node deployment guide is well-structured, comprehensive, and internally consistent. No corrections or additional changes are required.


306-306: Documentation references verified—no issues found.

The referenced design orchestration document exists at the correct path. The relative link ../dev-docs/design-deployment-orchestration.md from docs/src/user-docs/guides-multi-node.md correctly resolves to docs/src/dev-docs/design-deployment-orchestration.md. Both cross-referenced files are in place.


103-114: Environment variable names verified as correct.

All three worker concurrency environment variables (CLP_COMPRESSION_WORKER_CONCURRENCY, CLP_QUERY_WORKER_CONCURRENCY, CLP_REDUCER_CONCURRENCY) are confirmed to exist in the codebase at tools/deployment/package/docker-compose-all.yaml and are used correctly with Celery worker --concurrency flags. The variable names in the documentation match the deployment configuration exactly, and the CPU count tuning guidance is appropriate for Celery concurrency settings.

Comment thread docs/src/user-docs/guides-multi-node.md Outdated
Comment on lines +83 to +87
2. **Edit `var/www/webui/server/dist/settings.json`**:
* Update `SqlDbHost` to the actual hostname or IP address of your database service.
* Update `SqlDbPort` if you changed the database port.
* Update `MongoDbHost` to the actual hostname or IP address of your results cache service.
* Update `MongoDbPort` if you changed the results cache port.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Vary sentence structure to avoid repetitive "Update" beginnings.

Lines 84-86 begin three successive bullet points with "Update". Consider rewording some of these to improve readability (e.g., "Ensure SqlDbPort matches..." or "Set MongoDbHost to...").

Apply this diff to improve readability:

   2. **Edit `var/www/webui/server/dist/settings.json`**:
-     * Update `SqlDbHost` to the actual hostname or IP address of your database service.
-     * Update `SqlDbPort` if you changed the database port.
-     * Update `MongoDbHost` to the actual hostname or IP address of your results cache service.
+     * Set `SqlDbHost` to the actual hostname or IP address of your database service.
+     * Ensure `SqlDbPort` matches your database configuration.
+     * Set `MongoDbHost` to the actual hostname or IP address of your results cache service.
      * Update `MongoDbPort` if you changed the results cache port.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
2. **Edit `var/www/webui/server/dist/settings.json`**:
* Update `SqlDbHost` to the actual hostname or IP address of your database service.
* Update `SqlDbPort` if you changed the database port.
* Update `MongoDbHost` to the actual hostname or IP address of your results cache service.
* Update `MongoDbPort` if you changed the results cache port.
2. **Edit `var/www/webui/server/dist/settings.json`**:
* Set `SqlDbHost` to the actual hostname or IP address of your database service.
* Ensure `SqlDbPort` matches your database configuration.
* Set `MongoDbHost` to the actual hostname or IP address of your results cache service.
* Update `MongoDbPort` if you changed the results cache port.
🧰 Tools
🪛 LanguageTool

[style] ~87-~87: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... of your results cache service. * Update MongoDbPort if you changed the result...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
In docs/src/user-docs/guides-multi-node.md around lines 83 to 87, the three
consecutive bullet points all start with "Update", making the list repetitive;
rephrase the middle and/or last bullets to vary sentence structure (for example:
"Ensure `SqlDbPort` matches the database port you configured." and "Set
`MongoDbHost` to the hostname or IP address of your results cache service."),
keeping meaning identical and preserving inline code formatting and punctuation.

Comment thread docs/src/user-docs/guides-multi-node.md Outdated
* If you're not running as root, ensure Docker can be run
[without superuser privileges][docker-non-root].
* One or more hosts networked together
* When not using S3 storage, a shared filesystem accessible by all worker hosts (e.g., NFS,

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just realize that seaweedfs is accessible via s3 too. maybe it's better to say "cloud s3" or "aws s3" to avoid confusions. or we can just say it's possible to set up an s3 service with seaweedfs too

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
docs/src/user-docs/guides-external-database.md (2)

69-69: [UNRESOLVED] Fix conditional clause grammar: use present tense, not future.

This issue was flagged in a previous review and remains unresolved. Conditional clauses should use present tense, not future tense. Change "will connect" to "connect".

-If CLP components will connect from a different host, you need to configure MariaDB to accept remote
+If CLP components connect from a different host, you need to configure MariaDB to accept remote

127-127: [UNRESOLVED] Fix conditional clause grammar: use present tense, not future (second occurrence).

Same issue as line 69. Conditional clauses should use present tense. Change "will connect" to "connect".

-If CLP components will connect from a different host:
+If CLP components connect from a different host:
📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d589c0a and ec9ca96.

📒 Files selected for processing (5)
  • docs/src/user-docs/guides-external-database.md (1 hunks)
  • docs/src/user-docs/guides-multi-host.md (1 hunks)
  • docs/src/user-docs/guides-multi-node.md (0 hunks)
  • docs/src/user-docs/guides-overview.md (2 hunks)
  • docs/src/user-docs/index.md (1 hunks)
💤 Files with no reviewable changes (1)
  • docs/src/user-docs/guides-multi-node.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: quinntaylormitchell
Repo: y-scope/clp PR: 968
File: docs/src/user-guide/quick-start/overview.md:73-109
Timestamp: 2025-06-18T20:39:05.899Z
Learning: The CLP project team prefers to use video content to demonstrate detailed procedural steps (like tarball extraction) rather than including every step in the written documentation, keeping the docs focused on conceptual guidance.
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • docs/src/user-docs/guides-multi-host.md
🪛 LanguageTool
docs/src/user-docs/guides-external-database.md

[uncategorized] ~3-~3: Possible missing preposition found.
Context: ...ses for CLP instead of using the Docker Compose managed databases. :::{warning} The [C...

(AI_HYDRA_LEO_MISSING_TO)


[grammar] ~69-~69: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host, you need to conf...

(CONDITIONAL_CLAUSE)


[grammar] ~127-~127: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host: 1. Edit the Mon...

(CONDITIONAL_CLAUSE)

docs/src/user-docs/guides-multi-host.md

[grammar] ~8-~8: The word ‘available’ is a noun or an adjective. A verb is missing or misspelled, or maybe a comma is missing.
Context: ...; however, Kubernetes Helm support will available in a future release, which will simplif...

(MD_JJ)


[style] ~100-~100: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...if you changed the database port. * Update MongoDbHost to the actual hostname or...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ss of your results cache service. * Update MongoDbPort if you changed the result...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🔇 Additional comments (10)
docs/src/user-docs/guides-multi-host.md (5)

138-142: Helpful integration with external database guide.

The tip at lines 138–142 appropriately directs users to the external database guide for alternative database configurations, with clear guidance on skipping relevant services. This improves usability in multi-host deployments.


286-308: Stopping and monitoring guidance is clear.

The stop and monitoring commands are concise, well-documented, and follow consistent patterns with the startup section. Good integration with the overall multi-host deployment workflow.


310-343: SeaweedFS setup section is comprehensive and well-organized.

The distributed shared filesystem instructions are clear, with good separation of master/filer/volume server setup and appropriate external documentation references. The placeholder explanations help users adapt commands to their environments.


263-284: Using CLP section appropriately directs to quick-start guides.

The grid-based presentation of quick-start options is consistent with user docs style and provides clear next steps after multi-host deployment setup.


344-350: Reference links are well-sourced and correctly formatted.

Internal and external references use consistent Markdown syntax and point to authoritative sources (official Docker docs, SeaweedFS repository, internal design docs).

docs/src/user-docs/index.md (1)

64-64: Toctree additions correctly reference new guide files.

The two new entries (guides-external-database and guides-multi-host) are properly positioned in the Guides toctree and align with the new documentation files added in this PR. The ordering and syntax are consistent with existing toctree entries.

Also applies to: 66-66

docs/src/user-docs/guides-overview.md (2)

15-20: New external database guide card is properly integrated.

The grid-item-card for "External database setup" is correctly formatted, appropriately positioned among other guides, and uses clear descriptive text matching the PR objectives.


30-33: Multi-host guide reference correctly updated.

The rename from guides-multi-node to guides-multi-host with updated title ("Multi-host deployment") and description is consistent with the new multi-host guide file added in this PR.

docs/src/user-docs/guides-external-database.md (2)

1-10: Introduction appropriately frames the guide's purpose.

The warning block and introduction clearly explain that this guide is for users customizing their deployment with external database services, setting appropriate expectations and scope.


22-113: MariaDB/MySQL setup section is thorough and practical.

The section provides step-by-step installation instructions for Ubuntu, clear user creation with security notes, and practical AWS RDS guidance. Instructions are appropriate for both self-hosted and cloud-managed database deployments.

Comment thread docs/src/user-docs/guides-multi-host.md Outdated
Comment thread docs/src/user-docs/guides-multi-host.md
Comment thread docs/src/user-docs/guides-multi-host.md

All commands below assume you are running them from the root of the CLP package directory.

```bash

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. with the latest code, i believe the --file docker-compose-all.yaml argument should be removed
  2. some services use --wait while others use --detach. i believe that's to speed up the launches? if so, shall we add a note / tip / inline comment to warn users about the differences are intentional? (at least they should understand that --detach can't be blindly used)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Right. I didn't get a chance, so I left it for now. Feel free to test and remove.
  2. I thought the ones where we didn't have --wait before my changes were because those services didn't support health checks?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to test and remove.

sure. i will push directly

I thought the ones where we didn't have --wait before my changes were because those services didn't support health checks?

my bad - i meant to add --wait to all services but somehow i didn't. do you think we can consistently use --wait to avoid confusions to users?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this notification. Yeah, let's consistently use --wait if it works.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
docs/src/user-docs/guides-multi-host.md (1)

99-102: Repetitive sentence structure (already flagged in past review).

This comment echoes a previous review finding: lines 99–102 have four successive list items beginning with "Update", creating repetitive phrasing. Although this was marked as addressed in prior commits, the current code still exhibits the repetition.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b19c9b and 2a9b069.

📒 Files selected for processing (2)
  • docs/src/user-docs/guides-external-database.md (1 hunks)
  • docs/src/user-docs/guides-multi-host.md (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-22T04:01:43.409Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 939
File: components/package-template/src/etc/clp-config.yml:64-64
Timestamp: 2025-06-22T04:01:43.409Z
Learning: The webui server logging can be configured via LOG_LEVEL and WEBUI_LOGS_DIR environment variables, with file logging enabled automatically in non-TTY environments when a logs directory is specified.

Applied to files:

  • docs/src/user-docs/guides-multi-host.md
📚 Learning: 2025-09-25T05:13:13.298Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:217-223
Timestamp: 2025-09-25T05:13:13.298Z
Learning: The compression scheduler service in CLP runs with CLP_UID_GID (current user's UID:GID) rather than CLP_SERVICE_CONTAINER_UID_GID (999:999), unlike infrastructure services such as database, queue, redis, and results cache which run with the service container UID:GID.

Applied to files:

  • docs/src/user-docs/guides-multi-host.md
🪛 LanguageTool
docs/src/user-docs/guides-external-database.md

[uncategorized] ~3-~3: Possible missing preposition found.
Context: ...ses for CLP instead of using the Docker Compose managed databases. If the host(s) on wh...

(AI_HYDRA_LEO_MISSING_TO)


[grammar] ~72-~72: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host, you need to conf...

(CONDITIONAL_CLAUSE)


[grammar] ~130-~130: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host: 1. Edit the Mon...

(CONDITIONAL_CLAUSE)

docs/src/user-docs/guides-multi-host.md

[style] ~101-~101: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...if you changed the database port. * Update MongoDbHost to the actual hostname or...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~102-~102: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ss of your results cache service. * Update MongoDbPort if you changed the result...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: build (macos-15)
  • GitHub Check: lint-check (macos-15)
  • GitHub Check: lint-check (ubuntu-24.04)

Comment on lines +72 to +73
If CLP components will connect from a different host, you need to configure MariaDB to accept remote
connections:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix conditional clause grammar (future tense → present tense).

Line 72 uses future tense in the conditional clause. Rephrase to present tense for grammatical correctness:

-If CLP components will connect from a different host, you need to configure MariaDB to accept remote
+If CLP components connect from a different host, you need to configure MariaDB to accept remote
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
If CLP components will connect from a different host, you need to configure MariaDB to accept remote
connections:
If CLP components connect from a different host, you need to configure MariaDB to accept remote
connections:
🧰 Tools
🪛 LanguageTool

[grammar] ~72-~72: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host, you need to conf...

(CONDITIONAL_CLAUSE)

🤖 Prompt for AI Agents
In docs/src/user-docs/guides-external-database.md around lines 72 to 73, change
the conditional clause from future tense ("If CLP components will connect from a
different host, you need to configure...") to present tense—e.g., "If CLP
components connect from a different host, you need to configure MariaDB to
accept remote connections"—so the sentence uses present-tense grammar while
preserving the original meaning.

Comment on lines +130 to +131
If CLP components will connect from a different host:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix conditional clause grammar (future tense → present tense).

Line 130 repeats the same conditional grammar issue from line 72. Use present tense for grammatical correctness:

-If CLP components will connect from a different host:
+If CLP components connect from a different host:
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
If CLP components will connect from a different host:
If CLP components connect from a different host:
🧰 Tools
🪛 LanguageTool

[grammar] ~130-~130: The first clause in a conditional statement is not usually in the future tense. Consider removing the modal verb “will”.
Context: ...r remote connections If CLP components will connect from a different host: 1. Edit the Mon...

(CONDITIONAL_CLAUSE)

🤖 Prompt for AI Agents
In docs/src/user-docs/guides-external-database.md around lines 130-131, the
conditional clause uses future tense ("If CLP components will connect from a
different host:"); change it to present tense to match line 72 and correct
grammar—e.g., "If CLP components connect from a different host:"—and adjust any
matching occurrences to use present tense.

@kirkrodrigues kirkrodrigues left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the PR title, how about:

docs(deployment): Add multi-host deployment and external database setup guides.

@kirkrodrigues kirkrodrigues changed the title docs(deployment): Add multi-node deployment guide and external database setup reference. docs(deployment): Add multi-host deployment and external database setup guides. Nov 9, 2025
@kirkrodrigues kirkrodrigues merged commit a83e156 into y-scope:main Nov 9, 2025
9 checks passed
@junhaoliao junhaoliao deleted the multi-node-doc branch May 7, 2026 19:46
junhaoliao added a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…up guides. (y-scope#1532)

Co-authored-by: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants