Skip to content

Inventory: Add more relevant general and specific information#41

Merged
amotl merged 5 commits intomainfrom
inventory-cratedb-sql-more
May 19, 2025
Merged

Inventory: Add more relevant general and specific information#41
amotl merged 5 commits intomainfrom
inventory-cratedb-sql-more

Conversation

@amotl
Copy link
Member

@amotl amotl commented May 18, 2025

About

When curating the outline/index, we found a few important items to be missing, that should to be considered for an essential CrateDB knowledge base and as convenient utility sidekick when conceiving tutorials about CrateDB.

Details

  • partitioned tables
  • column storage
  • replication
  • views
  • {CREATE,ALTER} [FOREIGN] TABLE [AS]
  • COPY {FROM,TO}
  • SQL data types
  • Importing example datasets using the cratedb_toolkit.datasets.load_dataset API

/cc @surister, @kneth, @bmunkholm

@coderabbitai
Copy link

coderabbitai bot commented May 18, 2025

Walkthrough

This update reorganizes and expands the CrateDB documentation outline by adding new reference and guide entries, particularly for partitioned tables, storage, replication, views, SQL commands, and data types. It also introduces instructions for importing example datasets and updates the changelog and backlog to reflect these documentation enhancements.

Changes

File(s) Change Summary
CHANGES.md Updated the "Unreleased" changelog with new entries about inventory, partitioned tables, storage, replication, views, SQL commands, data types, and importing example datasets.
docs/backlog.md Added a new completed backlog item for documenting DDL/views, COPY FROM ..., and ctk load table for data import.
src/cratedb_about/outline/cratedb-outline.yaml Reorganized and expanded documentation outline: added references for partitioned tables, storage, replication, views, SQL commands, data types, and example dataset import; grouped performance guides.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant DocsOutline
    participant ReferenceDocs
    participant ExamplesSection

    User->>DocsOutline: Access documentation outline
    DocsOutline->>ReferenceDocs: Display new entries (Partitioned tables, Storage, Replication, Views, SQL commands, Data types)
    DocsOutline->>ExamplesSection: Show "Import example datasets" instructions
    User->>ReferenceDocs: Navigate to detailed guides
    User->>ExamplesSection: Follow dataset import instructions
Loading

Possibly related PRs

Suggested reviewers

  • bmunkholm
  • kneth
  • surister

Poem

In docs we burrow, deep and wide,
With tables, views, and types inside.
Partitioned tales and storage lore,
Replication guides—there’s always more!
SQL commands now clearly shown,
And sample data seeds are sown.
🐇 Hopping through knowledge, we make it our own!

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3e3a8da and 36b3b57.

📒 Files selected for processing (2)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (4 hunks)
✅ Files skipped from review due to trivial changes (1)
  • docs/backlog.md
🔇 Additional comments (5)
src/cratedb_about/outline/cratedb-outline.yaml (5)

197-231: New "General information" section looks solid.

All added entries for Partitioned tables, Storage, Replication, and Views follow the existing pattern (titles, descriptions, links, html, parents). This fills a gap in the outline by surfacing key DDL concepts.


246-261: "Performance guidelines" section correctly consolidates guides.

The relocated guides for sharding, query optimization, and design for scale are now under a dedicated Performance section, improving discoverability and logical grouping.


381-432: SQL commands reference updated comprehensively.

New entries for CREATE TABLE, CREATE TABLE AS, CREATE FOREIGN TABLE, ALTER TABLE, COPY FROM, and COPY TO are added as expected. Parentage ([ sql-syntax ]) and tags ([ sql ]) align with the broader SQL reference section.


433-444: SQL data types reference inclusion is correct.

The Data types entry is properly nested under SQL syntax, with link, html, description, and tags in place. This addresses the PR objective to document CrateDB’s data types.


648-657: Added "Toolkit Example Datasets" example entry.

The CrateDB Toolkit import dataset section accurately documents the cratedb_toolkit.datasets.load_dataset API with proper links and descriptive text. This completes the examples coverage for dataset loading.

✨ Finishing Touches
🧪 Generate Unit Tests
  • Create PR with Unit Tests
  • Commit Unit Tests in branch inventory-cratedb-sql-more
  • Post Copyable Unit Tests in Comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai auto-generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@amotl amotl marked this pull request as ready for review May 18, 2025 16:53
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
docs/backlog.md (1)

43-44: Backlog: Add missing inventory items for SQL data types and example datasets.

The new “Done” entry tracks DDL/views and COPY support but omits tasks for SQL data types and importing example datasets via the CrateDB Toolkit (both of which are covered in this PR).
Consider adding separate backlog items to reflect those completed tasks for completeness and traceability.

src/cratedb_about/outline/cratedb-outline.yaml (1)

648-658: Examples: Correct the inline comment URL.

The comment above the “Import example datasets” entry still points to the cluster documentation. Update or remove the comment to reference the datasets page (e.g., https://cratedb-toolkit.readthedocs.io/datasets.html) to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68fe7b7 and 0d936ee.

📒 Files selected for processing (3)
  • CHANGES.md (1 hunks)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (4 hunks)
🔇 Additional comments (5)
src/cratedb_about/outline/cratedb-outline.yaml (4)

197-231: General information: Approve partitioned tables, storage, replication, and views entries.

The new reference items for partitioned tables, column storage, replication, and views are well-structured, include both link and html fields, and correctly reference the reference parent. This fills a major gap in the outline.


246-261: Performance guidelines: Approve relocated section.

The “Performance guidelines” block has been moved to follow “Data modeling,” matching the AI summary and improving logical grouping of related guides.


381-432: SQL: Commands: Approve new command references.

The additions under “# SQL: Commands” cover key DDL and data-loading statements (CREATE TABLE, ALTER TABLE, COPY FROM/TO, etc.). Entry metadata (links, HTML, parents, tags) is consistent with existing patterns.


433-444: SQL: Data types: Approve new entry.

The “Data types” reference is correctly added under sql-syntax, with proper link, HTML URL, description, parents, and [ sql ] tag.

CHANGES.md (1)

5-9: CHANGES: Approve new Inventory entries.

The changelog correctly documents the addition of general info on partitioned tables, storage, replication, views, SQL commands (CREATE, ALTER, COPY), SQL data types, and example-dataset import via the CrateDB Toolkit.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/cratedb_about/outline/cratedb-outline.yaml (2)

208-215: Fix minor grammatical issue in Storage description.
The phrase “in conjunction of turning off indexing” should read “in conjunction with turning off indexing” for clarity and correctness.

- Turning off the Column Store in conjunction of turning off indexing will remove the length limitation.
+ Turning off the Column Store in conjunction with turning off indexing will remove the length limitation.

381-431: Consider reordering CREATE FOREIGN TABLE in the SQL Commands list.
For logical consistency, group all CREATE statements together before ALTER and COPY commands. For example:

# SQL: Commands
- CREATE TABLE
- CREATE TABLE AS
- CREATE FOREIGN TABLE   ← move here
- ALTER TABLE
- COPY FROM
- COPY TO
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d936ee and 3e3a8da.

📒 Files selected for processing (2)
  • docs/backlog.md (1 hunks)
  • src/cratedb_about/outline/cratedb-outline.yaml (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/backlog.md
🔇 Additional comments (6)
src/cratedb_about/outline/cratedb-outline.yaml (6)

197-207: Approve addition of Partitioned Tables reference entry.
The description clearly explains partitioned tables, links are accurate, and placement before “Data modeling” follows the intended outline structure.


216-223: Approve addition of Replication reference entry.
The replication entry accurately describes table replication behavior, the links are correct, and parents: [ reference ] is consistent with other reference docs.


224-231: Approve addition of Views reference entry.
The views entry succinctly covers stored queries and privilege nuances; links and hierarchy are spot on.


246-261: Approve grouping under new “Performance guidelines” section.
Relocating the sharding, query optimization, and scaling guides into their own section enhances discoverability and aligns with the reorganized outline.


433-443: Approve addition of SQL Data Types reference entry.
This entry comprehensively covers the role of data types in CrateDB, and the provided links and hierarchy are correct.


648-657: Approve addition of Toolkit Example Datasets entry.
This new example aligns perfectly with the PR objectives, and the description correctly highlights the cratedb_toolkit.datasets.load_dataset API usage.

@amotl amotl force-pushed the inventory-cratedb-sql-more branch from 3e3a8da to 596ee08 Compare May 18, 2025 17:33
A partition is similar to a regular single table and consists of one or more shards.
A table becomes a partitioned table by defining partition columns.
When a record with a new distinct combination of values for the configured partition
columns is inserted, a new partition is created and the document will be inserted into this partition.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
columns is inserted, a new partition is created and the document will be inserted into this partition.
columns is inserted, a new partition is created, and the document is inserted into this new partition.

@amotl amotl merged commit fe3ce1a into main May 19, 2025
6 checks passed
@amotl amotl deleted the inventory-cratedb-sql-more branch May 19, 2025 21:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants