Skip to content

docs(kv-ir): Add background necessary to understand the kv-ir stream format.#833

Merged
kirkrodrigues merged 42 commits into
y-scope:mainfrom
kirkrodrigues:kv-ir-stream-docs
May 6, 2025
Merged

docs(kv-ir): Add background necessary to understand the kv-ir stream format.#833
kirkrodrigues merged 42 commits into
y-scope:mainfrom
kirkrodrigues:kv-ir-stream-docs

Conversation

@kirkrodrigues

@kirkrodrigues kirkrodrigues commented Apr 17, 2025

Copy link
Copy Markdown
Member

Description

This PR adds docs explaining any background information necessary to explain the kv-ir stream format. Future PRs will flesh out the format itself.

It takes over where #707 left off.

The theme for Mermaid flowcharts used in this PR is slightly different than the one used in existing flowcharts, so they've also been updated for consistency.

Note that there is seemingly no way to set the theme globally due to some kind of bug in sphinxcontrib-mermaid (mermaid_init_js doesn't work).

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

Validated the new pages render as expected in both light and dark modes.

Summary by CodeRabbit

Summary by CodeRabbit

  • Documentation
    • Added new documentation introducing the KV-IR stream format and its background, including detailed explanations of compression, schema encoding, and storage processes.
    • Updated the developer guide index to include the new KV-IR design documentation.
    • Changed the colour scheme of workflow diagrams for improved visual clarity.

@coderabbitai

coderabbitai Bot commented Apr 17, 2025

Copy link
Copy Markdown
Contributor

Walkthrough

This update introduces new documentation for the KV-IR (key-value internal representation) stream format within the developer guide. It adds a detailed background document explaining the design and compression methods behind KV-IR, including how schemas and values are encoded and serialized. The documentation index is updated to include the new design section. Additionally, a visual style change is made to the Mermaid diagram theme in an existing tooling documentation file, updating specific colour values.

Changes

File(s) Change Summary
docs/src/dev-guide/design-kv-ir-streams/background.md New document providing an in-depth background on the KV-IR stream format, including schema computation, value encoding, serialization, and references to original research.
docs/src/dev-guide/design-kv-ir-streams/index.md New documentation introducing the KV-IR stream format, its advantages, and a placeholder for background information. Includes a reference to the new background document.
docs/src/dev-guide/index.md Updated developer guide index to include the new KV-IR stream format design documentation in the hidden "Design" section.
docs/src/dev-guide/tooling-gh-workflows.md Changed Mermaid diagram theme colours: updated "lineColor" and "secondaryColor" from "#9580ff" to "#007fff" to alter diagram appearance.

Sequence Diagram(s)

sequenceDiagram
    participant LogEvent
    participant SchemaComputation
    participant SchemaEncoding
    participant ValueEncoding
    participant ArchiveSerialization
    participant Disk

    LogEvent->>SchemaComputation: Compute schema (value-types, structure)
    SchemaComputation->>SchemaEncoding: Encode schema, assign node IDs
    SchemaEncoding->>ValueEncoding: Encode event values (various types)
    ValueEncoding->>ArchiveSerialization: Buffer and organize encoded data
    ArchiveSerialization->>Disk: Serialize and write archive to disk
Loading
sequenceDiagram
    participant UnstructuredText
    participant CLPAlgorithm
    participant FormatString
    participant StringVars
    participant EncodedVars

    UnstructuredText->>CLPAlgorithm: Parse and encode text
    CLPAlgorithm->>FormatString: Extract format string with placeholders
    CLPAlgorithm->>StringVars: Extract string variable values
    CLPAlgorithm->>EncodedVars: Encode variable values
Loading

Suggested reviewers

  • junhaoliao

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04abba2 and 781d699.

📒 Files selected for processing (1)
  • docs/src/dev-guide/design-kv-ir-streams/background.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/dev-guide/design-kv-ir-streams/background.md

[uncategorized] ~63-~63: A punctuation mark might be missing here.
Context: ...ect itself, and has no key. (figure-1)= :::{card} ```json lines { "timestamp"...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~92-~92: A punctuation mark might be missing here.
Context: ...**: Two JSON log events. ::: (table-1)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~109-~109: A punctuation mark might be missing here.
Context: ... in Table 3. ::: (table-2)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~125-~125: A punctuation mark might be missing here.
Context: ...n Figure 1. ::: (table-3)= :::{card} | clp-s value-type | Descrip...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[duplication] ~172-~172: Possible typo: you repeated a word.
Context: ... rootObj --> levelVarStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~239-~239: Possible typo: you repeated a word.
Context: ...: #97ff00'>Integer") root --> metadataNamespaceRoot metadataNamespaceRoot --> logEventIdxInt root --> defaultNa...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~241-~241: Possible typo: you repeated a word.
Context: ...spaceRoot --> logEventIdxInt root --> defaultNamespaceRootObj defaultNamespaceRootObj --> timestampInt defaultNamespaceRoot...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~244-~244: Possible typo: you repeated a word.
Context: ...velVarStr defaultNamespaceRootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[style] ~268-~268: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...-s encodes each component separately. * For NullValue values, clp-s doesn't need ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.17.2)
docs/src/dev-guide/design-kv-ir-streams/background.md

42-42: Link fragments should be valid
null

(MD051, link-fragments)


42-42: Link fragments should be valid
null

(MD051, link-fragments)


43-43: Link fragments should be valid
null

(MD051, link-fragments)


44-44: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


105-105: Link fragments should be valid
null

(MD051, link-fragments)


106-106: Link fragments should be valid
null

(MD051, link-fragments)


122-122: Link fragments should be valid
null

(MD051, link-fragments)


162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


178-178: Link fragments should be valid
null

(MD051, link-fragments)


190-190: Link fragments should be valid
null

(MD051, link-fragments)


191-191: Link fragments should be valid
null

(MD051, link-fragments)


195-195: Link fragments should be valid
null

(MD051, link-fragments)


227-227: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


251-251: Link fragments should be valid
null

(MD051, link-fragments)


259-259: Line length
Expected: 80; Actual: 100

(MD013, line-length)


260-260: Line length
Expected: 80; Actual: 98

(MD013, line-length)


261-261: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Link fragments should be valid
null

(MD051, link-fragments)


265-265: Line length
Expected: 80; Actual: 98

(MD013, line-length)


268-268: Line length
Expected: 80; Actual: 98

(MD013, line-length)


269-269: Line length
Expected: 80; Actual: 100

(MD013, line-length)


275-275: Line length
Expected: 80; Actual: 93

(MD013, line-length)


277-277: Line length
Expected: 80; Actual: 93

(MD013, line-length)


278-278: Line length
Expected: 80; Actual: 93

(MD013, line-length)


279-279: Line length
Expected: 80; Actual: 93

(MD013, line-length)


280-280: Line length
Expected: 80; Actual: 93

(MD013, line-length)


281-281: Line length
Expected: 80; Actual: 93

(MD013, line-length)


282-282: Line length
Expected: 80; Actual: 93

(MD013, line-length)


283-283: Line length
Expected: 80; Actual: 93

(MD013, line-length)


284-284: Line length
Expected: 80; Actual: 93

(MD013, line-length)


285-285: Line length
Expected: 80; Actual: 93

(MD013, line-length)


286-286: Line length
Expected: 80; Actual: 93

(MD013, line-length)


287-287: Line length
Expected: 80; Actual: 93

(MD013, line-length)


293-293: Line length
Expected: 80; Actual: 98

(MD013, line-length)


294-294: Line length
Expected: 80; Actual: 95

(MD013, line-length)


295-295: Line length
Expected: 80; Actual: 96

(MD013, line-length)


296-296: Line length
Expected: 80; Actual: 97

(MD013, line-length)


297-297: Line length
Expected: 80; Actual: 97

(MD013, line-length)


298-298: Line length
Expected: 80; Actual: 98

(MD013, line-length)


303-303: Line length
Expected: 80; Actual: 97

(MD013, line-length)


304-304: Line length
Expected: 80; Actual: 97

(MD013, line-length)


305-305: Line length
Expected: 80; Actual: 94

(MD013, line-length)


306-306: Line length
Expected: 80; Actual: 94

(MD013, line-length)


311-311: Line length
Expected: 80; Actual: 99

(MD013, line-length)


312-312: Line length
Expected: 80; Actual: 100

(MD013, line-length)


313-313: Line length
Expected: 80; Actual: 91

(MD013, line-length)


314-314: Line length
Expected: 80; Actual: 100

(MD013, line-length)


315-315: Line length
Expected: 80; Actual: 99

(MD013, line-length)


320-320: Line length
Expected: 80; Actual: 91

(MD013, line-length)


321-321: Line length
Expected: 80; Actual: 91

(MD013, line-length)


322-322: Line length
Expected: 80; Actual: 94

(MD013, line-length)


322-322: Link fragments should be valid
null

(MD051, link-fragments)


323-323: Line length
Expected: 80; Actual: 99

(MD013, line-length)


324-324: Line length
Expected: 80; Actual: 97

(MD013, line-length)


327-327: Line length
Expected: 80; Actual: 96

(MD013, line-length)


329-329: Line length
Expected: 80; Actual: 93

(MD013, line-length)


331-331: Line length
Expected: 80; Actual: 98

(MD013, line-length)


341-341: Line length
Expected: 80; Actual: 98

(MD013, line-length)


342-342: Line length
Expected: 80; Actual: 90

(MD013, line-length)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kirkrodrigues kirkrodrigues changed the title WIP: docs: Add kv-ir stream background and specification. docs(kv-ir): Add background necessary to understand kv-ir stream format. May 3, 2025
@kirkrodrigues kirkrodrigues marked this pull request as ready for review May 3, 2025 02:54
@kirkrodrigues kirkrodrigues requested a review from a team as a code owner May 3, 2025 02:54

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🔭 Outside diff range comments (1)
docs/src/dev-guide/design-kv-ir-streams/background.md (1)

314-316: 🧹 Nitpick (assertive)

Review bibliography link ordering
The references [clp-paper] and [clp-s-paper] appear at the end. Confirm they render in the intended sequence, and consider alphabetizing or adding a brief comment to clarify their order.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1f70d17 and 15bca78.

📒 Files selected for processing (4)
  • docs/src/dev-guide/design-kv-ir-streams/background.md (1 hunks)
  • docs/src/dev-guide/design-kv-ir-streams/index.md (1 hunks)
  • docs/src/dev-guide/index.md (1 hunks)
  • docs/src/dev-guide/tooling-gh-workflows.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/dev-guide/design-kv-ir-streams/index.md

[duplication] ~14-~14: Possible typo: you repeated a word.
Context: ... :gutter: 2 :::{grid-item-card} 🔗 background Background ^^^ Any necessary information to unders...

(ENGLISH_WORD_REPEAT_RULE)

docs/src/dev-guide/design-kv-ir-streams/background.md

[uncategorized] ~63-~63: A punctuation mark might be missing here.
Context: ...ect itself, and has no key. (figure-1)= :::{card} ```json lines { "timestamp"...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~92-~92: A punctuation mark might be missing here.
Context: ...**: Two JSON log events. ::: (table-1)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~109-~109: A punctuation mark might be missing here.
Context: ... in Table 3. ::: (table-2)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~125-~125: A punctuation mark might be missing here.
Context: ...n Figure 1. ::: (table-3)= :::{card} | clp-s value-type | Descrip...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[duplication] ~173-~173: Possible typo: you repeated a word.
Context: ...rootObj --> messageClpStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~186-~186: Possible missing comma found.
Context: ... archive's tree. (The leaf node IDs are sufficient since we can determine the predecessor ...

(AI_HYDRA_LEO_MISSING_COMMA)


[duplication] ~226-~226: Possible typo: you repeated a word.
Context: ...rootObj --> messageClpStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[style] ~238-~238: Consider using a synonym to be more concise.
Context: ... values For each log event, clp-s uses a variety of encoding methods to compactly encode it...

(A_VARIETY_OF)


[style] ~251-~251: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...754 double-precision format. * Booleans are encoded as an integer. * `VarString...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~252-~252: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...are encoded as an integer. * VarStrings are encoded using dictionary-encoding--...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~262-~262: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...encoded as the integer 0. VarStrings share a dictionary with the string vari...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.17.2)
docs/src/dev-guide/design-kv-ir-streams/background.md

42-42: Link fragments should be valid
null

(MD051, link-fragments)


42-42: Link fragments should be valid
null

(MD051, link-fragments)


43-43: Link fragments should be valid
null

(MD051, link-fragments)


44-44: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


105-105: Link fragments should be valid
null

(MD051, link-fragments)


106-106: Link fragments should be valid
null

(MD051, link-fragments)


122-122: Link fragments should be valid
null

(MD051, link-fragments)


162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


178-178: Link fragments should be valid
null

(MD051, link-fragments)


188-188: Link fragments should be valid
null

(MD051, link-fragments)


189-189: Link fragments should be valid
null

(MD051, link-fragments)


214-214: Inline HTML
Element: span

(MD033, no-inline-html)


214-214: Inline HTML
Element: span

(MD033, no-inline-html)


215-215: Inline HTML
Element: span

(MD033, no-inline-html)


215-215: Inline HTML
Element: span

(MD033, no-inline-html)


216-216: Inline HTML
Element: span

(MD033, no-inline-html)


216-216: Inline HTML
Element: span

(MD033, no-inline-html)


217-217: Inline HTML
Element: span

(MD033, no-inline-html)


217-217: Inline HTML
Element: span

(MD033, no-inline-html)


218-218: Inline HTML
Element: span

(MD033, no-inline-html)


218-218: Inline HTML
Element: span

(MD033, no-inline-html)


219-219: Inline HTML
Element: span

(MD033, no-inline-html)


219-219: Inline HTML
Element: span

(MD033, no-inline-html)


220-220: Inline HTML
Element: span

(MD033, no-inline-html)


220-220: Inline HTML
Element: span

(MD033, no-inline-html)


221-221: Inline HTML
Element: span

(MD033, no-inline-html)


221-221: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Link fragments should be valid
null

(MD051, link-fragments)


287-287: Link fragments should be valid
null

(MD051, link-fragments)

🔇 Additional comments (5)
docs/src/dev-guide/design-kv-ir-streams/background.md (3)

1-11: Validate internal link anchors in introduction
The links to #clp-s-compression and #parsing--encoding-unstructured-text must exactly match the generated HTML anchor IDs for their corresponding headings. Please preview or run your Markdown renderer to confirm these fragments resolve correctly and avoid broken links.


14-14: Confirm relative path for clp-s reference
The link [clp-s](../../user-guide/core-clp-s.md) appears correct, but please double-check that this relative path resolves in the final site build.


8-8: Ensure “Parsing & encoding unstructured text” anchor consistency
You’re linking to #parsing--encoding-unstructured-text, yet the header is ## Parsing & encoding unstructured text. Verify the exact slug (likely parsing-encoding-unstructured-text) and update either the link or the header to keep them in sync.

Also applies to: 283-283

docs/src/dev-guide/index.md (1)

78-86: Approve design toctree update
The addition of design-kv-ir-streams/index under the “Design” section correctly integrates the new KV-IR documentation into the developer guide.

docs/src/dev-guide/design-kv-ir-streams/index.md (1)

1-9: Approve KV-IR streams overview
The introductory section clearly outlines the KV-IR format’s goals, trade-offs, and use cases. It’s concise and well-focused.

Comment on lines +24 to +25
"lineColor": "#007fff",
"secondaryColor": "#007fff",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Verify colour contrast and consistency
The lineColor and secondaryColor values have been updated to #007fff. Please confirm these colours maintain sufficient contrast per WCAG guidelines and remain consistent with the project’s branding across all Mermaid diagrams.

Comment thread docs/src/dev-guide/design-kv-ir-streams/background.md Outdated
Comment on lines +147 to +177
:::{mermaid}
%%{
init: {
"theme": "base",
"themeVariables": {
"primaryColor": "#0066cc",
"primaryTextColor": "#fff",
"primaryBorderColor": "transparent",
"lineColor": "#007fff",
"secondaryColor": "#007fff",
"tertiaryColor": "#fff"
}
}
}%%
flowchart LR
rootObj("root: <span style='color: #97ff00'>Object</span>")
timestampInt("timestamp: <span style='color: #97ff00'>Integer</span>")
levelVarStr("level: <span style='color: #97ff00'>VarString</span>")
messageClpStr("message: <span style='color: #97ff00'>ClpString</span>")
timersObj("timers: <span style='color: #97ff00'>Object</span>")
timersStage1Float("stage_1: <span style='color: #97ff00'>Float</span>")
timersStage2Null("stage_2: <span style='color: #97ff00'>NullValue</span>")

rootObj --> timestampInt
rootObj --> levelVarStr
rootObj --> messageClpStr
rootObj --> timersObj
timersObj --> timersStage1Float
timersObj --> timersStage2Null
:::
+++

@coderabbitai coderabbitai Bot May 3, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Avoid inline HTML in Mermaid diagrams
Inline <span> tags trigger MD033. To keep linting green, consider using Mermaid’s theme variables or annotations instead of embedded HTML, or explicitly disable that lint rule for these blocks.

🧰 Tools
🪛 LanguageTool

[duplication] ~173-~173: Possible typo: you repeated a word.
Context: ...rootObj --> messageClpStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)

🪛 markdownlint-cli2 (0.17.2)

162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I use Mermaid's theme variables or annotations?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kirkrodrigues You can use Mermaid's classDef and class directives to style nodes instead of inline HTML. Here's how you can modify your diagram:

%%{init:{"theme":"base","themeVariables":{"primaryColor":"#0066cc","primaryTextColor":"#fff","primaryBorderColor":"transparent","lineColor":"#007fff","secondaryColor":"#007fff","tertiaryColor":"#fff"}}}%%
flowchart LR
  rootObj["root: Object"]:::objectType
  timestampInt["timestamp: Integer"]:::intType
  levelVarStr["level: VarString"]:::stringType
  messageClpStr["message: ClpString"]:::stringType
  timersObj["timers: Object"]:::objectType
  timersStage1Float["stage_1: Float"]:::floatType
  timersStage2Null["stage_2: NullValue"]:::nullType

  rootObj --> timestampInt
  rootObj --> levelVarStr
  rootObj --> messageClpStr
  rootObj --> timersObj
  timersObj --> timersStage1Float
  timersObj --> timersStage2Null

  classDef objectType fill:#fff,stroke:#333,color:#97ff00
  classDef intType     fill:#fff,stroke:#333,color:#97ff00
  classDef stringType  fill:#fff,stroke:#333,color:#97ff00
  classDef floatType   fill:#fff,stroke:#333,color:#97ff00
  classDef nullType    fill:#fff,stroke:#333,color:#97ff00
Loading

This approach:

  1. Defines classes using classDef (at the bottom)
  2. Applies classes to nodes using the :::className syntax
  3. Achieves the same colored text but without inline HTML

You can also use a single class for all nodes that share the same color:

  %% Alternative approach with one c```mermaid
%% Alternative approach with one class
classDiagram
```classDiagram
  classDef typeHighlight fill:#fff,stroke:#333,color:#97ff00
  class rootObj,timestampInt,levelVarStr,messageClpStr,timersObj,timersStage1Float,timersStage2Null typeHighlight
```g your desired styling.


<!-- <agentComponentType>chat-answer</agentComponentType> -->

<!-- <rawResChunk><answer>@kirkrodrigues You can use Mermaid's classDef and class directives to style nodes instead of inline HTML. Here's how you can modify your diagram:

```mermaid
%%{init:{"theme":"base","themeVariables":{"primaryColor":"#0066cc","primaryTextColor":"#fff","primaryBorderColor":"transparent","lineColor":"#007fff","secondaryColor":"#007fff","tertiaryColor":"#fff"}}}%%
flowchart LR
  rootObj["root: Object"]:::objectType
  timestampInt["timestamp: Integer"]:::intType
  levelVarStr["level: VarString"]:::stringType
  messageClpStr["message: ClpString"]:::stringType
  timersObj["timers: Object"]:::objectType
  timersStage1Float["stage_1: Float"]:::floatType
  timersStage2Null["stage_2: NullValue"]:::nullType

  rootObj --&gt; timestampInt
  rootObj --&gt; levelVarStr
  rootObj --&gt; messageClpStr
  rootObj --&gt; timersObj
  timersObj --&gt; timersStage1Float
  timersObj --&gt; timersStage2Null

  classDef objectType fill:#fff,stroke:#333,color:#97ff00
  classDef intType     fill:#fff,stroke:#333,color:#97ff00
  classDef stringType  fill:#fff,stroke:#333,color:#97ff00
  classDef floatType   fill:#fff,stroke:#333,color:#97ff00
  classDef nullType    fill:#fff,stroke:#333,color:#97ff00
Loading

This approach:

  1. Defines classes using classDef (at the bottom)
  2. Applies classes to nodes using the :::className syntax
  3. Achieves the same colored text but without inline HTML

You can also use a single class for all nodes that share the same color:

  %% Alternative approach with one c```mermaid
%% Alternative approach with one class
classDiagram
```classDiagram
  classDef typeHighlight fill:#fff,stroke:#333,color:#97ff00
  class rootObj,timestampInt,levelVarStr,messageClpStr,timersObj,timersStage1Float,timersStage2Null typeHighlight
```g your desired styling.</answer></rawResChunk> -->

<!-- This is an auto-generated reply by CodeRabbit -->
Loading

Comment on lines +25 to +29
:::{toctree}
:hidden:

background
:::

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Add blank line before toctree for MyST
Ensure there’s an empty line preceding the :::{toctree} directive so MyST parses it cleanly and avoids rendering glitches.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds documentation to explain the background and design of the new KV-IR stream format. The changes include:

  • Updating the theme colors in the GitHub workflows documentation for consistent diagram rendering.
  • Adding an entry to the developer guide index for the KV-IR design documentation.
  • Introducing a new KV-IR streams documentation page along with its background explanation.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
docs/src/dev-guide/tooling-gh-workflows.md Updated theme colors for Mermaid flowcharts to ensure consistency.
docs/src/dev-guide/index.md Added a new toctree entry for the KV-IR stream design documentation.
docs/src/dev-guide/design-kv-ir-streams/index.md New introductory page for KV-IR streams documentation.
docs/src/dev-guide/design-kv-ir-streams/background.md New documentation page detailing the background and technical details.

* the format string is dictionary-encoded.
* the encoded variable values are encoded natively.
* the string variable values are dictionary-encoded.
* `UnstructuredArray`s are converted to EncodedTextAsts are then encoded similar to `ClpString`s.

Copilot AI May 3, 2025

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sentence is unclear and appears to be missing a connector; consider rephrasing it as: '* UnstructuredArrays are converted to EncodedTextAsts, and then encoded similarly to ClpStrings.'

Suggested change
* `UnstructuredArray`s are converted to EncodedTextAsts are then encoded similar to `ClpString`s.
* `UnstructuredArray`s are converted to EncodedTextAsts, and then encoded similarly to `ClpString`s.

Copilot uses AI. Check for mistakes.
@kirkrodrigues kirkrodrigues changed the title docs(kv-ir): Add background necessary to understand kv-ir stream format. docs(kv-ir): Add background necessary to understand the kv-ir stream format. May 3, 2025

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds comprehensive documentation detailing the background for understanding the new KV-IR stream format and updates the overall design theme for flowcharts. The changes include updating theme colors in existing documentation, adding a new section in the developer guide index, and introducing detailed background documentation on the KV-IR streams.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
docs/src/dev-guide/tooling-gh-workflows.md Updated theme colors for flowchart consistency
docs/src/dev-guide/index.md Linked the new KV-IR streams documentation section
docs/src/dev-guide/design-kv-ir-streams/index.md Added a new section header and overview for KV-IR streams
docs/src/dev-guide/design-kv-ir-streams/background.md Added extensive background documentation on the KV-IR stream format
Comments suppressed due to low confidence (1)

docs/src/dev-guide/tooling-gh-workflows.md:24

  • Please verify that the updated color '#007fff' provides sufficient contrast with the document’s background and neighboring text to ensure readability and compliance with accessibility guidelines.
"lineColor": "#007fff",

LinZhihao-723
LinZhihao-723 previously approved these changes May 4, 2025

@LinZhihao-723 LinZhihao-723 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I don't have more comments.
The PR title should be good.

@kirkrodrigues

Copy link
Copy Markdown
Member Author

@gibber9809 Can you review the clp-s background for accuracy?

@kirkrodrigues kirkrodrigues requested a review from gibber9809 May 4, 2025 17:31
gibber9809
gibber9809 previously approved these changes May 5, 2025

@gibber9809 gibber9809 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clp-s background looks good to me -- very clear. Just have the one comment about the archive level schema tree.

Comment on lines +215 to +237
flowchart LR
rootObj("<span style='color: #ffbe00'>0</span> root: <span style='color: #97ff00'>Object</span>")
timestampInt("<span style='color: #ffbe00'>1</span> timestamp: <span style='color: #97ff00'>Integer</span>")
levelVarStr("<span style='color: #ffbe00'>2</span> level: <span style='color: #97ff00'>VarString</span>")
timersObj("<span style='color: #ffbe00'>4</span> timers: <span style='color: #97ff00'>Object</span>")
timersStage1Float("<span style='color: #ffbe00'>5</span> stage_1: <span style='color: #97ff00'>Float</span>")
timersStage2Null("<span style='color: #ffbe00'>6</span> stage_2: <span style='color: #97ff00'>NullValue</span>")
timersStage2Float("<span style='color: #ffbe00'>7</span> stage_2: <span style='color: #97ff00'>Float</span>")
messageClpStr("<span style='color: #ffbe00'>3</span> message: <span style='color: #97ff00'>ClpString</span>")

rootObj --> timestampInt
rootObj --> levelVarStr
rootObj --> timersObj
timersObj --> timersStage1Float
timersObj --> timersStage2Null
timersObj --> timersStage2Float
rootObj --> messageClpStr
:::
+++
**Figure 3**: The archive's schema tree after adding the log events from [Figure 1](#figure-1). Each
node's label is of the form `<ID> <key>: <type>`.
::::
<!-- markdownlint-enable MD013 -->

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its worth explaining in detail, but do you think its worth drawing the full archive level schema tree instead of just the default object namespace portion?

i.e. have a separate untyped root, and maybe event include the log_event_idx in the Metadata subtree?

I guess tree would also have

root -> rootObj
root -> metadata
metadata -> log_event_idx

@kirkrodrigues kirkrodrigues dismissed stale reviews from gibber9809 and LinZhihao-723 via 9b90be5 May 5, 2025 21:31

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (3)
docs/src/dev-guide/design-kv-ir-streams/background.md (3)

5-9: Improve bullet phrasing and verify link fragments
The two list items both start with “How clp-s…” which reduces parallelism and may make the list harder to scan. Consider rephrasing for uniformity, for example:

  • Clp-s compression of log events
  • Clp-s parsing and encoding of unstructured text values

Additionally, confirm that the link targets (#clp-s-compression and #parsing--encoding-unstructured-text) match the generated heading IDs or add explicit anchor IDs to avoid broken links (MD051).


162-168: Eliminate inline HTML in Mermaid diagram
Inline <span> tags here trigger MD033 lint errors and make styling brittle. Replace these with Mermaid’s classDef and :::className syntax to apply colours without embedded HTML.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


227-237: Eliminate inline HTML in the second Mermaid diagram
The <span> tags also appear in the Figure 3 nodes and violate MD033. Please leverage classDef definitions and :::className node annotations instead of inline HTML.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

227-227: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1c2f47 and 9b90be5.

📒 Files selected for processing (1)
  • docs/src/dev-guide/design-kv-ir-streams/background.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/dev-guide/design-kv-ir-streams/background.md

[duplication] ~172-~172: Possible typo: you repeated a word.
Context: ... rootObj --> levelVarStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~239-~239: Possible typo: you repeated a word.
Context: ...: #97ff00'>Integer") root --> metadataNamespaceRoot metadataNamespaceRoot --> logEventIdxInt root --> defaultNa...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~241-~241: Possible typo: you repeated a word.
Context: ...spaceRoot --> logEventIdxInt root --> defaultNamespaceRootObj defaultNamespaceRootObj --> timestampInt defaultNamespaceRoot...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~244-~244: Possible typo: you repeated a word.
Context: ...velVarStr defaultNamespaceRootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[style] ~268-~268: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...-s encodes each component separately. * For NullValue values, clp-s doesn't need ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.17.2)
docs/src/dev-guide/design-kv-ir-streams/background.md

42-42: Link fragments should be valid
null

(MD051, link-fragments)


42-42: Link fragments should be valid
null

(MD051, link-fragments)


43-43: Link fragments should be valid
null

(MD051, link-fragments)


44-44: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


105-105: Link fragments should be valid
null

(MD051, link-fragments)


106-106: Link fragments should be valid
null

(MD051, link-fragments)


122-122: Link fragments should be valid
null

(MD051, link-fragments)


162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


178-178: Link fragments should be valid
null

(MD051, link-fragments)


190-190: Link fragments should be valid
null

(MD051, link-fragments)


191-191: Link fragments should be valid
null

(MD051, link-fragments)


195-195: Link fragments should be valid
null

(MD051, link-fragments)


227-227: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


251-251: Link fragments should be valid
null

(MD051, link-fragments)


259-259: Line length
Expected: 80; Actual: 100

(MD013, line-length)


260-260: Line length
Expected: 80; Actual: 98

(MD013, line-length)


261-261: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Link fragments should be valid
null

(MD051, link-fragments)


265-265: Line length
Expected: 80; Actual: 98

(MD013, line-length)


268-268: Line length
Expected: 80; Actual: 98

(MD013, line-length)


269-269: Line length
Expected: 80; Actual: 100

(MD013, line-length)


275-275: Line length
Expected: 80; Actual: 93

(MD013, line-length)


277-277: Line length
Expected: 80; Actual: 93

(MD013, line-length)


278-278: Line length
Expected: 80; Actual: 93

(MD013, line-length)


279-279: Line length
Expected: 80; Actual: 93

(MD013, line-length)


280-280: Line length
Expected: 80; Actual: 93

(MD013, line-length)


281-281: Line length
Expected: 80; Actual: 93

(MD013, line-length)


282-282: Line length
Expected: 80; Actual: 93

(MD013, line-length)


283-283: Line length
Expected: 80; Actual: 93

(MD013, line-length)


284-284: Line length
Expected: 80; Actual: 93

(MD013, line-length)


285-285: Line length
Expected: 80; Actual: 93

(MD013, line-length)


286-286: Line length
Expected: 80; Actual: 93

(MD013, line-length)


287-287: Line length
Expected: 80; Actual: 93

(MD013, line-length)


293-293: Line length
Expected: 80; Actual: 98

(MD013, line-length)


294-294: Line length
Expected: 80; Actual: 95

(MD013, line-length)


295-295: Line length
Expected: 80; Actual: 96

(MD013, line-length)


296-296: Line length
Expected: 80; Actual: 97

(MD013, line-length)


297-297: Line length
Expected: 80; Actual: 97

(MD013, line-length)


298-298: Line length
Expected: 80; Actual: 98

(MD013, line-length)


303-303: Line length
Expected: 80; Actual: 97

(MD013, line-length)


304-304: Line length
Expected: 80; Actual: 97

(MD013, line-length)


305-305: Line length
Expected: 80; Actual: 94

(MD013, line-length)


306-306: Line length
Expected: 80; Actual: 94

(MD013, line-length)


311-311: Line length
Expected: 80; Actual: 99

(MD013, line-length)


312-312: Line length
Expected: 80; Actual: 100

(MD013, line-length)


313-313: Line length
Expected: 80; Actual: 91

(MD013, line-length)


314-314: Line length
Expected: 80; Actual: 100

(MD013, line-length)


315-315: Line length
Expected: 80; Actual: 99

(MD013, line-length)


320-320: Line length
Expected: 80; Actual: 91

(MD013, line-length)


321-321: Line length
Expected: 80; Actual: 91

(MD013, line-length)


322-322: Line length
Expected: 80; Actual: 94

(MD013, line-length)


322-322: Link fragments should be valid
null

(MD051, link-fragments)


323-323: Line length
Expected: 80; Actual: 99

(MD013, line-length)


324-324: Line length
Expected: 80; Actual: 97

(MD013, line-length)


327-327: Line length
Expected: 80; Actual: 96

(MD013, line-length)


329-329: Line length
Expected: 80; Actual: 93

(MD013, line-length)


331-331: Line length
Expected: 80; Actual: 98

(MD013, line-length)


341-341: Line length
Expected: 80; Actual: 98

(MD013, line-length)


342-342: Line length
Expected: 80; Actual: 90

(MD013, line-length)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)

Comment on lines +259 to +263
For each log event, clp-s encodes each value using an encoding method for the value's specific type.
The goal of each method is to deduplicate any repetitive information (e.g., deduplicating repeated
`VarString` values with a dictionary) and then represent the value with a 64-bit integer.
[Table 4](#table-4) lists how clp-s encodes each value type. Most value types are encoded
conventionally with the following exceptions:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Wrap long lines for readability
Several sentences and table entries exceed 80 characters (MD013). Please reflow paragraphs and break long lines (including table rows) at approximately 80 characters to improve readability.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

259-259: Line length
Expected: 80; Actual: 100

(MD013, line-length)


260-260: Line length
Expected: 80; Actual: 98

(MD013, line-length)


261-261: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Link fragments should be valid
null

(MD051, link-fragments)


We discuss each below.

## clp-s compression

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Consistent heading casing and terminology
The section title uses sentence case and lowercase “clp-s”. For consistency with other headings and acronyms, use title case and uppercase the acronym, e.g.:

## CLP-S Compression

Comment on lines +318 to +320
## Parsing & encoding unstructured text

clp-s uses [clp](../../user-guide/core-unstructured/clp.md)'s algorithm to parse and encode

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Refine heading style and symbols
The heading “## Parsing & encoding unstructured text” mixes the ampersand and sentence case. For uniformity, spell out “and” and use title case, e.g.:

## Parsing and Encoding Unstructured Text
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

320-320: Line length
Expected: 80; Actual: 91

(MD013, line-length)

@kirkrodrigues kirkrodrigues requested a review from Copilot May 5, 2025 21:36

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the documentation by introducing background information for understanding the KV-IR stream format and updating the visual style for Mermaid flowcharts.

  • Updated the color scheme in tooling workflows documentation for visual consistency.
  • Added a new guide section (design-kv-ir-streams) with an overview and detailed background.
  • Updated the documentation index to include the new KV-IR streams section.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
docs/src/dev-guide/tooling-gh-workflows.md Updated Mermaid theme colors for improved visual clarity.
docs/src/dev-guide/index.md Added reference to the new KV-IR streams documentation.
docs/src/dev-guide/design-kv-ir-streams/index.md Introduced a new documentation section for KV-IR streams background.
docs/src/dev-guide/design-kv-ir-streams/background.md Added detailed background documentation on the KV-IR stream format.
Comments suppressed due to low confidence (3)

docs/src/dev-guide/design-kv-ir-streams/background.md:1

  • [nitpick] Consider breaking this lengthy section into smaller subsections to improve readability as more detailed content is added in future updates.
# Background

docs/src/dev-guide/tooling-gh-workflows.md:24

  • [nitpick] Ensure that the updated color (#007fff) aligns with the overall documentation branding and is consistently applied across all diagrams.
"lineColor": "#007fff",

docs/src/dev-guide/index.md:82

  • [nitpick] Verify that the addition of the KV-IR streams section in the index provides clear navigation to users and is reflected properly in the sidebar.
design-kv-ir-streams/index

LinZhihao-723
LinZhihao-723 previously approved these changes May 5, 2025

@LinZhihao-723 LinZhihao-723 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest changes lgtm.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (5)
docs/src/dev-guide/design-kv-ir-streams/background.md (5)

257-271: Avoid repetitive “For … values” phrasing
Three successive list items start with “For”. Consider rephrasing, for example:

* Dictionary-ID encoding: store value in a dictionary and map to a unique integer
* ClpString encoding: encode each component separately
* NullValue handling: implicit null column — no explicit storage needed
🧰 Tools
🪛 LanguageTool

[uncategorized] ~262-~262: Possible missing comma found.
Context: ...alue type. Most value types are encoded conventionally with the following exceptions: * For t...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~268-~268: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...-s encodes each component separately. * For NullValue values, clp-s doesn't need ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.17.2)

259-259: Line length
Expected: 80; Actual: 100

(MD013, line-length)


260-260: Line length
Expected: 80; Actual: 98

(MD013, line-length)


261-261: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Link fragments should be valid
null

(MD051, link-fragments)


265-265: Line length
Expected: 80; Actual: 98

(MD013, line-length)


268-268: Line length
Expected: 80; Actual: 98

(MD013, line-length)


269-269: Line length
Expected: 80; Actual: 100

(MD013, line-length)


272-292: Wrap long table lines for readability
Several rows exceed 80 characters (e.g., “DateString”, “ClpString” entries). Please reflow or break lines to conform to the project’s 80-column guideline.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

275-275: Line length
Expected: 80; Actual: 93

(MD013, line-length)


277-277: Line length
Expected: 80; Actual: 93

(MD013, line-length)


278-278: Line length
Expected: 80; Actual: 93

(MD013, line-length)


279-279: Line length
Expected: 80; Actual: 93

(MD013, line-length)


280-280: Line length
Expected: 80; Actual: 93

(MD013, line-length)


281-281: Line length
Expected: 80; Actual: 93

(MD013, line-length)


282-282: Line length
Expected: 80; Actual: 93

(MD013, line-length)


283-283: Line length
Expected: 80; Actual: 93

(MD013, line-length)


284-284: Line length
Expected: 80; Actual: 93

(MD013, line-length)


285-285: Line length
Expected: 80; Actual: 93

(MD013, line-length)


286-286: Line length
Expected: 80; Actual: 93

(MD013, line-length)


287-287: Line length
Expected: 80; Actual: 93

(MD013, line-length)


145-176: 🧹 Nitpick (assertive)

Eliminate inline HTML in Mermaid diagram
Inline <span> tags trigger MD033 errors and make future styling brittle. Use Mermaid’s classDef and :::className syntax instead, and remove the repeated init block by extracting theme variables into a shared snippet once the upstream bug is fixed.

🧰 Tools
🪛 LanguageTool

[duplication] ~172-~172: Possible typo: you repeated a word.
Context: ... rootObj --> levelVarStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)

🪛 markdownlint-cli2 (0.17.2)

162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


209-249: 🧹 Nitpick (assertive)

Replace HTML spans and dedupe theme in Figure 3
As with Figure 2, convert inline <span> styling to classDef directives and apply classes to nodes. Also consolidate the repeated init theme block into a shared include to reduce duplication.

🧰 Tools
🪛 LanguageTool

[duplication] ~239-~239: Possible typo: you repeated a word.
Context: ...: #97ff00'>Integer") root --> metadataNamespaceRoot metadataNamespaceRoot --> logEventIdxInt root --> defaultNa...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~241-~241: Possible typo: you repeated a word.
Context: ...spaceRoot --> logEventIdxInt root --> defaultNamespaceRootObj defaultNamespaceRootObj --> timestampInt defaultNamespaceRoot...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~244-~244: Possible typo: you repeated a word.
Context: ...velVarStr defaultNamespaceRootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)

🪛 markdownlint-cli2 (0.17.2)

227-227: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


318-324: 🧹 Nitpick (assertive)

Standardize heading style and conjunction
Update

## Parsing & encoding unstructured text

to

## Parsing and Encoding Unstructured Text

for title case and to replace & with “and,” matching other section headings.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

320-320: Line length
Expected: 80; Actual: 91

(MD013, line-length)


321-321: Line length
Expected: 80; Actual: 91

(MD013, line-length)


322-322: Line length
Expected: 80; Actual: 94

(MD013, line-length)


322-322: Link fragments should be valid
null

(MD051, link-fragments)


323-323: Line length
Expected: 80; Actual: 99

(MD013, line-length)


324-324: Line length
Expected: 80; Actual: 97

(MD013, line-length)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f25499d and 04abba2.

📒 Files selected for processing (1)
  • docs/src/dev-guide/design-kv-ir-streams/background.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/dev-guide/design-kv-ir-streams/background.md

[uncategorized] ~63-~63: A punctuation mark might be missing here.
Context: ...ect itself, and has no key. (figure-1)= :::{card} ```json lines { "timestamp"...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~92-~92: A punctuation mark might be missing here.
Context: ...**: Two JSON log events. ::: (table-1)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~109-~109: A punctuation mark might be missing here.
Context: ... in Table 3. ::: (table-2)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[uncategorized] ~125-~125: A punctuation mark might be missing here.
Context: ...n Figure 1. ::: (table-3)= :::{card} | clp-s value-type | Descrip...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[duplication] ~172-~172: Possible typo: you repeated a word.
Context: ... rootObj --> levelVarStr rootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~179-~179: Possible missing comma found.
Context: ... form "<key>": <type>, except for the root which doesn't have an explicit name. Ea...

(AI_HYDRA_LEO_MISSING_COMMA)


[duplication] ~239-~239: Possible typo: you repeated a word.
Context: ...: #97ff00'>Integer") root --> metadataNamespaceRoot metadataNamespaceRoot --> logEventIdxInt root --> defaultNa...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~241-~241: Possible typo: you repeated a word.
Context: ...spaceRoot --> logEventIdxInt root --> defaultNamespaceRootObj defaultNamespaceRootObj --> timestampInt defaultNamespaceRoot...

(ENGLISH_WORD_REPEAT_RULE)


[duplication] ~244-~244: Possible typo: you repeated a word.
Context: ...velVarStr defaultNamespaceRootObj --> timersObj timersObj --> timersStage1Float timersObj --> t...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~262-~262: Possible missing comma found.
Context: ...alue type. Most value types are encoded conventionally with the following exceptions: * For t...

(AI_HYDRA_LEO_MISSING_COMMA)


[style] ~268-~268: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...-s encodes each component separately. * For NullValue values, clp-s doesn't need ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🪛 markdownlint-cli2 (0.17.2)
docs/src/dev-guide/design-kv-ir-streams/background.md

42-42: Link fragments should be valid
null

(MD051, link-fragments)


42-42: Link fragments should be valid
null

(MD051, link-fragments)


43-43: Link fragments should be valid
null

(MD051, link-fragments)


44-44: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


105-105: Link fragments should be valid
null

(MD051, link-fragments)


106-106: Link fragments should be valid
null

(MD051, link-fragments)


122-122: Link fragments should be valid
null

(MD051, link-fragments)


162-162: Inline HTML
Element: span

(MD033, no-inline-html)


163-163: Inline HTML
Element: span

(MD033, no-inline-html)


164-164: Inline HTML
Element: span

(MD033, no-inline-html)


165-165: Inline HTML
Element: span

(MD033, no-inline-html)


166-166: Inline HTML
Element: span

(MD033, no-inline-html)


167-167: Inline HTML
Element: span

(MD033, no-inline-html)


168-168: Inline HTML
Element: span

(MD033, no-inline-html)


178-178: Link fragments should be valid
null

(MD051, link-fragments)


190-190: Link fragments should be valid
null

(MD051, link-fragments)


191-191: Link fragments should be valid
null

(MD051, link-fragments)


195-195: Link fragments should be valid
null

(MD051, link-fragments)


227-227: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


228-228: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


229-229: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


230-230: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


231-231: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


232-232: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


233-233: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


234-234: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


235-235: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


236-236: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


237-237: Inline HTML
Element: span

(MD033, no-inline-html)


251-251: Link fragments should be valid
null

(MD051, link-fragments)


259-259: Line length
Expected: 80; Actual: 100

(MD013, line-length)


260-260: Line length
Expected: 80; Actual: 98

(MD013, line-length)


261-261: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Line length
Expected: 80; Actual: 89

(MD013, line-length)


262-262: Link fragments should be valid
null

(MD051, link-fragments)


265-265: Line length
Expected: 80; Actual: 98

(MD013, line-length)


268-268: Line length
Expected: 80; Actual: 98

(MD013, line-length)


269-269: Line length
Expected: 80; Actual: 100

(MD013, line-length)


275-275: Line length
Expected: 80; Actual: 93

(MD013, line-length)


277-277: Line length
Expected: 80; Actual: 93

(MD013, line-length)


278-278: Line length
Expected: 80; Actual: 93

(MD013, line-length)


279-279: Line length
Expected: 80; Actual: 93

(MD013, line-length)


280-280: Line length
Expected: 80; Actual: 93

(MD013, line-length)


281-281: Line length
Expected: 80; Actual: 93

(MD013, line-length)


282-282: Line length
Expected: 80; Actual: 93

(MD013, line-length)


283-283: Line length
Expected: 80; Actual: 93

(MD013, line-length)


284-284: Line length
Expected: 80; Actual: 93

(MD013, line-length)


285-285: Line length
Expected: 80; Actual: 93

(MD013, line-length)


286-286: Line length
Expected: 80; Actual: 93

(MD013, line-length)


287-287: Line length
Expected: 80; Actual: 93

(MD013, line-length)


293-293: Line length
Expected: 80; Actual: 98

(MD013, line-length)


294-294: Line length
Expected: 80; Actual: 95

(MD013, line-length)


295-295: Line length
Expected: 80; Actual: 96

(MD013, line-length)


296-296: Line length
Expected: 80; Actual: 97

(MD013, line-length)


297-297: Line length
Expected: 80; Actual: 97

(MD013, line-length)


298-298: Line length
Expected: 80; Actual: 98

(MD013, line-length)


303-303: Line length
Expected: 80; Actual: 97

(MD013, line-length)


304-304: Line length
Expected: 80; Actual: 97

(MD013, line-length)


305-305: Line length
Expected: 80; Actual: 94

(MD013, line-length)


306-306: Line length
Expected: 80; Actual: 94

(MD013, line-length)


311-311: Line length
Expected: 80; Actual: 99

(MD013, line-length)


312-312: Line length
Expected: 80; Actual: 100

(MD013, line-length)


313-313: Line length
Expected: 80; Actual: 91

(MD013, line-length)


314-314: Line length
Expected: 80; Actual: 100

(MD013, line-length)


315-315: Line length
Expected: 80; Actual: 99

(MD013, line-length)


320-320: Line length
Expected: 80; Actual: 91

(MD013, line-length)


321-321: Line length
Expected: 80; Actual: 91

(MD013, line-length)


322-322: Line length
Expected: 80; Actual: 94

(MD013, line-length)


322-322: Link fragments should be valid
null

(MD051, link-fragments)


323-323: Line length
Expected: 80; Actual: 99

(MD013, line-length)


324-324: Line length
Expected: 80; Actual: 97

(MD013, line-length)


327-327: Line length
Expected: 80; Actual: 96

(MD013, line-length)


329-329: Line length
Expected: 80; Actual: 93

(MD013, line-length)


331-331: Line length
Expected: 80; Actual: 98

(MD013, line-length)


341-341: Line length
Expected: 80; Actual: 98

(MD013, line-length)


342-342: Line length
Expected: 80; Actual: 90

(MD013, line-length)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (macos-latest)
  • GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (15)
docs/src/dev-guide/design-kv-ir-streams/background.md (15)

19-29: Verify numbered-step link anchors
The steps link to #computing-a-log-events-schema, #encoding-log-event-schemas, etc. Please run a quick check or adjust the headings to ensure those anchors resolve correctly.


30-33: Tip block is clear and helpful
The callout directing readers to the original CLP-S paper is well-placed and renders correctly.


34-43: Schema computation section is clear
The description and bullets about value-type determination and schema-tree building are concise and accurate.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

42-42: Link fragments should be valid
null

(MD051, link-fragments)


42-42: Link fragments should be valid
null

(MD051, link-fragments)


43-43: Link fragments should be valid
null

(MD051, link-fragments)


54-62: Schema-tree node explanation is accurate
The notes on internal vs. leaf nodes correctly convey the structure of the schema tree.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

56-56: Link fragments should be valid
null

(MD051, link-fragments)


56-56: Link fragments should be valid
null

(MD051, link-fragments)


63-91: Figure 1 JSON examples render correctly
The JSON code fence and examples illustrate the log events effectively. No changes needed here.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~63-~63: A punctuation mark might be missing here.
Context: ...ect itself, and has no key. (figure-1)= :::{card} ```json lines { "timestamp"...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


92-107: Table 1 formatting is good
The table and its caption are clear, and the use of +++ ensures proper rendering in Sphinx.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~92-~92: A punctuation mark might be missing here.
Context: ...**: Two JSON log events. ::: (table-1)= :::{card} | Key | clp-s valu...

(AI_EN_LECTOR_MISSING_PUNCTUATION)

🪛 markdownlint-cli2 (0.17.2)

105-105: Link fragments should be valid
null

(MD051, link-fragments)


106-106: Link fragments should be valid
null

(MD051, link-fragments)


125-143: Verify anchor for encoded text AST link
The Description column in Table 3 links to #parsing--encoding-unstructured-text. Ensure this matches the slug generated from the heading below (consider simplifying to “Parsing and Encoding Unstructured Text”).

🧰 Tools
🪛 LanguageTool

[uncategorized] ~125-~125: A punctuation mark might be missing here.
Context: ...n Figure 1. ::: (table-3)= :::{card} | clp-s value-type | Descrip...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


183-190: Encoding schemas introduction is clear
The explanation of leaf-node ID encoding and the reference to Figure 3 are accurate and concise.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

190-190: Link fragments should be valid
null

(MD051, link-fragments)


202-208: Merging algorithm bullets are self-contained
The two-item list clearly describes the merge logic without ambiguity.


293-299: Array-type distinctions are well explained
The difference between StructuredArray and UnstructuredArray is clear and accurate.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

293-293: Line length
Expected: 80; Actual: 98

(MD013, line-length)


294-294: Line length
Expected: 80; Actual: 95

(MD013, line-length)


295-295: Line length
Expected: 80; Actual: 96

(MD013, line-length)


296-296: Line length
Expected: 80; Actual: 97

(MD013, line-length)


297-297: Line length
Expected: 80; Actual: 97

(MD013, line-length)


298-298: Line length
Expected: 80; Actual: 98

(MD013, line-length)


301-307: Encoded record table (ERT) section is concise
The description of ERT grouping and search efficiency is well phrased.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

303-303: Line length
Expected: 80; Actual: 97

(MD013, line-length)


304-304: Line length
Expected: 80; Actual: 97

(MD013, line-length)


305-305: Line length
Expected: 80; Actual: 94

(MD013, line-length)


306-306: Line length
Expected: 80; Actual: 94

(MD013, line-length)


309-317: Archive-writing details are informative
The explanation of compression streams and buffering strategies effectively conveys the design trade-offs.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

311-311: Line length
Expected: 80; Actual: 99

(MD013, line-length)


312-312: Line length
Expected: 80; Actual: 100

(MD013, line-length)


313-313: Line length
Expected: 80; Actual: 91

(MD013, line-length)


314-314: Line length
Expected: 80; Actual: 100

(MD013, line-length)


315-315: Line length
Expected: 80; Actual: 99

(MD013, line-length)


327-339: Encoded text AST example is clear
The bullet list and sub-item demonstrate the decomposition effectively. No changes necessary.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

327-327: Line length
Expected: 80; Actual: 96

(MD013, line-length)


329-329: Line length
Expected: 80; Actual: 93

(MD013, line-length)


331-331: Line length
Expected: 80; Actual: 98

(MD013, line-length)


340-347: Note and tip callouts are appropriate
These contextual callouts guide readers to the relevant research paper and clarify terminology.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

341-341: Line length
Expected: 80; Actual: 98

(MD013, line-length)


342-342: Line length
Expected: 80; Actual: 90

(MD013, line-length)


349-351: Reference links are valid and necessary
The two citations correctly point to the original USENIX papers.

Comment thread docs/src/dev-guide/design-kv-ir-streams/background.md
Comment on lines +12 to +18
## clp-s compression

At a high-level, [clp-s](../../user-guide/core-clp-s.md) compresses log events into what we call
archives. Depending on the configured size-threshold for each archive, a set of log events may be
compressed into one or more archives. The archive format is designed so that each archive is
self-contained and independent, allowing archives to be searched concurrently.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use title case and correct hyphenation
For consistency with other headings, rewrite

## clp-s compression

to

## CLP-S Compression

Also change “At a high-level” to “At a high level” to remove the unnecessary hyphen.

Comment on lines +1 to +2
# Background

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clarify document title for context
The main heading # Background is quite generic. Consider renaming it to something more descriptive, for example:

# Background for KV-IR Stream Format

This immediately signals the scope of the document to readers.

Comment on lines +3 to +9
To understand the KV-IR stream format, we first need to review the following:

* [How clp-s compresses log events](#clp-s-compression), since the process for KV-IR streams is
similar but makes a different trade-off between resource usage, compression ratio, and search
performance.
* [How clp-s parses and encodes unstructured text values](#parsing--encoding-unstructured-text).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify internal link targets and phrasing
The list items reference #clp-s-compression and #parsing--encoding-unstructured-text. Please confirm those anchors match the actual slugified headings (especially the double hyphen for “Parsing & encoding”). You may also simplify phrasing for parallelism, e.g.:

* Clp-S Compression of log events
* Parsing and Encoding Unstructured Text

🏁 Script executed:

#!/bin/bash
# List all markdown headings in the background.md file to verify anchor slugs
grep -nE '^#{1,6} ' docs/src/dev-guide/design-kv-ir-streams/background.md || true

Length of output: 329


Verified internal anchors; phrasing suggestion
I’ve confirmed that the current link targets match the slugified headings in background.md:

  • “clp-s compression” → #clp-s-compression
  • “Parsing & encoding unstructured text” → #parsing--encoding-unstructured-text

For improved parallelism, you may update the list to:

* [Clp-S Compression of log events](#clp-s-compression)
* [Parsing and Encoding Unstructured Text](#parsing--encoding-unstructured-text)

@kirkrodrigues kirkrodrigues requested review from gibber9809 and removed request for gibber9809 May 6, 2025 14:32

@gibber9809 gibber9809 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated docs LGTM.

@kirkrodrigues kirkrodrigues merged commit 902a366 into y-scope:main May 6, 2025
5 checks passed
@kirkrodrigues kirkrodrigues deleted the kv-ir-stream-docs branch May 6, 2025 15:39
junhaoliao pushed a commit to junhaoliao/clp that referenced this pull request May 17, 2026
…format. (y-scope#833)

Co-authored-by: Lin Zhihao <59785146+LinZhihao-723@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants