Skip to content

Docs: add zh-CN entrypoint translations#6300

Merged
joshp123 merged 2 commits intomainfrom
docs-zh-cn-entrypoints
Feb 1, 2026
Merged

Docs: add zh-CN entrypoint translations#6300
joshp123 merged 2 commits intomainfrom
docs-zh-cn-entrypoints

Conversation

@joshp123
Copy link
Contributor

@joshp123 joshp123 commented Feb 1, 2026

Summary

  • add Go-based docs i18n pipeline (pi/Opus 4.5) with translation memory + glossary guidance
  • generate zh-CN versions of the first entry-point docs (index, getting-started, wizard)
  • add topbar language switcher link to /zh-CN

Testing

  • go run ./scripts/docs-i18n --lang zh-CN --docs docs/index.md
  • go run ./scripts/docs-i18n --lang zh-CN --docs docs/start/getting-started.md
  • go run ./scripts/docs-i18n --lang zh-CN --docs docs/start/wizard.md

Greptile Overview

Greptile Summary

This PR adds a Go-based documentation i18n pipeline (scripts/docs-i18n) that segments Markdown/HTML, preserves placeholders, consults a glossary + translation memory, and emits translated docs under docs/zh-CN/. It also updates docs/docs.json to include a topbar link to the new zh-CN entrypoint pages and adds i18n asset files (glossary + TM).

The overall approach fits the repo by keeping translation artifacts under docs/.i18n/ and generating localized markdown into a language-prefixed docs tree, while leaving existing English content intact.

Confidence Score: 3/5

  • Generally safe to merge, but there are a couple of edge cases in the new docs-i18n tool that could cause incorrect output or hard failures on larger inputs.
  • Most changes are additive docs/content and a standalone script. However, the new path-under-root check can be bypassed in some path layouts, TM loading can fail on long JSONL lines due to bufio.Scanner limits, and HTML tag stack handling may lead to incorrect translate/skip decisions on mismatched HTML blocks.
  • scripts/docs-i18n/process.go, scripts/docs-i18n/tm.go, scripts/docs-i18n/html_translate.go

@openclaw-barnacle openclaw-barnacle bot added docs Improvements or additions to documentation scripts Repository scripts labels Feb 1, 2026
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +15 to +21
absPath, err := filepath.Abs(filePath)
if err != nil {
return err
}
if !strings.HasPrefix(absPath, docsRoot) {
return fmt.Errorf("file %s not under docs root %s", absPath, docsRoot)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] strings.HasPrefix(absPath, docsRoot) is not a safe “is under directory” check: it will accept paths like /docs-root-evil/... when docsRoot is /docs-root, and can also behave unexpectedly with path separator differences. This can let inputs outside the docs root pass the guard (depending on how the tool is invoked). Prefer checking with filepath.Rel and ensuring the result doesn’t start with .. (and isn’t absolute).

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/docs-i18n/process.go
Line: 15:21

Comment:
[P1] `strings.HasPrefix(absPath, docsRoot)` is not a safe “is under directory” check: it will accept paths like `/docs-root-evil/...` when `docsRoot` is `/docs-root`, and can also behave unexpectedly with path separator differences. This can let inputs outside the docs root pass the guard (depending on how the tool is invoked). Prefer checking with `filepath.Rel` and ensuring the result doesn’t start with `..` (and isn’t absolute).

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +44 to +52
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if line == "" {
continue
}
var entry TMEntry
if err := json.Unmarshal([]byte(line), &entry); err != nil {
return nil, fmt.Errorf("translation memory decode failed: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] bufio.Scanner has a default token limit (~64K). Since TM entries can include long text/translated fields, a larger line will cause scanner.Err() and make LoadTranslationMemory fail on real-world docs. Using a bufio.Reader (ReadString/ReadBytes) or increasing the scanner buffer would avoid hard failures on large segments.

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/docs-i18n/tm.go
Line: 44:52

Comment:
[P1] `bufio.Scanner` has a default token limit (~64K). Since TM entries can include long `text`/`translated` fields, a larger line will cause `scanner.Err()` and make `LoadTranslationMemory` fail on real-world docs. Using a `bufio.Reader` (ReadString/ReadBytes) or increasing the scanner buffer would avoid hard failures on large segments.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +160 to +166
return stack[:len(stack)-1]
}
for i := len(stack) - 2; i >= 0; i-- {
if stack[i] == tag {
return stack[:i]
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] popTag returns stack[:i] when it finds a matching tag below the top; this drops the matching tag and everything above it, but also keeps any earlier unmatched tags. For nested/mismatched HTML, this can leave the stack in an inconsistent state and cause later text to be translated or skipped incorrectly (e.g., prematurely leaving a code/pre context). Consider trimming to stack[:i] vs stack[:i]/stack[:i+?] semantics carefully based on intended behavior.

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/docs-i18n/html_translate.go
Line: 160:166

Comment:
[P1] `popTag` returns `stack[:i]` when it finds a matching tag below the top; this drops the matching tag *and* everything above it, but also keeps any earlier unmatched tags. For nested/mismatched HTML, this can leave the stack in an inconsistent state and cause later text to be translated or skipped incorrectly (e.g., prematurely leaving a `code`/`pre` context). Consider trimming to `stack[:i]` vs `stack[:i]`/`stack[:i+?]` semantics carefully based on intended behavior.

How can I resolve this? If you propose a fix, please make it concise.

@joshp123
Copy link
Contributor Author

joshp123 commented Feb 1, 2026

Greptile fixes are in 6c4d2630b:

  • process.go: replace HasPrefix with filepath.Rel + ../abs guard for safe path-under-root checks.
  • tm.go: use bufio.Reader to avoid scanner 64K token limits on large JSONL lines.
  • html_translate.go: replace tag stack with skipDepth for code/pre/script/style, avoiding mismatched-tag corruption.

Let me know if you want me to re-run the translator outputs after these changes.

@joshp123 joshp123 merged commit 0e0e395 into main Feb 1, 2026
35 of 42 checks passed
@joshp123 joshp123 deleted the docs-zh-cn-entrypoints branch February 1, 2026 14:22
buiilding pushed a commit to buiilding/openclaw that referenced this pull request Feb 2, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
HashWarlock pushed a commit to HashWarlock/openclaw that referenced this pull request Feb 4, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
uxcu pushed a commit to uxcu/kook-openclaw that referenced this pull request Feb 5, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
bestNiu pushed a commit to bestNiu/clawdbot that referenced this pull request Feb 5, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
batao9 pushed a commit to batao9/openclaw that referenced this pull request Feb 7, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
hughdidit pushed a commit to hughdidit/DAISy-Agency that referenced this pull request Feb 8, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing

(cherry picked from commit 0e0e395)
hughdidit pushed a commit to hughdidit/DAISy-Agency that referenced this pull request Feb 8, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing

(cherry picked from commit 0e0e395)
battman21 pushed a commit to battman21/openclaw that referenced this pull request Feb 12, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
battman21 pushed a commit to battman21/openclaw that referenced this pull request Feb 12, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
zooqueen pushed a commit to hanzoai/bot that referenced this pull request Mar 6, 2026
* Docs: add zh-CN entrypoint translations

* Docs: harden docs-i18n parsing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation scripts Repository scripts

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant