chore:自动i18n by overflow65537 · Pull Request #1123 · MaaEnd/MaaEnd

overflow65537 · 2026-03-08T14:43:44Z

Summary by Sourcery

在 CI 中接入按需执行的自动化流程：将流水线 JSON 中 OCR 期望文本与集中式 i18n 翻译表进行自动同步。

Enhancements:

新增一个 Python 工具，用于基于 CN/TC/EN/JP 的 i18n 表和可选的热修复覆盖，对流水线定义中的 OCR 期望字段进行规范化处理。
在 tools/i18n 下引入翻译子模块及相关忽略规则，将其作为本地化文本 ID 的权威来源。
批量更新流水线 JSON 资源，使 OCR 节点统一使用标准化的、多语言期望文本格式。

CI:

新增一个 i18n-sync GitHub Actions 工作流，用于更新翻译子模块、以写入模式运行 OCR 期望文本同步脚本，并报告变更的文件。

Original summary in English

Summary by Sourcery

Automate synchronization of OCR expected texts in pipeline JSONs with centralized i18n translation tables and wire it into CI for on-demand execution.

Enhancements:

Add a Python utility to normalize OCR expected fields in pipeline definitions using CN/TC/EN/JP i18n tables and optional hotfix overrides.
Introduce a translation submodule and related ignore rules under tools/i18n to serve as the authoritative source for localized text IDs.
Bulk-update pipeline JSON resources so OCR nodes use the standardized, multi-language expected text format.

CI:

Add an i18n-sync GitHub Actions workflow that updates the translation submodule, runs the OCR expected sync script in write mode, and reports changed files.

sourcery-ai

Hey - 我发现了 2 个问题

给 AI Agent 的提示

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tools/i18n/sync_ocr_expected.py" line_range="423-432" />
<code_context>
+    unresolved_texts: List[str] = []
+
+    # 第一轮：唯一命中
+    for text, candidates in candidates_by_text:
+        if len(candidates) == 1:
+            lang_id = next(iter(candidates))
+            if lang_id not in resolved_set:
+                resolved_in_order.append(lang_id)
+                resolved_set.add(lang_id)
+        elif len(candidates) == 0:
+            unresolved_texts.append(text)
+
+    # 第二轮：如果歧义候选与已解析 ID 有交集，用交集兜底
+    for text, candidates in candidates_by_text:
+        if len(candidates) > 1:
+            intersection = [lang_id for lang_id in resolved_in_order if lang_id in candidates]
+            if len(intersection) == 1:
+                lang_id = intersection[0]
+                if lang_id not in resolved_set:
</code_context>
<issue_to_address>
**issue (bug_risk):** 由于 `resolved_set` 的检查，当前歧义文本的解析逻辑永远无法从交集中分配 ID。

在第二轮中，`intersection` 来源于 `resolved_in_order`，而 `resolved_set` 也是基于同一个列表构建的。这意味着对 `intersection` 中任意 `lang_id` 来说，`if lang_id not in resolved_set` 这个条件始终为假，因此分支体永远不会执行，歧义文本也就不会通过这个启发式规则被解析。如果你希望歧义文本能复用已有的 ID，大概率需要 (a) 去掉这个条件判断，并避免重新加入 `resolved_in_order`/`resolved_set`，或者 (b) 允许 `resolved_in_order` 中出现重复项。按当前写法，第二轮的逻辑实际上是死代码。
</issue_to_address>

### Comment 2
<location path="tools/i18n/sync_ocr_expected.py" line_range="472-481" />
<code_context>
+    unresolved_all: List[Tuple[str, str, List[str]]] = []
+
+    for file_path in pipeline_files:
+        try:
+            new_text, changes, unresolved_nodes, ocr_nodes = process_pipeline_file(
+                file_path, tables, reverse_index
+            )
+        except Exception as exc:
+            safe_print(f"[ERROR] {file_path}: {exc}")
+            continue
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 在处理文件时吞掉所有异常会隐藏损坏的 JSON 或逻辑错误，同时仍然返回退出码 0。

由于循环在捕获到 `Exception` 后只记录日志并继续执行，即使有部分文件解析或处理失败，CI 仍然会显示为成功，导致更新结果悄无声息地变成“只更新了一部分”。建议对失败进行统计（例如使用 `failed_files` 计数或列表），并在 `main()` 中如果存在失败就以非零状态码退出；或者对于像 JSON/解析错误这类关键异常选择性地重新抛出，让任务显式失败。

建议实现如下：

```python
    unresolved_all: List[Tuple[str, str, List[str]]] = []
    failed_files: List[Tuple[str, Exception]] = []

    for file_path in pipeline_files:
        try:
            new_text, changes, unresolved_nodes, ocr_nodes = process_pipeline_file(
                file_path, tables, reverse_index
            )
        except Exception as exc:
            safe_print(f"[ERROR] {file_path}: {exc}")
            failed_files.append((file_path, exc))
            continue

    if failed_files:
        safe_print(
            f"[ERROR] 共有 {len(failed_files)} 个文件处理失败，退出码为 1："  # summary in Chinese to match existing comments
        )
        for path, exc in failed_files:
            safe_print(f"  - {path}: {exc}")
        # 非零退出码以便 CI 正确失败
        sys.exit(1)

```

1. 确保在 `tools/i18n/sync_ocr_expected.py` 文件顶部已经导入了 `sys`：

   - 如果还没有，请和其他 import 放在一起添加 `import sys`。

2. 以上代码片段很可能位于 `main()` 或其他顶层函数中。如果这个循环不在 `main()` 里，请确保最终是由 `main()` 在该函数返回失败状态后调用 `sys.exit(1)`（或等价的非零退出），而不是从调用栈深处直接退出进程。如有需要，请相应地调整控制流。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的代码评审。}

Original comment in English

Hey - I've found 2 issues

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="tools/i18n/sync_ocr_expected.py" line_range="423-432" />
<code_context>
+    unresolved_texts: List[str] = []
+
+    # 第一轮：唯一命中
+    for text, candidates in candidates_by_text:
+        if len(candidates) == 1:
+            lang_id = next(iter(candidates))
+            if lang_id not in resolved_set:
+                resolved_in_order.append(lang_id)
+                resolved_set.add(lang_id)
+        elif len(candidates) == 0:
+            unresolved_texts.append(text)
+
+    # 第二轮：如果歧义候选与已解析 ID 有交集，用交集兜底
+    for text, candidates in candidates_by_text:
+        if len(candidates) > 1:
+            intersection = [lang_id for lang_id in resolved_in_order if lang_id in candidates]
+            if len(intersection) == 1:
+                lang_id = intersection[0]
+                if lang_id not in resolved_set:
</code_context>
<issue_to_address>
**issue (bug_risk):** Ambiguous text resolution logic can never assign an ID from the intersection because of the `resolved_set` check.

In the second pass, `intersection` is derived from `resolved_in_order`, and `resolved_set` is built from the same list. That makes the `if lang_id not in resolved_set` guard always false for any `lang_id` in `intersection`, so the body never runs and ambiguous texts are never resolved via this heuristic. If you want ambiguous texts to reuse an existing ID, you likely need to (a) drop this guard and avoid re-adding to `resolved_in_order`/`resolved_set`, or (b) allow duplicate entries in `resolved_in_order`. As written, the second-pass logic is effectively dead code.
</issue_to_address>

### Comment 2
<location path="tools/i18n/sync_ocr_expected.py" line_range="472-481" />
<code_context>
+    unresolved_all: List[Tuple[str, str, List[str]]] = []
+
+    for file_path in pipeline_files:
+        try:
+            new_text, changes, unresolved_nodes, ocr_nodes = process_pipeline_file(
+                file_path, tables, reverse_index
+            )
+        except Exception as exc:
+            safe_print(f"[ERROR] {file_path}: {exc}")
+            continue
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Swallowing all exceptions when processing files can hide broken JSON or logic errors while still returning exit code 0.

Because the loop only logs and continues on `Exception`, CI will show success even when some files fail to parse or process, leading to silently partial updates. Consider tracking failures (e.g., a `failed_files` counter) and exiting non‑zero from `main()` if any occur, or selectively re‑raising for critical exceptions like JSON/parse errors so the job fails visibly.

Suggested implementation:

```python
    unresolved_all: List[Tuple[str, str, List[str]]] = []
    failed_files: List[Tuple[str, Exception]] = []

    for file_path in pipeline_files:
        try:
            new_text, changes, unresolved_nodes, ocr_nodes = process_pipeline_file(
                file_path, tables, reverse_index
            )
        except Exception as exc:
            safe_print(f"[ERROR] {file_path}: {exc}")
            failed_files.append((file_path, exc))
            continue

    if failed_files:
        safe_print(
            f"[ERROR] 共有 {len(failed_files)} 个文件处理失败，退出码为 1："  # summary in Chinese to match existing comments
        )
        for path, exc in failed_files:
            safe_print(f"  - {path}: {exc}")
        # 非零退出码以便 CI 正确失败
        sys.exit(1)

```

1. Ensure `sys` is imported at the top of `tools/i18n/sync_ocr_expected.py`:

   - Add `import sys` alongside the other imports if it's not already present.

2. The shown snippet is likely inside `main()` or another top‑level function. If this loop is not in `main()`, make sure that the `sys.exit(1)` (or equivalent non‑zero exit) is ultimately called from `main()` after this function returns a status indicating failure, instead of exiting directly from deep inside the call stack. Adjust the control flow accordingly if needed.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

tools/i18n/sync_ocr_expected.py

Copilot

Pull request overview

This PR introduces an automated i18n sync mechanism to standardize OCR node expected fields across pipelines by expanding them into CN/TC/EN/JP text, and adds automation support via a translation submodule and a GitHub Action.

Changes:

Added a Python tool to scan pipeline JSON/JSONC and rewrite OCR expected arrays using i18n tables (+ hotfix merge support).
Added a translation referrer submodule and a workflow to run the sync in write mode.
Updated many pipeline expected arrays to include multi-language variants and reorder/append unresolved originals.

Reviewed changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tools/i18n/sync_ocr_expected.py	New sync tool to expand OCR `expected` using i18n reverse lookup + hotfix merge.
tools/i18n/EndFieldTranslationReferrer	Adds translation tables provider as a submodule gitlink.
tools/i18n/.gitignore	Ignores local i18n table JSON files under `tools/i18n`.
assets/resource_fast/pipeline/SellProduct/SellCore.json	Updates OCR `expected` entries to multi-language set / reorders unresolved items.
assets/resource_fast/pipeline/SellProduct/ChangeGoods.json	Updates OCR `expected` entries and removes inline comments.
assets/resource_fast/pipeline/DeliveryJobs/Wuling.json	Reorders/expands OCR `expected` list.
assets/resource_fast/pipeline/DeliveryJobs/ValleyIV.json	Reorders/expands OCR `expected` lists in multiple nodes.
assets/resource_fast/pipeline/DeliveryJobs/TransferJob.json	Reorders/expands OCR `expected` list.
assets/resource_fast/pipeline/DeliveryJobs/PackCargo.json	Expands OCR `expected` list, preserving original shorter strings at end.
assets/resource_fast/pipeline/Common/Text.json	Reorders/expands OCR `expected` list.
assets/resource_adb/pipeline/SeizeEntrustTask.json	Expands OCR `expected` into CN/TC/EN/JP.
assets/resource_adb/pipeline/BAKER.json	Expands OCR `expected` into CN/TC/EN/JP.
assets/resource/pipeline/Weapon.json	Expands OCR `expected` into CN/TC/EN/JP.
assets/resource/pipeline/SimpleProductionBatch.json	Expands/reorders OCR `expected` lists for multiple nodes.
assets/resource/pipeline/SeizeEntrustTask.json	Expands/reorders OCR `expected` lists; appends unresolved originals.
assets/resource/pipeline/SceneManager/SceneValleyIV.json	Expands/reorders OCR `expected` list.
assets/resource/pipeline/SceneManager/SceneMenu.json	Expands/reorders OCR `expected` lists (including repeated “ENDFIELD”).
assets/resource/pipeline/SceneManager/SceneMap.json	Expands OCR `expected` list with additional translations.
assets/resource/pipeline/SceneManager/SceneLogin.json	Reorders/expands OCR `expected` list (note: introduces a trailing space).
assets/resource/pipeline/Resell/ResellROI.json	Expands/reorders OCR `expected` list.
assets/resource/pipeline/ReadWiki.json	Expands OCR `expected` list with TC/JP.
assets/resource/pipeline/ProtocolSpace/Prepare.json	Expands OCR `expected` list with TC/JP.
assets/resource/pipeline/ProtocolSpace/OperationalManual.json	Expands OCR `expected` list (adds JP string).
assets/resource/pipeline/ItemTransfer.json	Expands/reorders OCR `expected` lists.
assets/resource/pipeline/ImportBluePrints.json	Expands OCR `expected` list with TC/EN/JP.
assets/resource/pipeline/GiftOperator.json	Expands OCR `expected` list with TC/EN/JP.
assets/resource/pipeline/GearAssembly.json	Expands OCR `expected` lists with TC/EN/JP.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/RainbowFin.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/IndoorCrops.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/EternalSunset.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/CollapsedTianshiPillar.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/CleansingJade.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/CisternOriginiumSlugs.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/BeaconDamagedInBlightTide.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/EnvironmentMonitoring/OutskirtsMonitoringTerminal/AncientTree.json	Cleans inline comments and reorders OCR `expected` entries.
assets/resource/pipeline/DijiangRewards/Template/TextTemplate.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/DijiangRewards/Template/Status.json	Expands/reorders OCR `expected` entries; adds time-format variants.
assets/resource/pipeline/DijiangRewards/Template/Location.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/DijiangRewards/RecoveryEmotion.json	Expands/reorders OCR `expected` entries; appends unresolved originals.
assets/resource/pipeline/DijiangRewards/ReceptionRoom.json	Expands OCR `expected` entries (adds JP).
assets/resource/pipeline/DijiangRewards/MainFlow.json	Expands/reorders OCR `expected` entries; appends unresolved original.
assets/resource/pipeline/DijiangRewards/GrowthChamber.json	Expands/reorders OCR `expected` entries; adds multiple English variants.
assets/resource/pipeline/DailyRewards/Tasks.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/DailyRewards/ProtocolPass.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/DailyRewards/Emails.json	Expands OCR `expected` entries; adds multiple mailbox synonyms.
assets/resource/pipeline/DailyRewards/ClaimDeliveryJobs.json	Adds EN/JP variants alongside existing expected items.
assets/resource/pipeline/CreditShopping/Reflash.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/CreditShopping/ClaimCredit.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/Crafting.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/BatchAddFriends.json	Expands OCR `expected` entries to multi-language set.
assets/resource/pipeline/BAKER.json	Expands/reorders OCR `expected` entries; appends unresolved original.
assets/resource/pipeline/AutoSimulationCollect.json	Expands/reorders OCR `expected` entries; appends unresolved original.
.gitmodules	Registers the new translation referrer submodule.
.github/workflows/i18n-sync.yml	Adds a workflow to update submodule and run OCR expected sync in `--write` mode.

Comments suppressed due to low confidence (3)

tools/i18n/sync_ocr_expected.py:1

If a block comment /* ... */ is unterminated, the loop exits when i + 1 >= self.n, but the code still does i += 2, which can advance past EOF and cause downstream out-of-range indexing. Consider explicitly detecting the unterminated case and raising ValueError(\"Unterminated block comment\") before incrementing past the end.
assets/resource/pipeline/SceneManager/SceneLogin.json:1
This string contains a trailing space, which can cause OCR matching to fail or reduce confidence because the expected token no longer matches the UI text exactly. Please remove the trailing whitespace so the expected value is stable.
tools/i18n/sync_ocr_expected.py:1
This function signature exceeds common Python style line-length limits, which hurts readability and makes future diffs noisier. Consider wrapping parameters onto multiple lines (PEP 8 style) for consistency with the rest of the file.

.github/workflows/i18n-sync.yml

…e and improve error handling in sync script - Introduced an optional input to the i18n-sync workflow for updating the translation submodule to the latest commit. - Added a step to display the current commit of the translation submodule. - Enhanced error handling in the sync_ocr_expected.py script to report failed file processing.

.github/workflows/i18n-sync.yml

assets/resource/pipeline/CreditShopping/ClaimCredit.json

.gitmodules

overflow65537 added 3 commits March 8, 2026 22:26

自动i18n

42ab828

ci中自动执行i18n脚本

1649a08

执行效果

a653764

Copilot AI review requested due to automatic review settings March 8, 2026 14:43

sourcery-ai bot reviewed Mar 8, 2026

View reviewed changes

tools/i18n/sync_ocr_expected.py Show resolved Hide resolved

tools/i18n/sync_ocr_expected.py Show resolved Hide resolved

Copilot AI reviewed Mar 8, 2026

View reviewed changes

.github/workflows/i18n-sync.yml Outdated Show resolved Hide resolved

MistEO reviewed Mar 8, 2026

View reviewed changes

.github/workflows/i18n-sync.yml Show resolved Hide resolved

MistEO reviewed Mar 8, 2026

View reviewed changes

assets/resource/pipeline/CreditShopping/ClaimCredit.json Show resolved Hide resolved

MistEO reviewed Mar 8, 2026

View reviewed changes

.gitmodules Outdated Show resolved Hide resolved

overflow65537 added 3 commits March 8, 2026 22:57

移除子模块,每次使用前clone翻译仓库

ab01833

自动pr

33ad1b8

添加一个标志来防止i18n错误更改文本

d2a4046

MistEO merged commit b9e6f83 into main Mar 8, 2026
17 checks passed

MistEO deleted the chore/i18n branch March 8, 2026 15:19

MistEO pushed a commit that referenced this pull request Mar 9, 2026

feat: 自动i18n (#1123)

b538924

MistEO pushed a commit that referenced this pull request Mar 9, 2026

feat: 自动i18n (#1123)

34a0457

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore:自动i18n#1123

chore:自动i18n#1123
MistEO merged 7 commits intomainfrom
chore/i18n

overflow65537 commented Mar 8, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

overflow65537 commented Mar 8, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

overflow65537 commented Mar 8, 2026 •

edited by sourcery-ai bot

Loading