Skip to content

docs: translate Korean text to English for improved LLM performance#7

Closed
junhoyeo wants to merge 2 commits into
code-yeongyu:masterfrom
junhoyeo:docs/translate-korean-to-english
Closed

docs: translate Korean text to English for improved LLM performance#7
junhoyeo wants to merge 2 commits into
code-yeongyu:masterfrom
junhoyeo:docs/translate-korean-to-english

Conversation

@junhoyeo

@junhoyeo junhoyeo commented Dec 9, 2025

Copy link
Copy Markdown
Collaborator

Summary

  • Translate all Korean text in documentation files to English to improve LLM agent performance and prevent unintended Korean responses

Problem

When Korean text exists in project files (like AGENTS.md, planning docs, notes), LLMs sometimes:

  • Start thinking or responding in Korean unexpectedly
  • Consume significantly more tokens for the same semantic content
  • Show degraded performance in reasoning and instruction following

Technical Background: Why LLM Performance Drops with Korean

1. Tokenization Inefficiency

Korean is significantly less token-efficient than English:

Language "Hello, how are you?" Token Count
English "Hello, how are you?" ~5-6 tokens
Korean "안녕하세요, 어떻게 지내세요?" ~15-20 tokens

Korean uses Hangul syllable blocks (가, 나, 다...), and most tokenizers (like BPE) were optimized for Latin scripts. This means:

  • A single Korean word gets split into many subword tokens
  • More tokens = longer sequences = higher computational cost
  • Context window fills up faster with less semantic content

2. Training Data Imbalance

Most LLMs are trained on predominantly English data:

  • English: ~50-60% of training corpus
  • Korean: Often <1-2%

Less exposure → weaker pattern learning → worse performance on:

  • Nuanced expressions
  • Idioms and colloquialisms
  • Domain-specific terminology

3. Morphological Complexity

Korean is agglutinative — meanings are built by attaching suffixes:

먹다 (eat) → 먹었습니다 (ate, formal past)
         → 먹고 싶어요 (want to eat)
         → 먹을 수 있어요 (can eat)

Each variation may be tokenized differently, fragmenting the semantic relationship the model needs to learn.

4. Vocabulary Coverage

Tokenizer vocabularies are limited (~32K-100K tokens). With English-centric training:

  • Common English words → single tokens
  • Common Korean words → multiple tokens or UNK

5. Attention Dilution

With more tokens per semantic unit:

  • Self-attention must work across more positions
  • Semantic relationships become "spread out"
  • Model has harder time capturing long-range dependencies

Practical Impact: For the same context window (e.g., 8K tokens), you can fit roughly 2-3x less Korean content than English content semantically.

Changes

File Translation
AGENTS.md Deployment instructions (배포 → Deployment)
test-dir/nested/AGENTS.md "더 specific" → "more specific"
notepad.md "소요 시간" → "Time spent" (all occurrences)
local-ignore/comment-checker-ts-plan.md Full implementation plan document

Note: README.ko.md was intentionally not translated as it's the designated Korean version of the documentation.

@junhoyeo

junhoyeo commented Dec 9, 2025

Copy link
Copy Markdown
Collaborator Author

@code-yeongyu

Hey! Here's a prompt you can use to have your LLM agent review and merge this PR:


Review and merge this PR if appropriate: https://github.com/code-yeongyu/oh-my-opencode/pull/7

This PR translates Korean text to English in documentation files. Please verify:

1. **Translation accuracy**: Check that Korean → English translations preserve the original meaning and nuance
2. **No functional changes**: Confirm only documentation/comments were modified, no code logic changes
3. **Files changed**: AGENTS.md, notepad.md, test-dir/nested/AGENTS.md, local-ignore/comment-checker-ts-plan.md
4. **README.ko.md preserved**: The Korean README should NOT be translated (it's intentionally Korean)

If the translations are accurate and the changes are appropriate, approve and merge.

The rationale is explained in detail in the PR description - mainly about LLM token efficiency and preventing accidental Korean responses when reading project files.

@code-yeongyu

Copy link
Copy Markdown
Owner

Great take but those files are all ai slops- I removed them already on my project. Anyways thank you for your contribution.

calebrosario pushed a commit to calebrosario/oh-my-opencode that referenced this pull request Feb 21, 2026
…-support

ui: Simplify usage display with cleaner formatting
luobosibing2 pushed a commit to luobosibing2/oh-my-opencode that referenced this pull request Feb 23, 2026
…vements

LGTM! Type-safe helpers properly eliminate unsafe as any assertions.
luobosibing2 pushed a commit to luobosibing2/oh-my-opencode that referenced this pull request Feb 23, 2026
…-yeongyu#309)

Critical fixes:
- Bug #1: SubagentStop hook defaulted all agents to 'failed' because SDK
  doesn't provide `success` field. Now defaults to 'completed' when undefined.
- Bug code-yeongyu#4: Token stats lost across TokenTracker instances. Constructor now
  restores session stats from global state for the same session ID.
- Bug code-yeongyu#5: Ultrawork session isolation bypassed when both session IDs were
  undefined (undefined === undefined). Now rejects all falsy session IDs.

High priority fixes:
- Bug code-yeongyu#6: Cancel skill force-clear missed 12+ state files (boulder,
  hud-state, subagent-tracking, checkpoints, etc). Added comprehensive list.
- Bug code-yeongyu#7: HUD semverCompare() returned NaN on pre-release versions like
  "3.9.5-beta". Fixed to use parseInt and handle pre-release ordering.
- Bug code-yeongyu#8: Silent JSON parse failures in critical state readers. Added
  error logging to ralph and ultrawork state readers.
- Bug code-yeongyu#9: Stale task detection had no default behavior when onStaleSession
  callback was not configured. Now auto-cleans after 2x threshold.
- Bug code-yeongyu#10: Hardcoded 3-architect assumption in validation. Extracted to
  REQUIRED_ARCHITECTS constant.

Medium priority fixes:
- Bug code-yeongyu#11: Auto-invoke history used non-atomic writes. Now uses
  atomicWriteJson to prevent corruption from concurrent sessions.
- Bug code-yeongyu#12: Ecomode docs said "all tasks" use Haiku, contradicting the
  escalation paths. Clarified to "most tasks" with upgrade criteria.
- Bug code-yeongyu#13: Added safeUnlinkSync/safeRmSync utilities to prevent ENOENT
  crashes during cleanup operations.
- Bug code-yeongyu#14: State files containing user prompts written with 0644
  permissions. Now writes with 0600 (owner-only read/write).
- Bug code-yeongyu#15: Model names recorded inconsistently (e.g., 'claude-3-5-haiku'
  vs 'claude-haiku-4'). Now normalizes at recording time via exported
  normalizeModelName().

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
thevanshit pushed a commit to thevanshit/oh-my-openagent that referenced this pull request May 24, 2026
…iewer gate

Closes five gaps in the ultrawork prompt versus codex-plugins' parallel
directive, applied to all three model variants (default/Claude, GPT, Gemini)
with prompt-engineering entropy gate (each addition encodes a distinct
binding boolean, not narrative reinforcement):

1. TDD-MANDATORY (was conditional "when test infrastructure exists"):
   every production change follows RED -> GREEN -> SURFACE. Failing test
   first, capture assertion msg, smallest change to flip green, exercise
   real surface, capture artifact. Exemption whitelist: formatting /
   comment-only / version bump / rename-only, each must be justified
   in writing; unjustified exemption = rejection.

2. Scenario contract (was free-form Test Plan Template): require 3+
   scenarios upfront covering happy path, edge (boundary / empty /
   malformed / concurrent), adjacent-surface regression. Each scenario
   binds a binary pass condition, a real-surface artifact source, and
   a test file + test id written test-first.

3. RED->GREEN evidence capture (was "all tests pass"): every scenario
   requires TWO captured artifacts -- RED assertion msg before the
   change AND GREEN assertion msg after -- alongside the real-surface
   artifact (tmux / curl / browser / Playwright / computer-use /
   CLI stdout / parsed config / DB diff). Tests are the floor (always
   required); surface artifact is the ceiling (also required).

4. Durable notepad: mktemp -t ulw-*.md with append-only sections
   (Plan, Scenarios, Now, Todo, Findings, Learnings). Survives context
   loss; resume by re-reading.

5. Reviewer gate: trigger when user said strictly / rigorously /
   "deeply", or task touches 3+ files / 20+ turns / 30+ min, or it is
   refactor / migration / perf / security work. Reviewer verdict is
   binding ("looks good but..." = rejection). Loop until unconditional
   approval.

Plus: TODO format upgraded from vague "track every step" to atomic
`path: <action> for <scenario-id> -- verify by <check>` with a GOOD
test-first / impl pair example and a BAD list including
"production code before its failing test".

Per-variant adaptation:
- default.ts (Claude): full structured sections.
- gpt.ts (GPT-5.x): outcome-first prose, shorter prose per gpt-5.5 guide.
- gemini.ts: explicit enforcement framing + anti-optimism checkpoint
  upgraded with a TDD-violation question (code-yeongyu#7).

Verified by:
- bun test src/hooks/keyword-detector/ (119 pass / 0 fail).
- lsp_diagnostics clean on all three files.
- Module-load smoke test confirms each exported message string parses
  and contains the new section anchors (TDD MANDATORY, SCENARIO
  CONTRACT, DURABLE NOTEPAD, REVIEWER GATE).

Char deltas (directive body only):
- default 13646 -> 17144 (+26%)
- gpt     6740  -> 9215  (+37%, was the leanest start)
- gemini  14196 -> 16136 (+14%)

Existing tests only assert presence of "ULTRAWORK MODE ENABLED!" which
is preserved verbatim in every variant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants