-
Notifications
You must be signed in to change notification settings - Fork 1
[Translation] Update TRANSLATION_DOCUMENTATION_README.md with accurate body content translation percentages #1098
Description
🎯 Objective
Update the TRANSLATION_DOCUMENTATION_README.md to reflect the actual body content translation status rather than the misleading infrastructure-only metrics.
📋 Background
The current TRANSLATION_DOCUMENTATION_README.md reports quality scores of 85-99% across languages. However, these scores measure infrastructure completeness (hreflang tags, Schema.org metadata, navigation elements) — NOT actual body content translation.
Actual body content translation status (measured via English sentence pattern detection):
| Language | Claimed Quality | Actual Body Translation | Files w/ English Body | Gap |
|---|---|---|---|---|
| 🇳🇱 Dutch | 91%+ | 13% | 84/96 | -78% |
| 🇩🇰 Danish | 95% | 24% | 73/96 | -71% |
| 🇫🇮 Finnish | 98% | 25% | 72/96 | -73% |
| 🇫🇷 French | 85%+ | 29% | 69/96 | -56% |
| 🇸🇪 Swedish | 99.2% | 38% | 60/96 | -61% |
| 🇩🇪 German | 98.9% | 42% | 56/96 | -57% |
| 🇰🇷 Korean | 75%+ | 43% | 55/96 | -32% |
| 🇪🇸 Spanish | 96.6% | 45% | 53/96 | -52% |
| 🇳🇴 Norwegian | 99.5% | 46% | 52/96 | -54% |
| 🇮🇱 Hebrew | 93%+ | 48% | 50/96 | -45% |
| 🇸🇦 Arabic | 67.7% | 60% | 39/96 | -8% |
| 🇨🇳 Chinese | 95%+ | 66% | 33/96 | -29% |
| 🇯🇵 Japanese | 95%+ | 68% | 31/96 | -27% |
The document itself acknowledges this in one section ("~450-488 files (36-39%) need body content translation") but the per-language quality scores on the summary table are misleading.
✅ Acceptance Criteria
-
TRANSLATION_DOCUMENTATION_README.mdupdated with two separate metric categories:- Infrastructure Completion (hreflang, metadata, navigation, Schema.org) — current high scores
- Body Content Translation (actual article body text translated to target language) — the real gaps shown above
- Each language section clearly states how many files have translated body content vs infrastructure-only
- The summary table shows both metrics side-by-side
- A prioritized "Next Steps" section identifies which languages/categories need translation work most urgently
- Remove or update any "🏆 PERFECT" / "🎉 COMPLETE" markers that are misleading given the body content gap
- Add a methodology note explaining how body content translation is measured vs infrastructure
🛠️ Implementation Guidance
Files to modify:
TRANSLATION_DOCUMENTATION_README.md— main update target
Approach:
- Update the main summary table to include both "Infrastructure %" and "Body Content %" columns
- For each language section, add a "Body Content Status" subsection with:
- Count of files with translated body content
- Count of files with English body content still pending
- Breakdown by category (blog, discordian, product, industry, core)
- Update status emojis: Use ✅ only when body content is actually translated, use 🚧 for infrastructure-only
- Add a "Translation Gap Analysis" section with the table from this issue
- Update the "Next Steps" section to prioritize based on actual body content gaps
Validation:
Run this command to verify counts:
for lang in ar ko zh ja he da fi no es fr nl sv de; do
count=0; total=0
for f in *_${lang}.html; do
[ -f "$f" ] || continue; total=$((total+1))
grep -P '<p>[A-Z][a-z]+ [a-z]+ [a-z]+' "$f" 2>/dev/null | head -1 | grep -qP '[A-Za-z]{3,}' && count=$((count+1))
done
echo "${lang}: ${count}/${total} files have English body"
done🤖 Recommended Agent
Agent: @hack23-isms-ninja
Rationale: This is a documentation accuracy and compliance task. The ISMS Ninja specializes in documentation quality and ensuring accurate status reporting.
For implementation:
- Audit current translation status claims against actual measurements
- Update documentation with accurate, two-tier metrics
- Ensure documentation follows Hack23 ISMS documentation standards
- Create clear, actionable "Next Steps" section for translation prioritization