fix: prevent UTF-8 panics on multi-byte characters by polaminggkub-debug · Pull Request #93 · rtk-ai/rtk

polaminggkub-debug · 2026-02-12T20:38:51Z

Summary

Fix 10 byte-indexed string slicing locations (&s[..n]) that panic on multi-byte UTF-8 characters (Thai, emoji, CJK)
Replace with char-aware operations: chars().take(), .get(), is_char_boundary()
Add safe_char_boundary() utility in utils.rs
Add 20 regression tests covering Thai, emoji, and CJK input

Problem

RTK crashes on non-ASCII content because Rust strings are UTF-8 encoded. Thai characters are 3 bytes, emoji are 4 bytes. &line[..77] can land mid-character and panic with byte index is not a char boundary.

Files Changed

File	Fix
`src/git.rs`	`filter_log_output`: `chars().take(77)` instead of `&line[..77]`; `format_status_output`: `.get()` instead of `&line[0..2]`
`src/log_cmd.rs`	Error/warning truncation: `chars().take(97)` instead of `&original[..97]` (2 locations)
`src/env_cmd.rs`	Long value display: `chars().take(50)`; `mask_value`: char-based prefix/suffix
`src/parser/mod.rs`	`truncate_output`: `is_char_boundary()` loop instead of `&output[..max_chars]`
`src/grep_cmd.rs`	`clean_line`: char-boundary snapping for pattern context; `chars().take()` for fallback
`src/wget_cmd.rs`	`compact_url`, `truncate_line`, `parse_error`: all switched to `chars().take()`
`src/utils.rs`	New `safe_char_boundary()` helper + multi-byte truncation tests

Test plan

cargo test — 284 tests pass (20 new UTF-8 regression tests)
cargo clippy — no new warnings (all pre-existing)
cargo build --release — builds successfully
Manual: create file with Thai name, run rtk git status — should not panic

🤖 Generated with Claude Code

pszymkowiak · 2026-02-12T22:24:25Z

review the commit there is a conflict. resolve to review the PR.

pszymkowiak · 2026-02-12T22:54:06Z

src/utils.rs

+/// let s = "hello";
+/// assert_eq!(safe_char_boundary(s, 3), 3);
+/// ```
+pub fn safe_char_boundary(s: &str, byte_idx: usize) -> usize {


is it use somewhere ?

pszymkowiak · 2026-02-12T23:04:05Z

src/wget_cmd.rs

@@ -199,11 +199,9 @@ fn compact_url(url: &str) -> String {
    if without_proto.len() <= 50 {


let char_count = without_proto.chars().count(); if char_count <= 50 { without_proto.to_string() } else { let prefix: String = without_proto.chars().take(25).collect(); let suffix: String = without_proto.chars().skip(char_count - 20).collect(); format!("{}...{}", prefix, suffix) }

chars , chars
?

Replace all byte-indexed string slicing (`&s[..n]`) with char-aware operations across 10 locations in 7 files. Rust strings are UTF-8, so byte slicing can land mid-character on Thai (3 bytes), emoji (4 bytes), or CJK text and panic at runtime. Fixes applied: - git.rs: filter_log_output, format_status_output - log_cmd.rs: error/warning message truncation (2 locations) - env_cmd.rs: long value display, mask_value - parser/mod.rs: truncate_output - grep_cmd.rs: clean_line (pattern context + fallback truncation) - wget_cmd.rs: compact_url, truncate_line, parse_error Also adds safe_char_boundary() utility and 20 regression tests covering Thai, emoji, and CJK input across all affected modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Address review feedback: collect chars into Vec once instead of calling .chars() multiple times. Also fixes byte-vs-char length check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pszymkowiak · 2026-02-13T11:56:01Z

waiting for passing function in chars and not a mix string chars.

Address review: function was defined but never called anywhere. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

polaminggkub-debug · 2026-02-13T12:04:08Z

Just to be clear you want me to stop mixing string slicing with chars and just do everything through .chars() instead, right?

pszymkowiak · 2026-02-13T13:12:09Z

yes go full .chars() .

Address review: use consistent .chars() everywhere instead of mixing byte-based is_char_boundary with char-based operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolved conflicts: - Version bumped to 0.15.4 (Cargo.toml, Cargo.lock, .release-please-manifest.json) - CHANGELOG.md: Added upstream releases (0.15.4, 0.15.3, 0.15.2) - Hooks: Adopted POSIX character classes ([[:space:]]) from upstream - src/parser/mod.rs: Added multibyte UTF-8 tests from upstream - src/ruff_cmd.rs: Kept functions public for lint/format dispatcher feature Upstream changes integrated: - rtk-ai#120: git status fix for non-repo folders - rtk-ai#93: UTF-8 panic prevention on multibyte chars - rtk-ai#98: POSIX grep compatibility in hooks - rtk-ai#95, rtk-ai#92: CI reliability and hook coverage improvements Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: prevent UTF-8 panics on multi-byte characters (Thai, emoji, CJK) Replace all byte-indexed string slicing (`&s[..n]`) with char-aware operations across 10 locations in 7 files. Rust strings are UTF-8, so byte slicing can land mid-character on Thai (3 bytes), emoji (4 bytes), or CJK text and panic at runtime. Fixes applied: - git.rs: filter_log_output, format_status_output - log_cmd.rs: error/warning message truncation (2 locations) - env_cmd.rs: long value display, mask_value - parser/mod.rs: truncate_output - grep_cmd.rs: clean_line (pattern context + fallback truncation) - wget_cmd.rs: compact_url, truncate_line, parse_error Also adds safe_char_boundary() utility and 20 regression tests covering Thai, emoji, and CJK input across all affected modules. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: collect chars once in compact_url to avoid redundant iteration Address review feedback: collect chars into Vec once instead of calling .chars() multiple times. Also fixes byte-vs-char length check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused safe_char_boundary function and tests Address review: function was defined but never called anywhere. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: convert remaining is_char_boundary to full .chars() approach Address review: use consistent .chars() everywhere instead of mixing byte-based is_char_boundary with char-based operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

polaminggkub-debug force-pushed the fix/utf8-panics branch from 7c5af55 to 74f40e7 Compare February 12, 2026 22:32

pszymkowiak reviewed Feb 12, 2026

View reviewed changes

polaminggkub-debug and others added 2 commits February 13, 2026 18:46

fix: collect chars once in compact_url to avoid redundant iteration

f25a2e5

Address review feedback: collect chars into Vec once instead of calling .chars() multiple times. Also fixes byte-vs-char length check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

polaminggkub-debug force-pushed the fix/utf8-panics branch from 5c3cfdb to f25a2e5 Compare February 13, 2026 11:54

fix: remove unused safe_char_boundary function and tests

46b926b

Address review: function was defined but never called anywhere. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: convert remaining is_char_boundary to full .chars() approach

289d86c

Address review: use consistent .chars() everywhere instead of mixing byte-based is_char_boundary with char-based operations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

pszymkowiak merged commit 155e264 into rtk-ai:master Feb 13, 2026
2 checks passed

github-actions bot mentioned this pull request Feb 13, 2026

chore(master): release 0.15.3 #108

Merged

pszymkowiak mentioned this pull request Feb 15, 2026

feat(docker): add docker compose support #110

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent UTF-8 panics on multi-byte characters#93

fix: prevent UTF-8 panics on multi-byte characters#93
pszymkowiak merged 4 commits intortk-ai:masterfrom
polaminggkub-debug:fix/utf8-panics

polaminggkub-debug commented Feb 12, 2026

Uh oh!

pszymkowiak commented Feb 12, 2026

Uh oh!

pszymkowiak Feb 12, 2026

Uh oh!

pszymkowiak Feb 12, 2026

Uh oh!

pszymkowiak commented Feb 13, 2026

Uh oh!

polaminggkub-debug commented Feb 13, 2026

Uh oh!

pszymkowiak commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -199,11 +199,9 @@ fn compact_url(url: &str) -> String {
		if without_proto.len() <= 50 {

Conversation

polaminggkub-debug commented Feb 12, 2026

Summary

Problem

Files Changed

Test plan

Uh oh!

pszymkowiak commented Feb 12, 2026

Uh oh!

pszymkowiak Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

pszymkowiak Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

pszymkowiak commented Feb 13, 2026

Uh oh!

polaminggkub-debug commented Feb 13, 2026

Uh oh!

pszymkowiak commented Feb 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants