Skip to content

Fix edge cases for input entry#972

Merged
gdamore merged 3 commits into
mainfrom
more-input-test
Jan 4, 2026
Merged

Fix edge cases for input entry#972
gdamore merged 3 commits into
mainfrom
more-input-test

Conversation

@gdamore

@gdamore gdamore commented Jan 4, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for extended UTF-8 characters, including 4-byte sequences and astral plane characters.
    • Improved Alt modifier support for rune keys from escape sequences.
  • Bug Fixes

    • Enhanced handling of terminal control sequences to prevent unintended events.
    • Improved UTF-8 error recovery and extended character processing in input parsing.
  • Tests

    • Expanded test coverage for terminal sequences and UTF-8 character handling.

✏️ Tip: You can customize this high-level summary in your review settings.

@codecov

codecov Bot commented Jan 4, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.31%. Comparing base (80a0969) to head (3a7a49c).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #972      +/-   ##
==========================================
+ Coverage   78.85%   80.31%   +1.45%     
==========================================
  Files          38       38              
  Lines        3675     3678       +3     
==========================================
+ Hits         2898     2954      +56     
+ Misses        638      585      -53     
  Partials      139      139              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@coderabbitai

coderabbitai Bot commented Jan 4, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Input parsing logic is refined to detect non-ASCII characters at the 0xA0 boundary instead of 0x7F, with added ISO 2022 8-bit control handling. UTF-8 decoding error paths are improved to discard invalid leading bytes and attempt recovery. Test coverage expands to verify SMP characters, escape sequence modifiers, and terminal control sequence handling.

Changes

Cohort / File(s) Summary
Input Scanner Logic
input.go
Threshold for non-ASCII detection shifted from 0x7F to 0xA0 in inputParser.scan; ISO 2022 8-bit control handling added via istEsc state and 0x40 adjustment. UTF-8 decoding error path modified to discard invalid leading bytes (utfLen = 1) on utf8.DecodeRune failure and only append valid runes.
Test Expansion
input_test.go
Added SMP UTF-8 character case in TestInputUTF8Characters for 4-byte sequence coverage. Extended TestSpecialKeys with Esc-Y case for Alt modifier verification. New TestIgnoredSequences suite validates parsing of terminal control sequences, DEC private mode queries, and OSC/DCS-like sequences without unintended event generation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰 With whiskers twitched at Unicode's call,
The scanner bounds now catch them all—
From 0xA0's extended dance,
To UTF-8's four-byte romance,
Each sequence tested, errors mend,
Input handling's now refined, my friend! 🎯

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Fix edge cases for input entry' accurately reflects the main changes in the PR: modifications to input parsing logic in input.go and expanded test coverage in input_test.go to handle edge cases like ISO 2022 8-bit controls, UTF-8 boundary adjustments, and terminal control sequences.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 80a0969 and 3a7a49c.

📒 Files selected for processing (2)
  • input.go
  • input_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
input_test.go (2)
key.go (2)
  • KeyRune (332-332)
  • EventKey (46-51)
event.go (1)
  • Event (23-26)
input.go (1)
key.go (3)
  • NewEventKey (244-301)
  • KeyRune (332-332)
  • ModNone (318-318)
🔇 Additional comments (5)
input.go (2)

414-424: LGTM! ISO 2022 8-bit control handling looks correct.

The boundary change to >= 0xA0 and the new ISO 2022 C1 control handling (0x80-0x9F range) correctly maps 8-bit control sequences to their 7-bit equivalents. The state machine transition and fall-through logic properly route these through the escape sequence handler.


1058-1072: LGTM! UTF-8 error recovery strategy is sound.

The boundary change to < 0xA0 maintains consistency with scan(), and the improved error handling now discards invalid bytes without appending the replacement character. This allows the parser to recover more cleanly from malformed UTF-8 sequences by discarding one byte at a time until valid UTF-8 is found.

input_test.go (3)

324-324: Good test coverage for 4-byte UTF-8.

The SMP character test case verifies that the parser correctly handles 4-byte UTF-8 sequences, which is important edge case coverage for the UTF-8 decoding changes.


537-537: Good test for Alt modifier handling.

This test case verifies that escape sequences followed by printable characters correctly produce Alt-modified rune events, providing good coverage for the escape sequence handling logic.


706-753: Excellent test coverage for ignored terminal sequences.

The test strategy is sound: sending each ignorable sequence followed by a DECID query (as a sentinel) verifies that the parser properly consumes the sequence without generating spurious events. The coverage includes both 7-bit escape sequences and 8-bit C1 control forms, as well as error cases like invalid UTF-8.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant