fix(config): sanitize_env substring false positive breaks valid API keys#17241
Closed
Feng-H wants to merge 1 commit into
Closed
fix(config): sanitize_env substring false positive breaks valid API keys#17241Feng-H wants to merge 1 commit into
Feng-H wants to merge 1 commit into
Conversation
sanitize_env_lines() uses str.find() to detect concatenated KEY=VALUE
pairs, but this matches key names as substrings of other key names.
For example, LM_API_KEY= is found at position 1 inside
GLM_API_KEY=<value>, causing the sanitizer to split a valid entry into
a bare 'G' line and an 'LM_API_KEY=...' line — destroying the key.
This is especially harmful when combined with CJK input methods (fcitx5,
ibus) that can insert a stray newline in the middle of an env var name
during paste, producing a split like:
G
LM_API_KEY=<actual_value>
The sanitizer would then:
1. Fail to merge the split (original behavior had no merge logic)
2. When the split was manually fixed, Pass 1 would re-split it on the
next gateway startup because find('LM_API_KEY=') matches inside
'GLM_API_KEY=...' at position 1
Fix:
- Pass 0 (new): detect and merge split key names where a bare uppercase
fragment on one line + a KEY=VALUE on the next line combine to form a
known env var name.
- Pass 1 (improved): replace str.find() with str.startswith() at
position 0 + forward scanning. This ensures we only match actual key
names at line/segment boundaries, never as substrings inside other
key names.
Collaborator
This was referenced Apr 29, 2026
Closed
Contributor
|
Closing as already fixed on Triage notes (medium confidence): If you still see this on the latest version, please reopen with reproduction steps. (Bulk-closed during a CLI triage sweep.) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
_sanitize_env_lines()inhermes_cli/config.pyusesstr.find()to detect concatenatedKEY=VALUEpairs on a single line. This substring search causes false positives when one known env var name is a substring of another:LM_API_KEY=matches at position 1 insideGLM_API_KEY=<value>, becausefind()does not require a word boundary.Gline and anLM_API_KEY=...line, destroying the key.sanitize_env_file()runs again and sees the two fragments as-is (neither is a valid known key), preserving the corruption permanently.This is a general bug — any pair of known keys where one name contains another as a substring is affected (e.g.,
API_KEY=insideOPENAI_API_KEY=,ANTHROPIC_TOKEN=insideANTHROPIC_API_KEY=, etc.). The practical impact is limited because most users do not encounter the initial corruption that triggers the cycle.Trigger scenario (CJK input methods)
The bug becomes a persistent data-loss issue when combined with CJK input methods (fcitx5, ibus). These can occasionally insert a stray newline in the middle of a pasted env var name, producing:
Once this corruption exists:
.envfile unfixable without patching the code.Fix
Pass 0 (new): Detect and merge split key names. If a line contains only uppercase letters/underscores (no
=) and the next non-empty line, when concatenated, forms a known env var name with=, merge them.Pass 1 (improved): Replace
str.find()(arbitrary substring search) withstr.startswith()at position 0 + forward offset scanning. This ensures concatenated key detection only matches actual key names at line/segment boundaries, never as substrings inside other key names.Testing
Verified with these scenarios:
G+ newline +LM_API_KEY=<value>→GLM_API_KEY=<value>GLM_API_KEY=abc→GLM_API_KEY=abcKEY1=val1KEY2=val2→ two lines