Skip to content

fix: prevent newline accumulation in XML parameter serialization#11767

Merged
enyst merged 2 commits intoOpenHands:mainfrom
aoi127:fix-xml-newline-accumulation
Nov 25, 2025
Merged

fix: prevent newline accumulation in XML parameter serialization#11767
enyst merged 2 commits intoOpenHands:mainfrom
aoi127:fix-xml-newline-accumulation

Conversation

@aoi127
Copy link
Copy Markdown
Contributor

@aoi127 aoi127 commented Nov 18, 2025

Summary of PR

Problem Description

When converting function calling format to XML format for non-native function calling LLMs, the converter was adding extra newlines around multiline parameter values. This caused newlines to accumulate across multiple LLM response cycles, leading to:

  1. str_replace failures due to mismatched old_str with excessive newlines
  2. Generated code containing redundant blank lines that grow over time

Reproduction Example

The newline accumulation follows this pattern:

  • 1st LLM response: <parameter=old_str>\n...code...\n</parameter>
  • After context rebuild: <parameter=old_str>\n\n...code...\n\n</parameter>
  • 2nd LLM response learns this format: \n\n...code...\n\n
  • After rebuild again: <parameter=old_str>\n\n\n...code...\n\n\n</parameter>
  • Newlines keep accumulating with each request-response cycle

Root Cause

The convert_tool_call_to_string() function in openhands/llm/fn_call_converter.py detected multiline content (via is_multiline check) and added \n before and after parameter values. These extra newlines were preserved in context reconstruction, creating a positive feedback loop where:

  1. LLM sees inflated format in context
  2. LLM learns and reproduces this format
  3. Converter adds more newlines
  4. Cycle repeats with growing newlines

Solution

Removed the is_multiline logic that added extra \n characters. Parameter values are now serialized as-is without additional formatting.

Modified Files

  • openhands/llm/fn_call_converter.py (lines 423-434)
    • Removed is_multiline variable and conditional newline insertion
    • Simplified parameter serialization to preserve original content
# Before:
for param_name, param_value in args.items():
    is_multiline = isinstance(param_value, str) and '\n' in param_value
    ret += f'<parameter={param_name}>'
    if is_multiline:
        ret += '\n'  # Extra newline added
    # ... value ...
    if is_multiline:
        ret += '\n'  # Extra newline added
    ret += '</parameter>\n'

# After:
for param_name, param_value in args.items():
    ret += f'<parameter={param_name}>'
    # ... value ...
    ret += '</parameter>\n'

Impact

Positive Impacts

  • ✅ Fixes str_replace_editor tool reliability
  • ✅ Prevents code quality degradation over long conversations
  • ✅ Maintains backward compatibility with existing prompts
  • ✅ Reduces confusion from growing blank lines in generated code

Testing

  • Manual testing shows parameter values are now preserved exactly
  • No additional newlines are added during serialization
  • Context reconstruction produces identical format to LLM output

Change Type

  • Bug fix
  • New feature
  • Breaking change
  • Refactor
  • Other (dependency update, docs, typo fixes, etc.)

Checklist

  • I have read and reviewed the code and I understand what the code is doing.
  • I have tested the code to the best of my ability and ensured it works as expected.

@mamoodi
Copy link
Copy Markdown
Collaborator

mamoodi commented Nov 19, 2025

@enyst is this also something that is in agent-sdk now?

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Nov 20, 2025

@aoi127 Thank you for the contribution! Do you have some prompt log where we can see the issue, maybe? Reason I ask is that I don't fully understand it, sorry, I think the tool prompts are recomputed every request...

On the other hand, as Mamoodi said, we are launching OpenHands V1, which is deeply rebuilt and it's based on agent SDK. The LLM is in the agent-sdk, since it's part of the agentic loop.

That means that in really short time we will actually remove llm.py here, and replace with using agent-sdk as a dependency.

I'd suggest to make an issue on agent-sdk, maybe with a log if you can?

@aoi127
Copy link
Copy Markdown
Contributor Author

aoi127 commented Nov 21, 2025

logs_251121_001.zip

@enyst Hello, I’ve uploaded a zip file containing logs from the /app/logs/ directory in the Docker container. These logs demonstrate the "newline accumulation" issue.

Case 1

image

In llm/25-11-21_03-29/response_012.log, the file_text parameter only contains two leading newline characters (\n).

image

However, in llm/25-11-21_03-29/prompt_037.log(which contains all messages), there are actually three \n after <parameter=file_text> in the reconstructed assistant message.

Case 2

image image image image image image

As shown, each reconstructed assistant message contains one additional \n compared to the original function call.

The number of newline characters increases with each call to str_replace_editor, indicating a positive feedback loop.

Root Cause

As noted in the earlier comment, the code bug creates a positive feedback loop that accumulates extra newlines.

@aoi127
Copy link
Copy Markdown
Contributor Author

aoi127 commented Nov 21, 2025

@enyst I'm not sure if the agent-sdk has this same issue. I'll test it and submit a PR once I have the chance.

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Nov 21, 2025

Thank you @aoi127 ! It would be great if you could check out agent-sdk!

Maybe if you could take a look at the failing test, I think it's failing precisely because of newlines. It would be great if we could have a unit test for this exact behavior, too, so that we can reproduce it and make sure it doesn't happen again.

When converting function calling format to XML format for non-native
function calling LLMs, the converter was adding extra newlines around
multiline parameter values. This caused newlines to accumulate across
multiple LLM response cycles, leading to:

1. str_replace failures due to mismatched old_str with excessive newlines
2. Generated code containing redundant blank lines that grow over time

For Example:
- 1st LLM response: <parameter=old_str>\n...code...\n</parameter>
- After context rebuild: <parameter=old_str>\n\n...code...\n\n</parameter>
- 2nd LLM response learns this format: \n\n...code...\n\n
- After rebuild again: \n\n\n...code...\n\n\n
- Newlines keep accumulating with each request-response cycle

Root cause:
The convert_tool_call_to_string() function in fn_call_converter.py
detected multiline content (is_multiline check) and added \n before
and after parameter values. These extra newlines were preserved in
context reconstruction, creating a positive feedback loop where LLM
learned the inflated format and reproduced it in subsequent responses.

Solution:
Removed the is_multiline logic that added extra \n characters.
Parameter values are now serialized as-is without additional formatting,
preventing newline accumulation while maintaining the correct XML format
from TOOL_EXAMPLES.

Also update test expectations for multiline parameters in
test_llm_fncall_converter.py to match the new behavior. Multiline
parameter values should NOT have extra newlines before/after to prevent
newline accumulation across multiple LLM response cycles.

Impact:
- Fixes str_replace_editor tool reliability
- Prevents code quality degradation over long conversations
- Maintains backward compatibility with existing prompts
- All tests now pass with the corrected behavior
@aoi127 aoi127 force-pushed the fix-xml-newline-accumulation branch from fb39e11 to 435d18b Compare November 24, 2025 06:45
@aoi127
Copy link
Copy Markdown
Contributor Author

aoi127 commented Nov 24, 2025

Hi @enyst, I looked into the failing test. It appears that the faling test case itself was introducing the redundant newlines, which matches the behavior we saw in the logs.
I adjust the test case, do you think it's OK?

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Nov 24, 2025

I think it's fine for now! Not sure for agent-sdk, we might do some more tests over there.

Let's get this one in, first, though!

@enyst
Copy link
Copy Markdown
Collaborator

enyst commented Nov 25, 2025

@OpenHands Do a /codereview-roasted on this PR, including on the discussion in this PR. Read all comments here and look at the screenshots if you can, so you understand very well what is reported. You should think deeply and feel free to tell if you agree it's a real problem, too, and/or where it happens, or could it be a mistake.

From now on, everytime you respond to the user, use markdown, with headings, bold, bulleted lists, so that we can read it easily.

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented Nov 25, 2025

I'm on it! enyst can track my progress at all-hands.dev

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented Nov 25, 2025

Summary of Work

What I did

  • Reviewed PR fix: prevent newline accumulation in XML parameter serialization #11767 (“fix: prevent newline accumulation in XML parameter serialization”).
  • Pulled the PR branch and inspected the diff across:
    • openhands/llm/fn_call_converter.py
    • tests/unit/llm/test_llm_fncall_converter.py
  • Analyzed the discussion, logs description, and screenshots context provided.

Key Findings

  • The issue is real: extra newlines around multiline parameter values created a positive feedback loop across LLM response cycles, causing:
    • str_replace mismatches
    • Growing blank lines in generated code
  • The fix is correct and minimal: removed conditional newline insertion. Parameter values now serialize verbatim.
  • The test update is appropriate: previous expectations encoded the buggy formatting.

Verdict

  • Worth merging. This is a clean simplification that removes a harmful special case and improves stability.

Recommendations (Follow-ups, not blockers)

  • Relax strict function regex to allow optional whitespace after <function=...>:
    • Change pattern to accept \s* after the opening tag to reduce brittleness.
  • Add round-trip stability tests to prevent regressions:
    • Multiline values remain unchanged over multiple serialize/parse cycles.
    • No phantom leading/trailing newlines.
    • Parser succeeds with or without a newline after <function=...>.
  • Consider expanding type coercion in parsing (boolean, number, object) for broader JSON Schema coverage.

No Code Changes Made

  • I did not modify or push any code. This response is a codereview-only summary as requested.

View full conversation

Copy link
Copy Markdown
Collaborator

@enyst enyst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@enyst enyst merged commit f6e4d00 into OpenHands:main Nov 25, 2025
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants