Skip to content

refactor: extract 6 helpers from _normalize_codex_response#23957

Closed
kshitijk4poor wants to merge 4 commits into
NousResearch:mainfrom
kshitijk4poor:refactor/codex-normalize-response
Closed

refactor: extract 6 helpers from _normalize_codex_response#23957
kshitijk4poor wants to merge 4 commits into
NousResearch:mainfrom
kshitijk4poor:refactor/codex-normalize-response

Conversation

@kshitijk4poor

Copy link
Copy Markdown
Collaborator

Summary

Split _normalize_codex_response (complexity 37) into 6 focused helpers:

Function Purpose Complexity
_normalize_codex_response Response normalizer 37 → 18
_resolve_codex_output Empty output fallback <10
_process_message_item Message text extraction <10
_process_reasoning_item Reasoning + encrypted content <10
_process_function_call_item Merged call types <10
_detect_tool_call_leak Leak pattern detection <10
_determine_response_finish_reason Finish reason logic <10

Also eliminated ~25 lines of duplicated function_call/custom_tool_call code by merging into one handler.

Zero logic changes — pure extraction.

Part 4 of 4 — completes codex_responses_adapter.py structural refactoring.

Depends on: #23954 (stacked commits, same file)

…nput

Split the 192-line dispatcher into three focused functions:

- _convert_assistant_msg_to_responses_items() — reasoning replay,
  message item replay, content emission, and tool call conversion
- _convert_tool_msg_to_responses_item() — tool result formatting
- _chat_messages_to_responses_input() — thin loop dispatcher

Complexity: 45 → <10 on the main dispatcher (C901 cleared).
Zero logic changes — pure extraction.
Split the 210-line validator into five focused functions, one per item type:

- _validate_function_call_item()      — call_id, name, arguments
- _validate_function_call_output_item() — output (string or multimodal array)
- _validate_reasoning_item()          — encrypted content with dedup
- _validate_message_item()            — role=assistant with content parts
- _validate_role_content_item()       — bare role+content items

Main dispatcher complexity: 51 → <10 (C901 cleared).
Zero logic changes — pure extraction.

Depends on: refactor/codex-chat-to-responses (same file)
…eaders

Split tool validation and extra-header normalization out of
_preflight_codex_api_kwargs into focused helpers:

- _normalize_codex_tools()        — validates tool list, normalizes each tool
- _normalize_codex_extra_headers() — validates and normalizes HTTP headers

Main dispatcher complexity: 36 → 21.
Zero logic changes — pure extraction.

Depends on: NousResearch#23952 (refactor/codex-preflight-input)
Split the 210-line response normalizer into focused helpers:

- _resolve_codex_output()           — empty output fallback
- _process_message_item()           — message text extraction
- _process_reasoning_item()         — reasoning text + encrypted content
- _process_function_call_item()     — merged function_call + custom_tool_call
- _detect_tool_call_leak()          — tool-call leak pattern detection
- _determine_response_finish_reason() — finish reason decision tree

Main dispatcher complexity: 37 → 18.
Also eliminated ~25 lines of duplicated function_call/custom_tool_call code.

Zero logic changes — pure extraction.

Depends on: NousResearch#23954 (refactor/codex-preflight-kwargs)
@alt-glitch alt-glitch added type/refactor Code restructuring, no behavior change comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have labels May 11, 2026
@kshitijk4poor

Copy link
Copy Markdown
Collaborator Author

Closing — re-evaluated against the original tracker goal in #23972 ("ruff complexity reduction"). The actual LOC accounting here is net positive (+26 / +66 / +85 / +139 across the stack), not negative.

The C901 number drops (45/51/39/37 → all <22) come at a real cost: each extracted helper carries ~3-6 lines of decomposition tax (signature + docstring + return-plumbing for shared locals that the original monolithic function held in scope). The helpers themselves are real single-responsibility units, but Python's function-scope semantics don't give us a free way to split without that tax.

For complexity reductions that trade LOC for testability, the bar is "is the function so untestable that the tax is worth it?" — and for these four codex functions, they're already exercised by the agent-loop integration tests in tests/run_agent/ and tests/agent/test_codex_responses_adapter.py. The bisect-during-debug argument is real but doesn't justify +139 LOC in already-covered code.

What's still moving forward from this tracker:

If a future refactor of these codex functions ends up net-negative on LOC (e.g. by collapsing repeated patterns or eliminating dead-code branches alongside the extraction), I'll reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P3 Low — cosmetic, nice to have type/refactor Code restructuring, no behavior change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants