GPTOSS openmathinstruct and eval#2990
Merged
Merged
Conversation
Signed-off-by: weijiac <weijiac@NVIDIA.com>
Signed-off-by: weijiac <weijiac@NVIDIA.com>
…tchinstruct2_gsm8k Signed-off-by: weijiac <weijiac@NVIDIA.com>
cuichenx
reviewed
Apr 13, 2026
Signed-off-by: weijiac <weijiac@NVIDIA.com>
Contributor
|
/claude review |
Contributor
|
/ok to test 10d1b91 |
Comment on lines
+60
to
+137
|
|
||
|
|
||
| def _strip_intermediate_boxed(text: str) -> str: | ||
| """Replace all \\boxed{content} occurrences in text with just content. | ||
|
|
||
| Uses brace-depth counting to handle nested braces correctly | ||
| (e.g. \\boxed{\\frac{1}{2}} → \\frac{1}{2}). | ||
| """ | ||
| marker = r"\boxed{" | ||
| result = [] | ||
| i = 0 | ||
| while i < len(text): | ||
| idx = text.find(marker, i) | ||
| if idx == -1: | ||
| result.append(text[i:]) | ||
| break | ||
| result.append(text[i:idx]) | ||
| depth = 0 | ||
| end = -1 | ||
| for j in range(idx + len(marker) - 1, len(text)): | ||
| if text[j] == "{": | ||
| depth += 1 | ||
| elif text[j] == "}": | ||
| depth -= 1 | ||
| if depth == 0: | ||
| end = j | ||
| break | ||
| if end == -1: | ||
| # malformed \boxed{, keep as-is | ||
| result.append(text[idx:]) | ||
| break | ||
| result.append(text[idx + len(marker) : end]) | ||
| i = end + 1 | ||
| return "".join(result) | ||
|
|
||
|
|
||
| def process_openmathinstruct2_thinking_packed_example(example: dict, _tokenizer=None) -> dict: | ||
| """Process OpenMathInstruct-2 example into analysis+final channel format. | ||
|
|
||
| Puts the CoT reasoning (generated_solution without the trailing \\boxed{N}) into | ||
| the 'thinking' field (rendered as <|channel|>analysis by the GPT-OSS chat template) | ||
| and the final answer as '#### N' in the 'content' field (rendered as <|channel|>final). | ||
|
|
||
| This separates the reasoning chain from the answer delivery, matching the intended | ||
| GPT-OSS channel structure for math problem solving. | ||
| """ | ||
| solution = example["generated_solution"] | ||
| expected_answer = str(example["expected_answer"]) | ||
|
|
||
| # Extract the reasoning prefix: everything before the final \boxed{N} | ||
| marker = r"\boxed{" | ||
| idx = solution.rfind(marker) | ||
| if idx != -1: | ||
| depth = 0 | ||
| end = -1 | ||
| for i in range(idx + len(marker) - 1, len(solution)): | ||
| if solution[i] == "{": | ||
| depth += 1 | ||
| elif solution[i] == "}": | ||
| depth -= 1 | ||
| if depth == 0: | ||
| end = i | ||
| break | ||
| thinking = re.sub(r"\$?\s*$", "", solution[:idx]).rstrip() if end != -1 else solution.rstrip() | ||
| else: | ||
| thinking = solution.rstrip() | ||
|
|
||
| # Strip any intermediate \boxed{} from the reasoning (replace with just content) | ||
| thinking = _strip_intermediate_boxed(thinking) | ||
|
|
||
| return { | ||
| "input": "", | ||
| "output": "", | ||
| "messages": [ | ||
| {"role": "user", "content": example["problem"]}, | ||
| {"role": "assistant", "thinking": thinking, "content": f"#### {expected_answer}"}, | ||
| ], | ||
| "original_answers": [expected_answer], |
Contributor
There was a problem hiding this comment.
Missing test coverage: _strip_intermediate_boxed and process_openmathinstruct2_thinking_packed_example have non-trivial parsing logic (brace-depth counting, regex stripping, rfind-based extraction) but no tests were added. The existing test file at tests/functional_tests/test_groups/data/hf_processors/test_openmathinstruct2.py only covers process_openmathinstruct2_example.
Worth covering at minimum:
- Basic case: solution with a single trailing
\boxed{N} - Nested braces:
\boxed{\frac{1}{2}} - Multiple
\boxed{}: intermediate ones stripped from thinking, last one used for split - Malformed input: unclosed
\boxed{ - No
\boxed{}at all (fallback path)
cuichenx
reviewed
Apr 25, 2026
cuichenx
left a comment
Contributor
There was a problem hiding this comment.
pls check claude comments, let's add a unit test, otherwise code coverage would fail too. otherwise LGM!
Signed-off-by: weijiac <weijiac@NVIDIA.com>
Contributor
Author
|
/ok to test c4dfc76 |
cuichenx
approved these changes
May 1, 2026
Contributor
|
/ok to test 2e917a0 |
gautham-kollu
pushed a commit
that referenced
this pull request
May 12, 2026
Signed-off-by: weijiac <weijiac@NVIDIA.com>
vasunvidia
pushed a commit
to vasunvidia/Megatron-Bridge
that referenced
this pull request
Jun 10, 2026
Signed-off-by: weijiac <weijiac@NVIDIA.com> Signed-off-by: Vasudevan Rengasamy <vrengasamy@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do ?
related doc
Changelog
GitHub Actions CI
See the CI sectionin the Contributing doc for how to trigger the CI. A Nvidia developer will need to approve and trigger the CI for external contributors.
Before your PR is "Ready for review"
Pre checks:
If you haven't finished some of the above items you can still open "Draft" PR.
Additional Information