Fix auto padding free logic to respect user passed False#4128
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an Out-Of-Memory issue encountered when Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an issue where explicitly setting padding_free=False was not respected, by changing the default value to None and updating the detection logic. The changes look good. I've added one suggestion to improve the robustness of the new check in _should_auto_padding_free.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e9981c3b5b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ): | ||
| return False | ||
| return not getattr(config, "padding_free", False) | ||
| return getattr(config, "padding_free", False) is None |
There was a problem hiding this comment.
Treat missing padding_free as unset for auto mode
The new return getattr(config, "padding_free", False) is None check disables auto padding-free for configs that do not define padding_free at all, because the fallback now becomes False instead of “unset”. This regresses the backwards-compatibility path where callers pass plain transformers.TrainingArguments (handled in _backwards_compatible_trainer), since those configs typically lack this field and will no longer get the previous auto-enable behavior unless users manually inject padding_free.
Useful? React with 👍 / 👎.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
danielhanchen
left a comment
There was a problem hiding this comment.
Tested and verified. The fix correctly distinguishes between "user didn't set padding_free" (None, auto-enable) and "user explicitly set padding_free=False" (respect it).
Unit tests (7/7 passed): All 4 padding_free states (None, True, False, missing attribute) plus edge cases.
GPU benchmarks (Llama-3.2-1B-Instruct, 61 steps):
| Scenario | Peak Mem | train_loss | Stdout message |
|---|---|---|---|
| Baseline (main) | 1.53 GB | 1.3789 | "Padding-free auto-enabled" |
| PR: default (None) | 1.53 GB | 1.3789 | "Padding-free auto-enabled" |
| PR: explicit False | 1.62 GB | 1.3748 | No padding-free message |
| PR: explicit True | 1.53 GB | 1.3789 | "Padding-free enabled" |
- Baseline vs PR default: losses/grad-norms identical -- zero regression
- Baseline vs PR True: losses/grad-norms identical
- PR explicit False: correctly disables padding-free (slightly higher memory, slightly different losses from different batching)
No additional changes needed.
Qwen3-14B notebook currently OOM's on a T4 due to increase VRAM when padding free is turned on. SFTConfig currently defaults
padding_free=Falseso the current logic can't differentiate between when it should autopad or the user specifically requested to turn off.This PR patches SFTConfig to default padding_free to None. If it's None padding free will be auto-enabled (the default). If it's True it's enabled, and if it's False it's turned off.
Notebook before fix with
padding_free=Falsehttps://colab.research.google.com/drive/1u51CbHLntgBLUrWe4B4lZFRNtPGi1faG?usp=sharing
Working notebook after fix with
padding_free=Falsehttps://colab.research.google.com/drive/1GYBXNlm9yP8zP0XAT6LD0kmmL5NficK-?usp=sharing