-
Notifications
You must be signed in to change notification settings - Fork 584
feat(pt): add AdamW for pt training #4757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Warning Rate limit exceeded@iProzd has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 5 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThe changes add support for the "AdamW" optimizer in the training module, including a new "weight_decay" parameter. The optimizer selection logic and argument validation are updated to accept "AdamW" as a valid option, alongside the existing "Adam" and "LKF" optimizers. No other public interfaces are modified. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TrainingConfig
participant TrainingModule
participant Optimizer
User->>TrainingConfig: Specify opt_type ("Adam" or "AdamW") and weight_decay
TrainingConfig->>TrainingModule: Pass configuration
TrainingModule->>Optimizer: Initialize with opt_type and weight_decay
Optimizer-->>TrainingModule: Return optimizer instance (Adam or AdamW)
TrainingModule->>TrainingModule: Proceed with training loop using optimizer
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
deepmd/pt/train/training.py (1)
613-626: Good implementation of AdamW optimizer support.The code correctly extends the optimizer initialization to handle both "Adam" and "AdamW" types, initializing the appropriate PyTorch optimizer in each case.
Consider simplifying line 618 from
fused=False if DEVICE.type == "cpu" else Truetofused=DEVICE.type != "cpu"for better readability.- fused=False if DEVICE.type == "cpu" else True, + fused=DEVICE.type != "cpu",Also, for consistency, consider adding the
fusedparameter to the AdamW optimizer as well if it's supported in your PyTorch version:self.optimizer = torch.optim.AdamW( self.wrapper.parameters(), lr=self.lr_exp.start_lr, weight_decay=float(self.opt_param["weight_decay"]), + fused=DEVICE.type != "cpu", )🧰 Tools
🪛 Ruff (0.11.9)
618-618: Use
not ...instead ofFalse if ... else TrueReplace with
not ...(SIM211)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
deepmd/pt/train/training.py(3 hunks)deepmd/utils/argcheck.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.11.9)
deepmd/pt/train/training.py
618-618: Use not ... instead of False if ... else True
Replace with not ...
(SIM211)
🔇 Additional comments (3)
deepmd/utils/argcheck.py (1)
3180-3180: AdamW optimizer addition looks good!The addition of "AdamW" as a valid optimizer type is appropriate and aligns with the PR objective to add AdamW support for PyTorch training. This change ensures the argument validation system recognizes AdamW as a valid option alongside the existing Adam and LKF optimizers.
deepmd/pt/train/training.py (2)
161-161: Good addition of weight_decay parameter.The implementation correctly adds a
weight_decayparameter with a default value of 0.001, which will be used by the AdamW optimizer.
721-721: Good update to the optimizer condition in training step.The condition has been correctly updated to include "AdamW" along with "Adam" for the optimizer type check, ensuring the learning rate scheduler and optimization step logic work for both optimizers.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #4757 +/- ##
==========================================
+ Coverage 84.70% 84.73% +0.02%
==========================================
Files 697 697
Lines 67424 67426 +2
Branches 3541 3540 -1
==========================================
+ Hits 57112 57132 +20
+ Misses 9182 9161 -21
- Partials 1130 1133 +3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added support for the "AdamW" optimizer in training configurations. - Introduced a "weight_decay" parameter for optimizer settings, with a default value of 0.001. - **Chores** - Updated configuration options to allow selection of "AdamW" as an optimizer type. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary by CodeRabbit
New Features
Chores