-
Notifications
You must be signed in to change notification settings - Fork 584
feat(pt): support training/profiling argument in PT
#3897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
WalkthroughWalkthroughRecent updates focus on improved profiling functionality within the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant TrainingClass
participant Profiler
participant ChromeTraceFile
User ->> TrainingClass: Initialize with training_params
TrainingClass ->> TrainingClass: Assign profiling parameters<br>(profiling, profiling_file)
User ->> TrainingClass: Call run()
TrainingClass ->> Profiler: Start profiler session if profiling enabled
loop During Training Steps
TrainingClass ->> Profiler: Step profiler if profiling enabled
end
TrainingClass ->> Profiler: Stop profiler if profiling enabled
Profiler ->> ChromeTraceFile: Export profiling data<br>to Chrome trace file
TrainingClass ->> User: Log message with location<br>of saved profiling trace file
Recent review detailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Additional context usedRuff
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Outside diff range and nitpick comments (4)
deepmd/pt/train/training.py (2)
Line range hint
755-758: Optimize conditional assignment using a ternary operator.For better readability and conciseness, use a ternary operator for setting
pref_lr.- if _step_id < self.warmup_steps: - pref_lr = _lr.start_lr - else: - pref_lr = cur_lr + pref_lr = _lr.start_lr if _step_id < self.warmup_steps else cur_lr
Line range hint
865-865: Remove unused loop control variable.The variable
iiis not used within the loop body. It can be replaced with_to indicate it's intentionally unused.- for ii in range(valid_numb_batch): + for _ in range(valid_numb_batch):deepmd/utils/argcheck.py (2)
Line range hint
75-75: Addstacklevelto warnings for better debugging.The
warnings.warnfunction call should include astacklevelargument to improve the traceability of the warning's origin in larger codebases.- warnings.warn(f"{key} has been removed and takes no effect.", FutureWarning) + warnings.warn(f"{key} has been removed and takes no effect.", FutureWarning, stacklevel=2)
Line range hint
1171-1178: Remove unused variables.Several variables are defined but never used within the function. Removing these can clean up the code and reduce confusion.
- link_lf = make_link("loc_frame", "model/descriptor[loc_frame]") - link_se_e2_a = make_link("se_e2_a", "model/descriptor[se_e2_a]") - link_se_e2_r = make_link("se_e2_r", "model/descriptor[se_e2_r]") - link_se_e3 = make_link("se_e3", "model/descriptor[se_e3]") - link_se_a_tpe = make_link("se_a_tpe", "model/descriptor[se_a_tpe]") - link_hybrid = make_link("hybrid", "model/descriptor[hybrid]") - link_se_atten = make_link("se_atten", "model/descriptor[se_atten]") - link_se_atten_v2 = make_link("se_atten_v2", "model/descriptor[se_atten_v2]")
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #3897 +/- ##
==========================================
- Coverage 82.73% 82.72% -0.01%
==========================================
Files 519 519
Lines 50510 50515 +5
Branches 3018 3019 +1
==========================================
Hits 41788 41788
- Misses 7787 7790 +3
- Partials 935 937 +2 ☔ View full report in Codecov by Sentry. |
wanghan-iapcm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im confused by the relationship between the options enable_profiler and profiling. could you please explain more in the doc?
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
I updated the documentation. While the behavior of the two options never changed, I am wondering whether we should give a more clear name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range and nitpick comments (2)
deepmd/utils/argcheck.py (2)
Line range hint
75-75: Specify stack level for better warning traceability.When issuing warnings, it's helpful to specify the
stacklevelso that the warning points to the caller’s line number rather than the line number inside the utility function itself.- warnings.warn(f"{key} has been removed and takes no effect.", FutureWarning) + warnings.warn(f"{key} has been removed and takes no effect.", FutureWarning, stacklevel=2)
Line range hint
1171-1178: Remove unused local variables.Several variables (
link_lf,link_se_e2_a,link_se_e2_r,link_se_e3,link_se_a_tpe,link_hybrid,link_se_atten,link_se_atten_v2) are defined but never used. This could be cleaned up to avoid confusion and maintain cleaner code.- link_lf = make_link("loc_frame", "model/descriptor[loc_frame]") - link_se_e2_a = make_link("se_e2_a", "model/descriptor[se_e2_a]") - link_se_e2_r = make_link("se_e2_r", "model/descriptor[se_e2_r]") - link_se_e3 = make_link("se_e3", "model/descriptor[se_e3]") - link_se_a_tpe = make_link("se_a_tpe", "model/descriptor[se_a_tpe]") - link_hybrid = make_link("hybrid", "model/descriptor[hybrid]") - link_se_atten = make_link("se_atten", "model/descriptor[se_atten]") - link_se_atten_v2 = make_link("se_atten_v2", "model/descriptor[se_atten_v2]")
wanghan-iapcm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we deprecate the profiling option in the future?
) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added profiling functionality with parameters for enabling profiling and exporting data to a Chrome trace file. - **Documentation** - Updated documentation for profiling-related arguments to clarify export options for performance analysis. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Summary by CodeRabbit
New Features
Documentation