-
Notifications
You must be signed in to change notification settings - Fork 584
feat(pt): add observed-type option for dp show
#4820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 Walkthrough## Walkthrough
Support for a new "observed-type" attribute was added to the `dp show` command, enabling detailed reporting of element types observed during model training. This includes CLI support, backend interface and PyTorch implementation, multi-task model handling, a utility to sort element types by atomic number, updated tests, and documentation enhancements.
## Changes
| File(s) | Change Summary |
|--------------------------------|----------------------------------------------------------------------------------------------------------------------------------|
| deepmd/entrypoints/show.py | Added handling for "observed-type" attribute; logs observed element types for single- and multi-task models. |
| deepmd/infer/deep_eval.py | Added abstract `get_observed_types` method to backend interface and concrete method in `DeepEval` class. |
| deepmd/main.py | Extended CLI "show" subcommand to include "observed-type" as a valid attribute choice. |
| deepmd/pt/infer/deep_eval.py | Added `no_jit` constructor parameter; implemented `get_observed_types` to identify observed element types from model buffers. |
| deepmd/utils/econf_embd.py | Added `sort_element_type` function to sort element symbols by atomic number. |
| deepmd/pt/model/model/ener_model.py | Added `get_observed_type_list` method to `EnergyModel` to retrieve observed element types from data statistics. |
| deepmd/pt/model/model/model.py | Added abstract `get_observed_type_list` method to `BaseModel`. |
| source/tests/pt/test_dp_show.py | Updated tests to include assertions for "observed-type" output in both single- and multi-task model scenarios. |
| doc/model/show-model-info.md | Updated documentation to describe the new "observed-type" attribute and provide example output. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant User
participant CLI
participant ShowEntrypoint
participant DeepEval
participant Model
User->>CLI: deepmd show model observed-type
CLI->>ShowEntrypoint: call show("observed-type")
ShowEntrypoint->>DeepEval: get_observed_types()
DeepEval->>Model: Inspect model buffers for observed types
Model-->>DeepEval: Return observed types info
DeepEval-->>ShowEntrypoint: Return observed types dict
ShowEntrypoint-->>CLI: Output observed types details
CLI-->>User: Display observed types informationSuggested reviewers
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #4820 +/- ##
=======================================
Coverage 84.57% 84.57%
=======================================
Files 699 699
Lines 68039 68120 +81
Branches 3541 3541
=======================================
+ Hits 57542 57614 +72
- Misses 9363 9372 +9
Partials 1134 1134 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
doc/model/show-model-info.md (2)
20-21: Attribute list: consider alphabetical ordering + tighten wordingThe new bullet is very welcome, but the list is now neither alphabetically sorted (
descriptor,fitting-net,model-branch, …) nor grouped by backend support, which makes scanning harder.
While you are touching this section, consider:- - `observed-type`: (Supported Backends: PyTorch) Shows the type (element) coverage for this model. + - `observed-type`: (PyTorch only) Shows the element types observed during training.Shorter parenthesis and moving the bullet to keep alphabetical order (
descriptor,fitting-net,model-branch,observed-type,size,type-map) will improve readability.
64-69: Clarify “coverage” vs. “type-map” & fix double-space grammar glitch
The previous review raised confusion around type map vs type coverage.
Line 68 gives a note, but it still might read as “may differ” without saying why.
A one-liner such as “Coverage reports elements actually present in the training set, whereas the type-map is the full embedding index.” would prevent recurring support questions.There is a double space before the next header (
map.␠␠## Example Output) which trips spelling/grammar checkers. Remove the extra space.- - Note: Type coverage reflects the types observed during training data statistics, which may differ from the type map. + - Note: *Coverage* lists elements present in the training data, whereas *type-map* is the full embedding index; they can therefore differ.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
deepmd/entrypoints/show.py(2 hunks)deepmd/infer/deep_eval.py(2 hunks)deepmd/main.py(1 hunks)deepmd/pt/infer/deep_eval.py(4 hunks)doc/model/show-model-info.md(3 hunks)source/tests/pt/test_dp_show.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- deepmd/main.py
- deepmd/infer/deep_eval.py
- deepmd/entrypoints/show.py
- deepmd/pt/infer/deep_eval.py
- source/tests/pt/test_dp_show.py
🧰 Additional context used
🪛 LanguageTool
doc/model/show-model-info.md
[grammar] ~20-~20: Use correct spacing
Context: ...ws the type (element) coverage for this model. ## Example Usage ```bash # For a multi-ta...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~68-~68: Use correct spacing
Context: ...tistics, which may differ from the type map. ## Example Output For a singletask model,...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Test C++ (true)
- GitHub Check: Test C++ (false)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
🔇 Additional comments (1)
doc/model/show-model-info.md (1)
83-85: Example output: keep order consistent with bullet descriptionIn the earlier bullet list you present number then list; the example output shows them reversed (
Numberafter the header line). For quick copy-paste comparison and to reduce cognitive load, swap to match the documented order or update the bullet list accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
doc/model/show-model-info.md (2)
64-69: Tighten wording & fix double-space artefactsMinor copy-editing for clarity and to silence grammar linters.
-**observed-type** +**type-coverage** - - Displays the count and list of element types of the model during data statistics. + - Displays the number and list of element types observed in the training data. - - Note: This info shows the types observed during training data statistics, which may differ from the type map. + - Note: Coverage is computed from training-data statistics and may therefore differ from the static type map.
83-85: Example output may confuse readersThe example shows
type_map = ['O', 'H', 'Au']but the observed set lists only['H', 'O'].
Consider adding a one-line explanation right after the example to stress that elements absent from the training data (here:Au) are intentionally missing from the coverage list.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
deepmd/infer/deep_eval.py(2 hunks)deepmd/pt/infer/deep_eval.py(4 hunks)doc/model/show-model-info.md(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- deepmd/infer/deep_eval.py
- deepmd/pt/infer/deep_eval.py
🧰 Additional context used
🪛 LanguageTool
doc/model/show-model-info.md
[grammar] ~20-~20: Use correct spacing
Context: ...pes (elements) of the model during data statistics. ## Example Usage ```bash # For a multi-ta...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~68-~68: Use correct spacing
Context: ...tistics, which may differ from the type map. ## Example Output For a singletask model,...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (cpu, cpu)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
doc/model/show-model-info.md (1)
20-21: Attribute name still conflicts with PR title & commit messageThe docs keep
observed-type, while the PR title / description saytype-coverage. This mismatch has already been raised and is still unresolved – please settle on one spelling and align code, docs, tests, and commit messages accordingly.
🧹 Nitpick comments (1)
doc/model/show-model-info.md (1)
64-69: Minor phrasing / style nitpick for “observed-type” sectionShorten the wording and avoid the repeated “data statistics” phrase.
- - Displays the count and list of observed element types of the model during data statistics. + - Displays the count and list of element types observed during training-data statistics.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
doc/model/show-model-info.md(3 hunks)
🧰 Additional context used
🪛 LanguageTool
doc/model/show-model-info.md
[grammar] ~20-~20: Use correct spacing
Context: ...pes (elements) of the model during data statistics. ## Example Usage ```bash # For a multi-ta...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~68-~68: Use correct spacing
Context: ...tistics, which may differ from the type map. ## Example Output For a singletask model,...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
- GitHub Check: CodeQL
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Analyze (javascript-typescript)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build C++ (rocm, rocm)
type-coverage option for dp showobserved-types option for dp show
observed-types option for dp showobserved-type option for dp show
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
doc/model/show-model-info.md (2)
20-21: Fix spacing issue after period.Static analysis flagged a spacing issue that should be corrected for consistency.
- - `observed-type`: (Supported Backends: PyTorch) Shows the observed types (elements) of the model during data statistics. Only energy models are supported now. + - `observed-type`: (Supported Backends: PyTorch) Shows the observed types (elements) of the model during data statistics. Only energy models are supported now.
64-69: Fix spacing issue and approve documentation.The documentation clearly explains the new feature. There's a minor spacing issue that should be corrected.
- - Note: This info shows the types observed during training data statistics, which may differ from the type map. + - Note: This info shows the types observed during training data statistics, which may differ from the type map.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
deepmd/pt/infer/deep_eval.py(4 hunks)deepmd/pt/model/model/ener_model.py(1 hunks)deepmd/pt/model/model/model.py(1 hunks)doc/model/show-model-info.md(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/pt/infer/deep_eval.py
🧰 Additional context used
🧠 Learnings (1)
deepmd/pt/model/model/model.py (2)
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4002
File: deepmd/dpmodel/model/model.py:106-106
Timestamp: 2024-10-08T15:32:11.479Z
Learning: The `get_class_by_type` method in `deepmd/utils/plugin.py` includes error handling for unsupported model types by raising a `RuntimeError` and providing a "Did you mean" message if there are close matches.
Learnt from: njzjz
PR: deepmodeling/deepmd-kit#4002
File: deepmd/dpmodel/model/model.py:106-106
Timestamp: 2024-07-22T21:18:12.787Z
Learning: The `get_class_by_type` method in `deepmd/utils/plugin.py` includes error handling for unsupported model types by raising a `RuntimeError` and providing a "Did you mean" message if there are close matches.
🧬 Code Graph Analysis (2)
deepmd/pt/model/model/model.py (1)
deepmd/pt/model/model/ener_model.py (1)
get_observed_type_list(48-71)
deepmd/pt/model/model/ener_model.py (4)
deepmd/pt/model/model/model.py (1)
get_observed_type_list(52-59)deepmd/pt/model/atomic_model/base_atomic_model.py (1)
get_type_map(126-128)deepmd/pt/model/model/make_model.py (2)
get_type_map(554-556)get_out_bias(201-202)deepmd/pt/model/atomic_model/dp_atomic_model.py (1)
get_out_bias(260-261)
🪛 LanguageTool
doc/model/show-model-info.md
[grammar] ~20-~20: Use correct spacing
Context: ...stics. Only energy models are supported now. ## Example Usage ```bash # For a multi-ta...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~68-~68: Use correct spacing
Context: ...tistics, which may differ from the type map. ## Example Output For a singletask model,...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)
- GitHub Check: Test Python (5, 3.12)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (6, 3.9)
- GitHub Check: Test Python (4, 3.9)
- GitHub Check: Test Python (6, 3.12)
- GitHub Check: Test Python (4, 3.12)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (1, 3.9)
- GitHub Check: Analyze (python)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C++ (cuda, cuda)
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (rocm, rocm)
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build C++ (clang, clang)
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (cuda120, cuda)
🔇 Additional comments (5)
deepmd/pt/model/model/model.py (1)
51-59: LGTM! Abstract method properly defined.The abstract method is correctly implemented with appropriate
@torch.jit.exportdecorator, clear docstring, and properNotImplementedErrorfor subclasses to override.doc/model/show-model-info.md (1)
83-86: Well-documented example output.The example output clearly demonstrates how the observed types information is presented to users.
deepmd/pt/model/model/ener_model.py (3)
58-62: Proper validation with clear error messages.The assertions provide good validation with descriptive error messages that will help developers debug issues.
63-65: Appropriate threshold for numerical stability.The 1e-6 threshold is a reasonable choice for determining significant bias values while maintaining numerical stability.
47-71: Confirmed:get_out_bias()[0]correctly extracts the first bias sliceI checked the implementation of
get_out_biasand its initialization inBaseAtomicModel.init_out_stat. It registersout_biaswith shape(n_out, ntypes, odims), so indexing[0]yields a 2D tensor of shape(ntypes, odims)as expected. No changes needed.
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> Signed-off-by: Duo <50307526+iProzd@users.noreply.github.com>
for more information, see https://pre-commit.ci
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added support for displaying "observed-type" information in the model reporting tool, showing element types observed during training for single-task and multi-task models. - **Tests** - Updated tests to verify correct reporting of observed types for both single-task and multi-task models. - **Documentation** - Updated documentation to include the new "observed-type" attribute in model information display, with examples and explanations. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Duo <50307526+iProzd@users.noreply.github.com> Co-authored-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary by CodeRabbit