-
Notifications
You must be signed in to change notification settings - Fork 584
feat: add yaml input file support #4894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add yaml input file support #4894
Conversation
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
📝 WalkthroughWalkthroughUnified config loading across PD/PT entrypoints by switching to deepmd.common.j_loader and updated TF entrypoint to import j_loader from deepmd.common. Added a test to verify YAML input handling in PT training and cleanup of generated artifacts. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test (pytest)
participant Entry as PT Train Entrypoint
participant Loader as deepmd.common.j_loader
participant Train as Training Logic
Test->>Entry: train_entry(input_file="input.yaml", ..., output="out.json")
Entry->>Loader: j_loader("input.yaml")
Loader-->>Entry: config (parsed YAML/JSON)
Entry->>Train: run(config, params...)
Train-->>Entry: writes out.json
Entry-->>Test: return
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
deepmd/pd/entrypoints/main.py (1)
239-239: Using j_loader(input_file) to enable YAML config: looks good; consider adding a Paddle YAML testThis aligns PD with PT/TF. To prevent regressions, consider adding a minimal PD test that:
- writes self.config to input.yaml,
- invokes pd.entrypoints.main.train(..., input_file="input.yaml", skip_neighbor_stat=True, output="out.json"),
- asserts out.json exists.
I can draft that test if helpful.
source/tests/pt/test_training.py (2)
184-201: Solid YAML acceptance test; use a temp directory to avoid cross-test artifact collisionsWriting input.yaml and out.json to CWD can clash under parallel runs. Prefer a temp dir to confine artifacts and avoid extra cleanup.
Apply this diff:
- def test_yaml_input(self) -> None: - import yaml - - yaml_file = Path("input.yaml") - with open(yaml_file, "w") as fp: - yaml.safe_dump(self.config, fp) - train_entry( - input_file=str(yaml_file), - init_model=None, - restart=None, - finetune=None, - init_frz_model=None, - model_branch="main", - skip_neighbor_stat=True, - output="out.json", - ) - self.assertTrue(Path("out.json").exists()) + def test_yaml_input(self) -> None: + import yaml + import tempfile + + with tempfile.TemporaryDirectory() as td: + td_path = Path(td) + yaml_file = td_path / "input.yaml" + out_file = td_path / "out.json" + with open(yaml_file, "w") as fp: + yaml.safe_dump(self.config, fp) + train_entry( + input_file=str(yaml_file), + init_model=None, + restart=None, + finetune=None, + init_frz_model=None, + model_branch="main", + skip_neighbor_stat=True, + output=str(out_file), + ) + self.assertTrue(out_file.exists())
202-206: Optional: conditional cleanup can be dropped if using a temp dirIf you adopt the temp-dir approach above, this manual cleanup becomes unnecessary; otherwise, keeping it is fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
deepmd/pd/entrypoints/main.py(2 hunks)deepmd/pt/entrypoints/main.py(2 hunks)deepmd/tf/entrypoints/train.py(1 hunks)source/tests/pt/test_training.py(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
deepmd/pd/entrypoints/main.py (2)
source/tests/tf/common.py (1)
j_loader(38-39)source/tests/pd/common.py (1)
j_loader(23-24)
source/tests/pt/test_training.py (1)
deepmd/pt/entrypoints/main.py (2)
main(517-576)train(242-365)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: Build C++ (cpu, cpu)
- GitHub Check: Build C++ (cuda120, cuda)
- GitHub Check: Test Python (5, 3.9)
- GitHub Check: Test Python (2, 3.9)
- GitHub Check: Test Python (1, 3.12)
- GitHub Check: Test Python (2, 3.12)
- GitHub Check: Test Python (3, 3.12)
- GitHub Check: Test Python (3, 3.9)
- GitHub Check: Test C++ (false)
- GitHub Check: Test C++ (true)
- GitHub Check: Build wheels for cp311-macosx_arm64
- GitHub Check: Build wheels for cp310-manylinux_aarch64
- GitHub Check: Build wheels for cp311-win_amd64
- GitHub Check: Build wheels for cp311-macosx_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Build wheels for cp311-manylinux_x86_64
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (c-cpp)
- GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
- GitHub Check: Build C library (2.14, >=2.5.0,<2.15, libdeepmd_c_cu11.tar.gz)
🔇 Additional comments (4)
deepmd/pd/entrypoints/main.py (1)
24-26: Unified config loader import (deepmd.common.j_loader): LGTMCentralizing on a single loader promotes consistency and enables YAML support across entrypoints.
deepmd/pt/entrypoints/main.py (2)
26-29: Unified config loader import (deepmd.common.j_loader): LGTMConsistent with PD/TF; simplifies maintenance and enables YAML parsing uniformly.
258-258: Load config via j_loader(input_file): LGTM and enables YAML supportThis is the key change the new test exercises; no functional regressions expected for JSON.
source/tests/pt/test_training.py (1)
19-19: Expose train_entry alias for tests: LGTMKeeps callsites readable and decoupled from module path changes.
|
If you don’t mind, I’d like to recommend trying the YAML configuration mechanism based on Hydra + OmegaConf. It’s a more powerful YAML configuration approach that is very convenient for automated experiments. We’ve adopted this method in PaddleScience, which makes it easy to run automated serial and parallel experiments. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## devel #4894 +/- ##
==========================================
- Coverage 84.29% 84.29% -0.01%
==========================================
Files 702 702
Lines 68664 68663 -1
Branches 3572 3572
==========================================
- Hits 57883 57881 -2
Misses 9641 9641
- Partials 1140 1141 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Training entrypoints now accept YAML configuration files in addition to JSON, offering more flexibility when launching training. * Unified configuration loading across frameworks for consistent behavior (PyTorch, Paddle, TensorFlow). * Backward compatible: existing JSON-based workflows continue to work unchanged. * **Tests** * Added coverage to verify YAML input produces the expected training output. * Improved test cleanup to remove generated artifacts after execution. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary by CodeRabbit
New Features
Tests