Configuation Cleanup by SahilJain314 · Pull Request #10 · NVIDIA-NeMo/RL

SahilJain314 · 2025-03-21T03:47:48Z

What does this PR do ?

Cleaned up configs to provide one 1B example (1GPU) and an 8B example (8GPUs)

also small misc naming changes for clarity

Signed-off-by: Sahil Jain <sahilj@nvidia.com>

Comments addressed: #3, #5, NVIDIA-NeMo#7, NVIDIA-NeMo#8, NVIDIA-NeMo#9, NVIDIA-NeMo#10, NVIDIA-NeMo#11. - Rename _load_M -> _get_sparse_projection_matrix and _load_dense_projection -> _get_topk_projection (later removed in favor of module-level cache helpers below). - Drop unused alignment_student_spans / alignment_teacher_spans from the cross-tokenizer batch payload. - Remove NRL_XTOKEN_LOSS_DUMP_DIR debug-dump side effect. - Move Fp32SparseMM, chunk_average_log_probs, valid_chunk_mask to a new shared module nemo_rl/algorithms/x_token/utils.py. - Extract projection-file parsing into utils.parse_projection_file; tokenalign.py and loss_functions.py both go through it. - Move per-instance projection-matrix caches to process-local caches in utils.get_sparse_projection_matrix / get_topk_projection. The driver no longer holds large CUDA tensors; each Ray worker fills its own cache on first loss call. Signed-off-by: Adithya Hanasoge <avenkateshha@nvidia.com>

Comments addressed: #3, #5, #7, #8, #9, #10, #11. - Rename _load_M -> _get_sparse_projection_matrix and _load_dense_projection -> _get_topk_projection (later removed in favor of module-level cache helpers below). - Drop unused alignment_student_spans / alignment_teacher_spans from the cross-tokenizer batch payload. - Remove NRL_XTOKEN_LOSS_DUMP_DIR debug-dump side effect. - Move Fp32SparseMM, chunk_average_log_probs, valid_chunk_mask to a new shared module nemo_rl/algorithms/x_token/utils.py. - Extract projection-file parsing into utils.parse_projection_file; tokenalign.py and loss_functions.py both go through it. - Move per-instance projection-matrix caches to process-local caches in utils.get_sparse_projection_matrix / get_topk_projection. The driver no longer holds large CUDA tensors; each Ray worker fills its own cache on first loss call. Signed-off-by: Adithya Hanasoge <avenkateshha@nvidia.com>

config cleanup

00f776f

Signed-off-by: Sahil Jain <sahilj@nvidia.com>

SahilJain314 added the enhancement New feature or request label Mar 21, 2025

SahilJain314 requested a review from parthchadha March 21, 2025 03:47

github-actions Bot added the Documentation Improvements or additions to documentation label Mar 21, 2025

SahilJain314 merged commit 38a74b2 into sahilj/more_docs Mar 21, 2025

SahilJain314 deleted the sahilj/config_cleanup branch March 21, 2025 03:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Configuation Cleanup#10

Configuation Cleanup#10
SahilJain314 merged 1 commit into
sahilj/more_docsfrom
sahilj/config_cleanup

SahilJain314 commented Mar 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SahilJain314 commented Mar 21, 2025

What does this PR do ?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant