fix(clp-package): Use generic archive directory as mount target in native/dataset_manager.py to match path in metadata database (fixes #1526).#1528
Conversation
WalkthroughCreate a container-visible copy of FS archive-output config and add a bind Changes
Sequence DiagramsequenceDiagram
participant Caller
participant DM as dataset_manager
participant Config as ArchiveConfig(FS)
participant Mount as DockerMount
Caller->>DM: Determine archive_output.storage.type
alt storage.type == FS
DM->>Config: copy config for container visibility
DM->>Config: transform storage path for container
DM->>Mount: create DockerMount (type: BIND) hostPath -> containerPath
DM->>DM: append mount to mounts list
else
DM->>DM: leave mounts unchanged
end
rect rgb(240, 245, 255)
Note over DM: Native deletion (simplified)
Caller->>DM: _try_deleting_archives(dataset_archive_storage_dir)
DM->>DM: resolve Path(dataset_archive_storage_dir)
DM->>DM: check existence and is_dir
DM->>DM: remove directory if checks pass
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related issues
Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
quinntaylormitchell
left a comment
There was a problem hiding this comment.
only one comment; I very much defer to @kirkrodrigues review though.
and fwiw, the "try" in _try_deleting_archives_from_fs (and the other functions in native/dataset_manager.py for that matter) doesn't seem like a useful or meaningful word to use in the function name(s), but that's outside the scope of this PR.
| if clp_config.archive_output.storage.type == StorageType.FS: | ||
| necessary_mounts.append(mounts.archives_output_dir) | ||
| container_archive_output_config = container_clp_config.archive_output.model_copy(deep=True) | ||
| container_archive_output_config.storage.transform_for_container() |
There was a problem hiding this comment.
does this line really need to be here? if container_clp_config is the version of the config meant for use inside the container, then its archive_output paths should already be in container form (from when they get transformed here), so the paths in container_archive_output_config should already be transformed because it's a deep copy, right?
There was a problem hiding this comment.
right, we could have used generate_docker_compose_container_config() (NOTE: it's a different function than generate_container_config() which is the one currently used in dataset_manager.py) to generate a copy of the config and directly reference it, but to limit the memory impacts, though not significantly, i only make a copy of the container_archive_output_config and transform the copy.
generate_container_config will be deprecated when we move all docker orchestrions to Docker Compose so the changes here are temporary as well
let me know if my explanations should be clarified or if i misunderstood your comment
kirkrodrigues
left a comment
There was a problem hiding this comment.
For the PR title, how about:
fix(clp-package): Use generic archive directory as mount target in `native/dataset_manager.py` to match path in metadata database (fixes #1526).
native/dataset_manager.py to match path in metadata database (fixes #1526).
…ative/dataset_manager.py` to match path in metadata database (fixes y-scope#1526). (y-scope#1528)
Description
mounts.archives_output_dirwithDockerMount(DockerMountType.BIND, host_dir, container_dir)derived from a deep-copied, container-transformed archive_output config._try_deleting_archives_from_fsto take onlydataset_archive_storage_dirand remove brittle top-level containment check.Checklist
breaking change.
Validation performed
Summary by CodeRabbit