How is the GPU vs CPU tensor management done?

Hi. I have trouble understanding how and when do you send tensors to the GPU. I have a custom training routine where I use your dataloader and some modules using the build_X methods. When the data comes from your dataloaders, neither the data nor the model is required to be in the GPU. The tensors must be in the CPU for the modules to handle, which otherwise will raise an error of mismatched memory allocation for input and model. For example, for an Anchor3dHead, the passed tensor is sent through a Conv2d layer (https://github.com/open-mmlab/mmdetection3d/blob/78f456216f101b5b684db754176e0def17ab9d49/mmdet3d/models/dense_heads/anchor3d_head.py#L150), and I cannot find anywhere along the way any .to(device) or .cuda(). Yet, the processing seems really fast. Can you give me some insights on how is this done?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How is the GPU vs CPU tensor management done? #758

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How is the GPU vs CPU tensor management done? #758

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions