-
Notifications
You must be signed in to change notification settings - Fork 32.4k
Description
System Info
transformers==4.30.2
Mac 2019, Ventura 13.4
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
ISSUE: I am running a generic model training using Trainer on my mac, locally. My model is being moved to MPS, but my tensors are staying on CPU.
I can provide more details about my script, but I kinda expect that this is a general library problem. Here's the lines of code I discovered:
When the accelerator is instantiated in the Trainer class, it doesn't get passed any user-specific arguments, like this from TrainingArgs for e.g to give the user control over which device to use. As a result, when running locally on Mac, Accelerate does a lot of inference about which device we want to use, and moves the model to self.device in the non-distributed setting. I'm not sure yet how self.device is instantiated in Accelerate, however, Trainer doesn't natively move my data to mps, so my script is crashing.
Expected behavior
Ideally, I have a flag I can pass into Trainer to help me not MPS if I don't want to, and just stick with CPU.