After #7067, timm_nfnet started failing with the following error:
python xla/benchmarks/experiment_runner.py \
--suite-name torchbench --accelerator cuda --repeat 8 --iterations-per-run 1 \
--xla PJRT --dynamo None --test train \
--filter timm_nfnet
Traceback (most recent call last):
File "xla/benchmarks/experiment_runner.py", line 960, in <module>
main()
File "xla/benchmarks/experiment_runner.py", line 956, in main
runner.run()
File "xla/benchmarks/experiment_runner.py", line 61, in run
self.run_single_config()
File "xla/benchmarks/experiment_runner.py", line 256, in run_single_config
metrics, last_output = self.run_once_and_gather_metrics(
File "xla/benchmarks/experiment_runner.py", line 349, in run_once_and_gather_metrics
output, _ = loop(iter_fn=self._default_iter_fn)
File "xla/benchmarks/experiment_runner.py", line 306, in loop
output, timing, trace = iter_fn(benchmark_experiment, benchmark_model,
File "xla/benchmarks/experiment_runner.py", line 224, in _default_iter_fn
self._mark_step(benchmark_experiment, output)
File "xla/benchmarks/experiment_runner.py", line 428, in _mark_step
xm.mark_step()
File "xla/torch_xla/core/xla_model.py", line 1055, in mark_step
torch_xla._XLAC._xla_step_marker(
RuntimeError: Bad StatusOr access: INTERNAL: during context [Unknown]: Seen floating point types of different precisions in %multiply.3753 = f32[128,3072,6,6]{3,2,1,0} multiply(f16[128,3072,6,6]{3,2,1,0} %multiply.3730, f32[128,3072,6,6]{3,2,1,0} %add.3752), but mixed precision is
disallowed.
Environment
- Reproducible on XLA backend [CPU/TPU]: CUDA
- torch_xla version: 62c3ba6
cc @miladm @JackCaoG @vanbasten23 @zpcore
After #7067,
timm_nfnetstarted failing with the following error:python xla/benchmarks/experiment_runner.py \ --suite-name torchbench --accelerator cuda --repeat 8 --iterations-per-run 1 \ --xla PJRT --dynamo None --test train \ --filter timm_nfnetEnvironment
cc @miladm @JackCaoG @vanbasten23 @zpcore