[torchbench] `timm_nfnet` training failing on non-dynamo.

After #7067, `timm_nfnet` started failing with the following error:

```bash
python xla/benchmarks/experiment_runner.py \
    --suite-name torchbench --accelerator cuda --repeat 8 --iterations-per-run 1 \
    --xla PJRT --dynamo None --test train \
    --filter timm_nfnet
```

```python
Traceback (most recent call last):
  File "xla/benchmarks/experiment_runner.py", line 960, in <module>
    main()
  File "xla/benchmarks/experiment_runner.py", line 956, in main
    runner.run()
  File "xla/benchmarks/experiment_runner.py", line 61, in run
    self.run_single_config()
  File "xla/benchmarks/experiment_runner.py", line 256, in run_single_config
    metrics, last_output = self.run_once_and_gather_metrics(
  File "xla/benchmarks/experiment_runner.py", line 349, in run_once_and_gather_metrics
    output, _ = loop(iter_fn=self._default_iter_fn)
  File "xla/benchmarks/experiment_runner.py", line 306, in loop
    output, timing, trace = iter_fn(benchmark_experiment, benchmark_model,
  File "xla/benchmarks/experiment_runner.py", line 224, in _default_iter_fn
    self._mark_step(benchmark_experiment, output)
  File "xla/benchmarks/experiment_runner.py", line 428, in _mark_step
    xm.mark_step()
  File "xla/torch_xla/core/xla_model.py", line 1055, in mark_step
    torch_xla._XLAC._xla_step_marker(
RuntimeError: Bad StatusOr access: INTERNAL: during context [Unknown]: Seen floating point types of different precisions in %multiply.3753 = f32[128,3072,6,6]{3,2,1,0} multiply(f16[128,3072,6,6]{3,2,1,0} %multiply.3730, f32[128,3072,6,6]{3,2,1,0} %add.3752), but mixed precision is
disallowed.
```

## Environment

 - Reproducible on XLA backend [CPU/TPU]: CUDA
 - torch_xla version: 62c3ba652ea09e2076a27f200ad755541f37daeb

cc @miladm @JackCaoG @vanbasten23 @zpcore 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[torchbench] `timm_nfnet` training failing on non-dynamo. #7084

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[torchbench] timm_nfnet training failing on non-dynamo. #7084

Description

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[torchbench] `timm_nfnet` training failing on non-dynamo. #7084