Skip to content

Add fallback check to test_core_aten_ops.py#6559

Merged
wonjoo-wj merged 3 commits intomasterfrom
wonjoo/core-aten-ops/metrics
Feb 17, 2024
Merged

Add fallback check to test_core_aten_ops.py#6559
wonjoo-wj merged 3 commits intomasterfrom
wonjoo/core-aten-ops/metrics

Conversation

@wonjoo-wj
Copy link
Copy Markdown
Collaborator

Noticed that if we run a test when an op is not supported, the op just falls back to CPU and that unit test succeeds silently. This adds a explicit metric check to ensure that the op is actually lowered in torch_xla.

By running this, we can see that 6 of the tests were silently passing while falling back to CPU:

  1. test_aten_grid_sampler_2d_0
  2. test_aten_reflection_pad1d_0
  3. test_aten_reflection_pad1d_1
  4. test_aten_reflection_pad3d_0
  5. test_aten_reflection_pad3d_1
  6. test_aten_reflection_pad3d_2

Example output of such failing test:

======================================================================
FAIL: test_aten_reflection_pad3d_2 (__main__.AtenOpTest) [torch_xla_metric]
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/wonjoo/pytorch/xla/test/test_core_aten_ops.py", line 59, in run_export_and_compare
    testcase.assertNotIn(aten_function_name, met.metrics_report())
AssertionError: 'aten::reflection_pad3d' unexpectedly found in 'Metric: DeviceLockWait\n  TotalSamples: 2\n  Accumulator: 063.560us\n  ValueRate: 10s009ms448.819us / second\n  Rate: 314961 / second\n  Percentiles: 1%=006.920us; 5%=006.920us; 10%=006.920us; 20%=006.920us; 50%=056.640us; 80%=056.640us; 90%=056.640us; 95%=056.640us; 99%=056.640us\nMetric: IrValueTensorToXlaData\n  TotalSamples: 2\n  Accumulator: 509.530us\n  ValueRate: 003ms442.425us / second\n  Rate: 13.5122 / second\n  Percentiles: 1%=201.220us; 5%=201.220us; 10%=201.220us; 20%=201.220us; 50%=308.310us; 80%=308.310us; 90%=308.310us; 95%=308.310us; 99%=308.310us\nMetric: LazyTracing\n  TotalSamples: 7\n  Accumulator: 145ms559.775us\n  ValueRate: 975ms930.568us / second\n  Rate: 47.2089 / second\n  Percentiles: 1%=262.510us; 5%=262.510us; 10%=262.510us; 20%=369.390us; 50%=627.540us; 80%=001ms011.209us; 90%=141ms770.906us; 95%=141ms770.906us; 99%=141ms770.906us\nMetric: TensorToData\n  TotalSamples: 2\n  Accumulator: 415.050us\n  ValueRate: 003ms804.118us / second\n  Rate: 13.5122 / second\n  Percentiles: 1%=154.880us; 5%=154.880us; 10%=154.880us; 20%=154.880us; 50%=260.170us; 80%=260.170us; 90%=260.170us; 95%=260.170us; 99%=260.170us\nMetric: TensorsGraphSize\n  TotalSamples: 1\n  Accumulator: 1.00\n  Percentiles: 1%=1.00; 5%=1.00; 10%=1.00; 20%=1.00; 50%=1.00; 80%=1.00; 90%=1.00; 95%=1.00; 99%=1.00\nMetric: UnwrapXlaData\n  TotalSamples: 2\n  Accumulator: 012.490us\n  ValueRate: 049ms705.350us / second\n  Rate: 7799.1 / second\n  Percentiles: 1%=004.480us; 5%=004.480us; 10%=004.480us; 20%=004.480us; 50%=008.010us; 80%=008.010us; 90%=008.010us; 95%=008.010us; 99%=008.010us\nMetric: WrapXlaData\n  TotalSamples: 1\n  Accumulator: 001.930us\n  Percentiles: 1%=001.930us; 5%=001.930us; 10%=001.930us; 20%=001.930us; 50%=001.930us; 80%=001.930us; 90%=001.930us; 95%=001.930us; 99%=001.930us\nCounter: CreateXlaTensor\n  Value: 2\nCounter: UncachedCompile\n  Value: 1\nCounter: xla::_copy_from\n  Value: 2\nCounter: xla::_to_copy\n  Value: 2\nCounter: xla::_to_cpu\n  Value: 1\nCounter: xla::empty_symint\n  Value: 2\nMetric: CompileTime\n  TotalSamples: 1\n  Accumulator: 139ms227.076us\n  Percentiles: 1%=139ms227.076us; 5%=139ms227.076us; 10%=139ms227.076us; 20%=139ms227.076us; 50%=139ms227.076us; 80%=139ms227.076us; 90%=139ms227.076us; 95%=139ms227.076us; 99%=139ms227.076us\nMetric: ExecuteTime\n  TotalSamples: 1\n  Accumulator: 177.490us\n  Percentiles: 1%=177.490us; 5%=177.490us; 10%=177.490us; 20%=177.490us; 50%=177.490us; 80%=177.490us; 90%=177.490us; 95%=177.490us; 99%=177.490us\nMetric: InboundData\n  TotalSamples: 1\n  Accumulator: 972.00B\n  Percentiles: 1%=972.00B; 5%=972.00B; 10%=972.00B; 20%=972.00B; 50%=972.00B; 80%=972.00B; 90%=972.00B; 95%=972.00B; 99%=972.00B\nMetric: OutboundData\n  TotalSamples: 2\n  Accumulator: 8.54KB\n  ValueRate: 57.72KB / second\n  Rate: 13.5128 / second\n  Percentiles: 1%=972.00B; 5%=972.00B; 10%=972.00B; 20%=972.00B; 50%=7.59KB; 80%=7.59KB; 90%=7.59KB; 95%=7.59KB; 99%=7.59KB\nMetric: TransferFromDeviceTime\n  TotalSamples: 1\n  Accumulator: 200.740us\n  Percentiles: 1%=200.740us; 5%=200.740us; 10%=200.740us; 20%=200.740us; 50%=200.740us; 80%=200.740us; 90%=200.740us; 95%=200.740us; 99%=200.740us\nMetric: TransferToDeviceTime\n  TotalSamples: 2\n  Accumulator: 183.900us\n  ValueRate: 001ms242.496us / second\n  Rate: 13.5127 / second\n  Percentiles: 1%=075.200us; 5%=075.200us; 10%=075.200us; 20%=075.200us; 50%=108.700us; 80%=108.700us; 90%=108.700us; 95%=108.700us; 99%=108.700us\nCounter: CreateCompileHandles\n  Value: 1\nCounter: CreateDataHandles\n  Value: 3\nCounter: aten::reflection_pad3d\n  Value: 1\n'

As seen in the metrics, we can see that aten::reflection_pad3d has fallen back to CPU (which makes sense because it isn't lowered in torch_xla according to https://github.com/pytorch/xla/blob/master/codegen/xla_native_functions.yaml).

Good thing is that only 3 ops were silently passing, so not too much more work is required from our end. I'll add these to core aten op issues.

cc @qihqi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants