Skip to content

Add explicit types to code analyzer scripts#3

Closed
bobrenjc93 wants to merge 6 commits intomainfrom
type-code-analyzer-scripts
Closed

Add explicit types to code analyzer scripts#3
bobrenjc93 wants to merge 6 commits intomainfrom
type-code-analyzer-scripts

Conversation

@bobrenjc93
Copy link
Copy Markdown
Owner

Summary

  • add typed argparse namespaces for the code analyzer entrypoints
  • replace untyped YAML/debug-info handling with explicit typed aliases and casts
  • tighten helper signatures in gen_op_registration_allowlist.py, gen_oplist.py, and gen_operators_yaml.py

Testing

  • PYTHONPATH=/tmp/repos/bobrenjc93/pytorch/tools/code_analyzer:/tmp/repos/bobrenjc93/pytorch python3 -m unittest -q tools.test.gen_oplist_test tools.test.gen_operators_yaml_test
  • python3 -m py_compile /tmp/repos/bobrenjc93/pytorch/tools/code_analyzer/gen_op_registration_allowlist.py /tmp/repos/bobrenjc93/pytorch/tools/code_analyzer/gen_oplist.py /tmp/repos/bobrenjc93/pytorch/tools/code_analyzer/gen_operators_yaml.py

@bobrenjc93
Copy link
Copy Markdown
Owner Author

@codex review

@bobrenjc93
Copy link
Copy Markdown
Owner Author

@codex

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@bobrenjc93
Copy link
Copy Markdown
Owner Author

@codex

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

bobrenjc93 pushed a commit that referenced this pull request Mar 16, 2026
…nces between x86 vs aarch64 (pytorch#176085)

In the test:

```
python  test/cpp_extensions/test_libtorch_agnostic.py TestLibtorchAgnosticCUDA.test_std_cuda_check_error_show_cpp_stacktraces_True_cuda
```
 it raises an exception when calling `STD_CUDA_CHECK(cudaSetDevice(99999));` which got the expected `CUDA error: invalid device` message. However, the expected string for the C++ stack trace is different between `x86` vs `aarch64` due perhaps in these issues:
  - pytorch#119905
  - pytorch#134387

In the current setup when getting a stack trace string:
- x86 contains `C++ CapturedTraceback:`
- aarch64 contains `Exception raised from` + `frame #`

An example of the full string from an aarch64 system when :
```
AssertionError: 'C++ CapturedTraceback:' not found in 'CUDA error: invalid device ordinal\nGPU device may be out of range, do you have enough GPUs?\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\n\nException raised from test_std_cuda_check_error at /opt/pytorch/pytorch/test/cpp_extensions/libtorch_agn_2_10_extension/csrc/test_std_cuda_check.cu:23 (most recent call first):\nframe #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0xd4 (0xe471ebcd39f4 in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10.so)\nframe #1: <unknown function> + 0x43f998 (0xe471ebdcf998 in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10_cuda.so)\nframe #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, unsigned int, bool) + 0x1bc (0xe471ebdcfc0c in /usr/local/lib/python3.12/dist-packages/torch/lib/libc10_cuda.so)\nframe #3: torch_c10_cuda_check_msg + 0x1c (0xe471ef335c4c in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cuda.so)\nframe #4: test_std_cuda_check_error() + 0x58 (0xe470cd396678 in /opt/pytorch/pytorch/test/cpp_extensions/libtorch_agn_2_10_extension/install/usr/local/lib/python3.12/dist-packages/libtorch_agn_2_10/_C.so)\nframe pytorch#5: c10::BoxedKernel::makeFromFunctor<StableIValueBoxedKernel>(std::unique_ptr<StableIValueBoxedKernel, std::default_delete<StableIValueBoxedKernel> >)::{lambda(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*)#1}::_FUN(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) + 0x16c (0xe47211cd419c in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe pytorch#6: <unknown function> + 0x61d34bc (0xe47211cf34bc in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_cpu.so)\nframe pytorch#7: <unknown function> + 0xe6c324 (0xe4721532c324 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe pytorch#8: <unknown function> + 0xe6c7e0 (0xe4721532c7e0 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe pytorch#9: <unknown function> + 0xd3907c (0xe472151f907c in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe pytorch#10: <unknown function> + 0x5ccbf8 (0xe47214a8cbf8 in /usr/local/lib/python3.12/dist-packages/torch/lib/libtorch_python.so)\nframe pytorch#11: /usr/bin/python() [0x504a34]\nframe pytorch#12: PyObject_Call + 0x6c (0x4c633c in /usr/bin/python)\nframe pytorch#13: _PyEval_EvalFrameDefault + 0x3ea0 (0x568564 in /usr/bin/python)\nframe pytorch#14: _PyObject_Call_Prepend + 0xc4 (0x4c5934 in /usr/bin/python)\nframe pytorch#15: /usr/bin/python() [0x52a070]\nframe pytorch#16: _PyObject_MakeTpCall + 0x78 (0x4c3e58 in /usr/bin/python)\nframe pytorch#17: _PyEval_EvalFrameDefault + 0x8a0 (0x564f64 in /usr/bin/python)\nframe pytorch#18: PyEval_EvalCode + 0x130 (0x5632b4 in /usr/bin/python)\nframe pytorch#19: PyRun_StringFlags + 0xe0 (0x59c330 in /usr/bin/python)\nframe pytorch#20: PyRun_SimpleStringFlags + 0x44 (0x67ebc4 in /usr/bin/python)\nframe pytorch#21: Py_RunMain + 0x390 (0x68b380 in /usr/bin/python)\nframe pytorch#22: Py_BytesMain + 0x28 (0x68ae88 in /usr/bin/python)\nframe pytorch#23: <unknown function> + 0x284c4 (0xe47216b084c4 in /lib/aarch64-linux-gnu/libc.so.6)\nframe pytorch#24: __libc_start_main + 0x98 (0xe47216b08598 in /lib/aarch64-linux-gnu/libc.so.6)\nframe pytorch#25: _start + 0x30 (0x5f6770 in /usr/bin/python)\n\n'

To execute this test, run the following from the base repo dir:
    python test/cpp_extensions/test_libtorch_agnostic.py TestLibtorchAgnosticCUDA.test_std_cuda_check_error_show_cpp_stacktraces_True_cuda
```

Pull Request resolved: pytorch#176085
Approved by: https://github.com/eqy
bobrenjc93 pushed a commit that referenced this pull request Mar 18, 2026
… mode + replication padding (pytorch#177166)

Fixes pytorch#170079

## Context

`torch.compile(ReplicationPad1d(...), fullgraph=True)` crashes when
`torch.use_deterministic_algorithms(True)` is set on CUDA. The error: Dynamo can't trace
through `importlib.import_module`.

The deterministic code path exists because the native `replication_pad1d_backward` CUDA
kernel uses `atomicAdd` (non-deterministic). `functional.py` calls `_replication_pad` — a
Python decomposition using `_unsafe_index`, whose backward uses `index_put` (deterministic).

## Dynamo limitations encountered

Three separate Dynamo tracing barriers prevented calling `_replication_pad` directly:

### 1. `importlib.import_module` is marked as skipped

```python
@torch.compile(fullgraph=True)
def fn(x):
    import importlib
    return importlib.import_module("torch").sin(x)
fn(torch.randn(3))  # Unsupported: function marked as skipped
```

### 2. `elementwise_dtypes` returns non-Tensor (from `@pw_cast_for_opmath`)

```python
@torch.compile(fullgraph=True)
def fn(x):
    from torch._prims_common import elementwise_dtypes, ELEMENTWISE_TYPE_PROMOTION_KIND
    dt, _ = elementwise_dtypes(x, type_promotion_kind=ELEMENTWISE_TYPE_PROMOTION_KIND.DEFAULT)
    return x.to(dt)
fn(torch.randn(3))  # Unsupported: torch.* op returned non-Tensor
```

### 3. `torch._check` with closure lambda

```python
@torch.compile(fullgraph=True)
def fn(x):
    dim = x.dim()
    torch._check(dim in (2, 3), lambda: f"expected 2D or 3D, got {dim}D")
    return x + 1
fn(torch.randn(3, 3))  # Unsupported: Can't extract message from torch._check()
```

## Iteration log

| # | Approach | Who | Tests | Reviewer pushback | Why it failed |
|---|----------|-----|-------|-------------------|---------------|
| 1 | Replace `importlib` with `from...import` | Claude | bilinear/trilinear pass, replicate fails | "why do we need bilinear/trilinear tests?" — scoped fix to reported bug only | Hit limitation #2: `@pw_cast_for_opmath` |
| 2 | Skip decomposition under compile via `is_compiling()`, rely on AOTAutograd's `@register_decomposition` | Claude | forward-only `backend="eager"` passes | "can you verify at inductor level this is actually deterministic?" — inspect AOT graph | No backward decomposition registered; backward still uses native `replication_pad1d_backward` (non-deterministic) |
| 3 | Unwrap `@pw_cast_for_opmath` via `__wrapped__` | Claude | N/A — fails immediately | N/A | Hit limitation #3: `torch._check()` closure |
| 4 | `@nonstrict_trace` — Dynamo skips body, AOTAutograd traces through | Reviewer suggestion | `backend="aot_eager"`, forward + backward under `DeterministicGuard(True)` | N/A — fix is correct | N/A |

## Key insight

The fix isn't about making Dynamo trace the decomposition or skipping it entirely — it's
about putting the boundary in the right place. Dynamo doesn't need to see inside; AOTAutograd
does. `@nonstrict_trace` is exactly this boundary.

Each "obvious" fix had passing tests that weren't testing the right thing. Only when the
reviewer pushed for backward determinism verification and AOT graph inspection did the
weaknesses surface. The backward completing without error under `DeterministicGuard(True)`
proves determinism — PyTorch explicitly raises `RuntimeError` if any non-deterministic CUDA
kernel executes under this mode.

Authored with Claude.

Pull Request resolved: pytorch#177166
Approved by: https://github.com/mlazos, https://github.com/williamwen42
@bobrenjc93 bobrenjc93 closed this Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant