Skip to content

Suppress C++ stacktrace on XLA_CHECK*() calls.#9448

Merged
ysiraichi merged 4 commits intomasterfrom
ysiraichi/xla-check-dont-show-cpp-stacktrace
Jul 10, 2025
Merged

Suppress C++ stacktrace on XLA_CHECK*() calls.#9448
ysiraichi merged 4 commits intomasterfrom
ysiraichi/xla-check-dont-show-cpp-stacktrace

Conversation

@ysiraichi
Copy link
Copy Markdown
Collaborator

This PR improves error messages in PyTorch/XLA by suppressing the display of C++ stack traces during XLA check failures, making them more user-friendly. Currently, when XLA_CHECK*() fails, the resulting error output includes a lengthy and verbose C++ stacktrace. While these can be useful for deep-dive debugging by developers, they often add noise for end-users.

Key Changes:

Before:

Traceback (most recent call last):
  File "dot.py", line 6, in <module>
    torch.dot(a, b)
RuntimeError: torch_xla/csrc/aten_xla_bridge.cpp:110 : Check failed: xtensor
*** Begin stack trace ***
        tsl::CurrentStackTrace[abi:cxx11]()
        torch_xla::bridge::GetXlaTensor(at::Tensor const&)
        torch_xla::XLANativeFunctions::dot(at::Tensor const&, at::Tensor const&)

        c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
        c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const
        c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const



        c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const


        at::_ops::dot::redispatch(c10::DispatchKeySet, at::Tensor const&, at::Tensor const&)





        at::_ops::dot::call(at::Tensor const&, at::Tensor const&)
        at::Tensor::dot(at::Tensor const&) const



        _PyObject_MakeTpCall
        _PyEval_EvalFrameDefault

        PyEval_EvalCode



        _PyRun_SimpleFileObject
        _PyRun_AnyFileObject
        Py_RunMain
        Py_BytesMain
        __libc_start_main
        _start
*** End stack trace ***
Input tensor is not an XLA tensor: torch.FloatTensor

After:

Traceback (most recent call last):
  File "dot.py", line 6, in <module>
    torch.dot(a, b)
RuntimeError: Check failed: xtensor: Input tensor is not an XLA tensor: torch.FloatTensor

Copy link
Copy Markdown
Collaborator

@zhanyong-wan zhanyong-wan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment thread torch_xla/csrc/runtime/tf_logging.cpp
Comment thread torch_xla/csrc/status.h Outdated
Comment thread torch_xla/csrc/runtime/debug_macros.h
Comment thread torch_xla/csrc/runtime/tf_logging.cpp
@ysiraichi ysiraichi requested a review from zhanyong-wan July 8, 2025 18:08
Copy link
Copy Markdown
Collaborator

@zhanyong-wan zhanyong-wan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ysiraichi ysiraichi force-pushed the ysiraichi/xla-check-dont-show-cpp-stacktrace branch from f96ad24 to 7d8baa0 Compare July 9, 2025 20:32
@ysiraichi ysiraichi force-pushed the ysiraichi/xla-check-dont-show-cpp-stacktrace branch from 7d8baa0 to fa8ac92 Compare July 10, 2025 12:07
@ysiraichi ysiraichi merged commit 5496a36 into master Jul 10, 2025
23 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants