Introducing the StableIValue representation of list :D by janeyx99 · Pull Request #165953 · pytorch/pytorch

janeyx99 · 2025-10-21T00:37:26Z

Some important notes:
a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable.
b) As a result, since StableIValues are owning and our contract is that to(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef.

We handle memory similar to AtenTensorHandle, but we have a StableListHandle!

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

pytorch-bot · 2025-10-21T00:37:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165953

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit c01ee49 with merge base d7e2d0a ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / linux-jammy-cuda12.8-py3.10-gcc11 / test (default, 3, 5, linux.g6.4xlarge.experimental.nvidia.gpu) (gh) (similar failure)
test_ops

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

trunk / linux-jammy-py3-clang12-executorch / test (executorch, 1, 1, linux.2xlarge, unstable) (gh) (#166072)
examples/models/llama/tests/test_export_llama_lib.py::ExportLlamaLibTest::test_has_expected_ops_and_op_counts

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: 9b72dab Pull Request resolved: #165953

[ghstack-poisoned]

ghstack-source-id: 55d2891 Pull Request resolved: #165953

[ghstack-poisoned]

ghstack-source-id: 6b5192b Pull Request resolved: #165953

[ghstack-poisoned]

ghstack-source-id: c5e7fb7 Pull Request resolved: #165953

[ghstack-poisoned]

ghstack-source-id: 86ef382 Pull Request resolved: #165953

[ghstack-poisoned]

ghstack-source-id: a94a7f3 Pull Request resolved: #165953

swolchok

I had a lot of confusion around the ownership model before I went back and reviewed how ownership of tensors works and confirmed that (I think) StableIValue owns lists to the same extent that it owns tensors -- it takes care of its own cleanup. I think it's probably OK but might be something to think about.

swolchok · 2025-11-05T00:41:19Z

+    StableListHandle* new_handle);
+
+AOTI_TORCH_EXPORT AOTITorchError
+torch_list_size(StableListHandle list_handle, size_t* size);


torch_list_get_size?

I'm following the semantic where if it's blah->api(), I'm naming it torch_blah_api, so this one stays.

Hmm, on one hand I understand the desire to keep error_t foo(inp, *out) semantic, but on the other hand, the only it it could fail if list_handle is nullptr, isn't it? Should we indeed for the sake of speed change semantic a bit to size_t torch_list_get_size(StableListHandle)?

I'm leaning towards no for now, because it would mean we should guarantee in the long term that the function should never fail, and that's not worth the small speed bump to me.

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! [ghstack-poisoned]

ghstack-source-id: a6b358f Pull Request resolved: #165953

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! [ghstack-poisoned]

ghstack-source-id: ae49f48 Pull Request resolved: #165953

malfet · 2025-11-05T21:45:36Z

+    StableListHandle* new_handle);
+
+AOTI_TORCH_EXPORT AOTITorchError
+torch_list_size(StableListHandle list_handle, size_t* size);


Hmm, on one hand I understand the desire to keep error_t foo(inp, *out) semantic, but on the other hand, the only it it could fail if list_handle is nullptr, isn't it? Should we indeed for the sake of speed change semantic a bit to size_t torch_list_get_size(StableListHandle)?

malfet · 2025-11-05T21:47:20Z

+AOTI_TORCH_EXPORT AOTITorchError torch_list_at(
+    StableListHandle list_handle,
+    size_t index,
+    StableIValue* element);


Same as above: if at::List::operator[] doesn't do boundary checks, this API shouldn't do them either

malfet · 2025-11-05T21:47:39Z

+      [[maybe_unused]] uint64_t extension_build_version,
+      [[maybe_unused]] bool is_internal) {


Just curious, what are the purpose of those?

These are scaffolding so that in the unlikely case that the memory layout of a struct passed as a stableivalue changes between two releases, we can appropriately modify from and to (and from_ivalue + to_ivalue that call into these) to handle this. See https://github.com/pytorch/pytorch/pull/165284/files#diff-e260bc7cf1dcf33d3ce5fa484f918553929e57bc8ae76d398d4b1d884fe8654eR420-R437 for an example use

(I'll have better developer documentation to come, there are also tests that ensure the memory layout of the relevant structs don't change coming)

malfet · 2025-11-05T21:49:29Z

+    TORCH_ERROR_CODE_CHECK(torch_list_size(list_handle, &size));
+    std::vector<T> result;
+    result.reserve(size);
+    for (size_t i = 0; i < size; i++) {


Why can't we use irange for stable ABI?

Suggested change

for (size_t i = 0; i < size; i++) {

for (const auto i : c10::irange(size)) {

We'd have to move irange to headeronly first, but otherwise we can

malfet · 2025-11-05T21:51:46Z

+}
+
+AOTI_TORCH_EXPORT AOTITorchError
+torch_list_size(StableListHandle list_handle, size_t* size) {


Do you want to check first that list_handle and size are not nullptrs?

Ehhhh these are meant to be used mostly internal so I'm not convinced we need to add checks

OK, then why do you want to wrap this code with AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE? Semantically this code will never throw, would it?

FWIW, it's harmless if size() gets inlined and so I am weakly in favor of it because it makes everything visually uniform (lack of AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE is known-bad). https://godbolt.org/z/zbqbhez87

I see your point; I have just learned that dereferencing a bad pointer will lead to crash/UB but not throw an Exception. Regardless, the way I'm thinking of adding these APIs is to be defensive against future code modifications. I would feel more comfortable removing the macro once we have a test to ensure that all shim APIs never throw Exceptions.

I indeed will land the PR as is but this is a good consideration. Whatever policy we adopt I would want to apply across all APIs --> if we check for nullptr here we should check in the Tensor shims too.

swolchok · 2025-11-05T22:40:49Z

+AOTI_TORCH_EXPORT AOTITorchError
+torch_list_push_back(StableListHandle list_handle, StableIValue element);


is it permissible to be wrong about the reserved size?

I think reserving size is just a performance optimization, it doesn't actually populate the tensor. So after reserving the size is still 0.

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! [ghstack-poisoned]

swolchok

looking pretty good

swolchok · 2025-11-06T20:18:51Z

+    for (const auto& elem : val) {
+      TORCH_ERROR_CODE_CHECK(torch_list_push_back(new_list_handle, from(elem)));
+    }


you're going to leak memory for the list and for previously-pushed elements if this TORCH_ERROR_CODE_CHECK ever fails, courtesy of StableIValue not having a destructor

swolchok · 2025-11-06T20:19:50Z

+    result.reserve(size);
+    for (size_t i = 0; i < size; i++) {
+      StableIValue element;
+      TORCH_ERROR_CODE_CHECK(torch_list_get_item(list_handle, i, &element));


ditto leaky

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! [ghstack-poisoned]

swolchok · 2025-11-07T17:25:13Z

+            torch_list_push_back(new_list_handle, from(elem)));
+      }
+      return from(new_list_handle);
+    } catch (const std::runtime_error& e) {


don't you need to catch all flavors of exception? why is it ok to leak on things that aren't runtime_error?

The only thing that TORCH_ERROR_CODE_CHECK can throw is std::runtime_error

fair enough, nothing should get thrown through ABI boundary

swolchok · 2025-11-07T17:25:29Z

+        // clean up memory if an error was thrown
+        TORCH_ERROR_CODE_CHECK(torch_delete_list(new_list_handle));
+      }
+      throw e;


drop the e, just throw;. it's cleaner.

Wait does that do the same thing? Does that rethrow the error?

Ohhh it issss

generally rethrowing can also do stuff like cause backtraces to be preserved.

throw e; can cause object slicing, while throw preserve type of original exception?

swolchok · 2025-11-07T17:25:54Z

+    } catch (const std::runtime_error& e) {
+      // clean up memory if an exception is thrown, and rethrow
+      TORCH_ERROR_CODE_CHECK(torch_delete_list(list_handle));
+      throw e;


same as above

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! [ghstack-poisoned]

swolchok · 2025-11-07T17:38:33Z

LGTM. I think @malfet has a couple unaddressed comments?

malfet · 2025-11-07T18:39:10Z

+}
+
+AOTI_TORCH_EXPORT AOTITorchError
+torch_list_size(StableListHandle list_handle, size_t* size) {


OK, then why do you want to wrap this code with AOTI_TORCH_CONVERT_EXCEPTION_TO_ERROR_CODE? Semantically this code will never throw, would it?

pytorchmergebot · 2025-11-07T19:20:03Z

Starting merge as part of PR stack under #167126

Pull Request resolved: #167126 Approved by: https://github.com/Skylion007 ghstack dependencies: #164991, #165152, #165153, #165953

ghstack-source-id: 36f837f Pull Request resolved: pytorch/pytorch#165953

Some important notes: a) Just like IValues steal the ownership of ArrayRefs and any std::vectors in order to convert the inner elements into IValues, we do the same thing with StableIValue. This O(N) traverse is ineluctable. b) As a result, since StableIValues are owning and our contract is that to<T>(StableIValue) transfers ownership, you cannot ever convert from StableIValue to a nonowning HeaderOnlyArrayRef<V>. We handle memory similar to AtenTensorHandle, but we have a StableListHandle! Pull Request resolved: pytorch#165953 Approved by: https://github.com/malfet ghstack dependencies: pytorch#164991, pytorch#165152, pytorch#165153

) Pull Request resolved: pytorch#167126 Approved by: https://github.com/Skylion007 ghstack dependencies: pytorch#164991, pytorch#165152, pytorch#165153, pytorch#165953

[draft] look away this commit only has a sample test

0be66b0

[ghstack-poisoned]

This was referenced Oct 21, 2025

Refactor out headeronly ArrayRef #164991

Closed

Widen ops support to take in IntHOArrayRef vs only std::vec #165152

Closed

pytorch-bot Bot added the topic: not user facing topic category label Oct 21, 2025

janeyx99 mentioned this pull request Oct 21, 2025

Add torch::stable::Tensor sizes and strides #165153

Closed

janeyx99 added a commit that referenced this pull request Oct 21, 2025

[draft] look away this commit only has a sample test

1e28543

ghstack-source-id: 9b72dab Pull Request resolved: #165953

Update on "[draft] look away this commit only has a sample test"

cc62812

[ghstack-poisoned]

janeyx99 added a commit that referenced this pull request Oct 24, 2025

[draft] look away this commit only has a sample test

ab98ff2

ghstack-source-id: 55d2891 Pull Request resolved: #165953

Update on "[draft] look away this commit only has a sample test"

8094f2d

[ghstack-poisoned]

janeyx99 added a commit that referenced this pull request Oct 29, 2025

[draft] look away this commit only has a sample test

6614ab5

ghstack-source-id: 6b5192b Pull Request resolved: #165953

pytorch-bot Bot added ciflow/inductor release notes: inductor (aoti) labels Oct 29, 2025

Update on "[draft] look away this commit only has a sample test"

319cc48

[ghstack-poisoned]

janeyx99 added a commit that referenced this pull request Nov 4, 2025

Introducing the StableIValue representation of list :D

c65241e

ghstack-source-id: c5e7fb7 Pull Request resolved: #165953

janeyx99 changed the title ~~[draft] look away this commit only has a sample test~~ Introducing the StableIValue representation of list :D Nov 4, 2025

Update on "Introducing the StableIValue representation of list :D"

9e2886c

[ghstack-poisoned]

janeyx99 added a commit that referenced this pull request Nov 4, 2025

Introducing the StableIValue representation of list :D

4a67fbd

ghstack-source-id: 86ef382 Pull Request resolved: #165953

janeyx99 marked this pull request as ready for review November 4, 2025 23:45

janeyx99 requested a review from mikaylagawarecki as a code owner November 4, 2025 23:45

Update on "Introducing the StableIValue representation of list :D"

ff8843a

[ghstack-poisoned]

janeyx99 added a commit that referenced this pull request Nov 4, 2025

Introducing the StableIValue representation of list :D

04abdc4

ghstack-source-id: a94a7f3 Pull Request resolved: #165953

janeyx99 requested review from albanD and swolchok November 4, 2025 23:48

janeyx99 added the ciflow/trunk Trigger trunk jobs on your pull request label Nov 5, 2025

swolchok reviewed Nov 5, 2025

View reviewed changes

janeyx99 added a commit that referenced this pull request Nov 5, 2025

Introducing the StableIValue representation of list :D

7ba498e

ghstack-source-id: a6b358f Pull Request resolved: #165953

janeyx99 added a commit that referenced this pull request Nov 5, 2025

Introducing the StableIValue representation of list :D

c8a39e3

ghstack-source-id: ae49f48 Pull Request resolved: #165953

janeyx99 mentioned this pull request Nov 5, 2025

[BE] use undeprecated from/to in libtorch_agnostic tests #167126

Closed

janeyx99 requested a review from malfet November 5, 2025 21:41

malfet reviewed Nov 5, 2025

View reviewed changes

swolchok reviewed Nov 5, 2025

View reviewed changes

swolchok reviewed Nov 6, 2025

View reviewed changes

swolchok reviewed Nov 7, 2025

View reviewed changes

malfet approved these changes Nov 7, 2025

View reviewed changes

pytorchmergebot added the Merged label Nov 7, 2025

pytorchmergebot closed this in 84b2147 Nov 7, 2025

pytorchmergebot pushed a commit that referenced this pull request Nov 7, 2025

[BE] use undeprecated from/to in libtorch_agnostic tests (#167126)

46516ef

Pull Request resolved: #167126 Approved by: https://github.com/Skylion007 ghstack dependencies: #164991, #165152, #165153, #165953

Khanaksahu pushed a commit to Khanaksahu/pytorch that referenced this pull request Nov 17, 2025

Introducing the StableIValue representation of list :D

ce50af5

ghstack-source-id: 36f837f Pull Request resolved: pytorch/pytorch#165953

github-actions Bot deleted the gh/janeyx99/317/head branch December 8, 2025 02:20

		[[maybe_unused]] uint64_t extension_build_version,
		[[maybe_unused]] bool is_internal) {

	for (size_t i = 0; i < size; i++) {
	for (const auto i : c10::irange(size)) {

		AOTI_TORCH_EXPORT AOTITorchError
		torch_list_push_back(StableListHandle list_handle, StableIValue element);

Conversation

janeyx99 commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165953

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

swolchok left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mikaylagawarecki Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

swolchok left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

janeyx99 commented Oct 21, 2025 •

edited

Loading

pytorch-bot Bot commented Oct 21, 2025 •

edited

Loading

mikaylagawarecki Nov 5, 2025 •

edited

Loading