Skip to content

functionalization fix for inplace_view ops#77126

Closed
bdhirsh wants to merge 16 commits intogh/bdhirsh/224/basefrom
gh/bdhirsh/224/head
Closed

functionalization fix for inplace_view ops#77126
bdhirsh wants to merge 16 commits intogh/bdhirsh/224/basefrom
gh/bdhirsh/224/head

Conversation

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented May 10, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 8844a19 (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

{meta_conversion_str}
reference_tensor_output = at::_ops::{noop_api_name}::call({', '.join(meta_call_args)});
}}
at::functionalization::impl::mutate_view_meta({view_tensor_name}, view_meta);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For inplace_view ops, mutate_view_meta() is responsible for replaying the newly-added view, and will update the wrapper's size/stride metadata (e.g. with the new strides after running transpose_()

Before, I was running mutate_view_meta() before running the reference meta function, which caused it to run with the input meta tensors already having been mutated. The reference_tensor_output's metadata was then wrong, so in set_sizes_strides_offset I'm setting the wrong size/stride info on the wrapper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These PR comments should just be comments in the code, they will be completely lost here.

// See Note [Propagating strides in the functionalization pass]
// XLA/LTC don't implement the logic to propagate strides correctly, so we need to rely
// on a reference implementation here (instead of relying on the output from the forward lambda
// having the correct stride info)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first attempt was just to say "hey we don't actually need to run the meta tensor reference implementation! We can just rely on sizes/strides/storage_offset being correct after running the view op on the inner tensor!

We can't rely on that though because XLA/LTC's implementations for transpose won't actually properly compute strides for you.

@bdhirsh bdhirsh requested review from albanD, ezyang and zou3519 May 10, 2022 01:36
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
y = x[0]
y.add_(tmp)
return y
return x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does the return here matter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep; first I got a repro from LTC tests, then I tried to swizzle this test until I could get a repro from it.

I also added self.assertEqual(out_ref.size(), out_functional.size()) to assert_functionalization above.

This version of the test now fail the new assert above without the change, because the FunctionalTensorWrapper for x used to have its sizes set incorrectly.

@ezyang
Copy link
Contributor

ezyang commented May 11, 2022

Do the test case modifications actually exercise the change here?

bdhirsh added 3 commits May 11, 2022 07:31
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
bdhirsh added 11 commits May 17, 2022 19:58
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
Another subtle bugfix that came from the LTC <-> Functionalization integration.




[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request May 25, 2022
Summary:
Pull Request resolved: #77126

Approved by: https://github.com/ezyang

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/22d566acdae7b61ee4432d68905f3eaa869bd275

Reviewed By: mehtanirav

Differential Revision: D36668389

Pulled By: bdhirsh

fbshipit-source-id: 1d530a8a8bcde6a6171a5a95c386d3ba54d32def
@facebook-github-bot facebook-github-bot deleted the gh/bdhirsh/224/head branch May 28, 2022 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants