Skip to content

Switch default memory format of to (and similar) operators to Preserve#30088

Closed
VitalyFedyunin wants to merge 8 commits intogh/VitalyFedyunin/88/basefrom
gh/VitalyFedyunin/88/head
Closed

Switch default memory format of to (and similar) operators to Preserve#30088
VitalyFedyunin wants to merge 8 commits intogh/VitalyFedyunin/88/basefrom
gh/VitalyFedyunin/88/head

Conversation

@VitalyFedyunin
Copy link
Contributor

@VitalyFedyunin VitalyFedyunin commented Nov 19, 2019

Stack from ghstack:

Differential Revision: D18624984

…and similar) operators to Preserve"

[ghstack-poisoned]
… of to (and similar) operators to Preserve"

Differential Revision: [D18624984](https://our.internmc.facebook.com/intern/diff/D18624984)

[ghstack-poisoned]
…lar) operators to Preserve"

Differential Revision: [D18624984](https://our.internmc.facebook.com/intern/diff/D18624984)

[ghstack-poisoned]
@kostmo
Copy link
Member

kostmo commented Dec 12, 2019

CircleCI build failures summary

As of commit 2c146b9:

  • 2/3 broken upstream at merge base 20a2e52 (see grid view)
    • You may want to rebase on the viable/strict branch (see its recency history):
      • If your commit is newer than viable/strict, you can try basing on an older, stable commit:
        git fetch viable/strict
        git rebase --onto viable/strict $(git merge-base origin/master HEAD)
        
      • If your commit is older than viable/strict:
        git fetch viable/strict
        git rebase viable/strict
        
  • 1/3 failures introduced in this PR

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

1 new failure recognized by patterns

The following build failures don't appear to be due to upstream breakage:

See CircleCI build pytorch_xla_linux_xenial_py3_6_clang7_test (1/1)

Step: "Test" (details)

Dec 13 02:21:48 +++ eval 'extract_trap_cmd ' 
Dec 13 02:21:48 ++++ extract_trap_cmd 
Dec 13 02:21:48 ++++ printf '%s\n' '' 
Dec 13 02:21:48 +++ printf '%s\n' cleanup 
Dec 13 02:21:48 ++ trap -- ' 
Dec 13 02:21:48 cleanup' EXIT 
Dec 13 02:21:48 ++ which sccache 
Dec 13 02:21:48 ++ sccache --stop-server 
Dec 13 02:21:49 Stopping sccache server... 
Dec 13 02:21:49 error: couldn't connect to server 
Dec 13 02:21:49 caused by: Connection refused (os error 111) 
Dec 13 02:21:49 ++ true 
Dec 13 02:21:49 ++ rm /var/lib/jenkins/sccache_error.log 
Dec 13 02:21:49 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Dec 13 02:21:49 ++ SCCACHE_IDLE_TIMEOUT=1200 
Dec 13 02:21:49 ++ RUST_LOG=sccache::server=error 
Dec 13 02:21:49 ++ sccache --start-server 
Dec 13 02:21:49 Starting sccache server... 
Dec 13 02:21:49 ++ sccache --zero-stats 
Dec 13 02:21:49 Compile requests                 0 
Dec 13 02:21:49 Compile requests executed        0 

2 failures not recognized by patterns:

Job Step Status
CircleCI pytorch_linux_xenial_cuda9_cudnn7_py3_build Build 🛑 Broken upstream
CircleCI pytorch_libtorch_linux_xenial_cuda9_cudnn7_py3_build Build 🛑 Broken upstream

This comment was automatically generated by Dr. CI.
Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 7 times.

static inline Tensor to_impl(const Tensor& self, const TensorOptions& options, bool non_blocking, bool copy, c10::optional<c10::MemoryFormat> optional_memory_format) {
auto memory_format =
optional_memory_format.value_or(MemoryFormat::Contiguous);
optional_memory_format.value_or(MemoryFormat::Preserve);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't this broken before? Because if I didn't pass in a memory format, memory_format would be set to contiguous.

But then suggest_memory_format wouldn't be contiguous, so I create a new tensor and copy. But that doesn't seem like the correct semantics for the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, looks like (in master) if we have channels last tensor and call .to to same dtype etc. we will get contiguous output instead of self.

This PR will solve the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that behavior in 1.4? Should we introduce a patch to fix it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would probably be good to fix that in a separate patch -- in case this gets reverted (happens with BC changes), we can have the bug fix in without this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and seems worthy of a test :).

@facebook-github-bot
Copy link
Contributor

@VitalyFedyunin merged this pull request in fde3d70.

@facebook-github-bot facebook-github-bot deleted the gh/VitalyFedyunin/88/head branch December 18, 2019 15:17
xxtEchjovs44 pushed a commit to xxtEchjovs44/pytorch that referenced this pull request Jan 29, 2020
wuhuikx pushed a commit to wuhuikx/pytorch that referenced this pull request Jan 30, 2020
Summary: Pull Request resolved: pytorch#30088

Test Plan: Imported from OSS

Differential Revision: D18624984

Pulled By: VitalyFedyunin

fbshipit-source-id: 54901786d7496c7dce785140b0585ac9093b1d86
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants