Skip to content

Propagate and report the file/line number of registrations on errors.#36264

Closed
ezyang wants to merge 16 commits intogh/ezyang/712/basefrom
gh/ezyang/712/head
Closed

Propagate and report the file/line number of registrations on errors.#36264
ezyang wants to merge 16 commits intogh/ezyang/712/basefrom
gh/ezyang/712/head

Conversation

@ezyang
Copy link
Copy Markdown
Contributor

@ezyang ezyang commented Apr 8, 2020

Stack from ghstack:

Pretty simple 1-2 PR:

  • Start recording the passed in FILE and LINE from the
    new API in Library, and propagate it to the existing debug
    strings
  • Extend OperatorEntry to record a debug string corresponding
    to where the schema came from
  • Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang ezyang@fb.com

Differential Revision: D20929260

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented Apr 8, 2020

💊 Build failures summary and remediations

As of commit 32dff1e (more details on the Dr. CI page):


  • 2/3 failures introduced in this PR

  • 1/3 broken upstream at merge base fb70b4f from Apr 13 until Apr 14 (4 commits; 8f501f3 - ed2d1cb)

    Please rebase on the viable/strict branch (expand for instructions)

    If your commit is newer than viable/strict, you can try basing on an older, stable commit:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase --onto FETCH_HEAD $(git merge-base origin/master HEAD)
    

    If your commit is older than viable/strict:

    git fetch https://github.com/pytorch/pytorch viable/strict
    git rebase FETCH_HEAD
    

    Check out the recency history of this "viable master" tracking branch.


🕵️ 2 new failures recognized by patterns

The following build failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_py3_clang5_mobile_custom_build_dynamic (1/2)

Step: "Build" (full log | pattern match details | 🔁 rerun)

Apr 14 13:50:44 caused by: Connection refused (os error 111)
Apr 14 13:50:44 +++ eval 'extract_trap_cmd ' 
Apr 14 13:50:44 ++++ extract_trap_cmd 
Apr 14 13:50:44 ++++ printf '%s\n' '' 
Apr 14 13:50:44 +++ printf '%s\n' cleanup 
Apr 14 13:50:44 ++ trap -- ' 
Apr 14 13:50:44 cleanup' EXIT 
Apr 14 13:50:44 ++ which sccache 
Apr 14 13:50:44 ++ sccache --stop-server 
Apr 14 13:50:44 Stopping sccache server... 
Apr 14 13:50:44 error: couldn't connect to server 
Apr 14 13:50:44 caused by: Connection refused (os error 111) 
Apr 14 13:50:44 ++ true 
Apr 14 13:50:44 ++ rm /var/lib/jenkins/sccache_error.log 
Apr 14 13:50:44 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory 
Apr 14 13:50:44 ++ true 
Apr 14 13:50:44 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log 
Apr 14 13:50:44 ++ SCCACHE_IDLE_TIMEOUT=1200 
Apr 14 13:50:44 ++ RUST_LOG=sccache::server=error 
Apr 14 13:50:44 ++ sccache --start-server 
Apr 14 13:50:44 Starting sccache server... 
Apr 14 13:50:44 ++ sccache --zero-stats 

See CircleCI build pytorch_linux_xenial_py3_6_gcc5_4_build (2/2)

Step: "Build" (full log | pattern match details | 🔁 rerun)

Apr 14 14:57:08 /var/lib/jenkins/workspace/aten/src/ATen/core/op_registration/op_registration_test.cpp:1526:22: error: 'import' is not a member of 'c10'
Apr 14 14:57:03 Scanning dependencies of target elementwise_op_test 
Apr 14 14:57:03 [ 92%] Building CXX object caffe2/CMakeFiles/elementwise_op_test.dir/operators/elementwise_op_test.cc.o 
Apr 14 14:57:03 [ 92%] Built target AlgorithmsTest 
Apr 14 14:57:03 Scanning dependencies of target GraphTest 
Apr 14 14:57:03 [ 92%] Building CXX object caffe2/CMakeFiles/GraphTest.dir/core/nomnigraph/tests/GraphTest.cc.o 
Apr 14 14:57:04 [ 92%] Linking CXX executable ../bin/GraphTest 
Apr 14 14:57:05 [ 92%] Built target GraphTest 
Apr 14 14:57:05 Scanning dependencies of target time_observer_test 
Apr 14 14:57:05 [ 92%] Building CXX object caffe2/CMakeFiles/time_observer_test.dir/observers/time_observer_test.cc.o 
Apr 14 14:57:08 /var/lib/jenkins/workspace/aten/src/ATen/core/op_registration/op_registration_test.cpp: In member function 'virtual void {anonymous}::OperatorRegistrationTest_whenDeregisteringOp_thenHandleBecomesInvalid_Test::TestBody()': 
Apr 14 14:57:08 /var/lib/jenkins/workspace/aten/src/ATen/core/op_registration/op_registration_test.cpp:1526:22: error: 'import' is not a member of 'c10' 
Apr 14 14:57:08      auto registrar = c10::import() 
Apr 14 14:57:08                       ^ 
Apr 14 14:57:08 make[2]: *** [caffe2/CMakeFiles/op_registration_test.dir/__/aten/src/ATen/core/op_registration/op_registration_test.cpp.o] Error 1 
Apr 14 14:57:08 caffe2/CMakeFiles/op_registration_test.dir/build.make:62: recipe for target 'caffe2/CMakeFiles/op_registration_test.dir/__/aten/src/ATen/core/op_registration/op_registration_test.cpp.o' failed 
Apr 14 14:57:08 CMakeFiles/Makefile2:8480: recipe for target 'caffe2/CMakeFiles/op_registration_test.dir/all' failed 
Apr 14 14:57:08 make[1]: *** [caffe2/CMakeFiles/op_registration_test.dir/all] Error 2 
Apr 14 14:57:08 make[1]: *** Waiting for unfinished jobs.... 
Apr 14 14:57:13 [ 92%] Linking CXX executable ../bin/elementwise_op_test 
Apr 14 14:57:13 [ 92%] Built target elementwise_op_test 
Apr 14 14:57:14 [ 93%] Linking CXX executable ../bin/time_observer_test 

🚧 1 upstream failure:

These were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 87 times.

… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 8, 2020
Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: a5aa0dd
Pull Request resolved: #36264
@ezyang ezyang requested review from bhosmer and smessmer April 8, 2020 22:12
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 9, 2020
Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 702bf0d
Pull Request resolved: #36264
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Apr 9, 2020
Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 05456fd
Pull Request resolved: #36264
Copy link
Copy Markdown

@bhosmer bhosmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍😍😍

if (ns_.has_value()) name.setNamespaceIfNotSet(ns_->c_str());
TORCH_CHECK(!(f.dispatch_key_.has_value() && dispatch_key_.has_value()), "Cannot specify a different dispatch key inside a TORCH_LIBRARY_IMPL; please declare a separate TORCH_LIBRARY_IMPL for your dispatch key");
if (ns_.has_value()) {
TORCH_CHECK(name.setNamespaceIfNotSet(ns_->c_str()), "Attempted to impl ", toString(name), " which is explicitly qualified with a namespace inside a TORCH_LIBRARY, which is not allowed. If TORCH_LIBRARY's namespace matches the explicitly given namespace, remove the qualifier; otherwise, please place it in an separate TORCH_LIBRARY_IMPL block to make it clear that you are overriding behavior for an operator in a different library. Registration site was ", file_, ":", line_);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about wrapping these messages? I guess the idea is to keep them out of the way of the code, but they can be informative and the side scrolling gets pretty crazy. YMMV I guess.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point it will be kind of annoying to adjust the formatting in the stack diff because of merge conflicts. I could do this after though.

… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
ezyang added 4 commits April 10, 2020 11:20
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
listeners_->callOnOperatorRegistered(op);
} else {
checkSchemaCompatibility(op, schema);
checkSchemaCompatibility(op, schema, debug);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd probably also add the std::move here. It doesn't change anything because the checkSchemaCompatibility function takes a reference, but that's not immediately visible at the call site, so the call site is easier to check for consistent std::move usage if we just add it everywhere where it is valid. That also future-proofs in case checkSchemaCompatibility decides to take it by value later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't make sense to me. Semantically, it doesn't make any sense for schema compatibility to take ownership of the schema. It is after all just going to read the schema, not store it anywhere. Isn't adding a std::move here "just because" it's the terminal use obscuring the semantic meaning of the code here?

}

RegistrationHandleRAII Dispatcher::registerFallback(DispatchKey dispatchKey, KernelFunction kernel) {
RegistrationHandleRAII Dispatcher::registerFallback(DispatchKey dispatchKey, KernelFunction kernel, std::string debug) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're not using debug here, right? In that case, you could also make it a const&.

General recommendation: If a function takes an argument by value or by const& should not depend on the call site but only on if the function internally takes ownership (e.g. moves the value further) or not. The API contract (i.e. what the call site needs to know) for by-value and const& is the same, it's just a compiler optimization. Both calling conventions look the same at the call site, so by reading you cannot distinguish them. That means the call site should ignore if the function takes an argument by value or as const& and just std::move it in whenever that is valid (and the type is large enough so moving in general makes sense).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one should morally take it by move; the problem is I'm not actually saving debug information for fallbacks (but I should!) There's a more generic problem which should get fixed at some point, which is we're not doing adequate error checking for fallbacks, c.f. this existing comment:

RegistrationHandleRAII Dispatcher::registerFallback(DispatchKey dispatchKey, KernelFunction kernel, std::string debug) {
  // TODO: fallbacks clobber each other completely unsafely, unlike regular
  // kernels

OperatorName name_;
c10::optional<FunctionSchema> schema_;
c10::optional<std::string> debug_;
// INVARIANT: schema_.has_value() == debug_.has_value()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we explicitly model this instead by having a c10::optional<SchemaInfo> with

struct SchemaInfo final {
    FunctionSchema schema;
    std::string debug;
}

That would make the invariant be enforced by the compiler instead of having to rely on the comment here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered it, but adding another struct seemed annoying. Up to you.


void setNamespaceIfNotSet(const char* ns) {
name_.setNamespaceIfNotSet(ns);
bool setNamespaceIfNotSet(const char* ns) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment on when this returns true/false. Or even better, make it return an enum with two possible (and descriptive) values.

ezyang added 3 commits April 13, 2020 14:07
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
… on errors."

Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: [D20929260](https://our.internmc.facebook.com/intern/diff/D20929260)

[ghstack-poisoned]
ljk53 pushed a commit to ljk53/pytorch that referenced this pull request Apr 14, 2020
Pretty simple 1-2 PR:
- Start recording the passed in __FILE__ and __LINE__ from the
  new API in Library, and propagate it to the existing debug
  strings
- Extend OperatorEntry to record a debug string corresponding
  to where the schema came from
- Print out these newly recorded debug strings wherever possible

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

ghstack-source-id: 61e4f5f
Pull Request resolved: pytorch#36264
@ezyang
Copy link
Copy Markdown
Contributor Author

ezyang commented Apr 15, 2020

Folded into #36258

@ezyang ezyang closed this Apr 15, 2020
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/712/head branch May 16, 2020 14:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants