Skip to content

Generate Dynamic Shapes#37693

Closed
Krovatkin wants to merge 6 commits intopytorch:masterfrom
Krovatkin:krovatkin/gen_syms
Closed

Generate Dynamic Shapes#37693
Krovatkin wants to merge 6 commits intopytorch:masterfrom
Krovatkin:krovatkin/gen_syms

Conversation

@Krovatkin
Copy link
Copy Markdown
Contributor

Yay!

@Krovatkin Krovatkin requested a review from zdevito May 1, 2020 21:18
@Krovatkin Krovatkin requested a review from apaszke as a code owner May 1, 2020 21:18
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label May 1, 2020
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented May 1, 2020

💊 CI failures summary and remediations

As of commit 399704a (more details on the Dr. CI page):


  • 2/3 failures possibly* introduced in this PR
    • 1/2 non-CircleCI failure(s)
  • 1/3 broken upstream at merge base 363a2d9 since May 19

1 job timed out:

  • binary_windows_libtorch_3_7_cpu_release_build

🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet:


ci.pytorch.org: 1 failed


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 33 times.

Copy link
Copy Markdown
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review: I didn't get to profiling_record.cpp/h yet, Ill try to that part tonight.

Comment thread torch/csrc/jit/runtime/interpreter.cpp Outdated
push(stack, pttp->isSubtypeOf(expected));
const TypePtr& expected = af.types[inst.X];
auto expected_type = expected->cast<TensorType>();
bool pass = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks more complicated than it needs to be

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed this can be simplified.

Comment thread torch/csrc/jit/runtime/interpreter.cpp Outdated
Function** functions;
std::function<void(std::vector<IValue>&)>* profile_functions;
TypePtr* types;
ShapeSymbolTable symbols2dims;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ActiveFrame is a way of keeping a bunch of pointers on the stack in the interpreter rather than reloading after every instruction. But here it is holding an entire hash table, which is destroyed every time we enter/exit a function. This symbol table should be local to Frame, not ActiveFrame, I believe.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes... i think i actually moved it in one of the branches :-( but I guess I lost it in one of the rebases.

pass &= bindSymbolicShapes(
af, t.sizes(), expected_type->symbolic_sizes());
if (pass) {
pttp = expected_type->merge(pttp, false);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the composition of bindSymbolicShapes, merging and isSubtypeOf here is not ideal. First, it is a bit hard to follow direct intent of tracking whether the guard matches. I am still not sure what merge, and false is suppose to be here. Second, this is in performance critical code. isSubtypeOf checks are slow, and there is a lot we can do here to combine the bindSymbolicShape, and isSubtypeOf together to make a single pass over the tensor. Finally, I am confused as to whether returning false from bindSymbolicShapes will cause the guard to fail. I think it should, since we found evidence of the program no longer passing its profiled model, but it doesn't directly lead to a push false, so I am not sure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't directly lead to a push false, so I am not sure.

yes it's not too obvious that expected_type either contains static or symbolic symbols and isSubtypeOf will fail when comparing with pttp which only has static symbols unless they are either the same or we run bindSymbolicShapes . I explicitly added && bound_successfully && to push.

@zdevito zdevito self-requested a review May 4, 2020 22:06
Copy link
Copy Markdown
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting there -- inline comments highlight a few places where I can't follow the data structures making the logic difficult to understand.

GRAPH_DEBUG(symbol, " is now bound to ", new_size);
new_symbols.push_back(symbol);
} else {
if (symbol_table.getValue(symbol) == new_sizes[i]->static_size()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use new_size

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

c10::ShapeSymbol ShapeSymbolTable::toSymbol(
Dimension val,
std::map<Dimension, c10::ShapeSymbol>& dims2symbols,
ProfilingRecord* pr) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr is unused here

c10::ShapeSymbol ShapeSymbolTable::GetSymbolInSet(
Dimension new_size,
c10::ShapeSymbol set,
ProfilingRecord* pr) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pr is unused here

namespace torch {
namespace jit {

c10::ShapeSymbol ShapeSymbolTable::toSymbol(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the name toSymbol, getOrCreateSymbolForDim? Shouldn't this be a method on whatever owns dims2symbols? It is weird that this is taking the map as an argument.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed toSymbol and added a comment to getSymbolInSet

std::shared_ptr<Graph> profiled_graph_;
std::mutex mutex_;
size_t profiling_count_;
std::map<int64_t, std::map<Value*, TensorTypePtr>> profiled_types_per_frame_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't follow this data structure in the code. There are no comments about what it stores, and it results in a lot of "x.first" and "x.second" confusion.

void reset() {
sets_.clear();
};
std::map<c10::ShapeSymbol, std::map<Dimension, c10::ShapeSymbol>> sets_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this stores, it is a nested map of ShapeSymbol to Dimension to Shape symbol?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment.
We are implementing the algorithm we discussed but the implementation merges by every dimension location, not by a whole set at once. It makes the implementation so much simpler, but it's a bit of a break from the conceptual algorithm we discussed.

// A helper structure used for partitioning
// a set associated with ShapeSymbol into
// subsets each associated with a unique
// Dimension (dimension value)
// if the same symbol is
// assigned to multiple `Dimension` in one of the
// profiling runs the set of dimension locations
// associated with the set will be split into
// two subsets each associated with the new Dimension
// value 
struct ShapeSymbolSets {

@Krovatkin Krovatkin requested a review from zdevito May 6, 2020 17:12
Copy link
Copy Markdown
Contributor

@zdevito zdevito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good now. I have minor nits about the speed along the hot path but that can be done in a different PR since one seems functionally correct and the profiler is not on by default.

continue;
}
auto symbol = *sym_shapes[i];
if (!isBound(symbol)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the hot path, so be careful about hash table lookups, this is doing two lookups even though it only needs to be doing 1.

TORCH_INTERNAL_ASSERT(!s.static_size());
data_[s] = v;
}
std::map<c10::ShapeSymbol, Dimension> data_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unordered map? A red-black tree along the fast path is going to be pretty slow.

// to assign dimension values to ShapeSymbols
// and fail a guard if the same symbol
// is assigned more than one dimension value.
struct ShapeSymbolTable {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A class used by the interpreter but not specific to profiling should probably go in a different file.

// * For every two runs, we iterate over all `symbic_shapes_` and compare their `ShapeSymbol`s in the same position.
// * if we observe that for every dimension location that has the`ShapeSymbol S1` in run #1 there is **only one** `ShapeSymbol S2` in the same dimension location in run #2, we conclude that the invariant holds.
// * However, if we observe some dimension locations in run #2 have `ShapeSymbol S2` and the other ones have `ShapeSymbol S3` we would like to partition the virtual set of dimension locations associated with `ShapeSymbol S1` into two new subsets, so the invariant holds.
// * The partitioning works by assigning a new symbol to the dimension locations (associated with `ShapeSymbol S1`) that have `ShapeSymbol S2` and another new symbol to the dimension locations that have `ShapeSymbol S3`. In other words,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this comment, it makes it much clearer what is happening in the merging algorithm.

@Krovatkin Krovatkin force-pushed the krovatkin/gen_syms branch from d612343 to 6160115 Compare May 19, 2020 05:28
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@Krovatkin Krovatkin force-pushed the krovatkin/gen_syms branch from 51e8b78 to 2d8035b Compare May 19, 2020 23:20
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Krovatkin added 6 commits May 19, 2020 16:49
address Zach's feedback

address Zach feedback

rename data structures; cleanly separate ShapeSymbolTable from PartitioningSetHelper

address feedback

add hash function for shapesymbols

fix bbrks

rollback 3rd party deps

swith back to map

clang-format
@Krovatkin Krovatkin force-pushed the krovatkin/gen_syms branch from 2d8035b to 399704a Compare May 19, 2020 23:49
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Krovatkin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@Krovatkin merged this pull request in 42870dd.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Yay!
Pull Request resolved: pytorch#37693

Differential Revision: D21641663

Pulled By: Krovatkin

fbshipit-source-id: 64e70138b31800371887d24ceb1c5d18945b4412
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged oncall: jit Add this issue/PR to JIT oncall triage queue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants