Skip to content

C++ API parity: MultiheadAttention#27309

Closed
pbelevich wants to merge 65 commits intogh/pbelevich/30/basefrom
gh/pbelevich/30/head
Closed

C++ API parity: MultiheadAttention#27309
pbelevich wants to merge 65 commits intogh/pbelevich/30/basefrom
gh/pbelevich/30/head

Conversation

@pbelevich
Copy link
Contributor

@pbelevich pbelevich commented Oct 3, 2019

Stack from ghstack:

Differential Revision: D17766736

@pytorchbot pytorchbot added the module: cpp Related to C++ API label Oct 3, 2019
pbelevich added a commit that referenced this pull request Oct 3, 2019
ghstack-source-id: a276603
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Oct 4, 2019
ghstack-source-id: b7a13fc
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Oct 5, 2019
ghstack-source-id: 151895f
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Oct 5, 2019
ghstack-source-id: cc9194a
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Oct 5, 2019
ghstack-source-id: 72bf7b8
Pull Request resolved: #27309
@pbelevich pbelevich mentioned this pull request Oct 5, 2019
@pbelevich pbelevich mentioned this pull request Oct 6, 2019
pbelevich added a commit that referenced this pull request Dec 5, 2019
ghstack-source-id: 76ade4d
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Dec 5, 2019
ghstack-source-id: ba4576d
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Dec 6, 2019
ghstack-source-id: 0a18c70
Pull Request resolved: #27309
@pbelevich pbelevich requested a review from yf225 December 6, 2019 22:06
pbelevich added a commit that referenced this pull request Dec 7, 2019
ghstack-source-id: 141683e
Pull Request resolved: #27309
pbelevich added a commit that referenced this pull request Dec 15, 2019
ghstack-source-id: a0c9cce
Pull Request resolved: #27309
@kostmo
Copy link
Member

kostmo commented Dec 15, 2019

CircleCI build failures summary

As of commit d1aadb6:

  • 3/3 broken upstream at merge base 9dc3d87 (see grid view)
    • You may want to rebase on the viable/strict branch (see its recency history):
      • If your commit is newer than viable/strict, you can try basing on an older, stable commit:
        git fetch viable/strict
        git rebase --onto viable/strict $(git merge-base origin/master HEAD)
        
      • If your commit is older than viable/strict:
        git fetch viable/strict
        git rebase viable/strict
        
  • 0/3 failures introduced in this PR

Detailed failure analysis

One may explore the probable reasons each build failed interactively on the Dr. CI website.

3 upstream failures recognized by patterns:

These builds matched patterns, but were probably caused by upstream breakages:


This comment was automatically generated by Dr. CI.
Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

This comment has been revised 2 times.

TORCH_ARG(bool, add_zero_attn) = false;

/// total number of features in key. Default: c10::nullopt.
TORCH_ARG(int64_t, kdim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is it declared as an optional variable? I am thinking if we should change it to:
TORCH_ARG(c10::optional<int64_t>, kdim) = c10::nullopt;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TORCH_ARG(int64_t, kdim);

/// total number of features in key. Default: c10::nullopt.
TORCH_ARG(int64_t, vdim);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above
TORCH_ARG(c10::optional<int64_t>, vdim) = c10::nullopt;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@anjali411 anjali411 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :)

}

void MultiheadAttentionImpl::reset() {
_qkv_same_embed_dim = options.kdim() == options.embed_dim() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and after making kdim and vdim optional in options/activation.h, we can add code corresponding to these lines in activation.py
self.kdim = kdim if kdim is not None else embed_dim
self.vdim = vdim if vdim is not None else embed_dim
if(options.kdim() != c10::nullopt) { options.kdim(options.embed_dim()); }
if(options.vdim() != c10::nullopt) { options.vdim(options.embed_dim()); }

@facebook-github-bot
Copy link
Contributor

@pbelevich merged this pull request in 47766e6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: cpp Related to C++ API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants