Remove clone in fused rnn double back quick fix#1683
Remove clone in fused rnn double back quick fix#1683csarofeen wants to merge 7 commits intopytorch:masterfrom
Conversation
|
What's the memory footprint of these changes? It seems that it holds quite a workspace. Can't it e.g. save |
|
That goes out of scope after the forward call and is not particularly big. But there were some larger memory usage issues that I got rid of. |
apaszke
left a comment
There was a problem hiding this comment.
Looks good, but the linter is unhappy
| *oghn = ghn; | ||
| DEVICE_LINEAR_GET(hidden, offset+0*hsz) = grg; | ||
| DEVICE_LINEAR_GET(hidden, offset+1*hsz) = gig; | ||
| DEVICE_LINEAR_GET(hidden, offset+2*hsz) = ghn; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| ibias = ibias.view(1, -1) | ||
| if hbias.dim() == 1: | ||
| hbias.unsqueeze_(0) | ||
| hbias = hbias.view(1, -1) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| self.backend = type2backend[type(input_gate)] | ||
|
|
||
| hy = input_gate.new() | ||
| storage = input_gate.new().resize_(hx.numel() * 5) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| igc = input_gate.clone() | ||
| hgc = hidden_gate.clone() | ||
| gradInputHx = gradOutput.new() | ||
| gradInInput = gradOutput.new().resize_(*self.igate_size) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| return igc, hgc, gradInput, gb1, gb2 | ||
| if self.hasBias: | ||
| gb1 = gradInInput.sum(0).squeeze() | ||
| gb2 = gradInHidden.sum(0).squeeze() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| gradInInput = gradOutput.new().resize_(*self.igate_size) | ||
| gradInHidden = gradOutput.new().resize_(*self.hgate_size) | ||
|
|
||
| storage = self.buffer |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| hy = input_gate.new() | ||
| storage = input_gate.new().resize_(hx.numel() * 5) | ||
|
|
||
| self.hasBias = False |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
merged into master |
Better fix for #1532