Skip to content

Fix PersimmonIntegrationTest OOM#26750

Merged
ydshieh merged 7 commits intomainfrom
fix_persimmon
Oct 12, 2023
Merged

Fix PersimmonIntegrationTest OOM#26750
ydshieh merged 7 commits intomainfrom
fix_persimmon

Conversation

@ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented Oct 12, 2023

What does this PR do?

Fix PersimmonIntegrationTest OOM: just use 8-bit

@ydshieh ydshieh requested a review from LysandreJik October 12, 2023 07:55
@ydshieh ydshieh marked this pull request as draft October 12, 2023 08:13
@ydshieh ydshieh removed the request for review from LysandreJik October 12, 2023 08:13
Comment on lines +402 to +408
EXPECTED_MEAN = torch.tensor(
[[-11.2879, -11.2628, -11.2498, -11.2534, -11.2676, -11.2638, -11.2501, -11.2431]], dtype=torch.float16
[[-11.4726, -11.1495, -11.2694, -11.2223, -10.9452, -11.0663, -11.0031, -11.1028]], dtype=torch.float16
)
torch.testing.assert_close(out.cpu().mean(-1), EXPECTED_MEAN, atol=1e-4, rtol=1e-4)
# change dtype to `torch.float32` before calling `mean` to avoid `nan` values
torch.testing.assert_close(out.cpu().to(torch.float32).mean(-1), EXPECTED_MEAN, atol=1e-4, rtol=1e-4)
# fmt: off
EXPECTED_SLICE = torch.tensor([-16.9670, -16.9647, -16.9649, -16.9630, -16.9577, -16.9623, -17.0164, -16.9673, -16.9648, -16.9668, -17.0160, -16.9651, -17.0156, -16.9668, -16.9655, -16.9653, -16.9665, -16.9682, -17.0112, -16.9667, -16.9717, -16.9654, -16.9650, -16.9701, -16.9657, -17.0160, -16.9676, -17.0138, -16.9610, -16.9695])
EXPECTED_SLICE = torch.tensor([16.9062, 16.9062, 16.9062, 16.9062, 16.8906, 16.9062, 16.9531, 16.9062, 16.9062, 16.9062, 16.9531, 16.9062, 16.9531, 16.9062, 16.9062, 16.9062, 16.9062, 16.9062, 16.9531, 16.9062, 16.9062, 16.9062, 16.9062, 16.9062, 16.9062, 16.9531, 16.9062, 16.9531, 16.9062, 16.9062])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update these values as using 8-bit change those

@ydshieh ydshieh marked this pull request as ready for review October 12, 2023 08:39
@ydshieh ydshieh marked this pull request as draft October 12, 2023 08:48
@ydshieh ydshieh marked this pull request as ready for review October 12, 2023 08:57
@ydshieh
Copy link
Collaborator Author

ydshieh commented Oct 12, 2023

Finally get it to work completely by also using torch.cuda.empty_cache and gc.collect

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 12, 2023

The documentation is not available anymore as the PR was closed or merged.

@ydshieh ydshieh merged commit 72256bc into main Oct 12, 2023
@ydshieh ydshieh deleted the fix_persimmon branch October 12, 2023 09:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants