Skip to content

simplify Float8Linear#2594

Merged
vkuzo merged 1 commit into
mainfrom
gh/vkuzo/94/head
Jul 24, 2025
Merged

simplify Float8Linear#2594
vkuzo merged 1 commit into
mainfrom
gh/vkuzo/94/head

Conversation

@vkuzo

@vkuzo vkuzo commented Jul 24, 2025

Copy link
Copy Markdown
Contributor

Summary:

Removing code which we no longer need after
#2356 . torch.compile now does the
right thing automatically, and the relevant config has been deprecated.

Also fix links in float8 training benchmark README.md to point to
updated locations.

Test Plan:

./test/float8/test_everything.sh
// before
with-proxy TORCHTITAN_ROOT=~/local/torchtitan/ FLOAT8_RECIPE_WITH_BEST_SETTINGS="tensorwise" ./torchtitan_benchmark.sh
...
Median Tokens/Second (excluding step 1): 7999.0
Max Memory Usage: 36.68 GiB

// after
with-proxy TORCHTITAN_ROOT=~/local/torchtitan/ FLOAT8_RECIPE_WITH_BEST_SETTINGS="tensorwise" ./torchtitan_benchmark.sh
...
Median Tokens/Second (excluding step 1): 8038.5
Max Memory Usage: 36.68 GiB

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@vkuzo

vkuzo commented Jul 24, 2025

Copy link
Copy Markdown
Contributor Author

@pytorch-bot

pytorch-bot Bot commented Jul 24, 2025

Copy link
Copy Markdown

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2594

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1c1ad9c with merge base 12ff479 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vkuzo added a commit that referenced this pull request Jul 24, 2025
Summary:

Removing code which we no longer need after
#2356 . `torch.compile` now does the
right thing automatically, and the relevant config has been deprecated.

Also fix links in float8 training benchmark README.md to point to
updated locations.

Test Plan:

```
./test/float8/test_everything.sh
```

```
// before
with-proxy TORCHTITAN_ROOT=~/local/torchtitan/ FLOAT8_RECIPE_WITH_BEST_SETTINGS="tensorwise" ./torchtitan_benchmark.sh
...
Median Tokens/Second (excluding step 1): 7999.0
Max Memory Usage: 36.68 GiB

// after
with-proxy TORCHTITAN_ROOT=~/local/torchtitan/ FLOAT8_RECIPE_WITH_BEST_SETTINGS="tensorwise" ./torchtitan_benchmark.sh
...
Median Tokens/Second (excluding step 1): 8038.5
Max Memory Usage: 36.68 GiB
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 1211464
ghstack-comment-id: 3113439246
Pull Request resolved: #2594
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 24, 2025
@vkuzo vkuzo requested review from danielvegamyhre and drisspg July 24, 2025 13:15
@vkuzo vkuzo added the module: not user facing Use this tag if you don't want this PR to show up in release notes label Jul 24, 2025
Comment thread torchao/float8/README.md
- bf16 + compile: `TORCHTITAN_ROOT=<path> ./float8_training_benchmark.sh`
- float8 tensorwise with float8 all-gather + compile: `TORCHTITAN_ROOT=<path> FLOAT8_RECIPE_WITH_BEST_SETTINGS="tensorwise" ./float8_training_benchmark.sh`
- float8 rowwise with bf16 all-gather + compile: `TORCHTITAN_ROOT=<path> FLOAT8_RECIPE_WITH_BEST_SETTINGS="rowwise" ./float8_training_benchmark.sh`
3. From the `torchao/benchmarks/float8/training/` directory, you can run the following commands to reproduce the benchmarks above:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be ao/benchmarks/float8/training/ (repo is named ao and benchmarks dir is in the repo root dir). Same for other places in this PR

@vkuzo vkuzo merged commit c6de9b4 into main Jul 24, 2025
53 of 55 checks passed
liangel-02 pushed a commit that referenced this pull request Aug 25, 2025
Update

[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants