[float8] Allow specifying arbitrary dtype for each tensor by lw · Pull Request #1326 · pytorch/ao

lw · 2024-11-22T09:49:52Z

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]

ghstack-source-id: 7dabc91 Pull Request resolved: #1326

pytorch-bot · 2024-11-22T09:49:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1326

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 97c9983 with merge base 1a0dbf1 ():

NEW FAILURE - The following job has failed:

Code Analysis with Ruff / build (3.9) (gh)
test/float8/test_dtensor.py:24:1: I001 [*] Import block is un-sorted or un-formatted

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

ghstack-source-id: d9c0e02 Pull Request resolved: #1326

[ghstack-poisoned]

ghstack-source-id: e86e1f6 Pull Request resolved: #1326

[ghstack-poisoned]

ghstack-source-id: aa9f551 Pull Request resolved: #1326

[ghstack-poisoned]

ghstack-source-id: c339ea0 Pull Request resolved: #1326

[ghstack-poisoned]

ghstack-source-id: 4b3a2f0 Pull Request resolved: #1326

vkuzo · 2024-11-26T22:44:52Z

    scaling_type: ScalingType = ScalingType.DYNAMIC
    scaling_granularity: ScalingGranularity = ScalingGranularity.TENSORWISE
    static_scale: Optional[torch.Tensor] = None
+    dtype: Optional[torch.dtype] = None


nit:

can we add a comment on what this is used for, and that None means the default e4m3|e5m2 value will be used?

optional - thoughts about naming this in a more specific way such as target_dtype, lowp_dtype, etc? dtype is a bit ambiguous across torchao unfortunately :(

vkuzo · 2024-11-26T22:46:42Z

        # grad_input_hp = grad_output_fp8_axiswise_dim0 @ weight_fp8_tensorwise
-        cc_go = CastConfig(scaling_granularity=ScalingGranularity.AXISWISE)
+        cc_go = CastConfig(
+            scaling_granularity=ScalingGranularity.AXISWISE, dtype=e4m3_dtype


nit: maybe we can also add some context in the comments on L353:L363 that it also uses e4m3 for grads?

vkuzo · 2024-11-26T22:48:18Z

-    NoopFwToFloat8E5M2BwDelayed,
-    NoopFwToFloat8E5M2BwDynamic,
-    NoopFwToFloat8E5M2BwStatic,
+    NoopFwToFloat8BwDelayed,


thanks for updating these!

vkuzo · 2024-11-26T22:50:23Z

        # Calculate the new scales from the updated history stacks
        new_input_scales = amax_history_to_scale_stack(
-            fp8_input_amax_history_stack, e4m3_dtype, x_dtype, scale_fn_recipe
+            fp8_input_amax_history_stack, input_dtype, x_dtype, scale_fn_recipe


will likely have to rebase on top of #1329 which changed this line

vkuzo · 2024-11-26T22:53:44Z

    static_scale: Optional[torch.Tensor] = None
+    dtype: Optional[torch.dtype] = None

    def short_str(self):


can we also add the dtype here, so it appears when we print an instance of Float8Linear? Float8Linear.__extra_repr__ calls this method.

vkuzo · 2024-11-26T22:54:10Z

This is great! LGTM, had some comments but all are pretty nitty. CI is green - ship it!

[ghstack-poisoned]

ghstack-source-id: d8300e2 Pull Request resolved: #1326

lw · 2024-12-04T15:57:39Z

Superseded by #1378

Update

e511579

[ghstack-poisoned]

lw mentioned this pull request Nov 22, 2024

[float8] Re-enable slow-accum in the bwd of axis-wise scaling schemes #1325

Merged

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

28949f8

ghstack-source-id: 7dabc91 Pull Request resolved: #1326

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 22, 2024

Update

51acb5b

[ghstack-poisoned]

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

a57d7c8

ghstack-source-id: d9c0e02 Pull Request resolved: #1326

lw added the topic: new feature Use this tag if this PR adds a new feature label Nov 22, 2024

Update

9e89f6a

[ghstack-poisoned]

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

b4876df

ghstack-source-id: e86e1f6 Pull Request resolved: #1326

Update

97b9cf8

[ghstack-poisoned]

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

a749a5f

ghstack-source-id: aa9f551 Pull Request resolved: #1326

Update

810ad91

[ghstack-poisoned]

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

71f2ea1

ghstack-source-id: c339ea0 Pull Request resolved: #1326

Update

b9672f5

[ghstack-poisoned]

lw added a commit that referenced this pull request Nov 22, 2024

[float8] Allow specifying arbitrary dtype for each tensor

7d28acf

ghstack-source-id: 4b3a2f0 Pull Request resolved: #1326

vkuzo reviewed Nov 26, 2024

View reviewed changes

Update

97c9983

[ghstack-poisoned]

lw added a commit that referenced this pull request Dec 4, 2024

[float8] Allow specifying arbitrary dtype for each tensor

541454b

ghstack-source-id: d8300e2 Pull Request resolved: #1326

lw mentioned this pull request Dec 4, 2024

[float8] Allow specifying arbitrary dtype for each tensor #1378

Merged

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

lm_eval: fix links and pin version to 0.4.2 (pytorch#1326)

2ea11b0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[float8] Allow specifying arbitrary dtype for each tensor#1326

[float8] Allow specifying arbitrary dtype for each tensor#1326
lw wants to merge 7 commits into
gh/lw/2/basefrom
gh/lw/2/head

lw commented Nov 22, 2024 •

edited

Loading

Uh oh!

pytorch-bot Bot commented Nov 22, 2024 •

edited

Loading

Uh oh!

vkuzo Nov 26, 2024

Uh oh!

vkuzo Nov 26, 2024

Uh oh!

vkuzo Nov 26, 2024

Uh oh!

vkuzo Nov 26, 2024

Uh oh!

vkuzo Nov 26, 2024

Uh oh!

vkuzo commented Nov 26, 2024

Uh oh!

lw commented Dec 4, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lw commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot Bot commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1326

❌ 1 New Failure

Uh oh!

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

vkuzo Nov 26, 2024

Choose a reason for hiding this comment

Uh oh!

vkuzo commented Nov 26, 2024

Uh oh!

lw commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lw commented Nov 22, 2024 •

edited

Loading

pytorch-bot Bot commented Nov 22, 2024 •

edited

Loading

lw commented Dec 4, 2024 •

edited

Loading