Skip to content
This repository was archived by the owner on Mar 2, 2026. It is now read-only.

[SYCL][Doc] Simplify queue recorded node definition and expand sub-graph section#254

Merged
EwanC merged 1 commit intosycl-graph-updatefrom
ewan/kernel_to_command
Jul 11, 2023
Merged

[SYCL][Doc] Simplify queue recorded node definition and expand sub-graph section#254
EwanC merged 1 commit intosycl-graph-updatefrom
ewan/kernel_to_command

Conversation

@EwanC
Copy link
Collaborator

@EwanC EwanC commented Jul 11, 2023

Simplify the definition of a node in the Record & Replay API by using "command" terminology rather than "kernel" to be more generic.

Wording around sub-graphs is also moved to the sub-graph section and expanded based on recent implementation experience.

Actions Gordon's feedback from:

Simplify the definition of a node in the Record & Replay API.
Move the wording around sub-graphs to the sub-graph section,
and use "command" terminology rather than "kernel" to be
more generic.

Actions Gordon's feedback from:
* Lack of clarity over inclusion of queue shortcut functions
intel#5626 (comment)
* Say "command" rather than "kernel"
intel#5626 (comment)
@EwanC EwanC added the Graph Specification Extension Specification related label Jul 11, 2023
Copy link
Owner

@reble reble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EwanC EwanC merged commit 498686c into sycl-graph-update Jul 11, 2023
@Bensuo Bensuo deleted the ewan/kernel_to_command branch July 12, 2023 16:37
Bensuo pushed a commit that referenced this pull request Apr 16, 2024
…ndling (#76644)

Fold BICi if all destination bits are already known to be zeroes

```llvm
define <8 x i16> @haddu_known(<8 x i8> %a0, <8 x i8> %a1) {
  %x0 = zext <8 x i8> %a0 to <8 x i16>
  %x1 = zext <8 x i8> %a1 to <8 x i16>
  %hadd = call <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16> %x0, <8 x i16> %x1)
  %res = and <8 x i16> %hadd, <i16 511, i16 511, i16 511, i16 511,i16 511, i16 511, i16 511, i16 511>
  ret <8 x i16> %res
}
declare <8 x i16> @llvm.aarch64.neon.uhadd.v8i16(<8 x i16>, <8 x i16>)
```

```
haddu_known:                            // @haddu_known
        ushll   v0.8h, v0.8b, #0
        ushll   v1.8h, v1.8b, #0
        uhadd   v0.8h, v0.8h, v1.8h
        bic     v0.8h, #254, lsl #8 <-- this one will be removed as we know high bits are zero extended
        ret
```

Fixes #53881
Fixes #53622
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Graph Specification Extension Specification related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants