[VPlan] Embed widening semantics of recipes (NFCI)#195385
Conversation
|
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-transforms Author: Ramkumar Ramachandra (artagnon) ChangesWe currently make widening decisions in an ad-hoc fashion, and have helpers that unnecessarily do recursive reasoning over and over: isSingleScalar, onlyFirstLaneUsed, and onlyScalarValuesUsed. The goal is to make widening decisions in a principled fashion, simply by analyzing the Plan, and this patch is the first step towards the goal. By embedding widening decisions in recipes, which can be refined iteratively, we can eliminate the hard-coded wide-equivalents of VPInstruction, including VPWidenRecipe, VPWidenGEPRecipe, and VPWidenCastRecipe. The design we have picked would also allow us to eliminate VPReplicateRecipe. This patch is intentionally non-functional, but it should have enough information to see what the follow-ups would be. Patch is 40.85 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/195385.diff 7 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index e912751525fc7..e74726d90d690 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7848,7 +7848,7 @@ static SmallVector<Instruction *> preparePlanForEpilogueVectorLoop(
[](const VPUser *U) {
return isa<VPScalarIVStepsRecipe>(U) ||
isa<VPDerivedIVRecipe>(U) ||
- cast<VPRecipeBase>(U)->isScalarCast() ||
+ cast<VPInstruction>(U)->isScalarCast() ||
cast<VPInstruction>(U)->getOpcode() ==
Instruction::Add;
}) &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 592d83af2295f..0d3755ff4d1fa 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -409,6 +409,9 @@ class LLVM_ABI_FOR_TEST VPRecipeBase
/// Subclass identifier (for isa/dyn_cast).
const unsigned char SubclassID;
+ /// Two bits of widening information, which takes values in VPWideningTy.
+ unsigned char WideningInfo : 2;
+
/// Each VPRecipe belongs to a single VPBasicBlock.
VPBasicBlock *Parent = nullptr;
@@ -466,9 +469,21 @@ class LLVM_ABI_FOR_TEST VPRecipeBase
VPLastPHISC = VPReductionPHISC,
};
- VPRecipeBase(const unsigned char SC, ArrayRef<VPValue *> Operands,
+ /// An enumeration for keeping track of whether the widening status of the
+ /// recipe.
+ /// Narrow indicates that the recipe necessarily produces a narrow scalar.
+ /// PossiblyNarrow indicates that the recipe is wide, although it could be
+ /// narrowed to produce a scalar if profitable.
+ /// Wide indicates that the recipes necessarily produces a wide result.
+ /// ReplicatePart indicates that the recipe needs to be replicated, producing
+ /// scalars for each part, which the unroller handles.
+ using VPWideningTy = enum { Narrow, PossiblyNarrow, Wide, ReplicatePart };
+
+ VPRecipeBase(const unsigned char SC, unsigned char WideInfo,
+ ArrayRef<VPValue *> Operands,
DebugLoc DL = DebugLoc::getUnknown())
- : VPDef(), VPUser(Operands), SubclassID(SC), DL(DL) {}
+ : VPDef(), VPUser(Operands), SubclassID(SC), WideningInfo(WideInfo),
+ DL(DL) {}
~VPRecipeBase() override = default;
@@ -552,8 +567,16 @@ class LLVM_ABI_FOR_TEST VPRecipeBase
/// Returns the debug location of the recipe.
DebugLoc getDebugLoc() const { return DL; }
- /// Return true if the recipe is a scalar cast.
- bool isScalarCast() const;
+ /// Methods for querying and setting WideningInfo.
+ bool isNarrow() const { return WideningInfo == Narrow; }
+ void markNarrow() { WideningInfo = Narrow; }
+ bool maybeNarrow() const {
+ return is_contained({Narrow, PossiblyNarrow}, WideningInfo);
+ }
+ void markPossiblyNarrow() { WideningInfo = PossiblyNarrow; }
+ bool isWide() const { return WideningInfo == Wide; }
+ bool isReplicatePart() const { return WideningInfo == ReplicatePart; }
+ void markReplicatePart() { WideningInfo = ReplicatePart; }
/// Set the recipe's debug location to \p NewDL.
void setDebugLoc(DebugLoc NewDL) { DL = NewDL; }
@@ -604,13 +627,15 @@ class LLVM_ABI_FOR_TEST VPRecipeBase
/// Note that VPRecipeBase must be inherited from before VPValue.
class VPSingleDefRecipe : public VPRecipeBase, public VPRecipeValue {
public:
- VPSingleDefRecipe(const unsigned char SC, ArrayRef<VPValue *> Operands,
+ VPSingleDefRecipe(const unsigned char SC, unsigned char WideInfo,
+ ArrayRef<VPValue *> Operands,
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeBase(SC, Operands, DL), VPRecipeValue(this) {}
+ : VPRecipeBase(SC, WideInfo, Operands, DL), VPRecipeValue(this) {}
- VPSingleDefRecipe(const unsigned char SC, ArrayRef<VPValue *> Operands,
- Value *UV, DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeBase(SC, Operands, DL), VPRecipeValue(this, UV) {}
+ VPSingleDefRecipe(const unsigned char SC, unsigned char WideInfo,
+ ArrayRef<VPValue *> Operands, Value *UV,
+ DebugLoc DL = DebugLoc::getUnknown())
+ : VPRecipeBase(SC, WideInfo, Operands, DL), VPRecipeValue(this, UV) {}
static inline bool classof(const VPRecipeBase *R) {
switch (R->getVPRecipeID()) {
@@ -1107,10 +1132,10 @@ static_assert(sizeof(VPIRFlags) <= 3, "VPIRFlags should not grow");
/// A pure-virtual common base class for recipes defining a single VPValue and
/// using IR flags.
struct VPRecipeWithIRFlags : public VPSingleDefRecipe, public VPIRFlags {
- VPRecipeWithIRFlags(const unsigned char SC, ArrayRef<VPValue *> Operands,
- const VPIRFlags &Flags,
+ VPRecipeWithIRFlags(const unsigned char SC, unsigned char WideInfo,
+ ArrayRef<VPValue *> Operands, const VPIRFlags &Flags,
DebugLoc DL = DebugLoc::getUnknown())
- : VPSingleDefRecipe(SC, Operands, DL), VPIRFlags(Flags) {}
+ : VPSingleDefRecipe(SC, WideInfo, Operands, DL), VPIRFlags(Flags) {}
static inline bool classof(const VPRecipeBase *R) {
return R->getVPRecipeID() == VPRecipeBase::VPBlendSC ||
@@ -1483,12 +1508,17 @@ class LLVM_ABI_FOR_TEST VPInstruction : public VPRecipeWithIRFlags,
/// Returns true if this VPInstruction produces a scalar value from a vector,
/// e.g. by performing a reduction or extracting a lane.
+ /// TODO: Remove in favor of WideningInfo.
bool isVectorToScalar() const;
/// Returns true if this VPInstruction's operands are single scalars and the
/// result is also a single scalar.
+ /// TODO: Remove in favor of WideningInfo.
bool isSingleScalar() const;
+ /// Returns true if this VPInstruction is a scalar cast.
+ bool isScalarCast() const { return Instruction::isCast(getOpcode()); }
+
/// Returns the symbolic name assigned to the VPInstruction.
StringRef getName() const { return Name; }
@@ -1524,8 +1554,6 @@ class VPInstructionWithType : public VPInstruction {
static inline bool classof(const VPRecipeBase *R) {
// VPInstructionWithType are VPInstructions with specific opcodes requiring
// type information.
- if (R->isScalarCast())
- return true;
auto *VPI = dyn_cast<VPInstruction>(R);
if (!VPI)
return false;
@@ -1536,7 +1564,7 @@ class VPInstructionWithType : public VPInstruction {
case Instruction::Load:
return true;
default:
- return false;
+ return VPI->isScalarCast();
}
}
@@ -1684,7 +1712,8 @@ class VPIRInstruction : public VPRecipeBase {
/// VPIRInstruction::create() should be used to create VPIRInstructions, as
/// subclasses may need to be created, e.g. VPIRPhi.
VPIRInstruction(Instruction &I)
- : VPRecipeBase(VPRecipeBase::VPIRInstructionSC, {}), I(I) {}
+ : VPRecipeBase(VPRecipeBase::VPIRInstructionSC, VPRecipeBase::Narrow, {}),
+ I(I) {}
public:
~VPIRInstruction() override = default;
@@ -1780,7 +1809,8 @@ class LLVM_ABI_FOR_TEST VPWidenRecipe : public VPRecipeWithIRFlags,
VPWidenRecipe(Instruction &I, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags = {}, const VPIRMetadata &Metadata = {},
DebugLoc DL = {})
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenSC, Operands, Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenSC, VPRecipeBase::Wide,
+ Operands, Flags, DL),
VPIRMetadata(Metadata), Opcode(I.getOpcode()) {
setUnderlyingValue(&I);
}
@@ -1788,7 +1818,8 @@ class LLVM_ABI_FOR_TEST VPWidenRecipe : public VPRecipeWithIRFlags,
VPWidenRecipe(unsigned Opcode, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags = {}, const VPIRMetadata &Metadata = {},
DebugLoc DL = {})
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenSC, Operands, Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenSC, VPRecipeBase::Wide,
+ Operands, Flags, DL),
VPIRMetadata(Metadata), Opcode(Opcode) {}
~VPWidenRecipe() override = default;
@@ -1841,7 +1872,8 @@ class VPWidenCastRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
CastInst *CI = nullptr, const VPIRFlags &Flags = {},
const VPIRMetadata &Metadata = {},
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenCastSC, Op, Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenCastSC, VPRecipeBase::Wide, Op,
+ Flags, DL),
VPIRMetadata(Metadata), Opcode(Opcode), ResultTy(ResultTy) {
assert(flagsValidForOpcode(Opcode) &&
"Set flags not supported for the provided opcode");
@@ -1903,8 +1935,8 @@ class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
const VPIRFlags &Flags = {},
const VPIRMetadata &MD = {},
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenIntrinsicSC, CallArguments,
- Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenIntrinsicSC,
+ VPRecipeBase::Wide, CallArguments, Flags, DL),
VPIRMetadata(MD), VectorIntrinsicID(VectorIntrinsicID), ResultTy(Ty),
MayReadFromMemory(CI.mayReadFromMemory()),
MayWriteToMemory(CI.mayWriteToMemory()),
@@ -1917,8 +1949,8 @@ class VPWidenIntrinsicRecipe : public VPRecipeWithIRFlags, public VPIRMetadata {
const VPIRFlags &Flags = {},
const VPIRMetadata &Metadata = {},
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenIntrinsicSC, CallArguments,
- Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenIntrinsicSC,
+ VPRecipeBase::Wide, CallArguments, Flags, DL),
VPIRMetadata(Metadata), VectorIntrinsicID(VectorIntrinsicID),
ResultTy(Ty) {
LLVMContext &Ctx = Ty->getContext();
@@ -1992,8 +2024,8 @@ class LLVM_ABI_FOR_TEST VPWidenCallRecipe : public VPRecipeWithIRFlags,
ArrayRef<VPValue *> CallArguments,
const VPIRFlags &Flags = {},
const VPIRMetadata &Metadata = {}, DebugLoc DL = {})
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenCallSC, CallArguments, Flags,
- DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenCallSC, VPRecipeBase::Wide,
+ CallArguments, Flags, DL),
VPIRMetadata(Metadata), Variant(Variant) {
setUnderlyingValue(UV);
assert(
@@ -2044,7 +2076,8 @@ class VPHistogramRecipe : public VPRecipeBase {
public:
VPHistogramRecipe(unsigned Opcode, ArrayRef<VPValue *> Operands,
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeBase(VPRecipeBase::VPHistogramSC, Operands, DL),
+ : VPRecipeBase(VPRecipeBase::VPHistogramSC, VPRecipeBase::Wide, Operands,
+ DL),
Opcode(Opcode) {}
~VPHistogramRecipe() override = default;
@@ -2094,7 +2127,8 @@ class LLVM_ABI_FOR_TEST VPWidenGEPRecipe : public VPRecipeWithIRFlags {
VPWidenGEPRecipe(GetElementPtrInst *GEP, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags = {},
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeWithIRFlags(VPRecipeBase::VPWidenGEPSC, Operands, Flags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPWidenGEPSC, VPRecipeBase::Wide,
+ Operands, Flags, DL),
SourceElementTy(GEP->getSourceElementType()) {
setUnderlyingValue(GEP);
SmallVector<std::pair<unsigned, MDNode *>> Metadata;
@@ -2153,8 +2187,8 @@ class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags {
public:
VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *VF, Type *SourceElementTy,
int64_t Stride, GEPNoWrapFlags GEPFlags, DebugLoc DL)
- : VPRecipeWithIRFlags(VPRecipeBase::VPVectorEndPointerSC, {Ptr, VF},
- GEPFlags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPVectorEndPointerSC,
+ VPRecipeBase::Narrow, {Ptr, VF}, GEPFlags, DL),
SourceElementTy(SourceElementTy), Stride(Stride) {
assert(Stride < 0 && "Stride must be negative");
}
@@ -2222,7 +2256,8 @@ class VPVectorPointerRecipe : public VPRecipeWithIRFlags {
public:
VPVectorPointerRecipe(VPValue *Ptr, Type *SourceElementTy,
GEPNoWrapFlags GEPFlags, DebugLoc DL)
- : VPRecipeWithIRFlags(VPRecipeBase::VPVectorPointerSC, Ptr, GEPFlags, DL),
+ : VPRecipeWithIRFlags(VPRecipeBase::VPVectorPointerSC,
+ VPRecipeBase::Narrow, Ptr, GEPFlags, DL),
SourceElementTy(SourceElementTy) {}
VP_CLASSOF_IMPL(VPRecipeBase::VPVectorPointerSC)
@@ -2296,7 +2331,8 @@ class LLVM_ABI_FOR_TEST VPHeaderPHIRecipe : public VPSingleDefRecipe,
protected:
VPHeaderPHIRecipe(unsigned char VPRecipeID, Instruction *UnderlyingInstr,
VPValue *Start, DebugLoc DL = DebugLoc::getUnknown())
- : VPSingleDefRecipe(VPRecipeID, Start, UnderlyingInstr, DL) {}
+ : VPSingleDefRecipe(VPRecipeID, VPRecipeBase::Wide, Start,
+ UnderlyingInstr, DL) {}
const VPRecipeBase *getAsRecipe() const override { return this; }
@@ -2585,7 +2621,8 @@ class LLVM_ABI_FOR_TEST VPWidenPHIRecipe : public VPSingleDefRecipe,
/// debug location \p DL and \p Name.
VPWidenPHIRecipe(ArrayRef<VPValue *> IncomingValues,
DebugLoc DL = DebugLoc::getUnknown(), const Twine &Name = "")
- : VPSingleDefRecipe(VPRecipeBase::VPWidenPHISC, IncomingValues, DL),
+ : VPSingleDefRecipe(VPRecipeBase::VPWidenPHISC, VPRecipeBase::Wide,
+ IncomingValues, DL),
Name(Name.str()) {}
VPWidenPHIRecipe *clone() override {
@@ -2780,7 +2817,8 @@ class LLVM_ABI_FOR_TEST VPBlendRecipe : public VPRecipeWithIRFlags {
/// all other incoming values are merged into it.
VPBlendRecipe(PHINode *Phi, ArrayRef<VPValue *> Operands,
const VPIRFlags &Flags, DebugLoc DL)
- : VPRecipeWithIRFlags(VPRecipeBase::VPBlendSC, Operands, Flags, DL) {
+ : VPRecipeWithIRFlags(VPRecipeBase::VPBlendSC, VPRecipeBase::Wide,
+ Operands, Flags, DL) {
assert(Operands.size() >= 2 && "Expected at least two operands!");
setUnderlyingValue(Phi);
}
@@ -2861,8 +2899,8 @@ class LLVM_ABI_FOR_TEST VPInterleaveBase : public VPRecipeBase,
ArrayRef<VPValue *> Operands,
ArrayRef<VPValue *> StoredValues, VPValue *Mask,
bool NeedsMaskForGaps, const VPIRMetadata &MD, DebugLoc DL)
- : VPRecipeBase(SC, Operands, DL), VPIRMetadata(MD), IG(IG),
- NeedsMaskForGaps(NeedsMaskForGaps) {
+ : VPRecipeBase(SC, VPRecipeBase::Wide, Operands, DL), VPIRMetadata(MD),
+ IG(IG), NeedsMaskForGaps(NeedsMaskForGaps) {
// TODO: extend the masked interleaved-group support to reversed access.
assert((!Mask || !IG->isReverse()) &&
"Reversed masked interleave-group not supported.");
@@ -3046,8 +3084,8 @@ class LLVM_ABI_FOR_TEST VPReductionRecipe : public VPRecipeWithIRFlags {
FastMathFlags FMFs, Instruction *I,
ArrayRef<VPValue *> Operands, VPValue *CondOp,
ReductionStyle Style, DebugLoc DL)
- : VPRecipeWithIRFlags(SC, Operands, FMFs, DL), RdxKind(RdxKind),
- Style(Style) {
+ : VPRecipeWithIRFlags(SC, VPRecipeBase::Wide, Operands, FMFs, DL),
+ RdxKind(RdxKind), Style(Style) {
if (CondOp) {
IsConditional = true;
addOperand(CondOp);
@@ -3188,9 +3226,6 @@ class LLVM_ABI_FOR_TEST VPReductionEVLRecipe : public VPReductionRecipe {
/// a single scalar, only one copy will be generated.
class LLVM_ABI_FOR_TEST VPReplicateRecipe : public VPRecipeWithIRFlags,
public VPIRMetadata {
- /// Indicator if only a single replica per lane is needed.
- bool IsSingleScalar;
-
/// Indicator if the replicas are also predicated.
bool IsPredicated;
@@ -3199,9 +3234,11 @@ class LLVM_ABI_FOR_TEST VPReplicateRecipe : public VPRecipeWithIRFlags,
bool IsSingleScalar, VPValue *Mask = nullptr,
const VPIRFlags &Flags = {}, VPIRMetadata Metadata = {},
DebugLoc DL = DebugLoc::getUnknown())
- : VPRecipeWithIRFlags(VPRecipeBase::VPReplicateSC, Operands, Flags, DL),
- VPIRMetadata(Metadata), IsSingleScalar(IsSingleScalar),
- IsPredicated(Mask) {
+ : VPRecipeWithIRFlags(VPRecipeBase::VPReplicateSC,
+ IsSingleScalar ? VPRecipeBase::Narrow
+ : VPRecipeBase::ReplicatePart,
+ Operands, Flags, DL),
+ VPIRMetadata(Metadata), IsPredicated(Mask) {
setUnderlyingValue(I);
if (Mask)
addOperand(Mask);
@@ -3211,7 +3248,7 @@ class LLVM_ABI_FOR_TEST VPReplicateRecipe : public VPRecipeWithIRFlags,
VPReplicateRecipe *clone() override {
auto *Copy = new VPReplicateRecipe(
- getUnderlyingInstr(), operands(), IsSingleScalar,
+ getUnderlyingInstr(), operands(), isNarrow(),
isPredicated() ? getMask() : nullptr, *this, *this, getDebugLoc());
Copy->transferFlags(*this);
return Copy;
@@ -3228,7 +3265,8 @@ class LLVM_ABI_FOR_TEST VPReplicateRecipe : public VPRecipeWithIRFlags,
InstructionCost computeCost(ElementCount VF,
VPCostContext &Ctx) const override;
- bool isSingleScalar() const { return IsSingleScalar; }
+ /// TODO: Remove.
+ bool isSingleScalar() const { return isNarrow(); }
bool isPredicated() const { return IsPredicated; }
@@ -3271,7 +3309,8 @@ class LLVM_ABI_FOR_TEST VPReplicateRecipe : public VPRecipeWithIRFlags,
class LLVM_ABI_FOR_TEST VPBranchOnMaskRecipe : public VPRecipeBase {
public:
VPBranchOnMaskRecipe(VPValue *BlockInMask, DebugLoc DL)
- : VPRecipeBase(VPRecipeBase::VPBranchOnMaskSC, {BlockInMask}, DL) {}
+ : VPRecipeBase(VPRecipeBase::VPBranchOnMaskSC, VPRecipeBase::Narrow,
+ {BlockInMask}, DL) {}
VPBranchOnMaskRecipe *clone() override {
return new VPBranchOnMaskRecipe(getOperand(0), getDebugLoc());
@@ -3459,8 +3498,10 @@ class LLVM_ABI_FOR_TEST VPPredInstPHIRecipe : public VPSingleDefRecipe {
public:
/// Construct a VPPredInstPHIRecipe given \p PredInst whose value needs a phi
/// nodes after merging back from a Branch-on-Mask.
+ /// FIXME: The WideInfo should be narrow.
VPPredInstPHIRecipe(VPValue *PredV, DebugLoc DL)
- : VPSingleDefRecipe(VPRecipeBase::VPPredInstPHISC, PredV, DL) {}
+ : VPSingleDefRecipe(VPRecipeBase::VPPredInstPHISC, VPRecipeBase::Wide,
+ PredV, DL) {}
~VPPredInstPHIRecipe() override = default;
VPPredInstPHIRecipe *clone() override {
@@ -3523,7 +3564,8 @@ class LLVM_ABI_FOR_TEST VPWidenMemoryRecipe : public VPRecipeBase,
std::initializer_list<VPValue *> Operands,
bool Consecutive, const VPIRMetadata &Metadata,
DebugLoc DL)
- : VPRecipeBase(SC, Operands, DL), VPIRMetadata(Metadata), Ingredient(I),
+ : VPRecipeBase(SC, VPRecipeBase::Wide, Operands, DL),
+ VPIRMetadata(Metadata), Ingredient(I),
Alignment(getLoadStoreAlignment(&I)), Consecutive(Consecutive) {}
public:
@@ -3753,7 +3795,9 @@ class VPExpandSCEVRecipe : public VPSingleDefRecipe {
public:
VPExpandSCEVRecipe(const SCEV *Expr)
- : VPSingleDefRecipe(VPRecipeBase::VPExpandSCEVSC, {}), Expr(Expr) {}
+ : VPSingleDefRecipe(VPRecipeBase::VPExpandSCEVSC, VPRecipeBase::Narrow,
+ {}),
+ Expr(Expr) {}
~VPExpandSCEVRecipe() override = default;
@@ -3862,7 +3906,8 @@ class VPWidenCanonicalIVRecipe : public VPSingleDefRecipe,
public VPUnrollPartAccessor<1> {
public:
V...
[truncated]
|
🐧 Linux x64 Test Results
✅ The build succeeded and all tests passed. |
🪟 Windows x64 Test Results
✅ The build succeeded and all tests passed. |
347e5cb to
fb1bcb5
Compare
fb1bcb5 to
f0d6d76
Compare
fdb7742 to
58cacd9
Compare
fhahn
left a comment
There was a problem hiding this comment.
For the transition towards folding various recipes into VPInstruction & co to reduce unnecessary duplication, we need a way to express the widening decision directly inside VPInstruction, which currently is the main recipe that does not directly imply a widening decisions. There are a few others that produce a single-scalar/scalar-for-all lanes like ScalarIVSteps, but VPInstruction is the main outlier, handling all three widening options somewhat implicitly.
It would be good to start focusing on VPInstruction and trying to subsume obvious duplicated recipes. We should be able to get there incrementally, by starting with tracking the widening decision in VPInstruction. Existing #129712 and #141429 could serve as starting point, by tracking single-scalar first.
This could then be easily extended to encode the full range with an enum as proposed here, coupled with moving cases from VPReplicateRecipe over to VPInstruction.
I think it would still be good to tie the widening decision immutably to a recipe instance (somewhat fixing the current outlier VPInstruction). Almost all current recipes already encode this, so I am not sure we should add a field to track the state that is already explicit for most recipes as first step.
At least initially, I think it would also be good to keep the distinction between the widening encoded in a recipe, and changes done via analysis + transform (e.g. if we can determine that a recipe producing a vector is only used as scalar, we should convert it to a single-scalar recipe)
|
There are several special-cases that I have unified here, and recipes other than VPInstruction already have ambiguous semantics: there can be a Widen that can never be narrowed like Widen(Call|Intrinsic) due to Function-type mismatch, a Widen that can be narrowed like Widen(GEP|Cast), a Narrow (what we call "single-scalar" in some places), a ReplicatePart that is illegal to narrow like a side-effecting one, a ReplicatePart that is legal to narrow like a BinOp or Cast, VectorToScalar that produce scalars from a vector, and ScalarToVector that produce a vector from a scalar. Of course, we could match the recipe and the opcode to find out which of these classes the instruction belongs to, but I thought it would be good to embed the information in VPRecipeBase to clean up some code and avoid recursive reasoning. I agree that we should try to consolidate recipes into VPInstruction: I also proposed #193510. I'm not sure about an immutable widening decision though -- the issue is that recipes can go from a Widen/ReplicatePart decision into a Narrow decision at any point: this depends both on legality and profitability, which do change over transforms. Currently, we widen recipes upfront (using the legacy costing), creating hard-coded wide recipes, and we see these pretty early in the VPlan: isn't it wasteful to extract the data from one kind of recipe just to create another kind of recipe to replace it? I found out that "single-scalar" is ill-defined, which is the reason I separated out concerns in the WideningTy bits. The reason I'm proposing this prior to the consolidation is so that we get the terminology right upfront, and avoid falling into the same pattern of ill-defined terms and hacks to work around it. For example, in the new vputils::isSingleScalar, it is clear that the isAgnostic case should be removed, and the helper should be renamed to vputils::couldProduceNarrowResultByNarrowing (which is what it really is). I've improved usesScalarsOnly in this patch, and usesFirstLaneOnly can also be improved in a follow-up to avoid recursive reasoning, to check against WideningInfo. Yes, this has the downside that a marking a hard-coded wide recipe Narrow would not change anything, and the goal is not to make it work, but rather remove the hard-coded wide recipes like Widen(GEP|Cast) altogether, and unify the execute in the wide-narrow-unified recipe to respect WideningInfo. I'm not a fan of creating dead code/paths either, but I would really like to see the enum land, so I can make forward progress? I tried out a lot of experiments (see the closed PRs linked in this PR) to discover these properties, and I'm confident that I've captured all the widening semantics now? I did try to show enough of an upside in the current PR though: many casts and TypeSwitches would get cleaned up when simply reasoning about the widening semantics of a recipe -- in this PR, vputils::isSingleScalar got simplified, and as a follow up, narrowToSingleScalars can be simplified with optimization benefits? A planned follow-up is to directly be able to query VPValue to check its widening information: roughly speaking, a live-in will not require a broadcast if all its users are narrow'able recipes. Another planned follow-up is to extend narrowToSingleScalars to narrow all narrow'able recipes -- an example of such a recipe that cannot be absorbed into VPInstruction is VPPredInstPHI. I could move the enum to VPInstruction, and do a TypeSwitch over recipes, querying their opcodes for this WideningInfo, but seems wasteful and would create unnecessary duplication and confusion? Even if the recipes' WideningInfo is fixed and immutable, I think it's valuable to have it embedded in the recipe, as this information is not present elsewhere, and it is up to contributors to inspect each recipe and figure out the widening semantics? The main simplifications would come in the recipes' execute, which would not use hacks like onlyFirstLaneUsed or isSingleScalar to re-analyze the VPlan, and determine the IsScalar boolean of State.get or State.set: they would simply use the WideningInfo? I think the alternative of having virtual functions is unworkable for the moment, as VPInstruction isn't the only recipe that mutates the WideningInfo. On the analysis/transform separation, how would we convert any recipe to a "single-scalar" to commit the result of the analysis? Do you propose to create a "single-scalar" versions of recipes like Blends, PredInstPHI, and Reductions? Would you duplicate their execute members, inserting true for State.(get|set) in one place, and false in another? If you ask me, the only difference the widening decision makes to a recipe is this boolean in the execute? p.s. I see that your VPI-consolidation patches are quite old -- I think we can go ahead with the scalar cast patch (I'll try to review it), but I really think we should try to review this PR before the widen cast. |
6f6e527 to
36f36f8
Compare
We currently make widening decisions in an ad-hoc fashion, and have helpers that unnecessarily do recursive reasoning over and over: isSingleScalar, onlyFirstLaneUsed, and onlyScalarValuesUsed. The goal is to make widening decisions in a principled fashion, simply by analyzing the Plan, and this patch is the first step towards the goal. By embedding widening decisions in recipes, which can be refined iteratively, we can eliminate the hard-coded wide-equivalents of VPInstruction, including VPWidenRecipe, VPWidenGEPRecipe, and VPWidenCastRecipe. The design we have picked would also allow us to eliminate VPReplicateRecipe. This patch is intentionally non-functional, but it should have enough information to see what the follow-ups would be.
36f36f8 to
fb27825
Compare
|
I have made it immutable for now -- WideningInfo is now query'able information holding the widening semantics of different recipes, and this data is not present anywhere else at the moment. We can discuss the mutability aspect separately. |
|
I had some more time to think about this. I'll write in bullets to avoid writing the same kind of long prose as my previous post:
|
|
Better idea: #196181. |
The goal is to make widening decisions in a principled fashion, simply by analyzing the Plan, and this patch is the first step towards the goal. By embedding widening semantics in recipes, we make explicit data that is otherwise non-trivial to infer from the recipe.