Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Value-number byref loads#9169

Merged
JosephTremoulet merged 4 commits intodotnet:masterfrom
JosephTremoulet:UntrackedMem
Feb 8, 2017
Merged

Value-number byref loads#9169
JosephTremoulet merged 4 commits intodotnet:masterfrom
JosephTremoulet:UntrackedMem

Conversation

@JosephTremoulet
Copy link

This change teaches value numbering to track states of "byref exposed" memory,
like it currently tracks states of heap memory; the default value number for a load
can then become a function of the memory state and pointer VN. This enables
CSE and loop hoisting to kick in when there are no intervening stores.
It also clears the way for us to chain "byref exposed" memory VNs similar to how
heap VNs are chained (via VNF_MapStore), to enable value propagation through
byref params and/or address-taken locals when there are no intervening conflicting
stores.

There are a few constituent changes:

  1. A minor refactoring to facilitate the subsequent changes
  2. Abstracting the current "heap" tracking into a slightly more general
    "memory" tracking that allows for multiple memory kinds, of which
    "heap" is one.
  3. Adding the "byref exposed" memory kind
  4. Value-numbering "byref exposed" memory states and loads

The first three changes are no-diff.

@JosephTremoulet
Copy link
Author

@briansull @dotnet/jit-contrib PTAL.

I expect it would be easiest to review the constituent commits individually.

I've confirmed this gets the desired loop hoisting from #9036, and the desired CSEs from #7903, and recognizes the motivating CSE from #7914 as legal, though CSE's heuristics decide not to fire in that case (though many similar methods improve as expected).

Diffs generally consist of improved redundant load elimination, invariant code motion, and bounds-check elimination.

Desktop SPMI diffs report a codesize win of 2% across 35,000 affected methods.

Lab runs appear to show a 7% improvement in the Matinv4 benchmark, in which a number of redundant loads and bounds checks have been removed.

@JosephTremoulet
Copy link
Author

@dotnet-bot test Windows_NT perf

inline MemoryKindSet memoryKindSet(MemoryKind id, MemoryKinds... ids)
{
return (1U << id) | memoryKindSet(ids...);
}
Copy link

@briansull briansull Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this syntax:
typename... MemoryKinds... and ids...
What is this called?
And is it supported across our compilers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's called a variadic template, and yes it's supported across our compilers (and already used in a couple other places in the jit)

src/jit/block.h Outdated
// Bitmask for an empty MemoryKindSet
inline MemoryKindSet memoryKindSet()
{
return 0;
Copy link

@briansull briansull Jan 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just using two consts: emptyMemoryKindSet' and fullMemoryKindSet`

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks for the suggestion; having emptyMemoryKindSet instead of memoryKindSet() at those uses reads much cleaner I think.

if ((tree->gtFlags & GTF_IND_VOLATILE) != 0)
{
// For any Volatile indirection, we must handle it as a
// definition of the global heap

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

global heap => GCHeap/ByrefExposed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (and likewise for the rest of these)

GcHeap, // Includes actual GC heap, and also static fields
MemoryKindCount, // Number of MemoryKinds
};
#ifdef DEBUG

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to mention that we really only support two kinds of memory and that if a third kind is every added then you need to examine the use of the variable byrefStatesMatchGcHeapStates.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I avoided baking that assumption into the places that use byrefStatesMatchGcHeapStates. I used a few patterns:

1 - places where output[GcHeap] and output[ByrefExposed] reference the same data structure:

  for (MemoryKind memoryKind : allMemoryKinds()) {
    if (memoryKind == GcHeap && statesMatch) {
      // already updated
      assert(memoryKind > ByrefExposed);
      assert(input[memoryKind] == input[ByrefExposed]); // and/or assert(output[memoryKind] == f(input[memoryKind]))
      continue;
    }
    output[memoryKind] = f(input[memoryKind]);
  }

2a - places where output[GcHeap] and output[ByrefExposed] need to be copies of each other:

  for (MemoryKind memoryKind : allMemoryKinds()) {
    if (memoryKind == GcHeap && statesMatch) {
      // copy result from ByrefExposed
      assert(memoryKind > ByrefExposed);
      output[memoryKind] = output[ByrefExposed];
      continue;
    }
    output[memoryKind] = f(input[memoryKind]);
  }

2b - other places where output[GcHeap] and output[ByrefExposed] need to be copies of each other:

  for (MemoryKind memoryKind : allMemoryKinds()) {
    if (memoryKind == GcHeap && statesMatch) {
      // already updated
      assert(memoryKind > ByrefExposed);
      assert(input[memoryKind] == input[ByrefExposed]); // and/or assert(output[memoryKind] == f(input[memoryKind]))
      continue;
    }
    output[memoryKind] = f(input[memoryKind]);
    if (memoryKind == GcHeap && statesMatch) {
      // copy over to ByrefExposed
      output[ByrefExposed] = output[memoryKind];
    }
  }

(2a was more convenient for some of the particular updates, 2b for others).

In all cases, if a third kind were added the loop would just process it independently of GcHeap and ByrefExposed, which I think is a good default position.

Does that address your concern, or am I misunderstanding the feedback?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the explaination

{
m_memorySsaMap[memoryKind] = nullptr;
}
m_refAnyClass = nullptr;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be willing to reorder m_refAnyClass to come before the loop


bool heapHavoc = false; // True ==> there's a call or a memory store that has arbitrary heap effects.
// MemoryKinds for which an in-loop call or store has arbitrary effects.
MemoryKindSet memoryHavoc = memoryKindSet();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would read better as
memoryHavoc = emptyMemoryKindSet();


// Bitmask containing all the MemoryKinds
const MemoryKindSet fullMemoryKindSet = (1 << MemoryKindCount) - 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add emptyMemoryKindSet


case GT_CLS_VAR:
// For Volatile indirection, first mutate the global heap
// For Volatile indirection, first mutate the GC heap

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC andf Byref


case GT_IND:
// For Volatile indirection, first mutate the global heap
// For Volatile indirection, first mutate GcHeap

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC and Byref

}

// In minopts, all stores conservatively affect both GcHeap and ByrefExposed memory
byrefStatesMatchGcHeapStates = true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We track these during MinOpts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, perhaps the comment was misleading. I don't expect byrefStatesMatchGcHeapStates to be checked in minOpts, but leaving it unset didn't seem right. I've updated the comment to

        // In minopts, we don't explicitly build SSA or value-number; GcHeap and
        // ByrefExposed implicitly (conservatively) change state at each instr.

Is that better?

if ((fgCurMemoryHavoc & memoryKindSet(memoryKind)) != 0)
{
printf("*");
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could reduce what we print in the dumps by adding:

if (byrefStatesMatchGcHeapStates) { break; }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had it like that at first, but regretted it as soon as I started debugging this stuff. Perhaps if that meant that ByrefExposed (or GcHeap) were just never mentioned in the dumps of byrefStatesMatchGcHeapStates methods, it would be useful, but since some of the liveness code that runs before statesMatch has been computed will end up dumping both, it just turns out confusing; I think better a more understandable dump than a smaller one, if we have to choose (and it seems in this case we have to choose).

if ((fgCurMemoryUse & memoryKindSet(memoryKind)) != 0)
{
printf(" + %s", memoryKindNames[memoryKind]);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could reduce what we print in the dumps by adding:

if (byrefStatesMatchGcHeapStates) { break; }

m_heapLiveIn = false;
m_heapLiveOut = false;
m_memoryLiveIn = memoryKindSet();
m_memoryLiveOut = memoryKindSet();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emptyMemoryKindSet

if ((block->bbMemoryLiveIn & memoryKindSet(memoryKind)) != 0)
{
printf(" + %s", memoryKindNames[memoryKind]);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (byrefStatesMatchGcHeapStates) { break; }

if ((block->bbMemoryLiveOut & memoryKindSet(memoryKind)) != 0)
{
printf(" + %s", memoryKindNames[memoryKind]);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (byrefStatesMatchGcHeapStates) { break; }

}

if (heapHavoc)
if (memoryHavoc != 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can memoryHavoc actually be different for GcHeap vs ByrefExposed?

If so it would be useful to have a test case in this area

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we assume all object field stores, static field stores, and array element stores alias all byref loads, we treat them as ByrefExposed-havoc (but not GcHeap-havoc). So every loop that has these but not GcHeap-havoc will have one kind of havoc but not the other.

//
// Return Value:
// The value number corresopnding to the heap after the assignment.
// The value number corresopnding to memory after the assignment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo corresopnding

assert(compCurBB->bbHeapDef);

fgCurHeapVN = vnStore->VNForMapStore(TYP_REF, fgCurHeapVN, elemTypeEqVN, newValAtArrType);
return vnStore->VNForMapStore(TYP_REF, fgCurMemoryVN[GcHeap], elemTypeEqVN, newValAtArrType);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we try to keep the old assert?
assert(compCurBB->bbHeapDef);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code used to assign fgCurHeapVN but doesn't anymore. Now it just returns the new value number, which then gets passed to recordGcHeapStore to do the update to fgCurMemoryVN[GcHeap], and that function does have the updated assert.

assert(fgCurHeapVN != ValueNumStore::NoVN);
}
else
// Now do the same for memory.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now do the same for both kinds of memory.

}
else
// Now do the same for memory.
for (MemoryKind memoryKind : allMemoryKinds())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to twice the work here, or should we check the bool byrefStatesMatchGcHeapStates

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do check byrefStatesMatchGcHeapStates, on line 4512 and line 4572. It made more sense to push them into the else branch of the conditional in the loop, because the if branch needs to do the same processing for both memoryKinds regardless.

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now

#ifdef DEBUG
// Sometimes we query the heap ssa map, and need a dummy location for the ignored result.
unsigned heapSsaNum;
// Sometimes we query the memory ssa map, and need a dummy location for the ignored result.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we query the memory ssa map in an assert

}

// We just mutate the heap if isVolatile is true, and then do the read as normal.
// We just mutate GcHeap if isVolatile is true, and then do the read as normal.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC or Byref

}
else if (lvaVarAddrExposed(lcl->gtLclNum))
{
// We could use MapStore here and MapSelect on reads of untracked locals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this say of address exposed locals rather than untracked locals

// This side-effects ByrefExposed. Just use a new opaque VN.
// As with GT_LCL_VAR, we could probably use MapStore here and MapSelect at corresponding
// loads, but to do so would have to identify the subset of locals which aren't tracked
// but whose fields can be disambiguated.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this say aren't address exposed rather than aren't tracked

else if (lvaVarAddrExposed(lclNum))
{
// Need to record the effect on ByrefExposed.
// We could use MapStore here and MapSelect on reads of untracked locals

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

untracked => address exposed

}

// We model statics as indices into the heap variable.
// We model statics as indices into the GC heap variable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC/Byref

#ifdef DEBUG
// This helps test some performance pathologies related to "evaluation" of VNF_MapSelect terms,
// especially relating to the heap. We count the number of applications of such terms we consider,
// especially relating to the GcHeap. We count the number of applications of such terms we consider,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GC/Byref heaps

GcHeap, // Includes actual GC heap, and also static fields
MemoryKindCount, // Number of MemoryKinds
};
#ifdef DEBUG

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for the explaination

@JosephTremoulet JosephTremoulet force-pushed the UntrackedMem branch 4 times, most recently from fd32e72 to 908dc2f Compare February 4, 2017 01:48
Consolodate the code for updating the current heap VN and HeapSsaMap
entries into new helper method `recordHeapStore`, to make it easier to
evolve that code.  Change `fgValueNumberArrIndexAssign` to return the new
heap VN to allow callers to pass it down into `recordHeapStore`
accordingly.

This is just a refactoring; no change to compiler behavior.
Re-cast the notion of "heap" (in liveness, SSA, and value-numbering) as
one of potentially many `MemoryKind`s, called `GcHeap`.  Update names,
comments, data structures, and signatures as appropriate to parameterize
relevant data/methods over `MemoryKind`.  This change is a no-diff
refactoring, and currently `GcHeap` is the only `MemoryKind`.  Generally,
codepaths which will generically need to process all `MemoryKinds`s
(initializing, dumping, dataflow propagation) now iterate over all
`MemoryKinds`, and codepaths which are sensitive to the semantics of the
specific `MemoryKind` (def/use identification in liveness and value
numbering) are changed to specifically operate on `MemoryKind::GcHeap`.

One notable exception is that `lvMemoryPerSsaData` and `CountForMemoryDef`
are *not* parameterized over `MemoryKind`; there's a single "space" of SSA
defnums for memory defs (though the same tree can incur different defs for
different memory kinds [in which case their defnums will differ]), to
facilitate subsequently sharing SSA nodes across memory kinds when
appropriate.
Add a new `MemoryKind` to represent "any memory that byrefs may
reference", called `ByrefExposed`.  All definition points of `GcHeap`
become also definition points of `ByrefExposed`, and additionally, so do
definitions of address-exposed locals.

Because it is a common case (currently happening in 90% of methods in
System.Private.CoreLib) that a method has no definitions of
address-exposed locals, have liveness detect this condition and set the
new `byrefStatesMatchGcHeapStates` flag accordingly; when the states
match, the same `MemoryPhi`s, defnums, and `PerSsaData` are then shared
between the two memory kinds, to avoid excess allocations/processing.

This change defines `ByrefExposed` memory and its def/use points, and
builds SSA for it, but no optimizations make use of it, so this is a
no-diff change.
Teach value numbering to keep track of the current value number for
`ByrefExposed` memory, updating it appropriately at relevant stores.
Introduce a new VN operator `ByrefExposedLoad`, used for loads of
address-exposed locals and loads by indirs that don't have more specific
value numbers, which is a function of the current `ByrefExposed` memory VN
and the pointer VN.  This allows loop hoisting and CSE to recognize
redundant loads via byrefs when there are no intervening stores.

Fixes #7903.
@JosephTremoulet JosephTremoulet merged commit 1e731c4 into dotnet:master Feb 8, 2017
@JosephTremoulet JosephTremoulet deleted the UntrackedMem branch February 8, 2017 16:17
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants