-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
In allocateRegisters() and allocateRegistersMinimal(), we iterate through RefPositions and update regsToFree, delayRegsToFree, regsToMakeInactive, delayRegsToMakeInactive and copyRegsToFree. These are regMaskTP local variables and as such incurs significant regression with more than 64 registers.
This primarily applies to the following
regMaskTP regsToFree = RBM_NONE;
regMaskTP delayRegsToFree = RBM_NONE;
regMaskTP regsToMakeInactive = RBM_NONE;
regMaskTP delayRegsToMakeInactive = RBM_NONE;
regMaskTP copyRegsToFree = RBM_NONE;
In order to optimize the operations on these variables, they can be represented by
struct RegSetMasks
{
SingleTypeRegSet regsToFree = RBM_NONE;
SingleTypeRegSet delayRegsToFree = RBM_NONE;
SingleTypeRegSet regsToMakeInactive = RBM_NONE;
SingleTypeRegSet delayRegsToMakeInactive = RBM_NONE;
SingleTypeRegSet copyRegsToFree = RBM_NONE;
};
and declared as one of following
Option A
RegSetMasks lowRegSet;
RegSetMasks highRegSet;
RegSetMasks *currRegSet = &lowRegSet;
Option B
RegSetMasks intRegSet;
RegSetMasks fltRegSet;
RegSetMasks mskRegSet;
This will reduce the overhead due to operations on regMaskTP since we will be working on SingleTypeRegSet which is a uint64.
Methods from the list of regressed methods this is likely to improve
allocateRegisters()allocateRegistersMinimal()freeRegisters()
The effect of switching to more than 64 registers for these methods without an optimization based on profiling is shown below.
| Method | InsCountDiff | InsPercentageDiff | ContributionPercentage |
|---|---|---|---|
| allocateRegistersMinimal@LinearScan | 3524107314 | 33.71% | 10.36% |
| allocateRegisters@LinearScan | 2453448194 | 23.65% | 7.21% |
| freeRegisters@LinearScan | 1676484336 | 62.85% | 4.93% |