Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Fix join order in isArgumentForParameter #7392

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

@MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Dec 14, 2021

This join was previously fixed in #3813, but it seems like it's broken again. This is another attempt that will hopefully be more robust.

main (on ChakraCore):

Tuple counts for AliasAnalysis::isArgumentForParameter#fff/3@5f8d3acp after 1.7s:
  1040290  ~0%      {3} r1 = JOIN Operand::NonPhiOperand#fff_10#join_rhs WITH Instruction::CallInstruction::getStaticCallTarget_dispred#ff ON FIRST 1 OUTPUT Lhs.1 'operand', Lhs.0 'ci', Rhs.1
  942858   ~0%      {3} r2 = r1 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  937251   ~0%      {3} r3 = r2 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  
  937251   ~0%      {3} r4 = r3 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  937251   ~0%      {3} r5 = r4 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  391018   ~2%      {4} r6 = JOIN r5 WITH Operand::PositionalArgumentOperand::getIndex_dispred#ff ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.0 'operand', Lhs.1 'ci'
  387998   ~0%      {3} r7 = JOIN r6 WITH Function::Function::getParameter_dispred#fff ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'operand', Lhs.3 'ci'
  316602   ~13%     {3} r8 = JOIN r7 WITH Instruction::InitializeParameterInstruction::getParameter_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'init', Lhs.1 'operand', Lhs.2 'ci'
  316602   ~4%      {3} r9 = JOIN r8 WITH project#Instruction::InitializeParameterInstruction#class#ff ON FIRST 1 OUTPUT Lhs.2 'ci', Lhs.1 'operand', Lhs.0 'init'
  
  219363   ~0%      {3} r10 = JOIN r3 WITH Operand::ThisArgumentOperand#class#ffff ON FIRST 1 OUTPUT Lhs.0 'operand', Lhs.1 'ci', Lhs.2
  219363   ~0%      {3} r11 = r10 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  219363   ~0%      {3} r12 = r11 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  219363   ~0%      {3} r13 = SCAN r12 OUTPUT In.2, In.0 'operand', In.1 'ci'
  10784166 ~0%      {3} r14 = JOIN r13 WITH AliasAnalysis::isArgumentForParameter#fff#join_rhs ON FIRST 1 OUTPUT Rhs.1 'init', Lhs.1 'operand', Lhs.2 'ci'
  366572   ~1%      {3} r15 = JOIN r14 WITH project#Instruction::InitializeParameterInstruction#class#ff ON FIRST 1 OUTPUT Lhs.0 'init', Lhs.1 'operand', Lhs.2 'ci'
  367997   ~0%      {4} r16 = JOIN r15 WITH IRConstruction::Raw::getInstructionVariable#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'operand', Lhs.2 'ci', Lhs.0 'init'
  190284   ~3%      {3} r17 = JOIN r16 WITH IRVariable::IRThisVariable#class#fffff ON FIRST 1 OUTPUT Lhs.2 'ci', Lhs.1 'operand', Lhs.3 'init'
  
  506886   ~4%      {3} r18 = r9 UNION r17
                    return r18

This PR:

[2021-12-14 13:29:29] (167s) Tuple counts for AliasAnalysis::initializeParameterInstructionHasVariable#ff/2@b2762c2g after 13ms:
  193946 ~0%     {2} r1 = JOIN project#Instruction::InitializeParameterInstruction#class#ff WITH IRConstruction::Raw::getInstructionVariable#ff ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.0 'init'
                  return r1
...
[2021-12-14 13:29:29] (167s) Tuple counts for AliasAnalysis::instructionInitializesThisInFunction#ff/2@aa4c75mm after 11ms:
  60518 ~0%     {1} r1 = SCAN IRVariable::IRThisVariable#class#fffff OUTPUT In.0
  60518 ~1%     {1} r2 = JOIN r1 WITH AliasAnalysis::initializeParameterInstructionHasVariable#ff ON FIRST 1 OUTPUT Rhs.1 'init'
  60518 ~1%     {1} r3 = JOIN r2 WITH project#Instruction::InitializeParameterInstruction#class#ff ON FIRST 1 OUTPUT Lhs.0 'init'
  60518 ~0%     {2} r4 = JOIN r3 WITH Instruction::Instruction::getEnclosingIRFunction_dispred#2#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'init'
  60518 ~4%     {2} r5 = JOIN r4 WITH IRFunctionBase::IRFunctionBase#ff ON FIRST 1 OUTPUT Rhs.1 'f', Lhs.1 'init'
                return r5
...
Tuple counts for AliasAnalysis::isArgumentForParameter#fff/3@b135b8cc after 1.1s:
  1040290 ~0%     {3} r1 = JOIN Operand::NonPhiOperand#fff_10#join_rhs WITH Instruction::CallInstruction::getStaticCallTarget_dispred#ff ON FIRST 1 OUTPUT Lhs.1 'operand', Lhs.0 'ci', Rhs.1
  942858  ~0%     {3} r2 = r1 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  937251  ~0%     {3} r3 = r2 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  
  219363  ~0%     {3} r4 = JOIN r3 WITH Operand::ThisArgumentOperand#class#ffff ON FIRST 1 OUTPUT Lhs.0 'operand', Lhs.1 'ci', Lhs.2
  219363  ~0%     {3} r5 = r4 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  219363  ~0%     {3} r6 = r5 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  219363  ~9%     {3} r7 = SCAN r6 OUTPUT In.2, In.0 'operand', In.1 'ci'
  190284  ~0%     {3} r8 = JOIN r7 WITH AliasAnalysis::instructionInitializesThisInFunction#ff ON FIRST 1 OUTPUT Lhs.2 'ci', Lhs.1 'operand', Rhs.1 'init'
  
  937251  ~0%     {3} r9 = r3 AND NOT IRCppLanguage::isFunctionVirtual#f(Lhs.2)
  937251  ~0%     {3} r10 = r9 AND NOT Alias::AliasFunction#class#f(Lhs.2)
  391018  ~2%     {4} r11 = JOIN r10 WITH Operand::PositionalArgumentOperand::getIndex_dispred#ff ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.0 'operand', Lhs.1 'ci'
  387998  ~0%     {3} r12 = JOIN r11 WITH Function::Function::getParameter_dispred#fff ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'operand', Lhs.3 'ci'
  316602  ~4%     {3} r13 = JOIN r12 WITH Instruction::InitializeParameterInstruction::getParameter_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2 'ci', Lhs.1 'operand', Rhs.1 'init'
  
  506886  ~0%     {3} r14 = r8 UNION r13
                  return r14
@MathiasVP MathiasVP added the C++ label Dec 14, 2021
@MathiasVP MathiasVP requested review from as code owners Dec 14, 2021
@github-actions github-actions bot added the C# label Dec 14, 2021
@andersfugmann
Copy link
Contributor

@andersfugmann andersfugmann commented Dec 14, 2021

The first commit seems to have broken C#

243 tests passed; 5 tests failed:
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/PrintAst.qlref
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/definitions.qlref
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/DB-CHECK
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/CONSISTENCY/CfgConsistency.ql
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/CONSISTENCY/SsaConsistency.ql

@MathiasVP
Copy link
Contributor Author

@MathiasVP MathiasVP commented Dec 14, 2021

The first commit seems to have broken C#

243 tests passed; 5 tests failed:
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/PrintAst.qlref
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/definitions.qlref
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/DB-CHECK
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/CONSISTENCY/CfgConsistency.ql
  FAILED: /home/runner/work/semmle-code/semmle-code/ql/csharp/ql/test/library-tests/definitions/CONSISTENCY/SsaConsistency.ql

That's .... odd (considering the fact that the IR part of C# is in the experimental directory). I'll take a look. Thanks for the heads up!

@MathiasVP
Copy link
Contributor Author

@MathiasVP MathiasVP commented Dec 14, 2021

Looks like it was a transient failure, @andersfugmann 🎉.

@MathiasVP
Copy link
Contributor Author

@MathiasVP MathiasVP commented Dec 15, 2021

DCA hit an out-of-memory error when checking the PR for bad join-orders, so unfortunately the experiment didn't automatically create the execution-timg log. However, I was able to generate it locally, and it looks like:

Log time comparisons for execute-queries.log (data/MathiasVP/pr-7392-dd6085f__Differences__security-extend) 
╔══════════════════════════════╤═══════╤════════╤═══════╤════════╤════════╤══════════╗
║                                   v1   stddev      v2   stddev     diff   relative ║
╟──────────────────────────────┼───────┼────────┼───────┼────────┼────────┼──────────╢
║ Overall (excl. partials!)      15425            15485             60.67      1.004 ║
╟──────────────────────────────┼───────┼────────┼───────┼────────┼────────┼──────────╢
║ microsoft__calculator (1, 1)      56        0      44        0      -12      0.786 ║
║ dsp-testing__putty (3, 3)      265.7    31.21   254.3    43.92   -11.33      0.957 ║
║ an-tao__drogon (1, 1)            199        0     191        0       -8       0.96 ║
║ abseil__abseil-cpp (1, 1)        115        0     111        0       -4      0.965 ║
║ nlohmann__json (1, 1)            260        0     252        0       -8      0.969 ║
╟──────────────────────────────┼───────┼────────┼───────┼────────┼────────┼──────────╢
║ boostorg__boost (2, 2)         544.5    0.707     531     39.6    -13.5      0.975 ║
║ facebook__yoga (1, 1)             55        0      54        0       -1      0.982 ║
║ git__git (1, 1)                  495        0     490        0       -5       0.99 ║
║ microsoft__ChakraCore (1, 1)     574        0     570        0       -4      0.993 ║
║ vim__vim (1, 1)                  397        0     395        0       -2      0.995 ║
╟──────────────────────────────┼───────┼────────┼───────┼────────┼────────┼──────────╢
║ wireshark__wireshark (1, 1)     5736        0    5718        0      -18      0.997 ║
║ php__php-src (1, 1)              746        0     747        0        1      1.001 ║
║ torvalds__linux (1, 1)           858        0     866        0        8      1.009 ║
║ kamailio__kamailio (1, 1)       3448        0    3526        0       78      1.023 ║
║ openjdk__jdk (2, 2)             1676     60.1    1736    36.77     60.5      1.036 ║
╚══════════════════════════════╧═══════╧════════╧═══════╧════════╧════════╧══════════╝

I've checked locally that the increase on openjdk/jdk is just noise: the tuple counts have improved on that project similar to how it's improved on all the other projects.

@@ -266,6 +266,20 @@ private predicate operandReturned(Operand operand, IntValue bitOffset) {
bitOffset = Ints::unknown()
}

pragma[nomagic]
Copy link
Contributor

@redsun82 redsun82 Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain to a newbie what does this incantation do? 🙂

Copy link
Contributor Author

@MathiasVP MathiasVP Dec 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can! "Magic" is a datalog-specific optimization that is implemented by the QL compiler. This optimization gathers "context" from places where the predicate is called, and pushes that context into the predicate body in order to make the number of tuples in the predicate smaller (and thus the evaluation faster). pragma[nomagic] disables that optimization for this predicate. It's described very well in this paper: https://dl.acm.org/doi/10.1145/1376616.1376673

Normally, when we factor out some joins into their own predicate, we do this because we want those joins to happen in isolation. If we allow magic in that predicate, chances are that the compiler will just insert those additional joins (that we wanted to prevent) back into this new predicate.

So by marking the predicate as nomagic we're saying "I pulled out these joins, and I don't want the compiler to insert additional joins". Does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants