New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby: expand DataFlow API #11024
base: main
Are you sure you want to change the base?
Ruby: expand DataFlow API #11024
Conversation
This is the inverse of getALocalSource()
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
99c8eba
to
be1d931
Compare
04a5c91
to
c2e33a2
Compare
These had the same name and ended up being unified
a3b5401
to
be87c74
Compare
be87c74
to
ab4e341
Compare
| @@ -170,6 +170,24 @@ module ConstantValue { | |||
|
|
|||
| /** A constant `nil` value. */ | |||
| class ConstantNilValue extends ConstantValue, TNil { } | |||
|
|
|||
| /** Gets the integer constant `x`. */ | |||
| ConstantValue getInt(int x) { result.getInt() = x } | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels a bit strange that the "inverse" of getInt is named getInt . Wouldn't ConstantValue::fromInt be a better name?
| @@ -23,6 +24,20 @@ class Module extends TModule { | |||
| /** Gets an `include`d module. */ | |||
| Module getAnIncludedModule() { result = getAnIncludedModule(this) } | |||
|
|
|||
| /** Gets the super class or an included or prepended module. */ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is very similar to https://github.com/github/codeql/pull/10747/files#diff-9333de60c5c1b8283b484ae3b7e3cbef24b7bf52aee695df420d0e3f4d55403b
The main difference is that you use * instead of + which would make all modules ancestors of themselves, which is odd (even though it is handy in some cases).
@hvitved suggested a pragma[inline] annotation on getAnAncestor , perhaps you should do that as well.
| * For example, the canonical enclosing module of `A::B` is `A`, and `A` itself has no canonical enclosing module. | ||
| */ | ||
| pragma[nomagic] | ||
| Module getCanonicalEnclosingModule() { result.getQualifiedName() = this.getEnclosingModuleName() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what a "canonical" enclosing module means. Could you give an example of a non-canonical enclosing module?
| Node getAnElementWriteValue() { | ||
| exists(CallNode call | | ||
| call = this.getAMethodCall("[]=") and | ||
| call.getNumberOfArguments() = 2 and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can have assignments like
obj[a, b, c] = x
I'm not sure we want to capture such things in this predicate though. Just wanted to make sure you ignored such calls on purpose.
Adds some classes to do basic stuff with
DataFlownodes without having to map down to CFG and AST nodes, and a few other tidbits I found myself missing.To ensure some degree of battle-hardening I (mostly) ported three library models to use the new nodes where it makes sense. The affected models are
Rails.qll,Railties.qll, andActionController.qll.Migrating from AST nodes to DataFlow nodes revealed a couple of pre-existing issues, some of which have now been fixed or fixed enough to avoid a regression:
Some other issues which were too hard to fix prior to opening this PR:
self, but of course that's not always the case. There is currently no way to tell the analysis what the value ofselfis, so we have to work around it for now. This means using things likegetParent+()and treating the moduleselfas if it's an instance of the module.getConst() and what about API graphs?
One of the changes that may look odd is the introduction of
DataFlow::getConst()andConstRef. It provides convenient access to constants that may resolve to a given qualified name. For example, the incantationgives us all descendents of
ActionController::Base.This was probably the biggest use case for API graphs in Ruby, as you can get to (almost) the same thing with
API::getTopLevelMember("ActionController").getMember("Base"). So why addgetConst()?getConst()doesn't rely on the call graph, which means it can be used for things likegetACallSimple()and other pre-call graph customizations (we have quite a few of those in JS and I expect we'll find the need in Ruby as well).getConst()handles the rules of constant lookup more precisely than API graphs, in particular when related toinclude, nested modules, and modules with more than one declaration. I think API graphs could be extended to make this work, but it's not really a natural fit for its use/def tracking, it would need to be a bunch of new predicates specific to constant lookup, which might as well exist earlier in the pipeline.I replaced some uses of API graphs with
getConst, not because it matters in practice, I just had to ensuregetConst()is being used enough to test it in the wild.I hope the name
getConst()makes more sense in the context of Ruby. I think the name in API graphs,getMember, was used mostly because that's what JS called it.Evaluation
Evaluation shows
flowsToThe new sources and sinks are mainly due to resolving methods upwards in the hierarchy in
ActionController. The combinationgetADescendentModule().getAnInstanceMethod()takes "sideways" method resolution into account without having to really think about it, so I like that.