New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Swift: add DataFlow::Content for arrays #13741
base: main
Are you sure you want to change the base?
Conversation
|
Some more checking to do about |
swift/ql/lib/codeql/swift/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
swift/ql/lib/codeql/swift/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
swift/ql/lib/codeql/swift/dataflow/internal/DataFlowPrivate.qll
Outdated
Show resolved
Hide resolved
|
Also this needs a DCA run. But probably not before the above fixes have been implemented. |
|
Oh, one more thing: We need to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not looking at this sooner.
This looks great. Thank you for including models-as-data support (ArrayElement), there are a number of models that should benefit from using this in the near future.
| var matrix = [[source()]] | ||
| sink(arg: matrix[0]) | ||
| sink(arg: matrix[0][0]) // $ flow=645 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does assignment to a multi-dimensional array work as well?
var matrix2 = [[1]]
matrix2[0][0] = source()
sink(arg: matrix2[0][0])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it doesn't, and now that I look at the cases I based this off of I'm not sure we handle nested fields or tuples properly either...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm. Then we should probably create an issue for this (in all content types) rather than trying to address it for just arrays here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding some tests it seems to work for nested fields... I'm not sure why, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because of "reverse reads": if readStep(n2.(PostUpdateNode).getPreUpdateNode(), f, n1.(PostUpdateNode).getPreUpdateNode()) then the dataflow library generates storeStep(n1, f, n2). So if this doesn't work for array content then it's either because the post-update/pre-update isn't working, or because the readStep isn't working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's actually that there's a step through an InOutExpr between the store and read step:
# 650| getElement(11): [AssignExpr] ... = ...
# 650| getDest(): [SubscriptExpr] ...[...]
# 650| getBase(): [InOutExpr] &...
# 650| getSubExpr(): [SubscriptExpr] ...[...]
# 650| getBase(): [InOutExpr] &...
# 650| getSubExpr(): [DeclRefExpr] matrix2
| var arr3 = [1] | ||
| var arr4 = arr2 + arr3 | ||
| sink(arg: arr3[0]) | ||
| sink(arg: arr4[0]) // $ MISSING: flow=642 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a model of Array.+(_:_:) in ArraySummaries fix this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should.
| sink(arg: arr4[0]) // $ MISSING: flow=642 | ||
| var arr5 = Array(repeating: source(), count: 2) | ||
| sink(arg: arr5[0]) // $ MISSING: flow=654 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again it's possible a model of the initializer is all we need here. It can wait as follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed.
|
DCA shows a 6.6x analysis slowdown, which hopefully @MathiasVP s suggestion will fix. We can tolerate some slowdown for an improvement like this. |
|
New DCA run LGTM. Not a hint of performance issues. Good work! |
| component.getName() = "ArrayElement" and | ||
| content instanceof Content::ArrayContent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be pulled into its own predicate to follow the structure of parseField?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a considerably simpler case than parseField, I'd be happy either way TBH.
| @@ -111,7 +111,8 @@ private module Cached { | |||
| hasExprNode(n, | |||
| [ | |||
| any(Argument arg | modifiable(arg)).getExpr(), any(MemberRefExpr ref).getBase(), | |||
| any(ApplyExpr apply).getQualifier(), any(TupleElementExpr te).getSubExpr() | |||
| any(ApplyExpr apply).getQualifier(), any(TupleElementExpr te).getSubExpr(), | |||
| any(SubscriptExpr se).getBase(), any(InOutExpr ioe).getSubExpr() | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a better fix maybe be to hide InOutExpr from the AST entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe - we might break some other things by doing that... I can try it though
| @@ -705,7 +709,7 @@ predicate storeStep(Node node1, ContentSet c, Node node2) { | |||
| node1.asExpr() = assign.getSource() and | |||
| node2.(PostUpdateNode).getPreUpdateNode().asExpr() = subscript.getBase() and | |||
| subscript = assign.getDest() and | |||
| subscript.getBase().getType().(InOutType).getObjectType() instanceof ArrayType and | |||
| // subscript.getBase().getType().(InOutType).getObjectType() instanceof ArrayType and | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to comment this out? Based on 42cc644#r1267210752, I gathered that conjunct was important
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh oops
No description provided.