[vf] Provide expression type information to Visualforce rules to avoid false positives#2864
Conversation
Addresses the general issue raised in pmd#1092 This commit removes false positives from expressions in apex tags. The specific use case raised in 1092 isn't reproducible and represents a false negative that will be fixed separately. The existing Visualforce rules don't have any information about the data types referenced in the Visualforce page. This results in false positives when attempting to identify expressions that are vulnerable to XSS attacks. The rules should not warn about XSS attacks when the expression refers to a type such as Integer or Boolean. The VfExpressionTypeVisitor visits the Visualforce page and extracts the datatypes from Salesforce metadata. Data type information can come from either Apex classes or Object Fields. The Salesforce metadata is generally located in a sibling directory of the Visualforce directory. By default the code looks in directories relative to the Visualforce file to find the metadata. The conventional locations for the metadata are "../classes" and "../objects", the user can override this default with other directories if required.
jbartolotta-sfdc
left a comment
There was a problem hiding this comment.
@adangel @rsoesemann - can you let me know your thoughts on this change?
It has some aspects of Multi file and Cross language tactically applied to solve a specific Visualforce issue.
Thanks.
| final List<ASTIdentifier> ids = expr.findChildrenOfType(ASTIdentifier.class); | ||
|
|
||
| for (final ASTIdentifier id : ids) { | ||
| ExpressionType expressionType = getExpressionType(id); |
There was a problem hiding this comment.
This is the code that fixes the false positives for this rule.
|
|
||
| public class VfUnescapeElTest extends PmdRuleTst { | ||
| // no additional unit tests | ||
| public static final String EXPECTED_RULE_MESSAGE = "Avoid unescaped user controlled content in EL"; |
There was a problem hiding this comment.
Added methods that run rules against files found in the test/resources/ directory. I didn't add the runRule method to the base class because the use case is very specific to this change and I didn't want to confuse others that might be writing rule tests.
Renamed ExpressionType to IdentifierType since this is more accurate. Removed usage of google.collect classes that were causing UnsupportedClassVersionError exception in the Travis CI run.
adangel
left a comment
There was a problem hiding this comment.
Thanks for tackling this!
This change is similar to some of the ideas presented in #237 but is tactically focused on Visualforce. This code could be the basis of a more general approach or could be modified to conform to a more general approach when one becomes available.
Yes, I agree. We should start small+focused and improve/generalize it over time.
pmd-visualforce depends on pmd-apex. pmd-apex relies on Java 8. This change configures pmd-visualforce to also require Java 8. This is a breaking change that will need to be documented.
Generated by 🚫 Danger |
Allow subclasses of AbstractRuleSetFactoryTest to filter out languages that show up in the classpath but should not be tested. Change VFTestContstants to final instead of abstract.
06a6b85 to
ba2e91b
Compare
Store the IdentifierType on ASTIdentifier node instead of in a separate map. Use the existing TypeResolution pattern to configure the visitor instead deriving from an abstract rule. Changed ParserOptions to extend AbstractPropertySource with the ability to override the defaults via environment variables.
| private <T> String errorForPropCapture(PropertyDescriptor<T> descriptor) { | ||
| return descriptor.errorFor(getProperty(descriptor)); | ||
| } | ||
|
|
There was a problem hiding this comment.
Added equals and hashCode based on the ParserOptions JavaDoc comment.
|
The CI failed with what looks like a transient error. Hopefully it will pass when I push some more code.
|
| @@ -0,0 +1,154 @@ | |||
| /** | |||
There was a problem hiding this comment.
I just found ParserOptionsTest in the pmd-test project I will move these tests over. This class is not currently running as a unit test because it is in the src/main hierarchy. All of the other classes in pmd-test/src/main are abstract classes meant to be extended. ParserOptionsTest#verifyOptionsEqualsHashcode is referenced from XmlParserOptionsTest
I will refactor #verifyOptionsEqualsHashcode into a utils class and move the test methods over to pmd-test so that they are run.
LanguageVersionHandler#getTypeResolutionFacade is deprecated. Moved the VfExpressionTypeVisitor creation and execution to VfParser#parse instead. ParsingOptionsTest located in pmd-test wasn't running previously because it was in the src/main hierarchy. Moved this test into the src/test hierarchy and consolidated the methods from the similarly named class from pmd-core.
The latest CI run fails with a similar error but different jar, is this something that I have caused? It's not in a module that I have modified, but maybe it's related somehow. |
It's nothing you have caused. It's a combination of multiple problems when running long maven build within github's infrastructure (which is Azure). |
|
I just want to thank @adangel and @oowekyala for assisting so much here. I couldn't not do such a detailed code review myself, simply because this is not a trivial rule conversion I did some years ago. And I am not a Java pro. Thanks also to @jbartolotta-sfdc for pushing VF rules out of their dark and buggy corner. Far out! |
Changed method for how the Visualforce strings are reconstructed from the AST. The previous implementation had incorrect assumptions about the structure of the AST. Added tests to more thoroughly test these situations. Changed name of IdentifierType to DataType. This information can be stored on either ASTIdentifier or ASTLiteral nodes. Changes based on PR feedgack: - Restored ParserOptionsTest in order to avoid binary compatibilty issues. - Changed ParserOptions to contain a PropertySource instead of extending AbtractPropertySource.
|
@adangel @oowekyala The latest commit contains changes based on the PR feedback. It also introduces some limitations to the original PR on how the DataType information is calculated. The limitation reduces the scope of the DataType resolution to a common use case that will remove some of the most common false positives, but the change does not remove all false positives. This limitation will avoid an over correction that introduces false negatives. We will iterate to remove more false positives over time. Given the size of this change, I think it would be best to target the PR to |
| * | ||
| * See https://developer.salesforce.com/docs/atlas.en-us.api_meta.meta/api_meta/meta_field_types.htm#meta_type_fieldtype | ||
| */ | ||
| public enum DataType { |
There was a problem hiding this comment.
I renamed this from IdentifierType. This value can be stored on both ASTIdentifier and ASTLiteral nodes.
| @@ -0,0 +1,41 @@ | |||
| /* | |||
There was a problem hiding this comment.
New common class for ASTIdentifier and ASTLiteral.
| * Responsible for storing a mapping of Fields that can be referenced from Visualforce to the type of the field. The | ||
| * fields are identified by in a case insensitive manner. | ||
| */ | ||
| abstract class SalesforceFieldTypes { |
There was a problem hiding this comment.
Common code for ApexClassPropertyTypes and ObjectFieldTypes.
| * @throws DataNodeStateException if the results of this method could have been incorrect. Callers should typically | ||
| * not rethrow this exception, as it will happen often and doesn't represent a terminal exception. | ||
| */ | ||
| public Map<AbstractVFDataNode, String> getDataNodes() throws DataNodeStateException { |
There was a problem hiding this comment.
This method can be iterated on over time to handle the cases listed in the Javadoc.
@oowekyala I wanted to follow up to see if this can be released in |
Sure. If that's so I'll try to make the cli extension for language properties by then |
Thanks @oowekyala |
@oowekyala we have finished our QA on this PR. We are ready for it to be merged and released when you are ready. Thanks. FYI @rmohan20 |
This DataType does not need to be escaped, it is always escaped by the server.
|
@jbartolotta-sfdc I added a paragraph to the release notes about this PR. Could you review this and add any relevant information, eg, maybe when configuration is required (and when the defaults work), or whether code needs to be compiled or something. You can also credit other members of your team if you want (also, I found your first name on LinkedIn, if it's incorrect sorry, please correct it) Also, it looks like this fixes #1092, right? |
@oowekyala the docs look great, there isn't any need to compile the code, the defaults follow the existing conventions and should cover 99% of the cases. I think it's best to leave the directions simple for now, but I can revisit if anyone has issues. 1092 is in an odd state. The test case as provided doesn't produce any violations, even when tested using the version mentioned in the report(6.3). The fact that it doesn't produce any violations is a False Negative. @rmohan20 has a change ready that fixes these False Negatives, but she was waiting for this PR to avoid too many False Positives. Thank you @oowekyala and @adangel for all of the help. |
|
Thanks for your efforts on this! I'll merge today |
Describe the PR
Fixes false positives for VfUnescapeEl
Addresses the general issue raised in Issue 1092 [vf] Id fields considered unescaped and is related to Issue 237 [core] RFC: Analyzing embedded snippets from other languages
This PR removes false positives from expressions in
apextags found in Visualforce pages forVfUnescapeElRule. The specific use case raised in 1092 isn't currently reproducible and represents a false negative that will be fixed separately by @rmohan20.The existing Visualforce rules don't have any information about the data types referenced in the Visualforce page. This results in false positives when attempting to identify expressions that are vulnerable to XSS attacks. The rules should not warn about XSS attacks when the expression refers to a type such as Integer or Boolean.
The VfExpressionTypeVisitor visits the Visualforce page and extracts the datatypes from Salesforce metadata. Data type information can come from either Apex classes or Object Fields. The Salesforce metadata is generally located in a sibling directory of the Visualforce directory. By default the code looks in directories relative to the Visualforce file to find the metadata. The conventional locations for the metadata are "../classes" and "../objects", the user can override this default with other directories if required.
This change is similar to some of the ideas presented in #237 but is tactically focused on Visualforce. This code could be the basis of a more general approach or could be modified to conform to a more general approach when one becomes available.
The significant differences from #237 are
Related issues
#1092
#237
Ready?
./mvnw clean verifypasses (checked automatically by travis)