Skip to content

[vf] Provide expression type information to Visualforce rules to avoid false positives#2864

Merged
oowekyala merged 22 commits into
pmd:masterfrom
jbartolotta-sfdc:issue1092-vf-escape-false-positives
Dec 11, 2020
Merged

[vf] Provide expression type information to Visualforce rules to avoid false positives#2864
oowekyala merged 22 commits into
pmd:masterfrom
jbartolotta-sfdc:issue1092-vf-escape-false-positives

Conversation

@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor

Describe the PR

Fixes false positives for VfUnescapeEl

Addresses the general issue raised in Issue 1092 [vf] Id fields considered unescaped and is related to Issue 237 [core] RFC: Analyzing embedded snippets from other languages

This PR removes false positives from expressions in apex tags found in Visualforce pages for VfUnescapeElRule. The specific use case raised in 1092 isn't currently reproducible and represents a false negative that will be fixed separately by @rmohan20.

The existing Visualforce rules don't have any information about the data types referenced in the Visualforce page. This results in false positives when attempting to identify expressions that are vulnerable to XSS attacks. The rules should not warn about XSS attacks when the expression refers to a type such as Integer or Boolean.

The VfExpressionTypeVisitor visits the Visualforce page and extracts the datatypes from Salesforce metadata. Data type information can come from either Apex classes or Object Fields. The Salesforce metadata is generally located in a sibling directory of the Visualforce directory. By default the code looks in directories relative to the Visualforce file to find the metadata. The conventional locations for the metadata are "../classes" and "../objects", the user can override this default with other directories if required.

This change is similar to some of the ideas presented in #237 but is tactically focused on Visualforce. This code could be the basis of a more general approach or could be modified to conform to a more general approach when one becomes available.

The significant differences from #237 are

  • The referenced entities aren't required to be part of the files under analysis
  • Some of the referenced entities(Custom Fields) aren't languages covered by PMD
  • The logic is a post-parser step instead of being embedded in the parser. This code could be moved into the parser, but the rule allows some configuration which is currently tied to Rules and not parsers.
  • The information is not stored in the AST. This was an intentional decision because the same rule could be configured multiple different ways via a rule ref. These different configurations would result in different data which would conflict since all rules share the same AST.

Related issues

#1092
#237

Ready?

  • Added unit tests for fixed bug/feature
  • Passing all unit tests
  • Complete build ./mvnw clean verify passes (checked automatically by travis)
  • Added (in-code) documentation (if needed)

Addresses the general issue raised in pmd#1092 This commit removes false positives from expressions in apex tags. The specific use case raised in 1092 isn't reproducible and represents a false negative that will be fixed separately.

The existing Visualforce rules don't have any information about the data types referenced in the Visualforce page. This results in false positives when attempting to identify expressions that are vulnerable to XSS attacks. The rules should not warn about XSS attacks when the expression refers to a type such as Integer or Boolean.

The VfExpressionTypeVisitor visits the Visualforce page and extracts the datatypes from Salesforce metadata. Data type information can come from either Apex classes or Object Fields. The Salesforce metadata is generally located in a sibling directory of the Visualforce directory. By default the code looks in directories relative to the Visualforce file to find the metadata. The conventional locations for the metadata are "../classes" and "../objects", the user can override this default with other directories if required.

@jbartolotta-sfdc jbartolotta-sfdc left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adangel @rsoesemann - can you let me know your thoughts on this change?

It has some aspects of Multi file and Cross language tactically applied to solve a specific Visualforce issue.

Thanks.

Comment thread pmd-visualforce/pom.xml
final List<ASTIdentifier> ids = expr.findChildrenOfType(ASTIdentifier.class);

for (final ASTIdentifier id : ids) {
ExpressionType expressionType = getExpressionType(id);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the code that fixes the false positives for this rule.


public class VfUnescapeElTest extends PmdRuleTst {
// no additional unit tests
public static final String EXPECTED_RULE_MESSAGE = "Avoid unescaped user controlled content in EL";

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added methods that run rules against files found in the test/resources/ directory. I didn't add the runRule method to the base class because the use case is very specific to this change and I didn't want to confuse others that might be writing rule tests.

@jbartolotta-sfdc jbartolotta-sfdc marked this pull request as ready for review October 22, 2020 19:09
Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/ExpressionType.java Outdated
@adangel adangel added this to the 6.30.0 milestone Oct 23, 2020
Renamed ExpressionType to IdentifierType since this is more accurate.

Removed usage of google.collect classes that were causing UnsupportedClassVersionError exception in the Travis CI run.
@adangel adangel added the an:enhancement An improvement on existing features / rules label Oct 28, 2020

@adangel adangel left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this!

This change is similar to some of the ideas presented in #237 but is tactically focused on Visualforce. This code could be the basis of a more general approach or could be modified to conform to a more general approach when one becomes available.

Yes, I agree. We should start small+focused and improve/generalize it over time.

Comment thread pmd-visualforce/pom.xml
Comment thread pmd-visualforce/src/test/java/net/sourceforge/pmd/lang/vf/RuleSetFactoryTest.java Outdated
Comment thread pmd-visualforce/src/test/java/net/sourceforge/pmd/lang/vf/VFTestContstants.java Outdated
Comment thread pmd-visualforce/src/test/java/net/sourceforge/pmd/lang/vf/VFTestContstants.java Outdated
Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/IdentifierType.java Outdated
pmd-visualforce depends on pmd-apex. pmd-apex relies on Java 8. This change configures pmd-visualforce to also require Java 8.

This is a breaking change that will need to be documented.
Comment thread pmd-dist/pom.xml
@ghost

ghost commented Oct 28, 2020

Copy link
Copy Markdown
1 Message
📖 This changeset changes violations,
introduces new violations, new errors and new configuration errors,
removes violations, errors and configuration errors.
Full report
This changeset changes violations,
introduces new violations, new errors and new configuration errors,
removes violations, errors and configuration errors.
Full report
This changeset changes violations,
introduces new violations, new errors and new configuration errors,
removes violations, errors and configuration errors.
Full report
This changeset introduces 0 new violations, 0 new errors and 0 new configuration errors,
removes 0 violations, 0 errors and 0 configuration errors.
Full report
This changeset introduces 0 new violations, 0 new errors and 0 new configuration errors,
removes 0 violations, 0 errors and 0 configuration errors.
Full report
No java rules are changed!

Generated by 🚫 Danger

Allow subclasses of AbstractRuleSetFactoryTest to filter out languages
that show up in the classpath but should not be tested.

Change VFTestContstants to final instead of abstract.
@jbartolotta-sfdc jbartolotta-sfdc force-pushed the issue1092-vf-escape-false-positives branch from 06a6b85 to ba2e91b Compare November 11, 2020 05:02
Store the IdentifierType on ASTIdentifier node instead of in a separate map.

Use the existing TypeResolution pattern to configure the visitor instead deriving from an abstract rule.

Changed ParserOptions to extend AbstractPropertySource with the ability to override the defaults via environment variables.
private <T> String errorForPropCapture(PropertyDescriptor<T> descriptor) {
return descriptor.errorFor(getProperty(descriptor));
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added equals and hashCode based on the ParserOptions JavaDoc comment.

Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/VfHandler.java Outdated
@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

The CI failed with what looks like a transient error. Hopefully it will pass when I push some more code.

Error: Failed to execute goal on project pmd-apex-jorje: Could not resolve dependencies for project net.sourceforge.pmd:pmd-apex-jorje:pom:6.30.0-SNAPSHOT: Failed to collect dependencies at cglib:cglib:jar:3.2.0: Failed to read artifact descriptor for cglib:cglib:jar:3.2.0: Could not transfer artifact cglib:cglib:pom:3.2.0 from/to central (https://repo.maven.apache.org/maven2): Connection timed out (Read failed) -> [Help 1]

@@ -0,0 +1,154 @@
/**

@jbartolotta-sfdc jbartolotta-sfdc Nov 12, 2020

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just found ParserOptionsTest in the pmd-test project I will move these tests over. This class is not currently running as a unit test because it is in the src/main hierarchy. All of the other classes in pmd-test/src/main are abstract classes meant to be extended. ParserOptionsTest#verifyOptionsEqualsHashcode is referenced from XmlParserOptionsTest

I will refactor #verifyOptionsEqualsHashcode into a utils class and move the test methods over to pmd-test so that they are run.

LanguageVersionHandler#getTypeResolutionFacade is deprecated. Moved the VfExpressionTypeVisitor creation and execution to VfParser#parse instead.

ParsingOptionsTest located in pmd-test wasn't running previously because it was in the src/main hierarchy. Moved this test into the src/test hierarchy and consolidated the methods from the
similarly named class from pmd-core.
@jbartolotta-sfdc

jbartolotta-sfdc commented Nov 14, 2020

Copy link
Copy Markdown
Contributor Author

The CI failed with what looks like a transient error. Hopefully it will pass when I push some more code.

Error: Failed to execute goal on project pmd-apex-jorje: Could not resolve dependencies for project net.sourceforge.pmd:pmd-apex-jorje:pom:6.30.0-SNAPSHOT: Failed to collect dependencies at cglib:cglib:jar:3.2.0: Failed to read artifact descriptor for cglib:cglib:jar:3.2.0: Could not transfer artifact cglib:cglib:pom:3.2.0 from/to central (https://repo.maven.apache.org/maven2): Connection timed out (Read failed) -> [Help 1]

The latest CI run fails with a similar error but different jar, is this something that I have caused? It's not in a module that I have modified, but maybe it's related somehow. Error: Failed to execute goal on project pmd-apex-jorje: Could not resolve dependencies for project net.sourceforge.pmd:pmd-apex-jorje:pom:6.30.0-SNAPSHOT: The following artifacts could not be resolved: com.google.code.gson:gson:jar:2.7, com.google.errorprone:error_prone_annotations:jar:2.1.3: Could not transfer artifact com.google.code.gson:gson:jar:2.7 from/to central (https://repo.maven.apache.org/maven2): Connection timed out (Read failed) -> [Help 1]

Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/ast/ASTIdentifier.java Outdated
Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/ObjectFieldTypes.java Outdated
Comment thread pmd-visualforce/src/main/java/net/sourceforge/pmd/lang/vf/ObjectFieldTypes.java Outdated
Comment thread pmd-core/src/main/java/net/sourceforge/pmd/lang/ParserOptions.java Outdated
@adangel

adangel commented Nov 14, 2020

Copy link
Copy Markdown
Member

The CI failed with what looks like a transient error. Hopefully it will pass when I push some more code.
Error: Failed to execute goal on project pmd-apex-jorje: Could not resolve dependencies for project net.sourceforge.pmd:pmd-apex-jorje:pom:6.30.0-SNAPSHOT: Failed to collect dependencies at cglib:cglib:jar:3.2.0: Failed to read artifact descriptor for cglib:cglib:jar:3.2.0: Could not transfer artifact cglib:cglib:pom:3.2.0 from/to central (https://repo.maven.apache.org/maven2): Connection timed out (Read failed) -> [Help 1]

The latest CI run fails with a similar error but different jar, is this something that I have caused? It's not in a module that I have modified, but maybe it's related somehow. Error: Failed to execute goal on project pmd-apex-jorje: Could not resolve dependencies for project net.sourceforge.pmd:pmd-apex-jorje:pom:6.30.0-SNAPSHOT: The following artifacts could not be resolved: com.google.code.gson:gson:jar:2.7, com.google.errorprone:error_prone_annotations:jar:2.1.3: Could not transfer artifact com.google.code.gson:gson:jar:2.7 from/to central (https://repo.maven.apache.org/maven2): Connection timed out (Read failed) -> [Help 1]

It's nothing you have caused. It's a combination of multiple problems when running long maven build within github's infrastructure (which is Azure).
That should be fixed now on master. The builds with Github Actions are working now. If you add another commit to this PR (or merge/rebase against master) github actions should automatically run again.

@rsoesemann

Copy link
Copy Markdown
Member

I just want to thank @adangel and @oowekyala for assisting so much here. I couldn't not do such a detailed code review myself, simply because this is not a trivial rule conversion I did some years ago. And I am not a Java pro.

Thanks also to @jbartolotta-sfdc for pushing VF rules out of their dark and buggy corner. Far out!

Changed method for how the Visualforce strings are reconstructed from the AST. The previous implementation had incorrect assumptions about the structure of the AST. Added tests to more thoroughly test these situations.

Changed name of IdentifierType to DataType. This information can be stored on either ASTIdentifier or ASTLiteral nodes.

Changes based on PR feedgack:
- Restored ParserOptionsTest in order to avoid binary compatibilty issues.
- Changed ParserOptions to contain a PropertySource instead of extending AbtractPropertySource.
@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

@adangel @oowekyala The latest commit contains changes based on the PR feedback. It also introduces some limitations to the original PR on how the DataType information is calculated.

The limitation reduces the scope of the DataType resolution to a common use case that will remove some of the most common false positives, but the change does not remove all false positives. This limitation will avoid an over correction that introduces false negatives.

We will iterate to remove more false positives over time.

Given the size of this change, I think it would be best to target the PR to 6.31.0 instead of 6.30.0. This will give my team more time to expand our sample set of Visualforce and ensure the change doesn't introduce any false negatives.

*
* See https://developer.salesforce.com/docs/atlas.en-us.api_meta.meta/api_meta/meta_field_types.htm#meta_type_fieldtype
*/
public enum DataType {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this from IdentifierType. This value can be stored on both ASTIdentifier and ASTLiteral nodes.

@@ -0,0 +1,41 @@
/*

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New common class for ASTIdentifier and ASTLiteral.

* Responsible for storing a mapping of Fields that can be referenced from Visualforce to the type of the field. The
* fields are identified by in a case insensitive manner.
*/
abstract class SalesforceFieldTypes {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Common code for ApexClassPropertyTypes and ObjectFieldTypes.

* @throws DataNodeStateException if the results of this method could have been incorrect. Callers should typically
* not rethrow this exception, as it will happen often and doesn't represent a terminal exception.
*/
public Map<AbstractVFDataNode, String> getDataNodes() throws DataNodeStateException {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method can be iterated on over time to handle the cases listed in the Javadoc.

@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

@adangel @oowekyala The latest commit contains changes based on the PR feedback. It also introduces some limitations to the original PR on how the DataType information is calculated.

The limitation reduces the scope of the DataType resolution to a common use case that will remove some of the most common false positives, but the change does not remove all false positives. This limitation will avoid an over correction that introduces false negatives.

We will iterate to remove more false positives over time.

Given the size of this change, I think it would be best to target the PR to 6.31.0 instead of 6.30.0. This will give my team more time to expand our sample set of Visualforce and ensure the change doesn't introduce any false negatives.

@oowekyala I wanted to follow up to see if this can be released in 6.31.0 instead of 6.30.0. Thanks.

@oowekyala

Copy link
Copy Markdown
Member

@oowekyala I wanted to follow up to see if this can be released in 6.31.0 instead of 6.30.0. Thanks.

Sure. If that's so I'll try to make the cli extension for language properties by then

@oowekyala oowekyala modified the milestones: 6.30.0, 6.31.0 Nov 24, 2020
@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

@oowekyala I wanted to follow up to see if this can be released in 6.31.0 instead of 6.30.0. Thanks.

Sure. If that's so I'll try to make the cli extension for language properties by then

Thanks @oowekyala

@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

@oowekyala I wanted to follow up to see if this can be released in 6.31.0 instead of 6.30.0. Thanks.

Sure. If that's so I'll try to make the cli extension for language properties by then

Thanks @oowekyala

@oowekyala we have finished our QA on this PR. We are ready for it to be merged and released when you are ready. Thanks. FYI @rmohan20

@oowekyala oowekyala self-assigned this Dec 10, 2020
@oowekyala oowekyala modified the milestones: 6.31.0, 6.30.0 Dec 10, 2020
jbartolotta-sfdc and others added 4 commits December 9, 2020 20:11
This DataType does not need to be escaped, it is always escaped by the
server.
@oowekyala

Copy link
Copy Markdown
Member

@jbartolotta-sfdc I added a paragraph to the release notes about this PR. Could you review this and add any relevant information, eg, maybe when configuration is required (and when the defaults work), or whether code needs to be compiled or something. You can also credit other members of your team if you want (also, I found your first name on LinkedIn, if it's incorrect sorry, please correct it)

Also, it looks like this fixes #1092, right?

@jbartolotta-sfdc

Copy link
Copy Markdown
Contributor Author

@jbartolotta-sfdc I added a paragraph to the release notes about this PR. Could you review this and add any relevant information, eg, maybe when configuration is required (and when the defaults work), or whether code needs to be compiled or something. You can also credit other members of your team if you want (also, I found your first name on LinkedIn, if it's incorrect sorry, please correct it)

Also, it looks like this fixes #1092, right?

@oowekyala the docs look great, there isn't any need to compile the code, the defaults follow the existing conventions and should cover 99% of the cases. I think it's best to leave the directions simple for now, but I can revisit if anyone has issues.

1092 is in an odd state. The test case as provided doesn't produce any violations, even when tested using the version mentioned in the report(6.3). The fact that it doesn't produce any violations is a False Negative. @rmohan20 has a change ready that fixes these False Negatives, but she was waiting for this PR to avoid too many False Positives.

Thank you @oowekyala and @adangel for all of the help.

@oowekyala

Copy link
Copy Markdown
Member

Thanks for your efforts on this! I'll merge today

@oowekyala oowekyala merged commit f775d79 into pmd:master Dec 11, 2020
oowekyala added a commit that referenced this pull request Dec 11, 2020
@oowekyala oowekyala mentioned this pull request Feb 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

an:enhancement An improvement on existing features / rules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants