Recognize classes used in web.xml as main classes#264
Recognize classes used in web.xml as main classes#264slawekjaranowski merged 8 commits intoapache:masterfrom
Conversation
A web application may include classes provided by dependency artifacts. Class references defined in web.xml are now recognized as main used classes.
| throws IOException, SAXException, XPathExpressionException, ParserConfigurationException { | ||
|
|
||
| DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); | ||
| documentBuilderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); |
There was a problem hiding this comment.
I think you need to set the namespace feature to true here
There was a problem hiding this comment.
when I set:
documentBuilderFactory.setNamespaceAware(true);
xpath stops working .... we can have web.xml in both namespace ... javaee - old and jakartaee
it will be more complicated to build xpath
There was a problem hiding this comment.
If the the xpath stops working, then the xpath is wrong. You likely have two errors that are cancelling each other out, but that's very brittle.
web.xml in both namespaces is a problem, and that's a big reason not to change namespaces with new versions. It causes exactly this sort of problem.
You can match on the local-name() instead or rearrange the code to use separate paths for the two different namespaces.
There was a problem hiding this comment.
done use namespace with local-name
...che/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProvider.java
Outdated
Show resolved
Hide resolved
| throws IOException, SAXException, XPathExpressionException, ParserConfigurationException { | ||
|
|
||
| DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); | ||
| documentBuilderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true); |
There was a problem hiding this comment.
If the the xpath stops working, then the xpath is wrong. You likely have two errors that are cancelling each other out, but that's very brittle.
web.xml in both namespaces is a problem, and that's a big reason not to change namespaces with new versions. It causes exactly this sort of problem.
You can match on the local-name() instead or rearrange the code to use separate paths for the two different namespaces.
| private XPathExpression[] getXPathExpression() throws XPathExpressionException { | ||
| XPathFactory xPathFactory = XPathFactory.newInstance(); | ||
| XPath xpath = xPathFactory.newXPath(); | ||
| return new XPathExpression[] { |
There was a problem hiding this comment.
Just noticed you have multiple XPath expressions here already, so you probably should just double them up, one for each namespace. You will need to provide a NamespaceContext object that binds the prefixes you use here to the actual namespace URIs.
There was a problem hiding this comment.
When I use local-name I don't need a NamespaceContext.
I don't want to complicate it so much.
Goal is to take some values from web.xml not verifying it.
In case of wrong namespaces, missing namespaces in web.xml I don't want to break a build it and so ...
There was a problem hiding this comment.
I did as you propose:
You can match on the local-name() instead or rearrange the code to use separate paths for the two different namespaces.
There was a problem hiding this comment.
I still think this should break in the case of a missing namespace. I suggested local-name only because I didn't yet notice that you were already using a list.
There was a problem hiding this comment.
Ok, once again one achievement here is to read some values from web.xml, it should be easy and quick task.
Validation wrong syntax, missing namespace and so on is not a goal for this taks - here we have dependency analizes.
What benefit we will have for strict validating namespace here?
There was a problem hiding this comment.
Pull request overview
This pull request adds support for recognizing classes referenced in web.xml as main used classes for WAR projects. This enables proper dependency analysis for web applications where classes from dependency artifacts are referenced in deployment descriptors but may not be directly referenced in compiled code.
Changes:
- Implements a new
WarMainDependencyClassesProviderthat parses web.xml files to extract class references from filter-class, listener-class, and servlet-class elements - Adds comprehensive unit tests including edge cases (malformed XML, missing files, namespace variations)
- Includes integration tests demonstrating the feature working with Maven projects
- Changes plexus-xml dependency scope from test to provided since Xpp3Dom is used in the main code
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/java/org/apache/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProvider.java | New provider implementation that parses web.xml files to extract class references for WAR projects |
| src/test/java/org/apache/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProviderTest.java | Unit tests covering various web.xml scenarios including standard location, custom location, and edge cases |
| src/test/resources/webapp/src/main/webapp/WEB-INF/web.xml | Test resource with standard web.xml containing filter, listener, and servlet class references |
| src/test/resources/webapp/examples/*.xml | Additional test resources for edge cases (empty, malformed, no namespace, multiple entries) |
| src/it/web-application/pom.xml | Integration test parent POM declaring dependencies to test against |
| src/it/web-application/web1/pom.xml | Integration test module using standard web.xml location |
| src/it/web-application/web2/pom.xml | Integration test module using custom web.xml location via maven-war-plugin configuration |
| src/it/web-application/web1/src/main/webapp/WEB-INF/web.xml | Integration test web.xml at standard location |
| src/it/web-application/web2/webapp/WEB-INF/web.xml | Integration test web.xml at custom location |
| src/it/web-application/verify.groovy | Integration test verification script checking that classes from web.xml are properly detected |
| pom.xml | Moves plexus-xml dependency from test to provided scope as it's now used in main code |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...che/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProvider.java
Show resolved
Hide resolved
...che/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProvider.java
Outdated
Show resolved
Hide resolved
...maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProviderTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...che/maven/shared/dependency/analyzer/dependencyclasses/WarMainDependencyClassesProvider.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@elharo I have changed an implementation, I simply use getElementsByTagName without XPath So I hope we can finish discusion on this PR. |
| } | ||
|
|
||
| private void processClassesFromTags(Document doc, List<String> classes, String tagName) { | ||
| NodeList tags = doc.getElementsByTagName(tagName); |
There was a problem hiding this comment.
The problem with getElementsByTagName is that it only works with the exact tag name with the exact prefix (or lack thereof) specified. So now you're not recognizing a lot of tags you want to recognize.
With DOM or XPath or anything else, you need to work with namespaces as designed. That means search by local name and namespace URI. Trying to avoid that is brittle at best and likely will break sooner or later.
There was a problem hiding this comment.
Can you provide an example web.xml which brake my implementation?
There was a problem hiding this comment.
please look again now use getElementsByTagNameNS ....
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
A web application may include classes provided by dependency artifacts. Class references defined in web.xml are now recognized as main used classes.
Following this checklist to help us incorporate your
contribution quickly and easily:
Note that commits might be squashed by a maintainer on merge.
This may not always be possible but is a best-practice.
mvn verifyto make sure basic checks pass.A more thorough check will be performed on your pull request automatically.
mvn -Prun-its verify).If your pull request is about ~20 lines of code you don't need to sign an
Individual Contributor License Agreement if you are unsure
please ask on the developers list.
To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.