Skip to content

Fix updating of maven model when parent pom was updated previously in same cycle#5051

Closed
mryan43 wants to merge 35 commits intoopenrewrite:mainfrom
mryan43:fix-maven-model-update
Closed

Fix updating of maven model when parent pom was updated previously in same cycle#5051
mryan43 wants to merge 35 commits intoopenrewrite:mainfrom
mryan43:fix-maven-model-update

Conversation

@mryan43
Copy link
Copy Markdown

@mryan43 mryan43 commented Feb 18, 2025

What's changed?

  • Move projectPoms to the ExecutionContext instead of PomDownloader + MavenResolutionResult + UpdateMavenModel
  • in MavenVisitor, update the current maven resolution result if a parent pom from the project was updated
  • Added minimal test to showcase issue caused by projectPoms not being shared between MavenResolutionResults

What's your motivation?

  • In multi-module maven projects where both the parent and child are in the same project, updates made to the parent pom are not visible to recipes updating children poms, causing various bugs.
  • The typical scenario that I want to fix is :
  1. A first recipe modifies a parent pom,
  2. Another recipe attempts to modify a child of this pom and requires the changes from step 1
  3. Since the MavenResolutionResult is a Marker, the parent and the child each have their own and UpdateMavenModel does not update the marker on the child document when the parent model is updated.
  4. Also, the PomDownloader uses its own projectPoms which can be stale, making the resolution process return out-of-date poms.
  5. the recipe that should modify the child fails because the model is not up to date between documents

See added test for a very simple real-life scenario.
I also believe that this PR could fix a number of similar issues with many different combinations of recipes, such as this one for example : #4949

Have you considered any alternatives or workarounds?

  • I have considered using the ExecutionContext to share MavenResolutionResult created by UpdateMavenModel. It doesn't work because the PomDownloader uses projectPoms that are not shared between MavenResolutionResult during the resolution process.
  • I have considered adding the ExecutionContext as a transient field in MavenResolutionResult but this solution was creating really bad coupling and would cause maintenance issues over time.

Checklist

  • I've added unit tests to cover both positive and negative cases
  • I've read and applied the recipe conventions and best practices
  • I've used the IntelliJ IDEA auto-formatter on affected files

Manuel Ryan and others added 2 commits February 18, 2025 01:58
…ionResult + UpdateMavenModel

- Only update project poms in ExecutionContext when UpdateMavenModel runs
- Resolve updated maven model before maven visitor visits a document
- Added minimal test to showcase issue caused by projectPoms not being shared between MavenResolutionResults
- WIP: a few tests fail. Also resolution of updated maven model can be improved to run only when a previous recipe actually made a change. Also not 100% sure about new implementation of MavenResolutionResult#parentPomIsProjectPom
@mryan43 mryan43 changed the title Move projectPoms to the ExecutionContext instead of PomDownloader + MavenResolutionResult + UpdateMavenModel Fix updating of maven model when parent pom was updated Feb 27, 2025
@mryan43 mryan43 marked this pull request as ready for review February 27, 2025 15:05
@mryan43 mryan43 changed the title Fix updating of maven model when parent pom was updated Fix updating of maven model when parent pom was updated previously in same cycle Feb 27, 2025
@timtebeek
Copy link
Copy Markdown
Member

Thanks a lot @mryan43 ! Tagging Knut for review as he proposed something similar for constructors just today.

github-actions[bot]

This comment was marked as off-topic.

…ToParentThenRemoveRedundantDependencyTest.java

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions[bot]

This comment was marked as off-topic.

@timtebeek timtebeek requested a review from MBoegers March 3, 2025 15:30
@MBoegers
Copy link
Copy Markdown
Contributor

MBoegers commented Mar 3, 2025

Hi @mryan43, looks neat! I will have a detailed look tomorrow in the morning 📓

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Mar 4, 2025

PR is ready for another round of review.

PS: Huge thank you for all the work everyone has put in this PR so far !

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Mar 10, 2025

Merged origin/main into this branch, please approve again for CI. This PR is still awaiting a re-review after comments were addressed.

@MBoegers
Copy link
Copy Markdown
Contributor

Hi @mryan43 as you see from the number of participants, updating MavenResolutionResults after changing the Maven Pom content is a pain point for OpenRewrite, and we are very pleased by your contribution.
You already noticed the public activity dropped, please let me explain. We noticed that your solution utilizes the ExecutionContext to store Pom instances. The ExecutionContext is designed to store meta information and configurations. Accumulators of SearchRecipes, Maven Configurations and likewise are data to store in the ExecutionContext. Poms in contrast, are LST elements and the main object of work for OpenRewrite, like a person for a CRM system. With this assumption, build in storing Poms in the ExecutionContext can cause major problems in downstream products like the Moderne SaaS or Moderne CLI.
Because of this fear, we came to the conclusion that we shouldn't merge this PR with the current implementation but seek for a solution that does utilize the ExecutionContext as designed to transport meta information.

I apologize for the delay and late intervention. I hope you can use your incredible skill and motivation to find a better solution with us.

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Mar 12, 2025

Thanks for the answer and the call we had recently.

After investigating a bit more, the ExecutionContext seems to already contain instances of the Pom class before those changes :
image
image
image

Could you share a bit more details about why the changes from this PR are dangerous for downstream products but those are ok ? That would help me find a better solution that fits everyone's needs

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Mar 12, 2025

I would also like to challenge the idea that the Pom class represents a LST element.
I will argue that it is already meta-data:

  • MavenVisitor extends TreeVisitor<XML,P>, it does not have a visitPom or visitDependency method for example, only methods for visiting xml tags.
  • MavenVisitors being an XML visitor, they do not mutate the fields of the Pom class when visiting. That is most likely why the UpdateMavenModel visitor was created in the first place and needs to run after modifications to the XML, to update the metadata (therefore MavenResolutionResult is also meta-data)
  • Poms can be returned by the MavenPomDownloader class. This class can return poms downloaded from remote artifact repositories, how are those downloaded poms LST elements in OpenRewrite ? Could they be modified by visitors for example ? obviously not...

For all those reasons... I think only XML documents and tags can be considered LST elements in the current implementation.

@knutwannheden
Copy link
Copy Markdown
Contributor

You are right, Poms are not actually LST elements for the reasons you mention. There is also no technical reason why LST elements can't be stored in the ExecutionContext. Sometimes that can be the best way to achieve something. The reason we discourage storing LSTs in the ExecutionContext is that these can consume a lot of memory, especially if they have type attribution attached to them as in the case of Java. In the case of Maven the MavenResolutionResult can also consume quite a lot of memory due to all the transitive dependencies. I am not sure how much of that is attributed to the Pom objects, but I think it should probably be OK and I don't really see any other option anyway.

@knutwannheden
Copy link
Copy Markdown
Contributor

knutwannheden commented Mar 18, 2025

Also note that while using the InMemoryLargeSourceSet for recipe execution this isn't a problem, because all LST are already loaded into memory anyway and thus adding a reference to the ExecutionContext doesn't occupy any additional memory. With the Moderne SaaS and CLI the situation is different as they support processing huge repos by not loading all LSTs at once into memory. By carrying LSTs forward in the ExecutionContext the memory of those LSTs can never be released.

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Mar 18, 2025

Thanks a lot for clarifying those concerns !

I was wondering if there exists any mechanism other than the ExecutionContext to propagate the update between parent and child pom that could be better suited ? I didn't find any in my research but this is my first contribution so I might have missed something

@knutwannheden
Copy link
Copy Markdown
Contributor

The only alternative I can come up with is if we would register something like a UnaryOperator<Pom> (probably a dedicated functional interface would be the better alternative) in the ExecutionContext as a function to transform the parent Pom. At least theoretically this seems like it should also be doable. This function could then be applied to transform the parent POM both for the parent project and for child projects. I am just putting this out there for discussion.

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Apr 7, 2025

I'm not convinced that we can find a better solution without heavily refactoring the whole maven module...

The question remains about what should happen to this PR.
I'm a bit sad that the fix seems to be blocked only because the architecture of the downstream closed source project.

@knutwannheden
Copy link
Copy Markdown
Contributor

@mryan43 What about my proposal with a transformer function? Would that end up capturing a lot of objects from the LST model in the lambda or is it a lot of work to refactor? I haven't looked at the changes very closely and it is a bit hard to tell from a distance.

@mryan43
Copy link
Copy Markdown
Author

mryan43 commented Apr 11, 2025

I can add an interface with an implementation that contains the full Pom object and replaces it. Then downstream closed source product maintainers can replace it by whatever they want. Would that work for you ?

This would be similar to how the ExecutionContext vs InMemoryExecutionContext works

@github-actions
Copy link
Copy Markdown
Contributor

This PR is stale because it has been open for 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@crankydillo
Copy link
Copy Markdown
Contributor

crankydillo commented Sep 10, 2025

How can we progress on this? Ultimately, when a recipe is crawling over a child pom, we need all the parent metadata to be up to date. I even tried multi-cycle as a quick hack to make things work, but it didn't. I didn't really dig into why, but I seem to recall some caching.

Is the function approach mentioned here being discussed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

7 participants