Skip to content

OPENNLP-1708: Introduce new OpenNLP module structure#786

Merged
mawiesne merged 8 commits intomainfrom
OPENNLP-1708-Introduce-new-module-structure
Jun 18, 2025
Merged

OPENNLP-1708: Introduce new OpenNLP module structure#786
mawiesne merged 8 commits intomainfrom
OPENNLP-1708-Introduce-new-module-structure

Conversation

@mawiesne
Copy link
Contributor

@mawiesne mawiesne commented Jun 1, 2025

Prolog

IMPORTANT:

  • ⚠️ The relocations of classes, test classes and resources are substantial!
  • DO NOT try to read the DIFF here on GH.
  • 👨🏽‍🔧 Instead, checkout the branch locally, switch to it and run a
    • mvn clean install and
    • inspect the module structure and internal dependencies.

Changes

Notes

  • Tests separated, all passing!
  • opennlp-ml module is just a stub for now and can / will be filled next.

Tasks

Thank you for contributing to Apache OpenNLP.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible.

@mawiesne mawiesne requested review from jzonthemtn and rzo1 June 1, 2025 19:23
@mawiesne mawiesne self-assigned this Jun 1, 2025
@mawiesne mawiesne added the java Pull requests that update Java code label Jun 1, 2025
@mawiesne mawiesne requested a review from atarora June 1, 2025 19:23
@rzo1
Copy link
Contributor

rzo1 commented Jun 2, 2025

Great things are happening here. Thanks for the push forward @mawiesne

I have some questions / thoughts / points for further discussion:

  • (A) What is the (planned) content to be put under opennlp-ml? I think it aims to carve out the maxent parts (or everything under ml package? opennlp-dl and opennlp-dl-gpu are technically also part of machine learning. Maybe we can put up opennlp-ml as a pom module and add opennlp-maxent and the dl stuff as children under it?

  • (B) Do we (still) aim for (package) compatibility or should we use this major version to align the package imports with ASF standards and move to org.apache.opennlp ?

  • (C) I think we need to double check the provided package-info.java for JPMS (or drop JPMS metadata doc support because our current multi maven structure with split packages would result in JPMS errors anyway). At the moment, it looks a bit inconsistent (but is also a thing for 2.x, I guess) because package-info.java files are meant for metadata, having them in multiple modules defining the same package can confuse tools (like static analyzers, etc), even outside JPMS.

➜  opennlp git:(OPENNLP-1708-Introduce-new-module-structure) find . -name package-info.java

./opennlp-api/src/main/java/opennlp/tools/commons/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/muc/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/ad/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/frenchtreebank/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/masc/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/brat/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/leipzig/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/conllu/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/nkjp/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/letsmt/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/irishsentencebank/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/ontonotes/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/formats/package-info.java
./opennlp-core/opennlp-formats/src/main/java/opennlp/tools/package-info.java
./opennlp-core/opennlp-cli/src/main/java/opennlp/tools/cmdline/lemmatizer/package-info.java
./opennlp-core/opennlp-cli/src/main/java/opennlp/tools/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/lemmatizer/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/ext/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/featuregen/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/util/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/namefind/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/entitylinker/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/languagemodel/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/parser/chunking/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/parser/treeinsert/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/parser/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ngram/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/postag/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/naivebayes/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/maxent/io/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/maxent/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/maxent/quasinewton/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/perceptron/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/model/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/ml/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/dictionary/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/log/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/doccat/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/sentdetect/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/chunker/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/langdetect/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/tokenize/package-info.java
./opennlp-core/opennlp-runtime/src/main/java/opennlp/tools/package-info.java
./opennlp-tools/src/main/java/opennlp/tools/util/package-info.java
./opennlp-tools/src/main/java/opennlp/tools/package-info.java

Other Notes

  • Build did succeed on my machine. Checked the diff in IDE + the new classes -> lgtm.
  • If we have more input from the others, I would suggest to just make the move and merge the big changes into main and do the rest of the work "step-by-step" in iteration mode via smaller PRs to reduce review burden :)

@rzo1 rzo1 marked this pull request as ready for review June 5, 2025 09:22
@mawiesne
Copy link
Contributor Author

mawiesne commented Jun 5, 2025

FYI @jzonthemtn - We worked on Richard's comments this morning:

Comment (A) was addressed and we further modularized the ML part as suggested by him in a joint session.
Comment (B) is up for discussion / decision and has received a separate issue number: OPENNLP-1739.
Comment (C) is addressed in this PR with @rzo1's latest commit / push.

@mawiesne mawiesne force-pushed the OPENNLP-1708-Introduce-new-module-structure branch from 3c5786c to 4760ccf Compare June 7, 2025 19:25
@mawiesne mawiesne marked this pull request as draft June 7, 2025 19:31
@rzo1 rzo1 force-pushed the OPENNLP-1708-Introduce-new-module-structure branch from 4760ccf to 3c5786c Compare June 7, 2025 19:32
@rzo1 rzo1 marked this pull request as ready for review June 7, 2025 19:34
@rzo1
Copy link
Contributor

rzo1 commented Jun 16, 2025

Any additional feedback? Otherwise, I propose that we move on after a rebase to fix pom conflicts.

mawiesne and others added 8 commits June 18, 2025 07:28
- defines new Maven module hierarchy
- resolves OPENNLP-1709 (API)
- resolves OPENNLP-1710 (Core)
- resolves OPENNLP-1711 (CLI)
- resolves OPENNLP-1712 (Tools)
- resolves OPENNLP-1713 (Extensions)
- resolves OPENNLP-1717 (API in core/dl)
- resolves OPENNLP-1718 (API, RT, etc. in uima & morfologik)

Note: tests separated, all passing!
…ation to separate the other ML related packages into custom modules
@rzo1 rzo1 force-pushed the OPENNLP-1708-Introduce-new-module-structure branch from 5ee99b2 to 098d717 Compare June 18, 2025 05:29
@rzo1
Copy link
Contributor

rzo1 commented Jun 18, 2025

Rebased against main.

@jzonthemtn
Copy link
Contributor

It's a large PR but the changes I looked at closely are ok. I'm personally a fan of getting unavoidable big merges out of the way and then tackling the details. So I'm ok to merge this into main.

@mawiesne mawiesne merged commit b4559df into main Jun 18, 2025
10 checks passed
@mawiesne
Copy link
Contributor Author

Thx @smarthi & @jzonthemtn for the feedback.

@mawiesne mawiesne deleted the OPENNLP-1708-Introduce-new-module-structure branch July 7, 2025 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

java Pull requests that update Java code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants