Update jruby to 9.4.12.1 by lfoppiano · Pull Request #1293 · grobidOrg/grobid

lfoppiano · 2025-05-26T06:08:07Z

This PR fixes the regression introduced by updating to jruby 9.4.12.0 (in #1261):

update JRUBY
update pragmatic segmenter to 0.3.24

More tests are required, ideally a few thousand PDFs. Before testing, you have to switch to the pragmatic segmenter in the configuration:

  sentenceDetectorFactory: "org.grobid.core.lang.impl.PragmaticSentenceDetectorFactory"
#  sentenceDetectorFactory: "org.grobid.core.lang.impl.OpenNLPSentenceDetectorFactory"

…he new jruby version

coveralls · 2025-05-26T06:16:56Z

coverage: 40.576%. remained the same
when pulling a0a82bb on bugfix/fix-jruby-update
into 23eef0f on master.

Copilot

Pull Request Overview

This PR updates JRuby to 9.4.12.1 and bumps the pragmatic segmenter version to 0.3.24 to address a regression introduced in the previous update. Key changes include updating version numbers, refactoring method calls from instance calls to the new Rule.apply class method across multiple files, and adjusting text initialization from Text.new(text) to text.dup.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
grobid-home/sentence-segmentation/pragmatic_segmenter/version.rb	Bumped version from "0.3.22" to "0.3.24".
grobid-home/sentence-segmentation/pragmatic_segmenter/types.rb	Refactored rule application to use the new Rule.apply class method.
grobid-home/sentence-segmentation/pragmatic_segmenter/punctuation_replacer.rb	Updated rule calls to use Rule.apply for both @text and local variables.
grobid-home/sentence-segmentation/pragmatic_segmenter/processor.rb	Replaced instance .apply calls with Rule.apply for consistency.
grobid-home/sentence-segmentation/pragmatic_segmenter/list.rb	Changed Text instantiation to text duplications and updated rule calls accordingly.
grobid-home/sentence-segmentation/pragmatic_segmenter/languages/*	Standardized rule application for various language-specific processors.
grobid-home/sentence-segmentation/pragmatic_segmenter/cleaner.rb	Updated text initialization and switched to Rule.apply throughout.
grobid-home/sentence-segmentation/pragmatic_segmenter/abbreviation_replacer.rb	Applied similar changes to text handling and rule application.

Comments suppressed due to low confidence (2)

grobid-home/sentence-segmentation/pragmatic_segmenter/languages/common/numbers.rb:50

Ensure the updated regex pattern still correctly matches all intended numbered references; consider adding specific test cases to cover potential edge cases.

NUMBERED_REFERENCE_REGEX = /(?<=[^\d\s])(\.|∯)((\[(\d{1,3},?\s?-?\s?)?\b\d{1,3}\])+|((\d{1,3}\s?){0,3}\d{1,3}))(\s)(?=[A-Z])/

grobid-home/sentence-segmentation/pragmatic_segmenter/list.rb:51

Verify that replacing 'Text.new(text)' with 'text.dup' preserves any specialized behavior provided by the Text class and does not introduce unwanted side effects.

@text = text.dup

Update jruby to 9.4.12.1, update pragmatic segmenter to comply with t…

a0a82bb

…he new jruby version

lfoppiano requested a review from Copilot May 26, 2025 19:46

Copilot AI reviewed May 26, 2025

View reviewed changes

lfoppiano mentioned this pull request May 26, 2025

Address vulnerabilities in docker images #1262

Closed

lfoppiano merged commit 11d3763 into master May 26, 2025
11 checks passed

lfoppiano deleted the bugfix/fix-jruby-update branch May 26, 2025 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update jruby to 9.4.12.1#1293

Update jruby to 9.4.12.1#1293
lfoppiano merged 1 commit into
masterfrom
bugfix/fix-jruby-update

lfoppiano commented May 26, 2025

Uh oh!

coveralls commented May 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

lfoppiano commented May 26, 2025

Uh oh!

coveralls commented May 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants