Skip to content

Backport PR #16482 to 8.x: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit#16569

Merged
andsel merged 1 commit into8.xfrom
backport_16482_8.x
Oct 16, 2024
Merged

Backport PR #16482 to 8.x: Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit#16569
andsel merged 1 commit into8.xfrom
backport_16482_8.x

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Backport PR #16482 to 8.x branch, original message:


Release notes

[rn:skip]

What does this PR do?

Updates BufferedTokenizerExt so that can accumulate token fragments coming from different data segments. When a "buffer full" condition is matched, it record this state in a local field so that on next data segment it can consume all the token fragments till the next token delimiter.
Updated the accumulation variable from RubyArray containing strings to a StringBuilder which contains the head token, plus the remaining token fragments are stored in the input array.
Port the tests present at

describe FileWatch::BufferedTokenizer do
in Java.

Why is it important/What is the impact to the user?

Fixes the behaviour of the tokenizer to be able to work properly when buffer full conditions are met.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

How to test this PR locally

Follow the instructions in #16483

Related issues

Use cases

Screenshots

Logs

…ines bigger then sizeLimit (#16482)

Fixes the behaviour of the tokenizer to be able to work properly when buffer full conditions are met.

Updates BufferedTokenizerExt so that can accumulate token fragments coming from different data segments. When a "buffer full" condition is matched, it record this state in a local field so that on next data segment it can consume all the token fragments till the next token delimiter.
Updated the accumulation variable from RubyArray containing strings to a StringBuilder which contains the head token, plus the remaining token fragments are stored in the input array.
Furthermore it translates the `buftok_spec` tests into JUnit tests.

(cherry picked from commit 85493ce)
@elastic-sonarqube
Copy link
Copy Markdown

Quality Gate passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarQube

@elasticmachine
Copy link
Copy Markdown

💚 Build Succeeded

cc @andsel

@andsel andsel merged commit 27bd2a0 into 8.x Oct 16, 2024
donoghuc added a commit to donoghuc/logstash that referenced this pull request Nov 20, 2024
…r to completely consume lines in case of lines bigger then sizeLimit (elastic#16569)"

This reverts commit 27bd2a0.
donoghuc added a commit that referenced this pull request Nov 20, 2024
…mpletely consume lines in case of lines bigger then sizeLimit (#16569)" (#16705)

This reverts commit 27bd2a0.
donoghuc added a commit to donoghuc/logstash that referenced this pull request Nov 21, 2024
…r to completely consume lines in case of lines bigger then sizeLimit (elastic#16569)"

This reverts commit 27bd2a0.
@donoghuc donoghuc deleted the backport_16482_8.x branch November 21, 2024 16:15
donoghuc added a commit that referenced this pull request Nov 21, 2024
…mpletely consume lines in case of lines bigger then sizeLimit (#16569)" (#16714)

This reverts commit 27bd2a0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants