Skip to content

[BUG] PreparedDictionaryImpl data gets removed by garbage collection #189

@bwollmer

Description

@bwollmer

Describe the bug
PreparedDictionaries are loosing their content, since the garbage collection will remove it. This can happen "silently" so the code does not crash, but the compression ratio is just not good since no data was used. The problem seems to be a missing reference within the PreparedDictionaryImpl class on rawData.

To Reproduce
I would love to show a test, but since the garbage collection is involved and could run at any time, the test would be flaky.
But the general idea:

  1. Call Encoder.prepareDictionary
  2. Use dictionary for compression
  3. Let the garbage collection run
  4. Repeat 2. and compare the results
  5. The second run should be way bigger than the first run, since an empty dictionary was used

Expected behavior
Custom dictionaries should be safe from garbage collection until they are not used anymore.

Platform (please complete the following information):
I saw the behavior on Linux and MacOS, but this should be platform independent.

Additional context
We have seen this in production, where we load a dictionary once at start and use it multiple times. As a workaround we keep a reference to the ByteBuffer passed to Encoder.prepareDictionary, which solved the problem for us.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions