Describe the bug
PreparedDictionaries are loosing their content, since the garbage collection will remove it. This can happen "silently" so the code does not crash, but the compression ratio is just not good since no data was used. The problem seems to be a missing reference within the PreparedDictionaryImpl class on rawData.
To Reproduce
I would love to show a test, but since the garbage collection is involved and could run at any time, the test would be flaky.
But the general idea:
- Call Encoder.prepareDictionary
- Use dictionary for compression
- Let the garbage collection run
- Repeat 2. and compare the results
- The second run should be way bigger than the first run, since an empty dictionary was used
Expected behavior
Custom dictionaries should be safe from garbage collection until they are not used anymore.
Platform (please complete the following information):
I saw the behavior on Linux and MacOS, but this should be platform independent.
Additional context
We have seen this in production, where we load a dictionary once at start and use it multiple times. As a workaround we keep a reference to the ByteBuffer passed to Encoder.prepareDictionary, which solved the problem for us.
Describe the bug
PreparedDictionaries are loosing their content, since the garbage collection will remove it. This can happen "silently" so the code does not crash, but the compression ratio is just not good since no data was used. The problem seems to be a missing reference within the PreparedDictionaryImpl class on rawData.
To Reproduce
I would love to show a test, but since the garbage collection is involved and could run at any time, the test would be flaky.
But the general idea:
Expected behavior
Custom dictionaries should be safe from garbage collection until they are not used anymore.
Platform (please complete the following information):
I saw the behavior on Linux and MacOS, but this should be platform independent.
Additional context
We have seen this in production, where we load a dictionary once at start and use it multiple times. As a workaround we keep a reference to the ByteBuffer passed to Encoder.prepareDictionary, which solved the problem for us.