Problem
JDK serialisation used by security plugin to serialize and deserialize various headers is slow.
Proposal
This is a proposal to change the implementation of Base64Helper::serializeObject and Base64Helper::deserializeObject to use a faster serialization protocol. I explored Fast Serialization, Protostuff, Kryo, Avro, and OpenSearch's Custom Serialization as alternatives to JDK serialization and ran a few benchmarks. Results are attached below.
Benchmarking Environment
Framework used - JMH, 1000 warm-up iterations, 30000 test iterations
EC2 InstanceType - c5.2xlarge
JDK - Corretto JDK 11
OS - Amazon Linux 2 x86_64
| Type |
User |
User |
User |
InetSocketAddress |
InetSocketAddress |
InetSocketAddress |
SourceFieldContext |
SourceFieldContext |
SourceFieldContext |
User |
User |
User |
InetSocketAddress |
InetSocketAddress |
InetSocketAddress |
SourceFieldContext |
SourceFieldContext |
SourceFieldContext |
| Operation |
deserialize |
deserialize |
deserialize |
deserialize |
deserialize |
deserialize |
deserialize |
deserialize |
deserialize |
serialize |
serialize |
serialize |
serialize |
serialize |
serialize |
serialize |
serialize |
serialize |
| Stat |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
Avg Time (ns/op) |
Error +/- ns/op |
Diff % |
| Java |
26062.709 |
847.012 |
|
9732.072 |
309.654 |
|
7892.943 |
333.835 |
|
10370.249 |
319.919 |
|
4749.54 |
168.423 |
|
4023.138 |
146.527 |
|
| FST |
4299.802 |
251.09 |
-83.50209 |
3957.335 |
287.201 |
-59.33718 |
2168.463 |
66.373 |
-72.52656 |
3104.632 |
161.298 |
-70.06213 |
2578.204 |
115.172 |
-45.71676 |
1427.189 |
63.018 |
-64.52548 |
| FST (Pre) |
3674.455 |
133.466 |
-85.90148 |
3417.478 |
134.756 |
-64.88437 |
868.976 |
48.215 |
-88.99047 |
2899.691 |
131.584 |
-72.03837 |
2368.224 |
101.214 |
-50.13782 |
756.986 |
38.476 |
-81.18419 |
| Proto |
808.423 |
40.851 |
-96.89816 |
|
|
|
1003.155 |
29.785 |
-87.29048 |
1423.777 |
59.772 |
-86.27056 |
|
|
|
1138.412 |
70.829 |
-71.70338 |
| Custom (OpenSearch) |
834.74 |
56.749 |
-96.79719 |
|
|
|
834.987 |
30.013 |
-89.42109 |
1115.154 |
69.707 |
-89.2466 |
|
|
|
1123.486 |
37.035 |
-72.07439 |
| Kryo (Pre) |
|
|
|
1274.085 |
20.928 |
-86.90839 |
|
|
|
|
|
|
1544.436 |
55.018 |
-67.48241 |
55.018 |
|
|
- Though FST is highly performant, simplest to use amongst all, it comes with its own shortcomings. FST no longer seems to be actively maintained with last commit made 2yrs ago and 102 open issues, history of breaking changes even with minor version upgrades.
- Protostuff too is highly performant, but will need explicit handling for certain classes such as InetSocketAddress by writing Delegates. Protostuff too doesn't seem to be actively maintained, last commit was 1yr ago.
- Kryo does not work out of the box. Kryo does not work with classes with no zero-arg constructors. We'll have to write serializers. Discovered that for complex objects for eg.
java.util.Collections$SynchronizedMap we'll have to register separate serializers. There's a repo kryo-serializers that has many such serializers that we can use. Given we already have highly optimised custom serialization framework (StreamOutput, StreamInput) within OpenSearch, expending effort to integrate with another library seems unnecessary.
- Custom serialization using OpenSearch's
BytesStreamOutput and BytesStreamInput classes is a promising approach. It too is highly performant. For the classes that are defined within security plugin such as User, SourceFieldsContext - Writeable interface can be implemented. For classes such as InetSocketAddress which we cannot change, we'll have to add Writers and Read methods to the StreamOutput and StreamInput classes to be able to use writeGenericObject and readGenericObject methods. This is inline with how OpenSearch deals with third party classes today. [source code]
To conclude, we propose to use custom serialization for headers in security plugin.
Solution
This change is to proposed to be introduced with OS 3.0 with no intention to backport this. We can break down the solution into following action items -
I've raised an initial draft PR for serialization using protostuff and working towards testing the version upgrade scenario (from OS2.5 to OS2.7). Currently, the change is assumed to be introduced as part of OS2.7 release for testing purpose. We may need to bump up this version.
Will raise another PR with custom serialization.
Next Steps
Problem
JDK serialisation used by security plugin to serialize and deserialize various headers is slow.
Proposal
This is a proposal to change the implementation of
Base64Helper::serializeObjectandBase64Helper::deserializeObjectto use a faster serialization protocol. I explored Fast Serialization, Protostuff, Kryo, Avro, and OpenSearch's Custom Serialization as alternatives to JDK serialization and ran a few benchmarks. Results are attached below.Benchmarking Environment
Framework used - JMH, 1000 warm-up iterations, 30000 test iterations
EC2 InstanceType - c5.2xlarge
JDK - Corretto JDK 11
OS - Amazon Linux 2 x86_64
java.util.Collections$SynchronizedMapwe'll have to register separate serializers. There's a repo kryo-serializers that has many such serializers that we can use. Given we already have highly optimised custom serialization framework (StreamOutput,StreamInput) within OpenSearch, expending effort to integrate with another library seems unnecessary.BytesStreamOutputandBytesStreamInputclasses is a promising approach. It too is highly performant. For the classes that are defined within security plugin such asUser,SourceFieldsContext-Writeableinterface can be implemented. For classes such asInetSocketAddresswhich we cannot change, we'll have to add Writers and Read methods to theStreamOutputandStreamInputclasses to be able to usewriteGenericObjectandreadGenericObjectmethods. This is inline with how OpenSearch deals with third party classes today. [source code]To conclude, we propose to use custom serialization for headers in security plugin.
Solution
This change is to proposed to be introduced with OS 3.0 with no intention to backport this. We can break down the solution into following action items -
StreamInput,StreamOutputclasses to add Writers and Read methods respectively for third party classes directly involved in serialization within security plugin. [will update the list below]Base4Helper::serializeandBase64Helper.deserializemethods to use custom serialization.I've raised an initial draft PR for serialization using protostuff and working towards testing the version upgrade scenario (from OS2.5 to OS2.7). Currently, the change is assumed to be introduced as part of OS2.7 release for testing purpose. We may need to bump up this version.Will raise another PR with custom serialization.
Next Steps