Skip to content

[BUG] Field masking has inconsistent memory issues with certain queries #4031

@peternied

Description

@peternied

What is the bug?
We've seen reports of heap memory usage spiking with field masking enabled. With how masking is implemented at the leaf level its possible that certain types of queries cause the masked fields to be materialized even when they are not used.

How can one reproduce the bug?
Steps to reproduce the behavior:

  • Checkout this branch main...peternied:masking-perf
  • Run ./gradlew integrationTest --tests org.opensearch.security.MaskingTests
  • Analyze the results

Baseline behavior

Baseline query with the following format:

final SearchSourceBuilder ssb = new SearchSourceBuilder();
ssb.size(0);
final SearchRequest request = new SearchRequest(INDEX_NAME_PREFIX + "*");
request.source(searchSourceBuilder);

Creating 3 Indices with 5000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 106 888,814 3,621,248 191,136 480,198
reader 100 105 808,317 1,470,368 517,640 195,083
reader ROLE_WITH_NO_MASKING 100 105 813,600 1,602,272 536,416 206,571
reader MASKING_LOW_REPEAT_VALUE 100 105 936,948 1,631,088 618,704 195,653
reader MASKING_RANDOM_LONG 100 105 919,187 1,593,456 548,264 236,892
reader MASKING_RANDOM_STRING 100 105 951,826 1,656,432 564,784 191,008

Creating 3 Indices with 50000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 110 873,379 4,872,552 575,208 508,483
reader 100 109 753,489 4,020,832 567,304 341,413
reader ROLE_WITH_NO_MASKING 100 108 715,624 988,688 553,504 78,209
reader MASKING_LOW_REPEAT_VALUE 100 108 970,238 1,222,008 650,520 89,596
reader MASKING_RANDOM_LONG 100 108 942,349 1,215,448 679,768 115,690
reader MASKING_RANDOM_STRING 100 108 971,219 1,646,856 660,856 119,863

Query with Aggregate Filter

Query with the following format:

SearchSourceBuilder ssb = new SearchSourceBuilder();
ssb.aggregation(AggregationBuilders.filters("my-filter", QueryBuilders.queryStringQuery("last")));
ssb.aggregation(AggregationBuilders.count("counting").field("genre.keyword"));
ssb.aggregation(AggregationBuilders.avg("averaging").field("longId"));
ssb.size(0);
final SearchRequest request = new SearchRequest(INDEX_NAME_PREFIX + "*");
request.source(searchSourceBuilder);

Creating 3 Indices with 5000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 106 1,069,939 4,905,008 288,144 679,704
reader 100 105 877,725 1,604,048 562,032 215,923
reader ROLE_WITH_NO_MASKING 100 106 898,739 1,860,632 354,032 260,686
reader MASKING_LOW_REPEAT_VALUE 100 106 2,441,865 3,504,944 2,040,768 235,363
reader MASKING_RANDOM_LONG 100 106 2,500,135 3,215,712 1,984,096 247,712
reader MASKING_RANDOM_STRING 100 106 2,414,665 3,330,960 2,075,848 232,634

Creating 3 Indices with 50000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 113 1,265,168 2,698,248 993,528 301,134
reader 100 112 1,308,585 5,408,936 851,672 579,648
reader ROLE_WITH_NO_MASKING 100 109 1,021,555 1,504,480 798,400 136,735
reader MASKING_LOW_REPEAT_VALUE 100 114 2,420,922 7,183,000 1,828,456 660,701
reader MASKING_RANDOM_LONG 100 112 2,225,070 2,815,568 1,994,176 142,038
reader MASKING_RANDOM_STRING 100 111 2,169,297 2,673,376 1,964,144 128,983

Term Match Query:

Query with the following format:

SearchSourceBuilder ssb = new SearchSourceBuilder();
ssb.aggregation(AggregationBuilders.filters("my-filter",  QueryBuilders.termQuery("title","last")));
ssb.aggregation(AggregationBuilders.count("counting").field("genre.keyword"));
ssb.aggregation(AggregationBuilders.avg("averaging").field("longId"));
ssb.size(0);
final SearchRequest request = new SearchRequest(INDEX_NAME_PREFIX + "*");
request.source(searchSourceBuilder);

Creating 3 Indices with 5000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 106 1,154,269 2,497,744 441,920 353,223
reader 100 105 906,467 1,586,928 563,872 203,493
reader ROLE_WITH_NO_MASKING 100 105 873,926 1,390,368 453,184 199,539
reader MASKING_LOW_REPEAT_VALUE 100 105 1,184,290 1,857,944 840,680 218,853
reader MASKING_RANDOM_LONG 100 106 1,184,382 1,847,496 795,520 224,178
reader MASKING_RANDOM_STRING 100 105 1,162,373 1,811,936 782,312 215,824

Creating 3 Indices with 50000 Documents

Role Condition Count Attempts Avg Heap Used Max Heap Used Min Heap Used Std Deviation
admin 100 110 966,775 1,264,696 825,536 79,341
reader 100 109 971,058 1,393,544 779,968 95,795
reader ROLE_WITH_NO_MASKING 100 108 964,071 1,392,504 765,104 90,630
reader MASKING_LOW_REPEAT_VALUE 100 108 1,249,295 1,707,896 1,028,104 120,058
reader MASKING_RANDOM_LONG 100 109 1,225,250 1,717,792 982,744 111,090
reader MASKING_RANDOM_STRING 100 108 1,227,667 1,667,544 991,424 116,841

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationFor code documentation/ javadocs/ comments / readme etc..help wantedCommunity contributions are especially encouraged for these issues.performanceMake it fast!triagedIssues labeled as 'Triaged' have been reviewed and are deemed actionable.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions