This issue was found during adding new heap attack tests for subqueries. When there are multiple subqueries that read many large keyword fields or only one single giant text field, the issue is exposed.
The memory consumed in the following places are not tracked properly yet:
ValuesSourceReaderOperator.FieldWork
BlockSourceReader.scratch
There are two indices referenced by the two queries below
- Index
manybigfields has 1000 keyword fields, each field is a random 1KB string, each document is 1MB, and there are 500 documents.
- Index
bigtext has 1 text field, each field/document is a random 5MB string, and there are 40 documents.
Query #1
FROM
(FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
, (FROM manybigfields)
ValuesSourceReaderOperator seems to have some untracked memory consumed by lucene, the size in the dominator tree does not quite reflect it. There are 1000 ValuesSourceReaderOperator.FieldWork, although heap dump says they are tiny(which is questionable?), Block[]#1 in the screenshot is populated by ValuesSourceReaderOperator#3, heap dump says ValuesSourceReaderOperator#3 is about 92KB itself, however Block[]#1 is 6MB in the heap dump. There could be some hidden memory usage not shown yet.
Query #2
FROM
(FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
, (FROM bigtext)
| LIMIT 30
BlockSourceReader.scratch is about 15MB in the heap dump and it is not tracked by circuit breaker

This issue was found during adding new heap attack tests for subqueries. When there are multiple subqueries that read many large keyword fields or only one single giant text field, the issue is exposed.
The memory consumed in the following places are not tracked properly yet:
ValuesSourceReaderOperator.FieldWorkBlockSourceReader.scratchThere are two indices referenced by the two queries below
manybigfieldshas 1000 keyword fields, each field is a random 1KB string, each document is 1MB, and there are 500 documents.bigtexthas 1 text field, each field/document is a random 5MB string, and there are 40 documents.Query #1
ValuesSourceReaderOperatorseems to have some untracked memory consumed by lucene, the size in the dominator tree does not quite reflect it. There are 1000ValuesSourceReaderOperator.FieldWork, although heap dump says they are tiny(which is questionable?),Block[]#1in the screenshot is populated byValuesSourceReaderOperator#3, heap dump saysValuesSourceReaderOperator#3is about 92KB itself, howeverBlock[]#1is 6MB in the heap dump. There could be some hidden memory usage not shown yet.Query #2
BlockSourceReader.scratchis about 15MB in the heap dump and it is not tracked by circuit breaker