Skip to content

Fix bug with additional file processing in cache compressed buffer#4913

Merged
alexey-milovidov merged 4 commits intomasterfrom
fix_additional_seek
Apr 6, 2019
Merged

Fix bug with additional file processing in cache compressed buffer#4913
alexey-milovidov merged 4 commits intomasterfrom
fix_additional_seek

Conversation

@alesapin
Copy link
Copy Markdown
Member

@alesapin alesapin commented Apr 5, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

For changelog. Remove if this is non-significant change.

Category (leave one):

  • Bug Fix

Short description (up to few sentences):
Fix bug with additional file seek and read when block already in uncompressed cache.

@alesapin alesapin changed the title Fix bug with additional seek in cache compressed buffer Fix bug with additional file processing in cache compressed buffer Apr 5, 2019
Copy link
Copy Markdown
Contributor

@nvartolomei nvartolomei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still see one cache miss. Seems to be due to EOF which we don't cache. Curious if it make senses to cache it so we don't have to seek at all for some queries.

@alexey-milovidov
Copy link
Copy Markdown
Member

Your own test has failed.

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented Apr 5, 2019

Have no idea why. Seems like query was absent in system.query_log, but server log contains:

2019.04.05 16:36:30.836471 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> executeQuery: (from [::1]:56788) SELECT ProfileEvents.Values[indexOf(ProfileEvents.Names, 'Seek')], ProfileEvents.Values[indexOf(ProfileEvents.Names, 'ReadCompressedBytes')], ProfileEvents.Values[indexOf(ProfileEvents.Names, 'UncompressedCacheHits')] AS hit FROM system.query_log WHERE (query_id = 'test-query-uncompressed-cache') AND (type = 2) ORDER BY event_time DESC LIMIT 1
2019.04.05 16:36:30.839870 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "type = 2" moved to PREWHERE
2019.04.05 16:36:30.851788 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> system.query_log (SelectExecutor): Key condition: unknown, unknown, and
2019.04.05 16:36:30.851848 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> system.query_log (SelectExecutor): MinMax index condition: unknown, unknown, and
2019.04.05 16:36:30.851928 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> system.query_log (SelectExecutor): Selected 1 parts by date, 1 parts by key, 1 marks to read from 1 ranges
2019.04.05 16:36:30.853315 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Trace> system.query_log (SelectExecutor): Reading approx. 1024 rows with 1 streams
2019.04.05 16:36:30.853632 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2019.04.05 16:36:30.855353 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> executeQuery: Query pipeline:
Limit
 Expression
  MergeSorting
   PartialSorting
    Expression
     Filter
      MergeTreeThread

2019.04.05 16:36:30.858977 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Information> executeQuery: Read 10 rows, 3.61 KiB in 0.022 sec., 448 rows/sec., 161.78 KiB/sec.
2019.04.05 16:36:30.859126 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> MemoryTracker: Peak memory usage (for query): 1.03 MiB.
2019.04.05 16:36:30.859697 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Debug> MemoryTracker: Peak memory usage (total): 1.03 MiB.
2019.04.05 16:36:30.859786 [ 29 ] {a718ce00-620d-4975-898f-906bbcbecca0} <Information> TCPHandler: Processed in 0.028 sec.

How is it possible?

@alexey-milovidov
Copy link
Copy Markdown
Member

alexey-milovidov commented Apr 5, 2019

There is a (logical) race condition in SystemLog.
SYSTEM FLUSH LOGS does not guarantee that all data is written, because the data may be still in queue. The method flush called by user, should also drain the queue.

The issue may happen in other tests as well.

$CLICKHOUSE_CLIENT --use_uncompressed_cache=1 --query_id="test-query-uncompressed-cache" --query="$cached_query" &> /dev/null

$CLICKHOUSE_CLIENT --query="SYSTEM FLUSH LOGS"
sleep 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be before SYSTEM FLUSH LOGS so that SystemLog background thread has time to pull log items from queue to data vector.

@alexey-milovidov alexey-milovidov merged commit 27f7f09 into master Apr 6, 2019
@abyss7 abyss7 added the pr-bugfix Pull request with bugfix, not backported by default label Apr 8, 2019
abyss7 pushed a commit that referenced this pull request Apr 15, 2019
Fix bug with additional file processing in cache compressed buffer

(cherry picked from commit 27f7f09)
@abyss7 abyss7 added the v19.5 label Apr 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants