[iceberg] Reduce redundant getTable calls in IcebergHiveMetadata#24376
Conversation
e24d787 to
219ef2e
Compare
219ef2e to
70fe67e
Compare
hantangwangd
left a comment
There was a problem hiding this comment.
Thanks for the change, one little question, otherwise looks good to me.
| throw e; | ||
| } | ||
| catch (ExecutionException e) { | ||
| throw new RuntimeException("Unexpected checked exception by cache load from metastore", e); | ||
| } |
There was a problem hiding this comment.
A little question, is there a reason not throw a PrestoException here?
There was a problem hiding this comment.
Sorry missed this comment.
Ideally, an ExecutionException should never be thrown in this case since metastore.getTable does not throw any checked exceptions. Any exceptions from the getTable call would be wrapped in an UncheckedExecutionException, from which we will throw a PrestoException. Because of this, I opted to keep it generic here. However, if you feel it’s better to throw a PrestoException here in any case, I can make that change.
|
Thanks for the release note! Nit to update the placeholder number in the pull request tag. |
hantangwangd
left a comment
There was a problem hiding this comment.
I'm ok, not have a strong feeling about this.
Description
After #24326, redundant metastore calls can lead to significant performance degradation.
This PR introduces a per-query caching mechanism for Hive table objects in the
IcebergHiveMetadataclass to improve performance by reducing repeated calls to the metastore.The chart below compares TPCH queries executed in my local environment. It illustrates execution times before disabling the metastore cache, the current master, and after the optimization introduced in this PR.
Impact
No user impact
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.