Skip to content

[iceberg] Reduce redundant getTable calls in IcebergHiveMetadata#24376

Merged
ZacBlanco merged 1 commit into
prestodb:masterfrom
imjalpreet:reduceGetTableIceberg
Jan 21, 2025
Merged

[iceberg] Reduce redundant getTable calls in IcebergHiveMetadata#24376
ZacBlanco merged 1 commit into
prestodb:masterfrom
imjalpreet:reduceGetTableIceberg

Conversation

@imjalpreet

@imjalpreet imjalpreet commented Jan 16, 2025

Copy link
Copy Markdown
Member

Description

After #24326, redundant metastore calls can lead to significant performance degradation.

This PR introduces a per-query caching mechanism for Hive table objects in the IcebergHiveMetadata class to improve performance by reducing repeated calls to the metastore.

The chart below compares TPCH queries executed in my local environment. It illustrates execution times before disabling the metastore cache, the current master, and after the optimization introduced in this PR.

image

Impact

No user impact

Test Plan

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Iceberg Connector Changes
* Optimize redundant getTable calls in Iceberg Connector :pr:`24376 `

@imjalpreet imjalpreet self-assigned this Jan 16, 2025
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Jan 16, 2025
@prestodb-ci prestodb-ci requested review from a team, BryanCutler and nishithakbhaskaran and removed request for a team January 16, 2025 15:22
@imjalpreet imjalpreet force-pushed the reduceGetTableIceberg branch from e24d787 to 219ef2e Compare January 18, 2025 10:27
@imjalpreet imjalpreet force-pushed the reduceGetTableIceberg branch from 219ef2e to 70fe67e Compare January 18, 2025 13:08
@imjalpreet imjalpreet marked this pull request as ready for review January 18, 2025 16:28
@imjalpreet imjalpreet requested review from a team, ZacBlanco and hantangwangd as code owners January 18, 2025 16:28
@imjalpreet imjalpreet requested a review from presto-oss January 18, 2025 16:28

@hantangwangd hantangwangd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change, one little question, otherwise looks good to me.

Comment on lines +222 to +226
throw e;
}
catch (ExecutionException e) {
throw new RuntimeException("Unexpected checked exception by cache load from metastore", e);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little question, is there a reason not throw a PrestoException here?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry missed this comment.

Ideally, an ExecutionException should never be thrown in this case since metastore.getTable does not throw any checked exceptions. Any exceptions from the getTable call would be wrapped in an UncheckedExecutionException, from which we will throw a PrestoException. Because of this, I opted to keep it generic here. However, if you feel it’s better to throw a PrestoException here in any case, I can make that change.

@steveburnett

Copy link
Copy Markdown
Contributor

Thanks for the release note! Nit to update the placeholder number in the pull request tag.

== RELEASE NOTES ==

Iceberg Connector Changes
* Optimize redundant getTable calls in Iceberg Connector. :pr:`24376`

@hantangwangd hantangwangd left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok, not have a strong feeling about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants