-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Adds DELETE and HEAD instrumentation to CLI #18206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
74976f8 to
dae5002
Compare
- Adds instrumentation to head requests in the instrumented object store - Adds instrumentatin to delete requests in the instrumented object store - Adds tests for new code and updates existing tests
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @BlakeOrth -- love the code 🚋 that is currently going on
| | Get | duration | ...NORMALIZED...| 1 | | ||
| | Get | size | 1006 B | 1006 B | 1006 B | 1006 B | 1 | | ||
| | Head | duration | ...NORMALIZED...| 1 | | ||
| | Head | size | | | | | 1 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is interesting that there are (even more!) requests going on to read files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb Yes! This feature continues to provide interesting data. That being said, I'm 99% sure the head requests on read in this case is due to the parquet metadata cache! Refer to my comment on its initial PR here here: #16971 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess what I am saying is that if we tally up all the object store operations that are (currently) required to read a parquet file there is a lot of room for improvement. I think the next release of DataFusion is going to "magically get faster" for a bunch of people because we are optimizing these calls systematically.
I for one am very excited
## Which issue does this PR close? This does not fully close, but is an incremental building block component for: - apache#17207 The full context of how this code is likely to progress can be seen in the POC for this effort: - apache#17266 ## Rationale for this change Further fills out method instrumentation ## What changes are included in this PR? - Adds instrumentation to head requests in the instrumented object store - Adds instrumentatin to delete requests in the instrumented object store - Adds tests for new code ## Are these changes tested? Yes. New unit tests have been added. ## Are there any user-facing changes? No-ish ## cc @alamb
## Which issue does this PR close? This does not fully close, but is an incremental building block component for: - apache#17207 The full context of how this code is likely to progress can be seen in the POC for this effort: - apache#17266 ## Rationale for this change Further fills out method instrumentation ## What changes are included in this PR? - Adds instrumentation to head requests in the instrumented object store - Adds instrumentatin to delete requests in the instrumented object store - Adds tests for new code ## Are these changes tested? Yes. New unit tests have been added. ## Are there any user-facing changes? No-ish ## cc @alamb
Which issue does this PR close?
This does not fully close, but is an incremental building block component for:
datafusion-cli] Add a way to see what object store requests are made #17207The full context of how this code is likely to progress can be seen in the POC for this effort:
Rationale for this change
Further fills out method instrumentation
What changes are included in this PR?
Are these changes tested?
Yes. New unit tests have been added.
Are there any user-facing changes?
No-ish
cc @alamb