(object_store): Instantiate object store from provided url #4077
(object_store): Instantiate object store from provided url #4077chitralverma wants to merge 26 commits intoapache:masterfrom
Conversation
tustvold
left a comment
There was a problem hiding this comment.
Looking good, mostly just some comments on error handling
|
@tustvold one question I had regarding the tests - I want to test parse_url for each supported store. so shall I put them separately in each module or create 1 big test in the same file? |
I suspect this will make the feature flags easier |
| /// # Examples | ||
| /// | ||
| /// ``` | ||
| /// |
There was a problem hiding this comment.
still a WIP 😂
there are some open items in my checklist
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
object_store/src/options.rs
Outdated
| .iter() | ||
| .map(|(key, value)| { | ||
| let conf_key = | ||
| AzureConfigKey::from_str(&key.to_ascii_lowercase()).unwrap(); |
There was a problem hiding this comment.
should we either return an error or omit the keys that will panic here?
There was a problem hiding this comment.
I meant making the function fallible and return the error, rather then unwrapping and thus panicing ...
There was a problem hiding this comment.
@roeap Thanks for the review.
I'm not sure what exactly you meant here, but based on my understanding and a little searching, I found something called error propagation and have made some changes for this.
Can you please check if this is what you meant and if this is on the right track?
|
I've marked this as a draft as it doesn't appear to be ready for review |
|
also one more question - Since each object store has its own |
| pub fn parse_url( | ||
| url: impl AsRef<str>, | ||
| store_options: Option<impl Into<StoreOptions>>, | ||
| _from_env: bool, |
There was a problem hiding this comment.
I wonder if we could instead use a builder pattern here 🤔 I'll have a play over the weekend and see what I can come up with
There was a problem hiding this comment.
yes, I thought about that as well. let me do something about this and then we can merge our ideas on this.
goals -
- allow URL based obejct_store instantiation
- store options are either explicitly passed or picked from env or both (preference given to explicit over env)
- optionally allow internal
ClientOptionsto be passed - should work for all object stores like local, HTTP, AWS, GCS, Azure, mem
- user facing API should be simple and natural
There was a problem hiding this comment.
Created a builder pattern for this.
let me know if you had something like this in mind?
https://github.com/apache/arrow-rs/pull/4077/files#r1182195434
| /// .with_env_variables(true) | ||
| /// .build(); | ||
| /// ``` | ||
| pub struct ObjectStoreBuilder { |
There was a problem hiding this comment.
@tustvold Based on your comment I have created a builder pattern for object store.
If this looks good, the we can get rid of the options.rs I add as it contains duplicate code.
Also added better examples :)
|
Apologies I have a bit of an interrupted week, and want to give some thought into how this would integrate with things like DataFusion's ObjectStoreRegistry or downstream ObjectStore implementations like HDFS. Thank you for sticking with this, I'll get back to you as soon as I am able |
|
Thank you for your work on this, I've raised #4184 with a proposal inspired by this. PTAL and let me know what you think |
|
Closing this as #4200 has now been merged, thank you for your work on this |
Which issue does this PR close?
Closes apache/arrow-rs-object-store#176.
Closes #2304
Rationale for this change
This PR proposes a standardized implementation to create an object store from provided URL and options. It would make things significantly simple for developers using the crate.
What changes are included in this PR?
Check list
parse_url( ... )from_envtoparse_urlparse_url( ... )Are there any user-facing changes?
Yes,
parse_url( ... )is user-facing.No breaking changes.