[CR] waterbutler provider for Microsoft OneDrive#102
[CR] waterbutler provider for Microsoft OneDrive#102caseyrygt wants to merge 60 commits intoCenterForOpenScience:developfrom
Conversation
…displays the folder as a child folder
…e does not have revisions
…. http://localhost:7777/...)
…orks with a single file in the project root folder; todo: test upload to subfolder
|
Hey @caseyrygt, Re: async: Yes, WB can do this. Look at Re: intra_move vs. intra_copy: Where are you seeing this? Is this in v0 or v1? Re: WaterButlerPath: No, sorry, str(path) is pretty standard, and we don't have any really good examples. The Box.com provider also uses an id rather than folder/name, so the WBPath-using code in there might prove helpful. Cheers, |
|
Hey @felliott: RE async: thanks, I'll take a look there. RE intra_move vs Intra_copy: I just tried to reproduce it and I must have been confused or thinking I did a copy instead of a move. Sorry about that. I'll also look at Box for the path handling. Thanks, |
|
I've run into an interesting snag...when creating the path parts for WaterButlerPath, I'm creating the names from the parentReference path: /ryan-test1/sub1/sub2; the ids are created from the parent ID and child ID. So I have something like this: In this case when methods call path.idenfifier (e.g. download, metadata), path.identifier is none. I suspect it is because WBPath thinks there are 4 parts in total (3 folders, 1 file) and only 2 IDs were passed in. If that's correct, I could work around it by just using the parent folder path (/sub2) and its ID. Is my understanding correct and is there a better way to create a WaterButlerPath for OneDrive? Thanks! |
IIRC, we usually just pad the id list with None for the entries we don’t have ids for. So in your example, ids would be [None, None, “1234!1”, “1234!2”]. |
|
Happy New Year! I've got the name/id path code working for most use cases except downloading a folder as a zip file. There I'm running into an issue in download where path.identifier is None. It looks like the Zip function in core/provider is calling revalidate_path. Can you shed some light on what this function needs to do? I looked at the box revalidate_path method and it looks like it gets the name and id for a path but I'm not 100% sure if that is everything it needs to do. Thanks, |
Hey Ryan, You are correct. Cheers, |
|
REDACTED see next post for details Hello @caseyrygt . Nice work on this new provider! I have been working on an issue with another provider and was curious how you were handling it in your code, so started snooping. It seems that you may run into the same problem. Both GoogleDrive and OneDrive File/Folder IDs can have multiple parents. The provider might need to handle that possibility. I have come to a bit of a dead end trying to resolve this in GoogleDrive. I hope you are able avoid this situation or find a solution. Food for thought @felliott . Thanks, Tom |
That’s a great catch, @TomBaxter! Could you send me a link to docs where it talks about multiple parents? My browse-fu is weak today. @caseyrygt, the issue we’re running into with GDrive is that its folders are not really folders, but labels. This breaks WB’s expectations of filesystem-like behavior, where @TomBaxter, @caseyrygt: I’ve talked with our Product Manager about what we should do for this, and I think we’ve come to a decision. I’m still in the middle of writing it up formally and probing for weaknesses, but what we want to do is this: WB is supposed to be a storage-provider abstraction providing filesystem-like semantics. When I delete given /foo/baz.txt and /bar/baz.txt are the same file with ID 1234, rename /bar/baz.txt to /bar/quux.txtremove label ‘bar’ from file 1234 Likewise, moves and copies will work on the same principal, with mutating actions creating new copies of multi-labeled entities. I’m still working through some examples on paper to make sure this’ll work how we expect, but I think it will so far. One other change will be in the ID assignments. Since we have to support the possibility of multi-labeled files, GDrive and OneDrive ids will have to be changed to materialized ids, a cross between paths and ids. Copying from another PR response, given:
For now, I’m still figuring out the cases we’ll need to support, and making sure this approach will work. I’ll update this ticket and the GDrive one when I’ve finished. Thank you, both of you! Cheers, |
|
That is very interesting. Thanks for bringing it up. Could you provide a link to the OneDrive (Personal Edition) documentation for labels / multiple parents? I went through the API just now and couldn't find anything regarding labels. (API: https://dev.onedrive.com/README.htm) Thanks, |
|
I apologize for jumping the gun with my post. I thought I had found several references to multiple OneDrive parents yesterday but have been unable to find them this morning. I suspect that my google searches of OneDrive were returning hits for GoogleDrive. I dove into the OneDrive API and see now that they use "remoteItem" to reference shares instead of using multiple "parentReference" entries and as near as I can tell they don't support symbolic links within a single "drive" at all, unless perhaps that is also done with the "remoteItem". Again I apologise for any anxiety I caused by bringing this up prematurely. Embarrassingly, Tom |
@caseyrygt: Sorry, I’ve lost track. Is this PR ready for review again? Cheers, |
|
@TomBaxter, no worries! I'm glad you were thinking about other providers On Tue, Jan 5, 2016 at 9:38 AM, Fitz Elliott notifications@github.com
@felliott, let me implement revalidate_path first. I did have a question Can you take a look at intra_copy? Basically, I did not want to return to If you want to look at intra_copy once everything else is done, I Thanks, |
Sounds good.
I don’t know off the top of my head, but I’ve made a note to look at it. I’m going to be spending all day writing stuff up, so it might be tomorrow or Thursday until I get back to you. Thanks! Cheers, |
…ad as zip on project root folder
|
Since this has gotten long and I believe all of the issues here have been address, I'm going to close this and submit a new PR with a new question. Thanks! |
This feature was done as a COS-funded Integration Grant by: * Ryan Casey (@caseyrygt) Additional work was done on read-write support by Alexandr Melnikov (@alexandr-melnikov-dev-pro). Read-write will be added in a future release. Thank you both for all your hard work on this! [SVCS-269] [SVCS-458] Closes: #205 Related: #102, #151, #177
Initial Work In Progress PR for Microsoft OneDrive WaterButler Provider
Notes
We copied Dropbox as the starting point for this provider. There is still some functionality in comments from Dropbox and that is not specific to OneDrive. Likewise there are debug and info logging statements that we'll remove before the final PR.
Also, invoke test passes but there are no OneDrive unit tests, yet. When I was first working in this, I just got formatting errors from invoke test and then once I got all of those fixed I saw that there are in fact WaterButler tests.
I will get tests written for OneDrive.
Design Approach
I think the biggest difference between the Dropbox and OneDrive provider (as written), is I used the OneDrive "id" for the file/folder rather than the path and name.
Limitations
Issues under development
Functionality in Place (i.e. "dev complete")
Questions
Thanks!