Existing file corruption, change, and/or deletion issues

1. What happens if a user changes a file directly in their cloud storage? What happens if a file is corrupted (unintentional change)?
2. What happens if a user deletes a file?
3. What happens if a user changes the file name?
4. What happens if owner of file revokes their creds and file thus becomes inaccessible?
5. What about corruption in network transmission? I.e., the checksum changes from that on the server to that computed by the client.

See 
https://github.com/crspybits/SyncServerII/issues/75
https://github.com/crspybits/SyncServerII/issues/63

Currently SyncServerII, and the iOS client are sensitive to these operations. e.g., if a file was deleted, that would cause a fatal error on the client-- one that a client/server couldn't recover from. But, these events can happen. Our model or concept of this should be more along the lines of what happens when a web page cannot download a particular image file. A default icon is displayed indicating that the file could not be downloaded-- a graceful degradation of service. It seems like this is going require work at each of the levels of the system: server, client and SharedImages. Let's analyze each of these situations.

1. What happens if a user changes a file directly in their cloud storage? What happens if a file is corrupted (unintentional change)?

First, note that this doesn't have to be a fatal problem in terms of the client. For example, on iOS in the Google Drive folder if you open an image to view it and happen to save that file, you may get a change in the file. This will definitely happen if you rotate an image. If that file was downloaded to the SharedImages client, there should be no problem-- aside from any current checking of the file size. That is, the content of the file should be fine in terms of SharedImages.

Of course, the file content change doesn't need to be quite so benign. A JSON file could have keys removed. An image file could be replaced with a text file. And so on. And it doesn't really matter if this change by the user is intentional.

Some things to note here:
a) This will result in a change relative to the current fileSizeBytes attribute-- which is only tracking the size of the file as last/originally uploaded. This will currently cause a download failure in the server-- this is a show stopper right now. Changing to using a checksum can alleviate this.
b) Once we shift to using a checksum, the emphasis is going to shift to validation of the file contents by the client. For example, if the client successfully reads the file as an image file, the file is valid. If that image file load fails, the file is invalid. Thus, we need work to resolve this in the end-client app (SharedImages in this case). And some documentation of the fact that the file contents need end-client validation.

2. What happens if a user deletes a file?

This needs to be handled. Differently, we need to evaluate what the current cloud storage API's do when you try to download a file that's not there. And provide a reliable and detectable response back to the client on this basis. This response needs to be standard for the CloudStorage protocol.

3. What happens if a user changes the file name?

This is really the same as 2) above assuming that the new file name is different than others. It's a file deletion.

4. What happens if owner of file revokes their creds and file thus becomes inaccessible?

I need to experiment with this. I'm not 100% sure where this is going to be detectable. It might depend on the particular cloud storage API. E.g., for Google Drive, it may be detectable when you attempt to refresh crews. My goal here is as with 2), to provide a standard response via the CloudStorage protocol.

5. What about corruption in network transmission? I.e., the checksum changes from that on the server to that computed by the client.

This is effectively the original reason to use a checksum. I'm going to implement a checksum. For Dropbox, see https://www.dropbox.com/developers/reference/content-hash
For Google Drive, see
https://stackoverflow.com/questions/23462168/google-drive-md5-checksum-for-files
https://developers.google.com/drive/api/v3/reference/files
"The MD5 checksum for the content of the file. This is only applicable to files with binary content in Drive."

I thought initially Google Drive was distinguishing between binary versus text files. However, it looks that it's broader. E.g., folders/directories don't have checksums. The text file and (PNG) image file I've tried *do* have checksums.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Existing file corruption, change, and/or deletion issues #93

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Existing file corruption, change, and/or deletion issues #93

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions