I feel like part of this is solved with #1703
But basically we should also do an initial dedup check on the previous initial chunk of the file , and if it matches a shard, load it in cache.
Maybe an API that takes the previous file and a edit: {offset: ..., length: ..., replacement: ...} and does everything smartly (for now we can just opitmize when the beginning of the file is edited)
This will allow gguf metadata edition cc @mishig25
Note that this will still need donwloading the whole file in memory (at least in streaming) even if only little is uploaded
#1703 should be done first