Inspiration

Cloud storage is fragmented. Every provider offers free space, but it is isolated and capped. We wanted to build a unified backend that treats multiple cloud providers as one distributed system. Instead of relying on a single vendor, Unibase distributes trust, storage, and reliability across many.

What it does

Unibase is a distributed chunk storage API. It accepts large file uploads, splits them into deterministic chunks, distributes those chunks across multiple cloud providers such as Google Drive, Box, and Dropbox, and stores the metadata required to reliably reassemble the file.

From the user’s perspective, it behaves like one seamless storage system backed by many providers.

How we built it

We built a FastAPI service with a deterministic upload pipeline:

  1. Receive file via POST /upload
  2. Split into fixed-size chunks
  3. Hash each chunk using SHA-256
  4. Hash the full file for integrity verification
  5. Route chunks across providers using a balancing rule
  6. Store metadata for reconstruction

At a high level:

N = ceil(S / C)

  • S = file size in bytes
  • C = chunk size in bytes
  • N = total number of chunks

Each chunk stores:

  • Index
  • Provider name
  • Provider file ID
  • Chunk hash

This metadata enables ordered reconstruction, integrity validation, and safe deletion.

Challenges we ran into

  • Cross-provider API inconsistencies in upload formats and authentication
  • Filename and path normalization across different cloud constraints
  • Handling partial failures while preserving metadata consistency
  • Designing clean error propagation from external SDKs through our API
  • Ensuring chunk-level integrity before marking uploads as complete

Accomplishments that we're proud of

  • A working distributed upload backend across multiple cloud providers
  • Deterministic chunk routing with reproducible reconstruction
  • End-to-end integrity validation using cryptographic hashing
  • A clean API surface that abstracts away provider complexity
  • Reliable metadata tracking for retrieval and lifecycle operations

What we learned

  • Distributed systems complexity lives in edge cases, not the happy path
  • Metadata design is as important as storage design
  • Hashing is essential when distributing file fragments
  • Naming conventions and state transitions directly affect reliability
  • Infrastructure design matters as much as application logic

What's next for Unibase

  • Parallel chunk uploads for improved throughput
  • Redundancy strategies such as replication or erasure coding
  • Smart provider routing based on quota availability
  • End-to-end encryption before chunk distribution
  • A public developer API with authentication and rate limiting
  • A frontend dashboard to visualize chunk distribution and storage usage

Built With

Share this project:

Updates