Underneath the hood, SkyBin is built around three components.
Renter nodes store files with other users in SkyBin. They perform operations
like encrypting and uploading files, determing where to store chunks of data,
and downloading and reconstructing files from data chunks. This is the component
which the typical user interacts with to perform basic operations in SkyBin, and
it runs on the user's computer as part of the SkyBin application.
Provider nodes provide storage to the network, negotiating storage agreements
with renters, storing file blocks, and serving other users' content. Any user
with spare disk space can run a provider node to rent out their space for cash.
Finally, the SkyBin Metaserver stores file and user metadata,
manages payments, and serves as a meeting place for renters and providers.
This is a global service operated by SkyBin.
Storage in SkyBin revolves around agreements called storage contracts. When a
renter would like to reserve storage space with a provider, they do so with a
storage contract, setting terms like how much space the provider is willing to
provide, the duration of the agreement, and the total fee the renter must pay
for the space. These agreements are negotiated behind the scenes by the SkyBin
system; users don't have to think about them.
Files are stored in SkyBin as encrypted chunks after being preprocessed by the renter
who uploads them. This process includes compressing the file, encrypting it with a
symmetric key (AES), breaking it into fixed-size data blocks, and then computing parity
blocks from the data blocks using an erasure coding scheme. Both data and parity blocks
are then stored on the network with providers the renter already has storage contracts
with. Metadata about the file, including the location of the blocks, is stored with
the SkyBin Metaserver.
The erasure-coding scheme used to create parity blocks for the file is based on
Reed-Solomon codes. This is an error correction technique that allows SkyBin to
recover file content despite partial data loss. If some data block is
irrecoverable or corrupted, a combination of data and parity blocks can be used
to reconstruct it. The approach is better than the more obvious recovery
technique of storing multiple copies of files because any single parity block
can be used to recover any single data block. For more information about this
interesting technique, the BackBlaze blog has a useful primer: