How to Store Files on the Blockchain Using IPFS

How to Store Files on the Blockchain Using IPFS

Learn how data such as images and documents are stored on a decentralized storage

Ayodele Samuel Adebayo
·Sep 28, 2022·

6 min read

Play this article

Table of contents

In this series, we're going to discuss the on-chain and off-chain methods of storing file documents on the blockchain.

And later on, we'll learn how to store files such as images, PDFs, or any other digital asset off-chain using the InterPlanetary File System (IPFS) network powered by Moralis and Infura IPFS.

By the end of this tutorial series, you'll be able to store and access files on the IPFS network.

Blockchain as a Database

Blockchain is an immutable database that is capable of storing data

As a database, a blockchain is an immutable digital ledger of transactions that is distributed across multiple computer networks. It enables us to store data, i.e. NFTs metadata (including files), and retrieve them in the same way that any other database does.

Storing Files on the Blockchain (On-Chain)

On-chain refers to verified activities or transactions that take place directly on the blockchain. Uploading files directly to a blockchain is also an on-chain activity in this case.

Should we, however, store files directly on-chain? Is it a good idea to keep files on-chain? And how much does blockchain storage cost?

Cost of Storing Files on the Blockchain (On-Chain)

What is the cost of storing files on the blockchain? The cost of storing large files on the blockchain can be very expensive. A gigabyte of storage costs around $100 USD on the blockchain

The cost of storing large files on the blockchain can be very expensive. According to IBM’s “Storage Needs for Blockchain Technology - Point of View document” on Page 9, a gigabyte of storage costs around $100 USD on the blockchain, which is 500 times more expensive than traditional storage.

Performance of Storing Files on the Blockchain (On-Chain)

Storing a large number of files on a blockchain can increase the access latency, which makes the blockchain system perform slowly

Apart from being expensive, storing a large number of files on a blockchain can increase the access latency of the files (increase the time it takes to upload/download the files from the blockchain).

File storage requires low latency for fast access. However, when there is an increase in latency due to heavy files, this may slow down the performance of the blockchain system and make maintenance very difficult.

It is not advisable to store non-transaction data such as files, contracts, documents, PDFs, and personal information directly on the blockchain but you should instead consider storing them off-chain.

Storing Files Outside the Blockchain (Off-Chain)

The term "off-chain" refers to activities or transactions that take place outside of the blockchain. In this context, an off-chain asset is a file that is not directly uploaded on the blockchain.

Since it is not advisable to store non-transaction data, the file is uploaded to another server or database (IPFS, MongoDB, Oracle, etc.) and the HashID generated for the uploaded document will be stored on the blockchain as metadata.

What is IPFS?

Web3 Database 🤯

InterPlanetary File System (IPFS) is a decentralized storage system based on content addressing

The InterPlanetary File System (IPFS) is a decentralized storage system. A protocol and peer-to-peer (p2p) network for storing, accessing, and sharing data in a distributed file system like a blockchain.

IPFS is based on Content-Based Identity (CID) or Content-Based Addressing which is created to be faster than the traditional Location-Based Addressing method of saving files.

What is Location-Based Addressing?

HTTP: "Where" is the content you want 🤷‍♂️

Location-Based Addressing is the traditional method of accessing content such as photos, music, and files on the internet and it requires specifying where the content is hosted by providing an absolute path or address to the content as shown below:

Example of accessing content on the web using Location-based Addressing

The limitation of Location-Based Addressing is that it makes use of a centralized server and the contents will be unavailable if the server that hosts them goes down.

In a centralized server such as Twitter and Facebook, they independently control how their user's saved contents are delivered (i.e. URL) and you can only access a Twitter profile image via Twitter URL and not Facebook URL.

What is Content-Based Addressing?

IPFS: "What" content do you want 🤔

In Content-Based Addressing or Content-Addressed Storage (CAS), every uploaded content has its own unique identifier known as Hash, which can be compared to a fingerprint, and no two contents can share the same hash.

Content such as photos, music, and files are accessible through their unique IPFS hash rather than their location.

Example of accessing content with their IPFS hash on the web using Content-Based Addressing

Any content uploaded on IPFS is accessible via the supported IPFS public gateway. For instance, when a file is uploaded via Moralis IPFS, we can then make use of the Infura IPFS URL to access the file.

The uploaded contents are stored in an IPFS object which is a data structure with two fields, a data which can hold up to 256 KB of blob data, and links, which is an array of links to other IPFS objects.

/* IPFS Object Example */

{
   Data: "",
   Links: [
      {
         Name: “”,  
         Hash: “”,
         Size: 256000
      }
   ]
}

When content is more than 256kb in size, IPFS will automatically break it down into multiple objects and create an empty IPFS to link all the hashes together.

Contents stored on the IPFS cannot be changed like on the blockchain. Instead, IPFS supports versioning of your contents known as IPFS commit object, which is connected to the previous version objects.

IPFS is originally created by Juan Benet, the founder of Protocol Labs in February 2015.

Where Are IPFS Data Stored?

Cached Folders 🗂

Data stored on IPFS are saved locally on a computer's cache folder and served to others who requested it via an IPFS gateway. The data is also cached on the new user's computer.

The cached data are trashed during garbage collection which could lead to loss of data. In order to save your data permanently on the IPFS, you need to pin it to one of the IPFS networks.

What Are the Advantages of IPFS?

  1. Fast record retrieval.

  2. It works with every technology.

  3. Only a single instance of content can be stored (no duplicate).

  4. Uploaded contents are authentic with their unique identifiers.

  5. Uploaded contents cannot be altered.

Difference Between IPFS and HTTP Storage

IPFS (Content-Based Addressing)HTTP (Location-Based Addressing)
It is based on a decentralized serverIt is based on a centralized origin server
It makes use of content-addressingIt makes use of location-addressing
All objects with the same content are stored only onceAll objects with the same content can be stored multiple times
Files are shared across multiple nodes and are always accessibleFiles are not accessible when the server is down
It has a high marketIt has a low market
Immutable (Versioning)Mutable

Wrapping Up

InterPlanetary File System (IPFS) is a reliable and decentralized storage system. It is also widely regarded as the future of file storage.

In this article, we learned about the differences between HTTP and IPFS storage, as well as the on-chain and off-chain (IPFS) methods of storing files on the blockchain.

Where Do You Go Next?

Now that you know how files are stored on the blockchain and how IPFS works:

  • Learn How to Store Files on the Blockchain (IPFS) With Infura 🗳 (Coming soon)

  • Learn How to Store Files on IPFS With Moralis React SDK ⛑ (Coming soon)


This article is a part of the Hashnode Web3 blog, where a team of curated writers are bringing out new resources to help you discover the universe of web3. Check us out for more on NFTs, DAOs, blockchains, and the decentralized future.

 
Share this