Beginner's Guide to IPFS

Beginner's Guide to IPFS

·

10 min read

Update - 2024

This article is about an older implementation of IPFS called js-ipfs which is now deprecated and replaced by Helia. The links in this article may not point to the same pages as they used to at the time of writing this article.

Overview

First, a disclaimer: I am still learning about IPFS and any feedback/tips are more than welcome. This article may be overly simplified (without jargon) so take it with a grain of salt and correct me if needed.

The reason I made this article is because IPFS and its packages have gone through a lot of changes in the last year. The resources for understanding and implementing them are all over the place. This picture perfectly encapsulates my reaction the whole time I was trying to understand the IPFS docs.

This article hopes to simplify most of the information that's currently revolving around IPFS and make it easier for you to get started with the latest version of IPFS.


What IPFS is and isn't

It stands for Interplanetary File System and that's what it is - a file system. It's a system to store and share files. It is a network of nodes (computers) connected in a peer-to-peer structure. There is no central authority like a server. This results in a decentralized network that is used to store and share data.

It is not a storage service provider, a database, or a cloud service provider. In its essence, IPFS is just a set of protocols (rules) that make up an algorithm. When implemented, it creates a decentralized file system. It can be used with databases and cloud services but it's not one by itself.


Why we care

For the most part, it aims to solve a lot of problems with the current web (web2) and it does a great job at it. You can read in detail about these problems here.

Probably the biggest solution it provides is content-based addressing. Our current web uses location-based addressing. It means when you want some data (say a web page) you have to tell the internet its location using the URL. Basically telling the internet where this web page is stored and asking for it.

In content-based addressing, instead of asking for data using its location, we ask for the data using some unique identifier that we have for that data. You can think of this identifier as a short name we just made up to refer to the data, and it directly depends on what the data is. If the data changes, its name changes.


How IPFS works

The most important role of IPFS is to create these names for the data it gets. Without these names, we won't have any way of accessing this data. From now on we will call these names as CID (content identifier). IPFS uses an algorithm called SHA256 to generate these CIDs.

SHA256 is an extremely useful algorithm that's applied in many fields. But here we care most about the following points:

  • It takes some data and generates a pseudo-random string. Pseudo-random means something that looks like complete gibberish but it's actually not.

  • These strings are called Hashes. Every piece of data is associated with a single unique hash. There should be no other data associated with the same hash. We need these hashes to be unique for every piece of data.

  • We also need these hashes to be of the same length regardless of the data size. Even if the data ranges from a simple text file to a whole 3-hour long movie, the length of the hashes must be the same in every case.

Sounds cool I know. How does SHA256 do this? I'm glad you asked. I have no clue. For all I know it could be a magical box given to devs by a wizard who rides unicorns. So IPFS takes these Hashes and uses them as CIDs, to identify and address data.


How to use IPFS for storing your data

Protocol Labs has created desktop applications to use IPFS. You can find an installation guide for your operating system from their docs here. The resources for using this app are widely available and it's not the scope of this article.

On the same page, you might notice a JS implementation of IPFS called js-ipfs. That's what the rest of this article is about.


What is an IPFS API/Gateway

Picture a decentralized network of nodes. The way for us to connect to this network is by using HTTP gateways. These gateways are made up of a protocol, a hostname and a port. If the gateway looks like http://localhost:5001 then the protocol is https, the hostname is localhost and the port is 5001. These gateways can be used as APIs to access the network and write/read data.

There are platforms like Infura and Quicknode which provide IPFS gateways. Sometimes they’re free (with restrictions) and sometimes they’re paid. The Infura IPFS public gateway was deprecated on 10th August 2022. That’s the reason you will find many other tutorials/resources using a gateway that looks like ipfs.infura.io but it doesn’t work the same way anymore (if at all). Learn more about it here.

I guess this is a good time to inform you of this warning if you’re thinking of making your own public gateway and exposing it.

To use these platforms and their networks, we may need to include their secret keys in the headers of our requests kinda like this.

So these HTTP gateways come in different types/formats and can look different. These gateways can also have different levels of security. You can explore everything about these gateways on this page but beginners don’t need to understand all of this.

You can also view the list of all public gateways available for you to use on this page. It is provided by Protocol Labs.

I should mention there is also a way to use the Kubo RPC API (What is an RPC API?) for building JS apps using the Go implementation but I’m not gonna go into the details of it here. We will learn to set it up with Docker further down in this article. However, I am providing detailed resources about this for interested readers:


Javascript packages for IPFS

There is a library called js-ipfs (sometimes used through ipfs-http-client) but now it’s deprecated and was very recently replaced by Helia. Most of the resources/tutorials that you find on the web for IPFS will likely be using js-ipfs, which is why all this information looks so confusing to complete beginners.

Getting started with Helia is simple and you don’t need to do much setup/configurations to see your app running. Check out the helia-examples repo and find the right starter for your stack. For react-based web apps, I recommend the helia-nextjs example.

The fundamental way to create an IPFS node using Helia is to call the createHelia() function. It will return to you a node that you can use to access a public network through the ipfs.io gateway by default. You can pass parameters to this function to use different gateways or your own local gateway (meaning localhost).

If you use the default gateway, the data you upload to the network will be accessible on https://ipfs.io/ipfs/${cid}.

How exactly you upload this data to the network using Helia is much more complicated than it was by calling a simple add() function in js-ipfs. We need to create a filesystem object and an encoder object manually. Then we can share data with IPFS… by using Uint8 iterables and asynchronous for loops 💀 (yeah I know this was supposed to be a no-jargon guide but Helia doesn’t exactly go easy on us)

Given how new Helia is, the proper documentation and support for it is probably still underway but you can find almost all the things you need as a beginner to do the basic stuff. And I will most likely create another post focusing only on Helia and everything I learn about it.


Creating a local IPFS gateway using Docker

If you are interested in running a private network for your app development and don’t want to use a public gateway then follow along.

💡
You can also check out the IPFS docs for installing Kubo inside Docker if you prefer. The below steps are what worked for me.

First, you need to install docker on your machine. Then add the following 3 files to your working directory.

docker-compose.yml

version: "3"

services:
  ipfs:
    image: ipfs/go-ipfs:v0.12.1
    environment:
      IPFS_SWARM_KEY: "/key/swarm/psk/1.0.0/\n/base16/\n${IPFS_SWARM_KEY}"
    ports:
      - "5001:5001"
      - "8089:8080"
    volumes:
      - ./local-data/ipfs:/data/ipfs

setup.sh

if [ ! -f ".env" ]; then
    echo "Generating IPFS swarm key..."
    GENERATED_KEY=$(docker run --rm -it mattjtodd/ipfs-swarm-key-gen | tail -n 1)

    echo "Copying environment variables"
    cp .env.example .env

    echo "Setting swarm key"
    sed -i "s/IPFS_SWARM_KEY=/IPFS_SWARM_KEY=$GENERATED_KEY/" .env
fi
echo "Setup complete (delete .env and run again to redo setup)"

.example.env

# must be set by running setup.sh
IPFS_SWARM_KEY=

If you’re on Windows then you also need to install WSL2 along with Ubuntu, and also the terminal from the Microsoft store to connect to Ubuntu. Make sure Ubuntu is enabled in the docker desktop settings under “resources”.

If you’re on Linux or Mac then just use the terminal and save those files anywhere you want. The steps for you guys will be much simpler.

For Windows users, open the Ubuntu terminal and type code . to open VScode connected to the Ubuntu distro. There, create a temporary folder tempDir in the root and move the above 3 files into it.

In the terminal run

cd tempDir/
bash setup.sh
docker compose up

The second command will generate the swarm key and store it in a .env file.

The third command will start the environment which can be accessed from the port 5001. If the daemon doesn't run, make sure the .env file has been created and the swarm key has been set properly.

Now if you’re using ipfs-http-client, you can simply call the create() function and start using it because the default gateway in this package is http://localhost:5001 so it should connect automatically. Follow these docs to know more about it.

To know more about the docker images of IPFS, check out their page on Docker Hub. As this page says, go-ipfs is a legacy name and you can also use Kubo instead.


Something I don't yet know

Even after as much research as I did, there are a couple of things I haven't been able to fully figure out (yet) which is why I haven't discussed them in this article.

  1. Connecting to localhost using Helia

    I am looking for the right parameters to pass in the createHelia() function that will connect me to the local network instead of the default ipfs.io gateway, I have not been successful in doing this.

  2. Connecting to a public gateway using js-ipfs

    The opposite of the above problem. I have not been successful in connecting to any public gateway using js-ipfs or ipfs-http-client (both with the latest versions because there are several resources and codesandboxes using way older versions of these packages)


The End

Welp, that's the end of it.

Everything discussed in this article was just the basics of setting it up and oversimplified theory. The aim of this article was to make it easier for beginners to understand what's going on and give some context on the background of IPFS as it stands in 2023. We have barely scratched the surface of this innovative file system.

As mentioned earlier, I am completely new to the concept of Web3 and decentralization. I encourage you to get into discussions in the comments and provide any tips and feedback. If you can share any more information regarding these packages, it would be a great bonus for everyone.

Let me know if you'd like me to cover any specific topics or talk more about the black magic that is SHA256. You can also follow me on Hashnode to stay updated.