, , ,

IPFS: the InterPlanetary File System demystified

Leon Klingele

In this article we will explore IPFS, the InterPlanetary File System. IPFS is a system for storing and accessing files, websites and other kinds of data — just as the Web we enjoy using every day — but unlike the Web, IPFS is peer-to-peer based and automatically distributes its content across the network.

A copy of this post is available on the IPFS network at /ipns/ipfs.leonklingele.de/edu/hdm/ipfs/index.html:
https://ipfs.io/ipns/ipfs.leonklingele.de/edu/hdm/ipfs/ (seriously, go read it over there, the syntax highlighting in this blog is terrible!)

First, some terminology:

  • IPFS, in uppercase letters, as used throughout this post, refers to the
    IPFS protocol
  • ipfs, in lowercase letters, refers to the IPFS command line utility written in Go

Introducing distributed networks

From its outset, the Internet1 was designed to be a decentralized network, able to deliver data packets and find alternate routing paths in case of a network or node failure.
While the Internet itself is resilient to failures2, the vast majority of content hosted on it is served by centralized organizations and servers, rendering it inaccessible in case of server or network failures.

In a centralized computer network, a single, central node controls the communication flow between all other nodes. If that central hub goes down, no communication on the network can be made, the network becomes completely unusable.

A decentralized network on the other hand is a network where multiple such hubs exist. There still are central nodes which need to be passed by data packets in most cases, but when these central hubs crash, at least part of the network will continue to function.

A distributed network is the most resilient kind of network where any single node can fail while all remaining nodes are still able to communicate with each other.

Paul Baran, 1962. On distributed communications network

Image source: Paul Baran, 1962. On distributed communications network, pp. 3–4

IPFS basics

IPFS creates such a distributed network on top of the Internet, although other types of networks are supported as well. IPFS does not rely on or assume access to IP.

Instead of addressing data by its address such as a domain name or IP address, IPFS addresses data by its content3.

Some of the benefits include:

  • Making files available from many different locations (similar to BitTorrent)
  • Making content accessible elsewhere when a server hosting the content goes offline, e.g. by an attack (resilience)
  • Caching becomes a no-brainer since content is addressed by its content3
  • Making it harder to censor content
  • Speeding up content delivery4, similar to CDNs and other P2P networks
  • Implicit trust is given by accessing content by its hash

Diving into the ipfs command line utility

Let’s get our hands dirty and start exploring the ipfs CLI utility along with some details of the inner workings of the IPFS protocol.

Unfortunately, at the time of this writing, no official Debian packages for ipfs exist which means we need to install the tool on our own.

We first show how to get ipfs up and running with Docker & Docker Compose, and provide an alternative way by installing it directly to Debian-based distros as explained further below.

Installing ipfs with Docker & Docker Compose

The easiest way to install ipfs at the time of writing is to run it within Docker. The procedure is straightforward and only requires the docker and docker-compose tools to be installed.

At first, change to a dedicated directory where all our files for ipfs will be put. For the purposes of demonstration, we use a temporary directory. Be advised that any data stored there will be lost on a system reboot.

$ cd $(mktemp -d)

We start by creating a docker-compose.yml file as follows:

$ cat <<EOF> docker-compose.yml
version: "3"

services:
  ipfs:
    image: ipfs/go-ipfs:latest
    restart: unless-stopped
    cap_drop:
      - ALL
    cap_add:
      - SETUID
      - SETGID
      - CHOWN
    ports:
      - "4001:4001" # Swarm TCP
      - "127.0.0.1:5001:5001" # Daemon API
      - "127.0.0.1:8080:8080" # Web Gateway
      - "127.0.0.1:8081:8081" # Swarm Websockets
    working_dir: /shared
    volumes:
      - "./data/ipfs:/data/ipfs"
      - "./data/shared:/shared"
EOF

and launch the Docker container, then exec into it with:

$ docker-compose up -d

$ docker-compose exec ipfs sh

That’s it. We are now ready to use the ipfs CLI utility.

Installing ipfs on Debian without Docker

First, install two dependencies which do not ship by default:

# As root
$ apt install tar curl

and continue by changing to a dedicated directory where all our files for ipfs will be put.
Again, be advised that any data put to a temporary directory as used in this tutorial will be lost on a system reboot.

$ cd $(mktemp -d)

Let’s install the CLI tool!

$ VERSION=0.4.23
$ curl -O https://dist.ipfs.io/go-ipfs/v${VERSION}/go-ipfs_v${VERSION}_linux-amd64.tar.gz

Verify the tarball’s integrity and unpack it:

$ sha256sum go-ipfs_v${VERSION}_linux-amd64.tar.gz
639492d0aec98f845d7de8cdb251389bcac924d9f3940921504481923b532e2f  go-ipfs_v0.4.23_linux-amd64.tar.gz

$ tar xzfv go-ipfs_v${VERSION}_linux-amd64.tar.gz
go-ipfs/install.sh
go-ipfs/ipfs
go-ipfs/LICENSE
go-ipfs/README.md

Continue as follows:

# Copy the ipfs binary to a dir in your path
$ cp go-ipfs/ipfs /usr/local/bin/

# Set up an ipfs user so the binary does not run as root
$ adduser --gecos ipfs --disabled-password ipfs

# Set up a systemd service to make starting and stopping the IPFS daemon easy
$ cat <<eof> /etc/systemd/system/ipfs.service
[Unit]
Description=IPFS Daemon
After=network.target</eof>

[Service]
ExecStart=/usr/local/bin/ipfs daemon --init --migrate --enable-gc
KillSignal=SIGINT
User=ipfs
Group=ipfs

[Install]
WantedBy=multi-user.target
EOF

# Reload the systemd daemon
$ systemctl daemon-reload

# Enable the ipfs service so it automatically starts upon boot
$ systemctl enable ipfs

The IPFS daemon can now be started with a simple

$ systemctl start ipfs

Check that it’s really up and running

$ systemctl status ipfs
[..]
Feb 25 18:00:00 service-ipfs ipfs[5055]: Daemon is ready

Nice! Now configure your firewall to open up port 4001/tcp which is required
to connect to other peers:

$ ufw allow 4001/tcp

As a last step, switch to the ipfs user.

$ su ipfs

We are now ready to use the ipfs CLI utility.

Exploring the ipfs CLI utility

Finally it’s time to play around with ipfs!

ipfs has already been initialized (ipfs init) and its daemon was started (ipfs daemon). Upon first startup, the IPFS node created a public/private key pair which uniquely identifies the node on the IPFS network. The public key is hashed5, yielding the peer’s unique ID which will become useful later on.

# Show own peer ID
$ ipfs id --format='<id>\n'
QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1

The node should already have connected to other peers on the IPFS network found through the bootstrapping / seeding process, similar to how nearly every other P2P network out there does (e.g. Bitcoin and Monero):

$ ipfs swarm peers
[..] long list of peers, an excerpt:
/ip4/37.120.190.6/tcp/4001/ipfs/Qma5q8kiKopYw1G3sSTwtRXDgx1AQ7a7jgJy6gdERvvWEY
/ip4/94.16.118.23/tcp/4001/ipfs/Qmdoy815C6fWDiUCeCq8ETVBwaWN2gsdmGTf9f4ST9P7X7
/ip4/94.16.118.250/tcp/4001/ipfs/QmV1gHWEBkfVDXRNbrUJ6qLfk9GiCR5gSbUGYdVXwzZCeK

Time to ping them:

$ ipfs ping Qma5q8kiKopYw1G3sSTwtRXDgx1AQ7a7jgJy6gdERvvWEY
PING Qma5q8kiKopYw1G3sSTwtRXDgx1AQ7a7jgJy6gdERvvWEY.
Pong received: time=0.56 ms
Pong received: time=0.57 ms
Pong received: time=0.61 ms
^C
Average latency: 0.58ms

# … smooth!

Upon first connecting, peers exchange their public keys and check whether the hash of their partner’s public key really equals the remote’s node ID. If not, the connection is terminated.

Every node has a local storage where IPFS blocks such as file objects are stored. This storage is either in RAM, some kind of database or simply on the node’s filesystem. Ultimately, all blocks available on IPFS are in some node’s local storage. When an object is requested, a peer serving the file is searched, eventually found, then downloaded from and stored temporarily in the local storage. This provides fast lookups of the same object for some time thereafter. Stored blocks can optionally be pinned so the block will stay in the local nodes’s storage forever without expiring.

Adding files to IPFS

Every node on IPFS — including our own — is capable of hosting files and delivering them to other peers. Adding new files to IPFS is easy:

# Create a new file on IPFS
$ echo "Hello World, today is $(date '+%Y-%m-%d')" | ipfs add
added QmVDE7P8cWERGeoexGtLsLGzABoiVQvwaJP7FTv4X6dEbk QmVDE7P8cWERGeoexGtLsLGzABoiVQvwaJP7FTv4X6dEbk
33 B / 33 B [============================================================================] 100.00%

# Note: if you run the command above it will return a different hash as the file will be a different.

ipfs adds the file to the local object storage and returns the hash of it: QmVDE7P8cWERGeoexGtLsLGzABoiVQvwaJP7FTv4X6dEbk. The file’s hash is added to the DHT and the local node starts serving the file to other nodes requesting it by its hash value.

As the hash uniquely identifies the file, we can retrieve it with:

# View the file we just added
$ ipfs cat QmVDE7P8cWERGeoexGtLsLGzABoiVQvwaJP7FTv4X6dEbk
Hello World, today is 2020-02-27

# Yup, working fine!

Upon adding the file, ipfs also automatically announces its existence to other peers on the network. (Try to ipfs cat it from another node — it will return the same content!)

In IPFS, data distribution happens by exchanging blocks with peers using a
BitTorrent-inspired protocol: BitSwap. Like BitTorrent, BitSwap peers are
looking to acquire a set of blocks (the want_list), and have another set of
blocks to offer in exchange (the have_list). Unlike BitTorrent, BitSwap is not
limited to the blocks in one torrent. BitSwap operates as a persistent
marketplace where nodes can acquire the blocks they need, regardless of what
files those blocks are part of. The blocks could come from completely unrelated
files in the filesystem. Nodes come together to barter in the marketplace.

In the base case, BitSwap nodes have to provide direct value to each other in
the form of blocks. This works fine when the distribution of blocks across nodes
is complementary, meaning they have what the other wants. Often, this will not
be the case. In some cases, nodes must work for their blocks. In the case that
a node has nothing that its peers want (or nothing at all), it seeks the pieces
its peers want, with lower priority than what the node wants itself. This
incentivizes nodes to cache and disseminate rare pieces, even if they are not
interested in them directly.

The protocol must also incentivize nodes to seed when they do not need anything
in particular, as they might have the blocks others want. Thus, BitSwap nodes
send blocks to their peers optimistically, expecting the debt to be repaid. But
leeches (free-loading nodes that never share) must be protected against. A
simple credit-like system solves the problem:

1. Peers track their balance (in bytes verified) with other nodes.
2. Peers send blocks to debtor peers probabilistically, according to a function
that falls as debt increases.

Note that if a node decides not to send to a peer, the node subsequently ignores
the peer for a while. This prevents senders from trying to game the probability
by just causing more dice-rolls.

The debt ratio is a measure of trust: lenient to debts between nodes that have
previously exchanged lots of data successfully, and merciless to unknown,
untrusted nodes.
Source: IPFS whitepaper which continues to describe this in more detail.

Adding files was easy. Let’s deploy a small static website to IPFS.

Deploying a static website to IPFS

First, a website is required! Fruits are awesome, so let’s showcase our favorite fruits on a website:

$ WEBDIR=demo/websites
$ SITEDIR=$WEBDIR/yummyfruits
$ mkdir -p $SITEDIR
$ echo '<h1 style="color: green;">I really like green Apples! 🍏</h1>' > $SITEDIR/apple.html
$ echo '<h1 style="color: yellow;">Bananas are one of my favorite fruits! 🍌</h1>' > $SITEDIR/banana.html
$ ipfs add -r $WEBDIR
added QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y websites/yummyfruits/apple.html
added QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN websites/yummyfruits/banana.html
added QmbVPuewTkDPSLJku69JFkAox3VUcMiokK2fk2HniT7e1u websites/yummyfruits
added QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr websites
140 B / 140 B [==========================================================================] 100.00%

… the whole website has just been added to IPFS and is ready to be requested.

To list the websites we currently host, ipfs ls the hash of the websites object:

# Show our websites
$ ipfs ls QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr
QmbVPuewTkDPSLJku69JFkAox3VUcMiokK2fk2HniT7e1u - yummyfruits/

Paths work as they do in traditional UNIX filesystems and the Web:

# List files on the YummyFruits website
$ ipfs ls QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits
QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y 64 apple.html
QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN 76 banana.html

To retrieve webpages from our YummyFruits website:

$ ipfs cat QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/apple.html
<h1 style="color: green;">I really like green Apples! 🍏</h1>
$ ipfs cat QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/banana.html
<h1 style="color: yellow;">Bananas are one of my favorite fruits! 🍌</h1>

Alternatively, the files can be accessed directly using their hash:

# Resolve hash of apple.html, or simply look up its hash above
$ ipfs resolve QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/apple.html
/ipfs/QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y

# Address apple.html directly by its hash
$ ipfs cat QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y
<h1 style="color: green;">I really like green Apples! 🍏</h1>

A different notation to achieve exactly the same is to prefix the IPFS object
with /ipfs/:

# Address apple.html directly by its hash and the /ipfs/ prefix
$ ipfs cat /ipfs/QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y
<h1 style="color: green;">I really like green Apples! 🍏</h1>

Other prefixes are, for example, /ipns/ which we’ll explore later on.

So far so good. Wouldn’t it be cool to view our website in an actual browser?

Accessing our website in a web browser

The IPFS daemon sets up a Web Gateway running locally on port 8080 which can be accessed using a web browser or with other clients such as curl:

$ curl localhost:8080/ipfs/QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/apple.html
<h1 style="color: green;">I really like green Apples! 🍏</h1>

Website running on IPFS

Alternatively, the website can be viewed through public IPFS Web Gateways such as the one run by ipfs.io: https://ipfs.io/ipfs/QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/apple.html

Feel free to share this link with your all friends so they know exactly what kind of fruits you are into.

Adding more fruits to our website

Showcasing only two fruits is quite pointless, so we add more:

$ echo '<h1 style="color: darkred;">Cherries are so sweet! 🍒</h1>' > $SITEDIR/cherry.html
$ echo '<h1 style="color: brown;">I only ever date dates! 🤓</h1>' > $SITEDIR/date.html
$ ipfs add -r $WEBDIR
added QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y websites/yummyfruits/apple.html
added QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN websites/yummyfruits/banana.html
added QmVW36JGRo1Q4SjNjktWPnd9SF3SEynH7nfTu4KBF7HSBn websites/yummyfruits/cherry.html
added QmYdEHFhCX8U3zgGEfayTpfvM6fj21pedBmG7jbPUrio1B websites/yummyfruits/date.html
added QmTmtiurZozG98k7QnTvbQ2i95b3VPou3By8HQm8TRHEKq websites/yummyfruits
added QmPHxSJX9VeFgkXoESJ29zLoCVh6GP1yCkb68fu97drSts websites
261 B / 261 B [==========================================================================] 100.00%

But what’s that? The hash value for our website has changed!

Recall that IPFS objects are addressed by their content6 (to be precise, their hash) and not their address. Once we change a file or folder, its hash must change too. We didn’t touch apple.html and banana.html so their hash values remained the same. On the other hand, the hash of the yummyfruits directory object was modified because two new files, cherry.html and date.html, were added to it. Consequently, as the hash for yummyfruits changed, its parent’s hash must to change too. This chain of hash-updates recursively propagates to all parents of a modified object (child).

Confirming that everything we just did really works:

$ ipfs cat QmPHxSJX9VeFgkXoESJ29zLoCVh6GP1yCkb68fu97drSts/yummyfruits/{cherry,date}.html
<h1 style="color: darkred">Cherries are so sweet! 🍒</h1>
<h1 style="color: brown">I only ever date dates! 🤓</h1>
# This on the other hand will not work as `cherry.html` is not available under the old resource hash
$ ipfs cat QmeoPDguuLaYsZXnniL5Be7t9hmbU55XU6B8LE5cuS7TJr/yummyfruits/cherry.html
Error: no link named "cherry.html" under QmbVPuewTkDPSLJku69JFkAox3VUcMiokK2fk2HniT7e1u

This new hash also needs to be shared with your friends, otherwise they would never know you like cherries too.

Enter IPNS, the InterPlanetary Naming System

Takeaway from the previous section was that modifying content of an existing resource will produce a new hash under which it can then be accessed. Using an old hash will still return the old content.

Distributing a new hash every time a resource has changed would be really cumbersome which is where IPNS, the InterPlanetary Naming System, comes in handy. Using it, a dynamic resource (such as a website which might change from time to time) can always be addressed by the same, static reference. This static reference is your peer ID, the hash of your peer’s public key.

Note: each peer ID can only reference a single resource such as our YummyFruits website. Multiple peer IDs can be created to reference more than just one dynamic resource. See ipfs key gen --help.

Publishing an IPNS name

To make our YummyFruits website available on our peer ID, use ipfs name publish followed by the hash of the updated yummyfruits object:

$ ipfs name publish QmTmtiurZozG98k7QnTvbQ2i95b3VPou3By8HQm8TRHEKq
Published to QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1: /ipfs/QmTmtiurZozG98k7QnTvbQ2i95b3VPou3By8HQm8TRHEKq
#            ^ This is the peer ID

Using ipns name resolve we can now resolve our IPNS name to the IPFS object:

$ IPFS_PEER_ID=$(ipfs id --format='<id>\n')
$ echo $IPFS_PEER_ID
QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1</id>

$ ipfs name resolve $IPFS_PEER_ID
/ipfs/QmTmtiurZozG98k7QnTvbQ2i95b3VPou3By8HQm8TRHEKq

The website can now be accessed by our peer ID:

#           v Note the use of IPNS here
$ ipfs cat /ipns/$IPFS_PEER_ID/cherry.html
<h1 style="color: darkred;">Cherries are so sweet! 🍒</h1>
Adding even more yummy fruits

To better illustrate the usefulness of IPNS, let’s first add another one of our favorite fruits:

$ echo '<h1 style="color: dark;">Elderberries are nice but stain clothes, be careful! ⚠️</h1>' > $SITEDIR/elder.html
$ ipfs add -r $WEBDIR
added QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y websites/yummyfruits/apple.html
added QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN websites/yummyfruits/banana.html
added QmVW36JGRo1Q4SjNjktWPnd9SF3SEynH7nfTu4KBF7HSBn websites/yummyfruits/cherry.html
added QmYdEHFhCX8U3zgGEfayTpfvM6fj21pedBmG7jbPUrio1B websites/yummyfruits/date.html
added QmR9hSAqq425pWzvjtA8g6xEak1Ks96fE8kuLU4xDTQTCX websites/yummyfruits/elder.html
added QmV9z6Dccx9onMnggZ6YHtofAsAw9xWHMnzMbMVt4gobkr websites/yummyfruits
added QmaX2HpHbf2VrAuenhFDYJibMNmk66ZTy4Y8Cq84WB4PwN websites
351 B / 351 B [==========================================================================] 100.00%
# This doesn't work yet…
$ ipfs cat /ipns/$IPFS_PEER_ID/elder.html
Error: no link named "elder.html" under QmTmtiurZozG98k7QnTvbQ2i95b3VPou3By8HQm8TRHEKq

# … we need to re-publish the updated hash first …
$ ipfs name publish QmV9z6Dccx9onMnggZ6YHtofAsAw9xWHMnzMbMVt4gobkr
Published to QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1: /ipfs/QmV9z6Dccx9onMnggZ6YHtofAsAw9xWHMnzMbMVt4gobkr

# … now it does!
$ ipfs cat /ipns/$IPFS_PEER_ID/elder.html
<h1 style="color: dark;">Elderberries are nice but stain clothes, be careful! ⚠️</h1>

As a last step, add an index.html file with references to all our favorite fruits added so far:

$ echo '<ul>
    <li><a href="apple.html">Apple</a></li>
    <li><a href="banana.html">Banana</a></li>
    <li><a href="cherry.html">Cherry</a></li>
    <li><a href="date.html">Date</a></li>
    <li><a href="elder.html">Elder</a></li>
</ul>' > $SITEDIR/index.html

Then add and re-publish:

$ ipfs add -r $WEBDIR
added QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y websites/yummyfruits/apple.html
added QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN websites/yummyfruits/banana.html
added QmVW36JGRo1Q4SjNjktWPnd9SF3SEynH7nfTu4KBF7HSBn websites/yummyfruits/cherry.html
added QmYdEHFhCX8U3zgGEfayTpfvM6fj21pedBmG7jbPUrio1B websites/yummyfruits/date.html
added QmR9hSAqq425pWzvjtA8g6xEak1Ks96fE8kuLU4xDTQTCX websites/yummyfruits/elder.html
added QmTbcghvnaEVg86PzATNTYu6QgZ9V5iRv3Ss7BV81djSEb websites/yummyfruits/index.html
added QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4 websites/yummyfruits
added QmPBkXse5gUKXakJW8JjP8ukVQ4LNN5X3Um91iHmcQCKnM websites
574 B / 574 B [==========================================================================] 100.00%

$ ipfs name publish QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4
Published to QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1: /ipfs/QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4

Fire up a browser and access the IPNS via either the local Web Gateway or a public one, e.g. the one provided by ipfs.io: https://ipfs.io/ipns/QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1/.

Website running with IPNS

The only hash value you ever need to tell your friends from now on is the peer ID, QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1.

Really? I am used to sharing nice-looking URLs with my friends!

All IPFS resources are self-authenticating. This means when requesting for a resource by its hash at a (potentially malicious) node on the IPFS network, neither they nor anyone in between (a man-in-the-middle, MITM) can inject additional data to the requested object as it would result in a different hash.

Simple illustration:

# A colleague creates an arbitrary file on IPFS and sends you the hash of it over an authenticated channel
$ echo 'Hello World!' | ipfs add -q
QmfM2r8seH2GiRaC4esTjeraXEachRt8ZsSeGaWTPLyMoG

As you are not yet in possession of a resource with that hash, you begin querying other peers which you are connected to. Once a providing node has been found and the file transfer has completed, your (receiving) IPFS node computes the hash of whatever data the other node sent. If that hash doesn’t match the hash value you were initially given by your colleague, your node knows something suspicious happened and discards the data. IPFS then asks another node for the file. Nodes will be penalized7 locally when they frequently sent such junk data.

To find nodes which provide the file, run the following command:

$ ipfs dht findprovs QmfM2r8seH2GiRaC4esTjeraXEachRt8ZsSeGaWTPLyMoG
[..] long list of peers serving the file, an excerpt:
QmXWf53PNW5nSrP2voZg9GHfYpqPWrYo7677saX6yFV8Z1
Qma5q8kiKopYw1G3sSTwtRXDgx1AQ7a7jgJy6gdERvvWEY
Qmdoy815C6fWDiUCeCq8ETVBwaWN2gsdmGTf9f4ST9P7X7
QmV1gHWEBkfVDXRNbrUJ6qLfk9GiCR5gSbUGYdVXwzZCeK
Creating beautiful8 IPNS names

Hashes are hard to remember. On the other hand, domain names as we use them every day can easily be remembered.
IPFS supports DNSLink, a system allowing to map domain names to an IPFS address, creating memorable aliases of the objects referred to.

For that, it uses DNS TXT records. To make an IPFS resource (e.g. our latest YummyFruits website, QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4) available under /ipns/yummyfruits.ipfs.leonklingele.de/, simply create a TXT DNS record on the _dnslink.yummyfruits.ipfs subdomain of your zone as follows:

--- a/zones/leonklingele.de
+++ b/zones/leonklingele.de
@@ -2,3 +2,3 @@ $TTL     3600
@       IN SOA  ns1.leonklingele.de. hostmaster.leonklingele.de. (
-            2020022400    ; Serial number
+            2020022500    ; Serial number
1800          ; Refresh (30 minutes)
@@ -424,2 +424,3 @@ www      IN A    185.183.159.234

+_dnslink.yummyfruits.ipfs   IN TXT  "dnslink=/ipfs/QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4"

Alternatively, specify your IPNS name in the dnslink directive.

Confirm that it really works:

$ dig +short -t TXT _dnslink.yummyfruits.ipfs.leonklingele.de
"dnslink=/ipfs/QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4"

$ ipfs dns yummyfruits.ipfs.leonklingele.de
/ipfs/QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4

$ ipfs cat /ipfs/QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4/index.html
<ul>
    <li><a href="apple.html">Apple</a></li>
    <li><a href="banana.html">Banana</a></li>
    <li><a href="cherry.html">Cherry</a></li>
    <li><a href="date.html">Date</a></li>
    <li><a href="elder.html">Elder</a></li>
</ul>
# Or, using the IPNS name directly
$ ipfs cat /ipns/yummyfruits.ipfs.leonklingele.de/index.html
[..] same as above

# Yay!

In your favorite browser, head over to https://ipfs.io/ipns/yummyfruits.ipfs.leonklingele.de/ and view it in action. Beware that — as you are not directly accessing a hash here but only a DNS-resolved IPFS address — MITM attacks become a problem if the domain doesn’t employ additional security mechanisms such as DNSSEC to add authenticity to the returned TXT record. Also note that DNSLink uses DNS which is a centralized system that can break more easily — we wanted to get decentralized and distributed after all, remember?

Website running with IPNS using a custom DNSLink domain

Dealing with IPFS blocks

So far, we’ve always dealt with IPFS objects. Objects are blocks represented in a Merkle Directed Acyclic Graph (DAG) but are addtionally encoded in the UnixFS protobuf data format. Amongst other tasks, the UnixFS data format is responsible for encoding large files into multiple blocks. Files are first broken down into blocks and then arranged in a tree-like structure using link nodes to tie them together. A given file’s hash is actually the hash of the root (uppermost) node in the DAG. See Dealing with Blocks for more details.

Citing the specs, the IPFS Merkle DAG is a directed acyclic graph whose edges are Merkle-links which are cryptographic hashes of the targets embedded in the sources. This means that links to objects can authenticate the objects themselves, and that every object contains a secure representation of its children.

If this format is not required, using blocks directly is recommended.
To create a raw block, use the ipfs block subcommand:

$ echo 'This is a raw IPFS block!' | ipfs block put
QmZLxXbxEzCy8v9RBiCNuHKWYqgQJWMfGFJX78MAUZo5BX

$ ipfs block get QmZLxXbxEzCy8v9RBiCNuHKWYqgQJWMfGFJX78MAUZo5BX
This is a raw IPFS block!

# `ipfs cat` requires blocks be encoded in the UnixFS format, so `ipfs cat`'ing
# a non-UnixFS block will fail:
$ ipfs cat QmZLxXbxEzCy8v9RBiCNuHKWYqgQJWMfGFJX78MAUZo5BX
Error: failed to decode Protocol Buffers: incorrectly formatted merkledag node: unmarshal failed. proto: PBNode: wiretype end group for non-group

Dealing with raw IPFS objects

IPFS objects can be explored with ipfs object:

# No childs elements are linked as this object is leaf (file) in the DAG
$ ipfs object get QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y | jq
{
  "Links": [],
  "Data": "\b\u0002\u0012@<h1 style="\"color:" green;\"="">I really like green Apples! 🍏</h1>\n\u0018@"
}

# Modify the raw object and add it to IPFS
$ ipfs object get QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y | sed 's/color: green/color: orange/' | sed 's/green Apples/big Oranges/' | sed 's/🍏/🍊/' | ipfs object put
added QmPxnroqXbaFkZ2dkqwKKZDdLuxZQBGKE5x5TB7821FzPB

$ ipfs cat QmPxnroqXbaFkZ2dkqwKKZDdLuxZQBGKE5x5TB7821FzPB
<h1 style="color: orange;">I really like big Oranges! 🍊</h1>

Displaying objects which refer to a directory reveals the Merkle DAG structure:

$ ipfs object get QmNr95KriEevBHikGKR5JvPfnKckiQRyNEpkb8z7yqAHg4 | jq
{
  "Links": [
    {
      "Name": "apple.html",
      "Hash": "QmUx6PWeRAPxCTqmK4eGq95KnvPCJwNhfU1cgw5RWCEG8y",
      "Size": 72
    },
    {
      "Name": "banana.html",
      "Hash": "QmZkMyLFKAQqRXer1MExU3vnfAFmxCN7rFkpmsUQBG81UN",
      "Size": 84
    },
    {
      "Name": "cherry.html",
      "Hash": "QmVW36JGRo1Q4SjNjktWPnd9SF3SEynH7nfTu4KBF7HSBn",
      "Size": 69
    },
    {
      "Name": "date.html",
      "Hash": "QmYdEHFhCX8U3zgGEfayTpfvM6fj21pedBmG7jbPUrio1B",
      "Size": 68
    },
    {
      "Name": "elder.html",
      "Hash": "QmR9hSAqq425pWzvjtA8g6xEak1Ks96fE8kuLU4xDTQTCX",
      "Size": 98
    },
    {
      "Name": "index.html",
      "Hash": "QmTbcghvnaEVg86PzATNTYu6QgZ9V5iRv3Ss7BV81djSEb",
      "Size": 234
    }
  ],
  "Data": "\b\u0001"
}

Conclusion

This was our small introduction to ipfs and some details of the inner workings of the IPFS protocol. If you’re interested, please don’t hesitate to read more about it in the referenced articles.

References & further reading


  1. Note: Internet != Web, the world wide web is a service on the Internet, just as email, BitTorrent, IRC, SSH, etc. 
  2. Not resilient to all kinds of failures, for sure… 
  3. By the hash5 of the content. 
  4. This is where IPFS got its name from. See What is IPFS?
  5. The hash in fact is a base58-encoded multihash, also known as the content identifier (CID), from now on always referred to by hash. The hash function used to produce a hash is stored in the multihash alongside the hash value itself. This allows to switch hashing functions without breaking backwards-compatibility. 
  6. This is commonly referred to as content-based addressing
  7. See the BitSwap / ledger specification
  8. Beauty is in the eye of the beholder. 

Comments

Leave a Reply