What’s Under the Hood of the IPFS, or Where Your NFTs Are Actually Stored
The IPFS is the interplanetary file system in which Tezos NFT marketplaces store content. Why interplanetary? The joke is that judging by the download speeds, the data is downloaded from Mars.
But seriously, the IPFS is the infrastructural backbone of web3, incorporating BitTorrent and blockchain protocol technologies. In this post, we explain in simple terms how it works as well as its advantages and disadvantages.
IPFS and Its Role in Tezos
The IPFS (InterPlanetary File System) is a decentralized data storage and access protocol. The network distributes the recorded files among the nodes and, when requested to access a file, constructs a route to the nearest node to ensure maximum download speed. If a file is large and consists of parts, the IPFS acts like BitTorrent and allows different fragments to be downloaded from other nodes.
The IPFS is free, so NFT marketplaces use it to store NFTs. On any Tezos marketplace, you can find links to content on an IPFS mirror, and in the NFT metadata, you can find direct IPFS links to the content. For example, Objkt.com provides everything at once: a direct link, an object on the Objkt.com server, and an IPFS link to the metadata.
Tezos also supports IPFS links to contract metadata, as seen in TZIP-16.
The Main Visual Difference Between HTTP and IPFS
The Internet is a network for exchanging data between participants. Still, simply connecting two computers by cable is not enough. You also need a protocol: rules for routing requests, encrypting traffic, transferring data, and other operations.
Different protocols are used for different types of data and tasks. For instance, IP is needed to access a host according to its physical address in the network, TCP and UDP are used to transfer data between hosts, and DNS is used to look up IP addresses according to host names.
The best-known Internet protocol is HTTP, HyperText Transfer Protocol. It sends and receives requests and responses between the client and server. Roughly speaking, when you type a website address into a browser, the browser first looks up the physical address of the host server hosting the site and then sends a HTTP-request for data to that address.
The IPFS searches the host address instead of looking up the data. That is, instead of “ipfs://WEBSITE.COM/image.jpg” it would simply be “ipfs://image.” The IPFS itself will find hosts that have the files you are looking for.
The different format of links is the main visual difference between HTTP and IPFS. That said, only some browsers like Brave and Opera can open IPFS links. In Chrome and others, you need to install particular extensions.
How the IPFS Stores and Searches Data
The IPFS network consists of thousands of nodes storing and replicating data as blockchain protocols do. The exact number of nodes and the volume of data stored are beyond any attempt to calculate them, yet the Protocol.ai team assumes there are at least 200,000 nodes and around 125 terabytes of traffic per week.
The IPFS protocol assigns a unique hash identifier (CID) to each uploaded file. It is what you see at the end of a link this one:
If the same file is uploaded to the IPFS twice, the copy will get the same CID but will likely end up on a different node. If the file changes, the new version will get a new CID and co-exist in the IPFS with the old one.
When uploading files larger than 256KB, the nodes split them into chunks, each with a different CID. The CID-based nodes create a directed acyclic Merkle graph (Merkle DAG) that describes the connections between the fragments and the final file. It looks like the instruction “To get the original picture A, combine chunks A1, A2 and A3.
With Merkle graphs, the IPFS can:
- store and host large projects like websites and web applications without the risk of losing fragments;
- update individual file fragments and keep a history of changes instead of overwriting the whole file;
- store different chunks on different nodes and reassemble them into an original file without problems.
IPFS uses the DNS equivalent of a distributed hash table or DHT to find files by their CIDs. Nodes write information into it about which files or fragments they are storing. If a user is looking for an image with CID A, the hash table will show a list of nodes from which this file can be queried. Finding and downloading files from fragments works similarly: by CID of a large file, you find the corresponding Merkle graph, and from the Merkle graph, you find a list of CIDs of fragments and addresses the nodes you require.
How to Use IPFS
To open IPFS links and retrieve content, you have to install a suitable browser or extension. To upload to the IPFS yourself, you must install the IPFS Desktop client and run your node.
You can find detailed instructions and links to those apps on IPFS Tech website.
Three Upsides and One Downside of IPFS
The first upside is the immutability of the data. Recorded files cannot be changed and can only be deleted with the consent of the nodes that keep all their copies.
The second upside is that the IPFS records files free of charge. Still, if a file is not requested for a long time, the nodes may delete it when it is not needed. Some services offer guaranteed storage on the IPFS for $0.1-0.2 per 1GB.
The third upside is decentralization and lack of censorship. As with blockchain protocols, disabling a few nodes will not affect the protocol’s operation.
That said, there is a downside. While in blockchain protocols, all nodes store the identical copy of the blockchain, in IPFS, each node has a different set of files. For example, if you have four nodes storing your picture, and they all decide to clean up unused files, you will no longer be able to access it.
Subscribe and never miss updates from the world of Tezos: