IPFS or InterPlanetary File System, is a decentralized file system that seeks to guarantee the security, privacy and resistance to censorship of your data.
The InterPlanetary File System project or Interplanetary File System (IPFS) is a curious project with a fairly clear objective: create a computer network with a global reach that allows the storage of information in a completely decentralized manner, with high scalability, and of course, with great resistance to censorship of any kind.
You can imagine it as a huge network that contains huge amounts of information spread all over the world and which you can access in a totally transparent and secure way. Without a doubt, the perfect complement to the ever-growing Internet, whose reach now reaches even the smallest electronic devices such as our clocks, alarm clocks and even coffee makers.
IPFS, the beginnings of the project
IPFS is an idea that comes from the mind of Juan Benet, a programmer who founded the company in 2014 Protocol Labs. However, it was not until 2015, when Benet presented IPFS to the world. The idea is to build a P2P network that allows those who are part of it, store and distribute information in a completely decentralized way throughout the planet. The system works based on the well-known technology of distributed hash table or DHT, the same that is used in the BitTorrent protocol, from which IPFS takes some functions for your P2P network.
Since then, IPFS has been a project in constant development and version 0.26.0 of this system is currently available. Despite the fact that it is a development version, IPFS currently allows the deployment of many of its final functions in a stable way, and many of them are still being constantly improved, making it clear that it is a system that we can already use today.
A file system for the Internet
Why was IPFS created? Well, basically the creation of IPFS comes to solve a gigantic storage space need that is constantly expanding with the current Internet that we have. It is estimated that a total of 2019 zettabytes of information was generated worldwide in 42. That is, 42 billion terabytes of additional information to all the data that has already been generated in previous years.
But the main problem with these data is that end up in the hands of third parties who generally exploit the same for their various economic activities. For example, it is not uncommon for companies like Meta to take your data on their social network and use it to sell it to third parties interested in information that may come from your tastes or activities, in order to create profiles that allow them to offer other products and services. services. It may sound innocent, but it's not. In fact, this is a violation of your privacy since not only your data on the social network is used, but all your activity inside and even outside the network is tracked, so that it can be sold to third parties.
Other companies like Google Drive, for their part, are capable of analyze what you write and save on their servers, and in case of finding something that "violates their regulations" they will simply delete it from their servers, without giving you the right to protect said information in another medium. In short, the big Internet companies use your data to sell it and practice intolerable censorship in every way.
IPFS was born to solve this problem. The idea of IPFS is transform the way data is stored, leaving these to be completely decentralized, and access control to them is in your hands at all times. Not only that, IPFS allows our computer to store data from a website and serve it to whoever needs it. Well, that's IPFS, and if you're not mistaken, it's similar to what BitTorrent does to share files, only, in this case, the protocol would be integrated into applications and websites like the ones we use all the time, making the interaction with completely transparent IPFS.
Not only that, with IPFS the world's storage capacity would increase dramatically. And this because our computers would become part of that huge data disk that will store information from the entire Internet. This would help solve (or greatly decrease) the need for storage space to cope with the demand from around the world. In addition, it would help decentralize the network, and even allow us to keep a complete history of that information that interests us in a resistant and uncensored network.
How does IPFS work?
IPFS is a system that works under the "search by content" scheme, that is, every time we do a search in IPFS, we must tell the system "what we are looking for" instead of telling it "where to look for it".
Let's take a look at what all of this means for a moment, and use the current Internet as an example. When we visit a website on the Internet, what our browser does is the following:
- Take the URL or address and perform a DNS query, to find out in which IP address that server is located.
- Once it has the IP address, the browser makes a request for information to the server and begins to download the information.
- It shows us the information of the URL that we have indicated.
This is a pretty simplified form of everything that happens every time we use our web browser. This type of operation is calledsearch by location”, and it is called that way because we need to know where the information is “located” in order to access it. That location is the IP address of the server, and from there comes that situation that nobody wants, if the server is down, you will not be able to access the information you are looking for, because the location is not available.
However, in the case of IPFS, the "search by content" it works completely differently. In fact, we can break it down as follows:
- You tell the system what content you are looking for.
- The system takes your request for information and sends it to the network, where the system nodes will begin to respond to you. Furthermore, such information is
protected by encryption, a data hashing system and digital signature, to prevent anyone from accessing it without permission. - You will receive the response of the nodes showing you the versions of the content available throughout the network.
- If you choose an option, you will be able to access the content and even its entire history, since if that option has been displayed it is because it is active on the network at the time of your request.
This means that IPFS does searches that are defined by content, and in which nodes on the network respond. For example, if you want to enter Bit2Me Academy in IPFS, just type Bit2Me Academy, and those nodes that store information from this website will show you all the content they have stored, being able to access it at any time.
DHT, the starting point of IPFS
The starting point of IPFS is the distributed hash table or DHT. This function is responsible for creating a unique and unrepeatable hash for each of the contents within the system. Not only that, it is also in charge of allowing the creation of a global search index for the entire distributed network, making sure that the content of the network is not duplicated and allowing searches to be redirected to the correct nodes so that we can access the information whenever we want. .
Simply put, DHT creates a huge library of unique and unrepeatable hashes that allow us to quickly search for the content we want. For this system, IPFS uses the well-known SHA-256 hash, the same one used in Bitcoin and many other cryptocurrencies. The reason? It is simple to implement, secure, and current hardware can perform the computations allowing this job to require little computing power in generating the hashes.
A DAG to manage the network
Another important part of how IPFS works is that its network is structured into a huge DAG or Directed Acyclic Graph. In this case, the IPFS DAG is specifically a Merkle DAG, that is, one in which each node has a unique identifier which is a hash of the contents of the node.
The Merkle DAG used is only a slight modification to what a blockchain would be, where each block has a Merke Root and the data of said block is summarized. In this sense, the DAG construction has been chosen over the blockchain for a very powerful technical reason: IPFS runs asynchronously and is more scalable. In addition, it is a design where total immutability is not its goal (although it is possible to configure the system to be immutable), and of course, there is no need to protect against attacks such as double spend, 51% attack, among others.
Given this technical landscape, the IPFS DAG is designed to allow more efficient redirection of content and search between nodes. Not only that, a Merkle DAG allows the creation of “change histories” that allow you to track the individual change of files at different times, allowing us to navigate through them without problems. In this way, we can preserve not only the latest version of a website, but also its complete history from its inception to its most current moment. Additionally, this feature allows the application of three important functions:
- The first is the well-known "deduplication" that prevents us from having duplicate content on the node, and throughout the network.
- The second is known as "delta storage" in which small files are created that allow us to know exactly what content has been changed between different versions. Thus, taking a certain base content and adding the respective deltas we can recreate a more current (or older) content than the base content that has been taken.
- Finally, the third function is that this DAG allows the network to participate in user access to certain information. Thus, for example, if a data is in 2 or more nodes, the user can begin to download the information from all those locations, improving the download time and general response of the network.
Privacy in IPFS
However, the idea of storing our data on computers scattered around the world is not something that many like. The danger that this could pose to our privacy is immense so: How does IPFS solve this problem?
Well, first of all, you should know that everything in IPFS is within a public network. So, anyone can access it by having a client for it. So every piece of data you put into IPFS will be part of the DHT and the Merkle DAG of the network, making it clear that everything is accessible.
This is something that can be solved thanks to the fact that IPFS is a free software system and any person or group of developers can add this function to the network, allowing data anonymization and even adding advanced cryptography to protect them from unauthorized access. This is in fact the case of several projects that use IPFS for their operation.
IPFS use cases
Now, let's know some quite striking use cases of this technology:
Filecoin
In full ICO fever of 2017, and looking for a way to finance his idea, the ICO of Filecoin, a sister project created by Juan Benet and his company Protocol Labs. The idea of Filecoin is to create a incentive system by which IPFS users feel incentivized to store the files that others want to store. Filecoin allows people to rent storage space that can be paid for using the FIL token.
The history of Filecoin began in 2017, and it was not until this October 29, 2020, when its network finally came out. The launch of the project generated a lot of attention and at the moment the network already stores about 1,4 Exabytes of information. In addition to a market capitalization that exceeds $ 1200 billion, and a cost of more than $ 29 per FIL token.
audius
Audius is a music and audio sharing platform designed to provide artists with a direct link to their listeners. Using decentralized technology, Audius is able to grant artists rights and control over their own music. All this, through a platform resistant to censorship for the expression and distribution of works and artistic compositions. To create a user-owned and operated platform, it was key to have a distributed cloud storage network as the foundation for the system. Audius uses IPFS as the core component of decentralized storage in its mission to give everyone the freedom to share, monetize, and listen to any audio.
openbaza is
OpenBazaar is a peer-to-peer e-commerce platform in which buyers and sellers can participate anonymously and privately without data collection by providers or any other central authority. The OpenBazaar platform is developed by OB1, who also created Haven, a mobile version of OpenBazaar that offers shopping, chat, and the ability to send cryptocurrencies privately.
IPFS serves as the data storage layer for OpenBazaar and Haven. On the network, merchants and buyers can run storage nodes, eliminating the need for a central server. By using IPFS to create this collaborative network, OpenBazaar enables buyers and sellers to trade without the risk. Risks such as centralized data collection or the threat of your personal information being hacked.
OB1 has been successfully building on IPFS since 2015. The peer-to-peer network that IPFS enables enables the team to provide a platform where people freely exchange goods. As well as allowing OB1 to be just a technology provider. This means not a seller of products, "owner" of the network or a party to commercial matters between peers.
Pros and cons of IPFS
Among the pros of IPFS we can mention:
- The storage system is completely decentralized.
- The network is built in order to be highly scalable.
- The network can withstand denial of service attacks among others because it is fully decentralized. In this way, timely access to information is guaranteed at all times.
- Its use is completely free, and the source code is available under free software licenses.
- It is extensible, which allows anyone to adapt new functions without major problems. For example, privacy modules, connection to TOR, I2P, among others, can be added.
For its cons we can mention:
- It is a development still in evolution, so its use in production is not yet very extensive.
- It is complex to use for inexperienced users in this type of systems.
- It does not have privacy extensions by default.
- Unlike projects like SIA, IPFS has not been designed with an incentive model at its core. Because of this, they have had to develop separate projects like Filecoin that are limited in their integration.