Byzantine Fault Tolerance, or BFT, is one of the most important concepts in blockchain and perhaps one of the least known. Without it, blockchain technology as we know it would not be possible.
Ehe term of Byzantine fault, is derived from Problem of the Byzantine Generals (PGB). This logical problem assumes, in a nutshell, that actors must agree on a concerted strategy to avoid catastrophic system failure. But there is a possibility that within the system there are actors that may not be reliable. Given this fact, the system must create mechanisms that guarantee that these malicious actors cannot lead to failure without remedy. The creation of these mechanisms are precisely those that grant tolerance to Byzantine failures.
It may sound simple, but the reality is very different. Achieving Byzantine Fault Tolerance is one of the most difficult challenges in computing. To the point that the first design to solve it satisfactorily was the Bitcoin, de Satoshi Nakamoto. This marked a milestone, one that has accompanied blockchain technology so far.
La Byzantine Fault Tolerance, is the ability of a distributed computer system to withstand Byzantine faults.
These failures can be:
- Consensus failures.
- Validation failures.
- Data verification failures.
- Failures in the response protocol against network situations.
This tolerance is linked to the ability of the network, as a whole, to create a mechanism of consent. The purpose of this is to give a coherent response to the system failure.
How does Byzantine Fault Tolerance work?
Byzantine Fault Tolerance it works by defining a set of rules that allows solving the Byzantine Generals Problem in a satisfying way. Achieving this is complex, since these kinds of failures do not imply restrictions. This situation makes the problem more complex and difficult to deal with. However, in many computer systems this tolerance is a requirement. Therefore, in order to achieve this goal, a Byzantine fault-tolerant system must meet at least the following:
- Each process must be started with an undecided state (neither YES, nor NO). At this point, the network proposes a series of deterministic values applicable to the process.
- To share the values, a means of communication must be guaranteed. This in order to display messages safely. The medium will also serve to communicate and identify the parties unequivocally.
- At this point, the nodes compute the values and go into a decided state (YES or NO). Each node it must generate its own state, which is part of a purely deterministic process.
- Once decided, they total and win the state with the largest number of decisions in favor.
These four points define the basic operation of a Byzantine fault tolerant algorithm.
A closer explanation
The above case can certainly be a bit complex. Therefore, a simpler explanation applied to the blockchain would:
Let's imagine that John performs a Bitcoin transaction.
Each node on the network, it begins to compile the transaction in an undecided state (unconfirmed TX). The confirmation of that transaction goes through a work of mining is. (our consensus protocol). The mining process verifies that the hash of the transaction is correct and includes it in a block.
This verification process is computational intensive, and is only possible by deterministic means. With each new confirmation (decided status) of the transaction given by the majority of the network, Juan can be sure that the transaction has been taken as valid.
Byzantine Fault Tolerance use cases
Byzantine Fault Tolerance It has the ability to solve various problems. Among these, we will talk about some of the most relevant ones to understand a little more about their broad utility:
Case # 1: Use in software compilers
Un compiler Source code is one of the most complex computing tools that we can know. Compilers have the ability to convert the source code of a program into a binary capable of being executed by the computer. This means that they convert something close to human language (such as C / C ++ or Go) into machine or binary language.
In the midst of all this, the compilers "Shelling" its ability in various sub-programs to perform the following actions:
- Translate the source code to the desired processor architecture. For example, we can decide whether to compile for x86-32 (PC) or ARM (mobile). In this example, we choose x86-32.
- Adjust the parameters to the capabilities of the target processor family or generation. A point to keep in mind, since the code of a higher generation may not be executed in a previous one. At this point, we decided to code for 3nd Generation Core i2s.
- It starts to compile the code and all the sub-programs transform it into machine code. In parallel, the applets decide where you can optimize and what optimizations to apply to your code. The end result is our program already compiled and ready to be run.
In this process, Byzantine Fault Tolerance is vital, as it guarantees the following:
- That applets apply correctly parameters and optimizations for the chosen architecture and generation. Failure to do so will result in errors and failure.
- Applying optimizations You must ensure that they do not mean the data duplication. But deduplication at the binary level should also not affect the functioning of the binary parts. At this point, a Byzantine fault analysis is necessary and compilers should be able to analyze this.
Case # 2: Data Storage Systems
Another use case of Byzantine fault tolerance, the Data storage systems. Many database systems and even file systems implement it to improve the reliability of stored data. An example of this is the file system ZFS. This archiving system is capable of replicating advanced hashing, replication, de-duplication, error correction, handling, and storage capabilities for large amounts of data.
To achieve this, ZFS makes use of Byzantine fault tolerance schemes to ensure:
- La non-omission of elementary processes for treatment of the stored data or by being stored in the file system. For example, the application of hashes to the data and metadata, compression, correction of errors or de-duplication of the same.
- That no undesirable steps are taken in data processing. As a de-duplication that leads to the loss of data in the system. Or a bug fix that corrupts information.
- The create, read, and write processes in nested ZFS structures They use these types of techniques to ensure that they are consistent at all times.
Thanks to all this, ZFS keeps the data stored in its structure in a secure way. This is why it is known as the most secure and advanced file system in the computing world.
Case # 3: Avionics Systems
This is the case of the Aircraft Information Management System. A system that works in real time and has Byzantine fault tolerance.
Each of the aircraft's sensors communicate with the command and control systems providing information in real time. The failure of a sensor must not mean at any time a catastrophic failure for the aircraft. To achieve this, use is made of Byzantine fault tolerance. This in order to compensate the data from the damaged sensor or systems and keep the aircraft safe.
In fact, at this point the application of Byzantine fault tolerance is quite a challenge. Due to the number of systems and the different scenarios to handle. Avionics should be aware of cases such as reconfiguration, duplication, failure of entire systems, and still offer resistance to this type of failure. While 100% resistance is impossible, engineers and programmers do a tremendous job in this regard.
Case # 4: Blockchain Consensus Protocols
Consensus protocols on blockchain like PoW they are tolerant of Byzantine failures. These allow a consensus to be reached in a distributed network under Byzantine conditions. When Satoshi Nakamoto designed Bitcoin, took this kind of tolerance into account. To do this, he created a series of rules and applied the PoW consensus protocol to create software with Byzantine fault tolerance. However, this tolerance is not 100%.
Despite this, PoW has proven to be one of the most secure and reliable implementations for blockchain networks. In that sense, the proof-of-work consensus algorithm, designed by Satoshi Nakamoto, is considered by many to be one of the best solutions for Byzantine failures. PoS y DPoS for their part, they are not completely tolerant to Byzantine failures, which is why they are usually complemented by other security measures.
Advantages and disadvantages
Advantages
- Ability to guarantee correctness of data and information, in distributed systems. This even under hostile scenarios for such tasks.
- Solves the problem of information processing in heterogeneous environments.
- High efficiency in computational and energy terms.
- It offers implementations that positively impact scalability if they are well built.
- The more nodes applying Byzantine fault tolerance, the more secure the model is.
Disadvantages
- The creation of these solutions is complex. This can lead to other security issues in your implementation.
- Ensuring its correct operation requires that the distribution of the system is increasing. The more nodes applying the process, the more secure it is. But this also has a negative impact on the scalability and bandwidth of the network.