Any protocol or network, like Bitcoin or Ethereum, needs a ledger to store transactions or any other kind of information relating to ownership. This ledger needs to be public and distributed to all users participating in the network so every time an update occurs they are all aligned and tuned to one shared global and immutable history. This ledger will be verifiable from anyone at any time.
Such a ledger is the blockchain. A chain of blocks that create an unbroken and immutable history of transactions. A chain of ownership.
A blockchain is a data structure. It can very well be centralized. But that wouldn’t be optimal since regular databases are far more efficient to maintain and easier to build.
If you can centralize, centralize. If you can’t centralize, then decentralize. - Dan Bohen
The blockchain is often criticized as being just an inefficient database. The truth is that in order to achieve some objective, which in the case of blockchain is decentralization, transparency, and public verifiability you have to make some trade-offs.
When a system is very centralized it is very easy to be torn down if one of the pieces is removed. Think of the 2008 financial crisis where because of banks going bust the whole system collapsed. In this case, there was a single point of failure. The banks. If you want to hack a conventional database you only need to hack the company and you now have access to the entire database all due to the fact that there was a single point of failure.
On the other hand, the more decentralized a system is, the more resilient it becomes. The network is composed of innumerable nodes and there is no single point of failure which means the possible surface area of attack is flattened.
If you want to attack a blockchain, due to being peer-to-peer and distributed you would need to hack all the nodes(meaning users) involved simultaneously and rewrite the history of all existing blocks or have the majority of nodes under your control. It could happen in theory but not in reality. You would need enormous amounts of computational power to do so which is practically infeasible.
Since being decentralized is the blockchain’s main characteristic and objective, no single entity can be in charge of interacting with it. This is being done with consensus mechanisms that have been the study of distributed systems for years. Any decentralized system must have some way to come to an agreement without trust as a prerequisite or central authority enforcing it.
Bitcoin for example uses proof-of-work as its consensus mechanism that takes place in mining. There are different consensus mechanisms such as proof-of-stake or delegated-proof-of-stake that other protocols use*.* Each one of these tries to achieve consensus in different ways. Replacing trust with math.
(More on various consensus mechanisms here)
Cryptographic Hash Functions
In order to understand the mechanics of a block and how new ones are being added to the chain, we must first get familiar with the concept of cryptographic hash functions.
A cryptographic hash function — such as SHA256 that Bitcoin uses — is a function that takes an input and returns a set of random alphanumeric characters of a fixed length. In essence, it is a random number in hexadecimal form. It appears to be random but in fact, it was produced based on the input we gave it. If we change the input we will have a different output. The guarantee here is that it is deterministic. The same message will always produce the same hash. This is also known as a one-way function where it is practically infeasible to reverse engineer the process and derive the initial message or input.
These hash functions are being used quite extensively as we will see below.
Anatomy of a Block
A block consists of a body where transactions are being stored and the block header where information about the block itself are stored. Each network will have a slightly different block header structure. For example, Bitcoin’s header consists of 6 fields whereas Ethereum has 15 fields. There are some main ones though that act as blockchain primitives meaning they are present in virtually all chains.
Version and TimeStamp
These are pretty self-explanatory. Version is the software version that was used and the timestamp which is the approximate time of when the block was created.
Each block consists of many transactions. All transactions are summarized in a tree-like data structure where each parent node has two children branching out. Each parent is the resulting hash(using SHA256 twice) of its children going all the way up the tree resulting in a single hash value called the Merkel Root. A Merkel path can be produced where you can verify - just as you would do a binary search - that a transaction is included in the block without having to download the whole blockchain. If a single transaction in the Merkel Tree is modified the Merkel Root will result in an entirely different hash.
Previous block hash
This is the hash of the previous block, meaning its unique identifier. This is also called the parent block since the next block is its direct children. This is what actually ties blocks together in an unbroken chain.
Difficulty Target and Nonce
If the network uses Proof-of-work as its consensus mechanism then the difficulty target and the nonce will be present since their role is crucial when it comes to the mining process. Miners contribute computational power in order to solve a cryptographic puzzle. The more miners in the networks, competing to solve the puzzle, the higher the likelihood of miners getting lucky. On the flipside the fewer miners in the network the harder it gets. For example, in Bitcoin, on average, every 10 minutes a block is validated and added to the blockchain. This is a purely arbitrary sequence and it is not fixed in any way. It might take 5 minutes or 15 minutes. It is the result of the difficulty adjustment doing its work.
The difficulty target is just a number. What the miners try to do is to feed a nonce - which is a random number as well - to a hash function and produce a number less than the difficulty target. Many think that it requires, from the miner’s part, to find x leading zeros in front of a hash but this is not the case. It just wants a number smaller than the difficulty target. The zeros come about as placeholders. For example, if you have the number 1000, a smaller number would be 999. This number would be represented as 0999 so the same length of characters is preserved.
The network expects 2016 blocks to be created in a span of two weeks. If there are more blocks than that the difficulty gets harder meaning the difficulty target is lowered. This means mining blocks got easier. On the other hand, if fewer than 2016 blocks were created then the difficulty is lowered meaning the difficulty target increases so blocks can be mined faster.
Another way to think about it is like a dance of limbo (props to Andreas Antonopoulos)where the shorter the stick gets the harder the dance becomes.
The Genesis Block
The Genesis Block is the first-ever block added to the Bitcoin network. Every first block on any blockchain is called the genesis block. By being the first one it has no reference to the previous block’s hash. In these cases, the block is hardcoded to the software.
Transactions and Digital Signatures
In order to validate the authenticity and integrity of a transaction, it has to be signed by the issuer. In the real world if you want to sign a document you just use your handwritten signature but in the digital world, you use a digital signature that adds a layer of complexity and security in order for the signature to retain its uniqueness. Digital signatures leverage the power of public-key cryptography(PKC) **which is a branch of cryptography that dates back to the 70s, and of hash functions mentioned above.
In a nutshell, you as a signer have a public key that is known by everyone and a private key that is known only by you. Both keys are mathematically related so encryption and decryption can take place. You can sign a transaction with your private key and share your public key so the receiver can use it to verify the authenticity of the transaction.
By adding hash functions to the picture the process becomes more robust. We hash the data and pass both the private key and the now hashed data from a signature algorithm(RSA, DSA, etc…) in order to generate the signature. Hashing the data is a good practice because passing all the data to the signature algorithm would be costly since the size would be unpredictable. With hashing, we can be sure the data have a fixed size. The receiver having the public key and the message hashed can now pass it through a verification algorithm and see two things. First that the sender is indeed the one who claims to be, and second that the data were not changed in any way during the transaction process. The receiver can be sure of this since any change in the information would result in a completely different hash thus there would be a conflict in the verification process.
The Double Spend Problem and Order of Transactions
Every transaction needs to be confirmed. Once a transaction happens(you send money) it goes into a pool of unconfirmed transactions waiting to be confirmed by miners. Every transaction can optionally include a small fee for the miners, incentivizing them to pick your transaction faster.
Once is added to the block it has one confirmation. After 10 minutes another confirmation happens(new block added). A transaction from 10 blocks back has 10 confirmations. For small transactions, you the recipient might choose to accept a transaction with just one confirmation, while for large transactions it is advised to wait for 6 confirmations. This reconfirmation of each block adds additional trust to the network and to the immutability of its history. The more a block is buried deep into the blockchain the more permanent it is.
In the case when we have a temporary history divergence of one block — two competing histories — that means that the transaction has one confirmation. If you as the recipient are willing to take the risk and accept it you can have one of either scenarios.
- The block with your transaction was the one that was added to the blockchain history making it successful.
- The block with your transaction was discarded so it is like the transaction never took place. It’s not added to the blockchain’s history and you just lost money.
If you are exchanging let’s say 10 dollars of value, one confirmation might do since it’s only 10 dollars. If you are sending one million one confirmation might be too risky and might need 5 or 6.
What will happen if I make two payments with the same amount before any of them is picked up by the miners?
This is the classic double-spent problem. Once the first transaction is confirmed, ownership is assigned to the recipient. When the second transaction gets to the verification process it will be marked as invalid since the history will have already been updated.
But what happens if two miners solve this cryptographic puzzle at the same time?
This is a scenario that is expected to happen in most protocols. Let’s take the case of the Bitcoin network as an example.
If we have two blocks extending the latest block we end up in a place where we have two competing histories and half the network sees A and the other half sees B. Again this is expected.
This happens approximately once a week but it gets resolved. If we have two blocks again mined at the same time but one was built on top history A and the other on top history B then again we have to wait for the longest one. This occurs even more rarely like a couple of times a year. A diverged history of three blocks, which is even rarer, might happen in theory once every few years.
(In the case of Bitcoin such a case has never been observed since its inception)
A four-block diversion every few decades. That is why waiting for six confirmations is recommended. A six-block divergence, in theory, can happen once every few millennia.
And what happens to all of the transactions in the block that was rejected?
Transactions of a discarded block will go again to the mempool of unconfirmed ones. If the accepted block and the discarded block share the same transaction it will stay confirmed as is.
Here, in this thread Andreas Antonopoulos elaborates more on a case of an expected temporary history divergence that was picked(and twisted) by the media.
Scaling a Blockchain
A recurring concern and topic of discussion is scaling. Can a blockchain really scale? Sure it can operate with a few hundred users but what happens with millions of users making transactions all the time?
Centralized systems scale better because they do have a central authority controlling and developing them. Constant iterations take place and bugs are fixed on the fly. The concern is how can you scale a system that by nature should be hard to update and impossible to reverse.
Layer 1 is the blockchain itself, the one the protocol is interacting with directly. Here things move slow solely for the reason that there are a lot of things to be done. Validation, mining, consensus, and so on. There is a time sequence where blocks are being added to the chain and a block has a limit on how many transactions it can handle. If a network is flooded with users it will be hard to settle all transactions. This results in transactions queued up that in turn push transaction fees up, making them expensive to use.
One way to combat this problem and scale a blockchain is with Layer 2 solutions. Layer 2 is built on top of Layer 1 and it is a way to handle transactions off the main chain and interact with it fewer times. An example of a Layer 2 solution is the Lightning Network with Bitcoin being the Layer 1 network.
There are several Layer 2 approaches being explored like Rollups and State machines(depending on the protocol) among many others using technologies like zero-knowledge proofs. This is a whole other chapter in and of itself.
Each protocol has a set of rules on how to interact with a blockchain which leads us to the famous block size discussion. Some protocols allow many transactions per second whereas others allow very few as Bitcoin does. There is this comparison that takes place often and it is the comparison of transactions per second between cryptocurrencies and Visa. The story goes that cryptocurrencies need to mimic credit card providers like Visa in order to scale and be used by millions of users because they can process thousands of transactions per second. This is not a valid comparison though. Visa is a layer built on top of the FedWire system(this is in the case for the US. Other countries will work in a similar way) which handles around 10 transactions per second which are not awfully more than Bitcoin. What Visa does is to batch thousands upon thousands of transactions into a single larger one and then make it a bigger transaction directed to Fedwire. Something similar is being done by the Lighting Network as well.
All concepts introduced here can be elaborated even further besides this high-level overview discussed here. Blockchains are still being explored and we are yet to see how far they can be stretched. The future looks promising.