Aswe’ve seen in the basic introduction to bitcoin transaction, bitcoins only exist as an amount associated to an address, and one owns an address if he’s able to prove to the network he owns the associated private key. Under their initially pretty simple appearance, very similar to that of a bank account number, lies a number of complex phenomena.
This posts introduces some cryptographic basis to explain how keys and addresses are generated, presents some of the address types, and explains some of the interesting aforementioned phenomena, such as address re-use and complex transaction scripts.
Cryptography is ubiquitous in today’s society. It is the pillar of secure communication throughout the internet, encryption of sensitive information, and much more. When talking about the blockchain, one specific type of cryptography is used, called public-private key cryptography. It is based upon the existence of special relationships between pairs of numbers, called private and public keys.
For encryption, distributing the public key is like giving open crates with locks. People can put whatever message they want in it and lock it, knowing that only the owner of the private key will be able to open the lock. As such, they do not need to worry about someone intercepting it, since the intruder wouldn’t be able to open it anyway. The technical explanation is out of this post’s scope, but if interested this article provides a simple explanation of RSA, the oldest such cryptosystem.
For signatures, the transfer of messages is not required. Instead, the target is for one to be able to prove they are the source of a message. The way these algorithms work is that they process any message together with a private key and generate a signature. These methods have the property that the public key can verify whether the signature was generated by the associated private key. Therefore, by providing a message, its signature, and the public key, the source of the message can be publicly authenticated.
Therefore, by providing a message, its signature, and the public key, the source of the message can be publicly authenticated
In bitcoins, the Elliptic Curve Digital Signature Algorithm (ECDSA) is used for signing (as the name suggests!). For a more detailed technical explanation as to how ECDSA works, I would suggest this great article, which explains it in details (Warning: it’s a technical article!).
Another important concept connected to cryptography and widely used in the bitcoin transactions are hashing algorithms. These are algorithms which, given any message, generate a “hash” (i.e. strings of seemingly random characters) of a specific length, in such a way that it is straightforward to find the hash from the input string, but very hard to get the input message back from the hash. Here’s an interactive website to see what hashes look like.
Addresses come in a few shapes, each of which has its characteristics. Up to now, I’ve been using the public key as the bitcoin address, but really this is only one type of bitcoin address, with its own payment method called Pay To Public Key, and the one which is being deprecated more and more. Instead, nowadays most bitcoin addresses are generated from the public key. There are two important bitcoin addresses formats: the Public Key Hash, and Bech32.
Both simply generated by hashing the public key in specific ways. These address types were created for two main reasons: to make addresses more legible by shortening them, and to add an extra layer of privacy.
Many wallet softwares and services show your wallet as a single bank account, with a given balance, so that when you make transaction, you only specify the target address and the amount, and the rest is taken care of for you. Because of that, it induces into thinking that you only have one private-public key pair, and all the transactions are either in or out. In reality, what’s hidden by these softwares hide that you have (or rather should have!) multiple private keys, and the associated public keys, and addresses.
As mentioned in the previous article, a transaction does not require all inputs to come from the same address, it only requires that you prove ownership of each of the addresses. Thanks to this, having your bitcoins dispersed into several addresses is not a problem. The software’s job is to act as merger, taking care of all the details when making a transaction, including creating new key pairs very often, usually even after every transaction.
There are good reasons for that. The main one is privacy. One of the cons of the decentralised property of bitcoins is that all the information is available to everyone. If you were to use the same bitcoin address every time, it would become very easy to track every single transaction you are involved in by simply monitoring your address. Except for the fact that it’s intrusive, it may also be dangerous. If someone was able to physically track you down (e.g. you posted your bitcoin address on your website) they would know exactly how much you own, and could find you and extort you in one way or another. Instead, by constantly changing address, you make this tracking pretty much impossible.
If you were to use the same bitcoin address every time, it would become very easy to track every single transaction you make, by simply monitoring that address.
Using a past transaction output
As previously mentioned, transactions are made to a bitcoin address. When you wish to use an unspent output for a transaction, you first need to provide the information to find it in the blockchain. Then, in the simplest case, provide the public key as well as a signature, which together prove that you are the owner of the private-public key pair, and will give you clearance to use the unspent amount as inputs for the transaction.
You can repeat this for multiple unspent outputs in order to reach the amount you wish to transact. A consequence of the indivisibility of bitcoin outputs is that it is very unlikely that you add up exactly to the value you wish to transact. Instead, you will pretty much every time end up with a total higher than what you wish to use.
If you were not to do anything about it, this extra amount would automatically be transformed into a transaction fee, which the miner (i.e. the computer which processes the transaction) of the block would take for himself. Of course, you might not want to do that, since this bitcoin “change” can be quite large, so what is commonly done is simply sending it back to a new address of yours as an unspent output.
However, transactions can actually be a lot more complicated. When creating an output, the sender can choose to add conditions, which are encoded in the scriptPubKey. These conditions can vary widely, they can include the the need for a password, for multiple specific signatures, etc.
The scripts can become very long and complicated, and since heavier transactions are longer to mine and put more strain on the network, they usually require higher transaction fees to be interesting. The core developers noticed this, and decided to create the Pay-To-Script-Hash addresses.
In the bitcoin world, a script is a list of instruction which can be made of any number of transformation. There’s a list of allowed functions, which you can find here. Most of the basic operations which you can think of are allowed, such as if conditions, bitwise logic, and obviously some more complex cryptographic functions such as hashing algorithms.
Theses addresses are very interesting, since they are generated directly from hashing the script which contains all the conditions, called “redeem script”. That is, we literally hash the text made of the instruction, and use that as an address. In addition to this memory (and therefore transaction fee) gain, it moves the responsibility of writing the redeem script from the sender to the receiver.
The redeem script is nothing else but the script that sets the conditions of payment. The most common type of operation done in this script is OP_MULTISIG, which allows multiple public keys to redeem the bitcoin output. It is important to notice that this redeem script is not public until the unspent output is used as input: it is payee’s responsibility to provide the script!
The multi-signature operation is just one of the conventional basic scripts. As stated earlier, there are several operations allowed by these scripts, making it possible to create complex conditions, often called “digital contracts”. For example, let’s think of a practical example. Suppose I’m the middle man in an agreement, responsible for a deposit in case of dispute. I could make a script such that either the client or the owner get the money but which can only be unlocked if I provide some hidden information, for instance a signature using my private key. This way, in case of a dispute, the money is frozen (at no cost) until I make a choice as a judge, and give the information required to one of the parties to unlock the deposit.
There is a lot more information to be said about transactions. For instance, we didn’t go through exactly how the transactions are signed and verified. Additionally, the segwit soft fork (which was an update to the network) was not discussed, even though it it is directly connected to how transactions are signed, and why P2SH and bech32 addresses are becoming more and more popular, which I will discuss further in the future.
In the next articles, we will be looking more closely at the blockchain, and shall discuss the security of transactions. If you’ve enjoyed the series up until now, don’t hesitate to follow so you’ll be notified when the next episode is published!