Hash functions

A function that can map any data of arbitrary size to some data of fixed size is called a hash function. More precisely, if f(x) is a hash function, then for every possible string x, there exists a value of fixed size y such that y = f(x). The fixed size is known as the size of the hash. Another important thing that makes hash functions useful is that they are quickly computable, which accounts for their usage in hash tables to quickly locate a piece of data stored earlier.

To explain with a simple analogy, this is the same thing that is used by a dictionary. Let's say that f(x) = The first letter of the string x. Then essentially we are putting all strings starting with the same letter in the same category (a bucket). A good dictionary is marked with bookmarks at the starting page of each letter, and is sorted lexicographically, therefore if we want to search for the word currency for example, we would directly jump to the c bookmark (here c is the hash for the word 'currency') and do a linear search from there. This makes the search extremely fast. Of course, multiple strings are bound to have the same hashes (see collisions).

Cryptographic hash functions

For us, not all hash functions are interesting. We are rather interested particularly in cryptographic hash functions, which are special hash functions with these additional properties:

  • They are collision resistant
  • The hashes they produce cannot be reverse traced to the input
  • It is not feasible to produce any randomly desired output as a hash

Let's explore these properties in a little detail.

Collision resistance

In the context of hash functions, when two different inputs produce the same output, it is known as a collision. In the above example of a dictionary, all the words starting with the same letter will collide as they all have the same hashes. Now if you focus on the definition of hash functions, see that there exist collisions for each and every hash function. We can never have a hash function which has absolutely 0 collisions. Why? because the input can be of arbitrary size, while the output is limited to strings of some fixed size only. Hence there must exist collisions, no hash function is collision free. It is important to understand the difference between collision-free and collision-resistant.

So what do we mean by collision resistance?

A hash function \(H\) is said to be collision resistant if it is infeasible to find ​two values, \(x\) and \(y\), such that \(x \neq y\), but \(H(x) = H(y)\).

Note that the definition says 'infeasible' and not 'impossible'. In plain words, there exist collisions, a lot of collisions, an infinite number of collisions, but nobody can find a single one of them. That's what makes up the first property of cryptographic hash functions. It may sound a loosely held assumption that nobody can find a collision, but let's have faith for now. On a side note, can you think of a procedure that is always guaranteed to find a collision? ... One such method is as follows: Suppose the fixed size of the output of \(H(x)\) is \(256\) bits, then we know that there are only \(2^{256}\) possible outputs. So we can simply take any random and distinct \(2^{256} + 1\) strings and compute their hashes, and there is guaranteed to be at least 1 collision.

Just look at that huge number! On average, one needs to compute \(2^{128}\) hashes to find at least one collision in a 256 bit hash. Even if a computer computes 1 million hashes / second, it will take about \(10^{25}\) years to find a collision. If that doesn't tell you how difficult it is, recall that the age of the universe is just about \(1.38 \times 10^{9}\) years.

Obscure inverse function

Don't know what an inverse function is? Look it up here.

If \(H(x)\) is a cryptographic hash function, then given a particular value \(y\), it must be infeasible to figure out the value of \(x\) such that \(y = H(x)\).

In simple terms, it must be infeasible to find \(H^{-1}(y)\) for any randomly given \(y\). But note that this is clearly not achievable if the number of possible inputs (domain) and thus the output space (range) is very small. For example in a dictionary, for a given hash (let's say C), we can very easily find numerous values of \(x\) such that \(H(x) = C\) because every word starting with the letter C was mapped to the hash 'C'. But if we consider our previous example of the 256 bit hash, then it's not very difficult to see that it is quite impractical to do it with this hash function.

Fortunately, we can very easily obscure the inverting of even a small range hash function, by combining each input with a highly spread out random variable. Therefore, formally, a hash function \(H(x)\) fulfills the obscure inverting property if on choosing a value \(a\) from a high min-entropy probability distribution and being given the value of \(H(a + x)\) (where + means concatenation), it is infeasible to find \(x\).

To have a better sense of what min-entropy means, let's try to befriend the term. Min-entropy is a measure of how predictable an outcome is. Now intuitively, for higher security we want the outcome to be the least predictable, or most random, and this idea is beautifully captured by high min-entropy. So high min-entropy simply means that if we sample a data point from the distribution in consideration, there is no particular value which is noticeably more likely to occur than other values. Min-entropy simply gives this likelihood a number and measures it as a mathematical quantity, which makes it easy to mathematically quantize the obscureness.

A real puzzle

This is the third property of a cryptographic hash function, also known as puzzle friendliness. To understand this completely, we will first look at the mathematical definition and then look at the intuition to realize why it is so useful.

A hash function \(H\) is said to be puzzle friendly if for every n-bit output value \(y\), if \(k\) is chosen randomly from a distribution with high min-entropy, then it is infeasible to find \(x\), such that \(H(x + k) = y\) in time significantly less than \(2^n\).

What does this mean. Well, go ahead with the intuition, it means that for a given hash function, it must be super-difficult for anyone to make the hash function output a specifically desired value, which means that no one can make the hash function produce whatever and whenever they want just by controlling the input \(x\), and there is no solving strategy for finding \(x\) which is much better than simply trying random values of \(x\) to produce \(y\).


SHA (pronounced sha as in sharp, abbreviation for Secure Hash Algorithm) is a cryptographic hash algorithm (you know what that means now). Why are we talking about this particular algorithm when there are so many other cryptographic hash algorithms? Because this is used a lot in bitcoin and it is one of the top-notch guys in it's category.

Merkle-Damgard Transform

We want our hash function to work on inputs of arbitrary length and produce fixed size output rather than working on fixed length. But we have not seen up to now how to create this feature out of ordinary fixed length input hash functions. Merkle-Damgard transform is a scheme designed for specifically this purpose. SHA-256 also uses this scheme. So we want to convert a function \(H(x)\) (a hash function which will be called a compression function now) which takes fixed size inputs and produces fixed size outputs to a function \(F(x)\) which can take arbitrary inputs and produce fixed size outputs. Let's see how we can wrap the function \(H(x)\) inside a scheme and use it to work on inputs of arbitrary lengths.

The scheme is very simple (except the super-cool name). If the compression function \(H(x)\) takes inputs of size (in bits) \(m\) and generates output of fixed size \(n\) where \(m > n\), then this is what we do:

  1. Initialize a sequence of \(n\) bits. This is known as the Initialization Vector (IV) and you can look it up in a standards document. An example document section for SHA-256 can be found here
  2. Break the input into blocks of size \(m - n\)
  3. Take the first block and concatenate it to \(IV\), and feed the result to the compression function. Note that the total input length is now \((m-n) + n = m\), which is what the compression function takes
  4. Take the output in 3 as the new \(IV\) and repeat with subsequent blocks of the original input
  5. Return the output for the last block as the final result

For SHA-256, \(m = 768\) and \(n = 256\). The original input is padded so that it becomes a multiple of \(m-n = 512\).

In the next section, we will look at using hash functions to build data structures that will help us implement the bitcoin architecture.

results matching ""

    No results matching ""