What is Hashing? How Do Hashing Algorithms Work?

İnanç Yılmaz
5 min readDec 4, 2022

--

If you're interested in software, you've probably come across these terms. All these terms were breaking points for me when I learned them. We'll talk a little more than as much as a client developer needs to know about these terms.

What is Hashing?

Hashing is the irreversibly transforming of any given key or a string of characters into another value.

What We're Using Hashing For?

Hashing is preferred, especially in banking applications such as e-signature and POS devices. They are used to provide security in this and many other areas. Assume we have a website and need to create a login strategy. We can put a username field quickly, but how do we handle storing and verifying users' passwords? Should we store passwords just straight? Or do we need to add another securing step to storing passwords?

It's dangerous to store users' passwords in DB. To do that, we're using hashing algorithms. For example, let's create a password and let's choose the password "123".

Hashing Process(MD5)

We transformed the password to a value using an MD5(We'll talk about later) Hashing algorithm. So we have a meaningless string. Therefore we can store the string in our DB. Also, when a user tries to log in, we can compare the hashing version of users' entered passwords between the password stored by the user's nickname in our DB. Therefore, the main reason to use hashing algorithms is security concerns.

How Hashing Works?

Hashing is a method of encrypting data and turning it into a fixed-size value that is known as a hash value. This value can be used to verify the integrity of the original data.

The process of hashing typically involves taking a piece of data and passing it through a mathematical algorithm, which produces a unique hash value. This value is typically much shorter than the original data and is often represented as a string of characters.

One of the key properties of a good hash function is that it is one-way, meaning that it is easy to compute the hash value for any given piece of data, but it is very difficult to determine the original data from the hash value. This makes it difficult for anyone to tamper with the data without being detected.

Another important property of a good hash function is that it is collision-resistant, meaning that it is very unlikely that two different pieces of data will produce the same hash value. This ensures the uniqueness of the hash value and helps to prevent data tampering.

Overall, hashing is a useful way to protect data and ensure its integrity.

Why Hashing is Irreversible?

When I was doing some research about hashing, I realized that I had forgotten to functions topics in maths. I was thinking like every function could be reversible, even if they could produce a unique state.

Let's take a look at functions to understand better.

Diagram of Injective Function

At the Injective function, as you can see, all the values have a unique equivalent. So that's right exactly where I struggled. But all functions don't have to be Injective. Let me explain simply.

Sample function results

As you can see, when we're taking the mod of those values, the results could be the same :). That means if you even know the result of mod operations and according to which number the mod is taken, you can't reach the first value.

Diagram of Surjective Function

The above as you can see, we can't reverse the function result. We need to remember something; you shouldn't think of B values in the diagram as the result of hashing; we're investigating the hashing function's works. It’s just a small function in the algorithm, but with this, it’s made irreversible. We’ll take look at producing unique values down below. We just learned basically how hashing functions are irreversible.

So far, we're okay, but in the diagram, the A columns' values can have the same responses. But how does hashing provides to produce unique values?

How Hashing Algorithms Produces Unique Values?

We can understand like hashing functions can generate unique values even if we try with the same value. It doesn't make any sense. This is impossible, as you all know. So hashing is just trying to mix the value. And hashing tries to mix the value multiple times. I tried to do some research, and even if hard to understand how hashing mixes values, I’ll try to explain.

Let’s pick a value to hash it and imagine this is its equivalent in the ASCII table.

The equivalent of our value in the ASCII table

And let’s try to create some confusing functions.

Custom confusion algorithms

So I created custom algorithms. These are hash functions that cannot be associated with the main value, and they are used more than once during the calculation. So by using these functions, we can generate unique values. In fact, real hashing algorithms like the SHA use the same functions. But their confusion functions are much longer. That’s the only difference. And when you merge all these functions (with other some little steps), you can produce the final hashing value.

How Hashing Algorithms Can Produce Unique Values?

The answer is they’re using some confusion functions. I know it’s not a surprise, but I’ll talk about later How SHA256 works later completely, but. Even though let’s try to create a confusion function together.

Difference Between MD5 and SHA

Long Form

Moreover, MD5 stands for Message-Digest Algorithm, while SHA stands for Secure Hash Algorithm.

Security

An important difference between MD5 and SHA is that the SHA is more secure than MD5.

Speed

Also, another difference between MD5 and SHA is that MD5 is faster than SHA.

Attacks

Moreover, many attacks have been reported on MD5, whereas there are not many attacks reported on SHA. Hence, this is another difference between MD5 and SHA.

See also;

Resources

--

--

İnanç Yılmaz
İnanç Yılmaz

Written by İnanç Yılmaz

Android Developer, Industrial Engineer

No responses yet