What is Hashing? – A complete Guide for Beginners
- What is Hashing?
- Benefits of using hashing.
- What are the properties of hash functions ?
- What are hashing algorithms or What are Hashing Techniques?
What is Hashing?
Hashing is an algorithm (hash function) to convert a string of characters into a fixed sized text using mathematical functions. The file to be hashed is known as “input” the algorithm used in known as ”hash function” and output is called “Hash Value”, some people call hash value as message digest. Hash value is the value that dictates what exactly in this file and always produces hexadecimal value.
Let’s put it in this way, let’s assume you have many letters at your home from telecommunication companies. You take all of the letters and put that into an envelope and close it. On that envelope you write “Telecom letter”. Now whenever you need to look for letters from telecommunication companies you will look for an envelope with “Telecom letter” on it.
In this case “Telecom letter” is the hash value of all letters you stored from the Telecom company in that envelope. That makes the envelope different from other envelopes you have in your home.
Hashing allows us to protect the information and store it securely. Hashing is also known as a one way procedure because it can’t be reversed. It is not like encryption that once you encrypt the file you can decrypt it back to its original state to read what is in it. Main difference between encryption and hashing is Encryption is used to encrypt or secure the file while Hashing is used to verify the originality of it. In hashing once the file is hashed it can’t be reversed to its original form.
Now the question is, if we cannot reverse the hashed data what is the use of hashing and why hashing is important?
Benefits of using hashing.
1. Password Protection
Password security: Hashing is used by companies in order to protect their customer username and password. It adds an additional layer to security after encrypting the credentials.
Joe entered a password “Idonotunderstand1&” on a website in www.example.com . First, if this company really cares about their customer they will hash the password so nobody can read from the database.
Hashed password: a5842qw821a2d48d57d2q578962d0q958752
2. Transaction Protection
Hashing is also used in online or offline transactions. Once you made the payment your card details and other credentials are hashed and then sent to the server. Server will verify the passed value to see if any changes have been made over the internet or not. If the server detects that the hashed value after the user submit their credentials and hashed value of transaction server receives is different server will not proceed the transaction.
3. File Integrity
Hashing is used to verify the file integrity. As more and more attacks are performed it is important to verify whether the file we are going to download is the legit file or not. Because some hackers will put the file onto the server of a website with malware (type of Virus). As soon as you hit download a file will be downloaded on your machine but it will come with malware. That is why some websites do mention the hashed value of the file. After downloading the file if you try to hash and the output value does meet the hashed value given on the website it means the file is corrupted or it’s infected.
Hashing is usually known for its unique properties that make hashing algorithms efficient. Let’s take a look at some of Hash Properties.
What are the properties of hash functions ?
As we already know, hashing is a one way procedure we cannot reverse once the file is hashed. Hashing is only used to verify the integrity of a file or to track if any changes have been made or not.
2. Collision Free
For every different file there will be different hash values provided by a hash function.
Example: Bob and Mike create different passwords and the hash value for both passwords will be different.
There is a minimal chance of the same hash value for different files. A point to notice that If a same file is hashed again using the same hash function it will provide the same output.
3. Fast Speed
A hash function that takes longer time to hash the file will slow down the procedure. A Hash function should be fast enough to hash the large amount of data quickly enough. A hash function should be fast and secure enough.
What are hashing algorithms or What are Hashing Techniques?
1. Message Digest (MD)
The MD family contains different versions of hashing functions such as MD2, MD4, MD5. MD5 was the most popular and widely used hash function during its starting days. MD5 was designed by Ronal Rivest also known as “fathers” of modern cryptography.
MD5 was published in April 1992 in order to replace MD4. MD5 digests have been widely used in the software world to provide assurance about integrity of transferred files. But in 1996 it came to know that MD5 was not a collision. Many frauds can be done through MD5 and hence it’s no longer recommended.
2. Secure Hash Algorithm (SHA)
SHA was developed by the National Security Agency. There are different types of SHA functions such as:
SHA-0 is the first function of the SHA family. It was published in 1993 by the National Institute of Standards and Technology (NIST). It was not widely used because it had some vulnerabilities that were overcome by the new algorithm SHA-1.
SHA-1 was a widely used algorithm. produces a 160-bit (20-byte) digest. It’s rendered as a 40 digits long hexadecimal number. It was used in several widely used applications and protocols including Secure Socket Layer (SSL) security.that was used to secure the website.In 2005 some collisions were found in SHA-1 and it was compromised in 2005 SHA-1 was completed ended after three big companies Microsoft, Google and Mozilla stop accepting SHA-1 SSL certificate on their browsers after multiple successful attacks.
SHA-2 family has six hash functions with digests: SHA-224, SHA-256 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256.Depending on their number it produces the hash value. For now no attack is found against SHA-2 but as SHA-2 shares the same structure and mathematical operations as its predecessor (SHA-1) there might be a chance of SHA-2 to be attacked near in future.
SHA-3 was introduced in 2009 and in 2012 the NIST chose the Keccak algorithm as the new SHA-3 standard.It is developed byuido Bertoni, Joan Daemen, Michaël Peeters and Gilles Van Assche. SHA-3 is more secure and 70-80% faster than SHA-2.
Hash Based Message Authentication Code. In HMAC a hash is combined with a secret key such as HMAC-MD5, HMAC-SHA1. HMAC also provides authenticity along with data integrity. It is used in Network Encryption Protocol such as IPsec and TLS (new security certificate to secure data instead of SSL).
The RIPEMD is also known as RACE Integrity Primitives Evaluation Message Digest. It was designed by the open research community and generally known as a family of European hash functions.There are different types That includes RIPEMD, RIPEMD-128, and RIPEMD-160. It also has 256, and 320-bit versions of this algorithm. Original RIPEMD (128 bit) is based on the design principles used in MD4 and has some security vulnerabilities.
RIPEMD 128-bit version came to fix, replace and overcome vulnerabilities on the original RIPEMD. RIPEMD-160 is an improved version and the most widely used version in the family. The 256 and 320-bit versions does reduce the chance of accidental collision, but do not have higher levels of security as compared to others.