How JavaScript uses hashing

What is Hashing?

- Hashing is generating a value or values from a string of text using a mathematical function.
- A hash function is any function that can be used to map data of arbitrary size to data of a fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.
- Hashing is one way to enable security during the process of message transmission when the message is intended for a particular recipient only. A formula generates the hash, which helps to protect the security of the transmission against tampering.
- Hashing is also a method of sorting key values in a database table in an efficient manner, because it is faster.
- A cyclic redundancy check (CRC) is an error-detecting code often used for detection of accidental changes to data. Encoding the same data string using CRC32 will always result in the same hash output, thus CRC32 is sometimes used as a hash algorithm for file integrity checks.
- Hashes play a role in security systems where they're used to ensure that transmitted messages have not been tampered with. The sender generates a hash of the message, encrypts it, and sends it with the message itself. The recipient then decrypts both the message and the hash, produces another hash from the received message, and compares the two hashes. If they're the same, there is a very high probability that the message was transmitted intact.
(techopedia.com & wikipedia.org & ebopedia.com)

What is Hash Table?

Hash Table is a data structure which stores data in an associative manner. In a hash table, data is stored in an array format, where each data value has its own unique index value. Access of data becomes very fast if we know the index of the desired data. Thus, it becomes a data structure in which insertion and search operations are very fast irrespective of the size of the data. Hash Table uses an array as a storage medium and uses hash technique to generate an index where an element is to be inserted or is to be located from.
Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key. Such collisions must be accommodated in some way.
In a well-dimensioned hash table, the average cost (number of instructions) for each lookup is independent of the number of elements stored in the table. Many hash table designs also allow arbitrary insertions and deletions of key-value pairs, at (amortized) constant average cost per operation.

There are two types of bucket.

- Open Hashing (Separate Chaining): In open hashing, keys are stored in linked lists attached to cells of a hash table. A scheme in which each position in the hash table has a list to handle collisions. Each position may be just a link to the list (direct chaining) or may be an item and a link, essentially, the head of a list.

- Closed Hashing (Open Addressing): In closed hashing, all keys are stored in the hash table itself without the use of linked lists.
Well-known probe sequences include:
Linear probing in which the interval between probes is fixed — often set to 1 and  is a form of open addressing. In these schemes, each cell of a hash table stores a single key–value pair. When the hash function causes a collision by mapping a new key to a cell of the hash table that is already occupied by another key, linear probing searches the table for the closest following free location and inserts the new key there. Lookups are performed in the same way, by searching the table sequentially starting at the position given by the hash function, until finding a cell with a matching key or an empty cell.
Quadratic probing in which the interval between probes increases linearly (hence, the indices are described by a quadratic function). When an incoming data's hash value indicates it should be stored in an already-occupied slot or bucket. Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found. Quadratic probing can be a more efficient algorithm, since it better avoids the clustering problem that can occur with linear probing, although it is not immune. It also provides good memory caching because it preserves some locality of reference; however, linear probing has greater locality and, thus, better cache performance.
Double hashing in which the interval between probes is fixed for each record but is computed by another hash function. Double hashing is implemented in many popular libraries, it uses one hash value as a starting point and then repeatedly steps forward an interval until the desired value is located, an empty location is reached, or the entire table has been searched; but this interval is decided using a second, independent hash function (hence the name double hashing). Unlike linear probing and quadratic probing, the interval depends on the data, so that even values mapping to the same location have different bucket sequences; this minimizes repeated collisions and the effects of clustering.

What causes a collision in hashing?

A collision or clash occurs when two different inputs to a function, typically one used to compress large data items into a smaller or fixed size, produce the same output, called (depending on the application) a hash value, checksum, fingerprint, or digest.
In other words: A Hash Collision Attack is an attempt to find two input strings of a hash function that produce the same hash result. Because hash functions have infinite input length and a predefined output length, there is inevitably going to be the possibility of two different inputs that produce the same output hash.

What is Map Objects?

Map is mainly used for fast searching and looking up data.
Map objects are collections of key/value pairs where both the keys and values may be arbitrary ECMAScript language values. A distinct key value may only occur in one key/value pair within the Map’s collection. Distinct key values are discriminated using the SameValueZero comparison algorithm.
Map object must be implemented using either hash tables or other mechanisms that, on average, provide access times that are sublinear on the number of elements in the collection. The data structures used in this Map objects specification is only intended to describe the required observable semantics of Map objects. It is not intended to be a viable implementation model.
In other words:
Map is a data collection type (in a more fancy way — abstract data structure type), in which, data is stored in a form of pairs, which contains a unique key and value mapped to that key. And because of the uniqueness of each stored key, there is no duplicate pair stored.
Example: {(1, “smile”), (2, “cry”), (42, “happy”)}
And what about Object?
Regular Object in Javascript is dictionary type of data collection . Which means it also follows key-value stored concept like Map.
Example: {1: ‘smile’, 2: ‘cry’, 42: ‘happy’}
Therefore, by definition, Object and Map are based on the same concept — using key-value for storing data. However, like we always say — same same but different — they are indeed quite different from each other
(tutorialspoint.com & wikipedia.org & stackoverflow.com & learncryptography.com & xlinux.nist.gov & developer.mozilla.org & medium.com)

What is difference between Hashing and Encryption?

The key difference between encryption and hashing is that encrypted strings can be reversed back into their original decrypted form if you have the right key. In symmetric key encryption, the key to both encrypt and decrypt is exactly the same. This is what most people think of when they think of encryption.

Encryption should only ever be used over hashing when it is a necessity to decrypt the resulting message. For example, if you were trying to send secure messages to someone on the other side of the world, you would need to use encryption rather than hashing, as the message is no use to the receiver if they cannot decrypt it.
(securityinnovationeurope.com)

How do hashing improve the security of your applications in JavaScript?

It was announced that a crypto api will be released for javascript. Meaning it will be possible to do real encrypting and decrypting, and generating hashes directly in the web browser. Now it's actually possible! It should be noted though that even if the api is a candidate recommendation, the browser support isn't the best just yet. When this is written most modern browsers supports it, except internet explorer. But given some time it should surely have improved. The crypto api will surely open up a lot of doors for web development. But there's actually a couple different opinions if this is really a good or a bad thing.
The SubtleCrypto interface represents a set of cryptographic primitives. It is available via the Crypto.subtle properties available in a window context (via Window.crypto).
This is something really great since it gives us the possibility to do things like computing authentication codes, signing and verifying documents, encrypting and decrypting values, importing and exporting keys, generating new keys, and even generating cryptographically strong random numbers in the browser and to verify that data has not been tampered with by a third-party.
This interface allows a script to access the following primitives:
- digest, the ability to compute a hash of an arbitrary block of data, in order to detect any change in it.
- mac, the ability to compute a message authentication code.
- sign and verify, the ability to digitally sign a document, and to verify a signature.
- encrypt and decrypt, the ability to encode or decode a document.
- import and export, the ability to import a key or export a key.
- key generation, the ability to create a cryptographically secure key, or key pair, without the use of base key, but using the available entropy of the local system.
- key wrapping and unwrapping, the ability to transmit, and to receive, a key from a third party, encoded using another key, without exposing the underlying key to JavaScript.
- random, the ability to generate cryptographically sound pseudo-random numbers.
(developer.mozilla.org & webbjocke.com)

Comments

Popular posts from this blog

Concurrency in Web Development

Big-O notation basics for web developers