The concept of hashing is
native to
programming, and involves the use of
mathematical functions to sort incoming data in a speedy, and organized fashion. One of the highest priorities in hashing is to organize the tables in such a fashion that we can get our
performance to be as close to
O(1) as possible.
In general, hashing refers to the use of a hashing function that requires a key for input. The key is a representative data structure that maps the input to its home bucket or bin, the location where it is to be stored in the table. In other words:
h(k)->{0,1,2...m-2,m-1}
where k is our key, and 0..m-1 are our buckets/bins.
Some of the ideals of hashing and hash functions are:
There is no set method or practice that will systematicially achieve a decrease in collision rate. In fact, sometimes it is desired to have more collisions to improve time complexity. Unfortunately, what is "desired" for collisions is ambiguous, and must be considered on a case to case basis.
For uniform key distribution, the condition must exist that P sub i, or the probability that h(k)=i, must be equal to 1/m, where m is the number of buckets.
Minimal key computation complexity is rather simple to understand. Simply put, we don't want to take mods of our data multiplied by irrational constants and other such nonsense. The hashing algorithm should be at least moderately intuitive.