Hash Map (Part 1)

Hash map (Hash table) is a data structure that maps keys to values with constant time. It is a very useful and commonly used data structure by programmers either for development or competitive programming. C++ programmer often use STL map container class as a hash map and Java programmers use HashMap collection class. Being a programmer, everyone knows that put and get operations of hash map have same complexity and it is O(1). Having this highly efficient data structure programmers often use hash map to store key-value information without any delay. But most of us perhaps never thought how hash map achieve this constant time complexity. In this article we will cover basic ideas behind hash map implementation and a simple implementation of hash map.

Simple Hash Map Implementation:

There are 3 things needed to implement a simple hash map.

  1. Hash code: You can generate hash code (typically an int or long variable) of any other objects mathematically.
  2. Bucket: A bucket is nothing but an index of a finite array.
  3. Linked List: Here singly linked list if enough.

In this simple implementation lets consider we have an array (bucket list) of finite size 8. Our hash method will return value less than 8 guaranteed. When putting key-value pair in hash map, if hash method gives output 3 as hash code of the key, we will simply put that key-value pair in index 3 of the array. Note that hash code of 2 different keys could be same and this situation is known as collision of hash code. As collision is unavoidable because of infinite keys and finite size of array there should be some other mechanism to encounter this collision. That's why to keep multiple key-value pairs in same bucket, linked list is required. Each bucket maintains it's own linked list so that in case collision happen it could store many key-value pair. Isn't it simple ?

Lets see steps of put and get operation precisely:

Put Operation:

Consider a key value pair (Key: "hello", Value: "world").

  1. Calculate hash value of key "hello". Suppose hashCode("hello") = 3.
  2. Put ("hello","world") in bucket 3. If there is already a pair in this bucket add ("hello","world") at end of the linked list maintained as this bucket.
  • Complexity: O(1) at best case, O(n) in worst case

Get Operation:

Consider a key value pair (Key: "hello", Value: "world").

  1. Calculate hash value of key "hello". Suppose hashCode("hello") = 3.
  2. Go to bucket 3. Search for the key-value in the linked list maintained as this bucket.
  • Complexity: O(1) at best case, O(n) in worst case
Insights:
  1. Performance of bucket based hash map implementation mostly depends on hash calculation. Collisions degrade hash map performance. For 100% collision lookup time is O(n). For a good implementation of hash calculation that keeps minimum collision (almost 0% over time) lookup time is O(1).
  2. Performance of bucket based hash map could be increased by using resizable bucket list (an array indeed). That means when all buckets are full with at least one key-value pair the bucket array doubles in size. Each doulbing takes only O(n) time to copy existing elements to newly allocated array space. This way amortized runtime would be O(1) for n operations. Number of buckets known as capacity.
  3. Performance of bucket based hash map also depends on another thing known as load factor. This indicates maximum number of elements a bucket can contain. If any of the existing bucket contains more than load factor amount of elements than bucket list/capacity will be increased.
  4. Java HashMap: Java HashMap implementation exactly depends on capacity and load factor and work as I described in last 3 insights.
  5. STP Map: STL Map use Red-Black tree in which put, get, remove operation both have O(lg n) complexity but it keeps element in order where bucket based hashmap unordered.
Full Code:
package com.mahbub.algorithms;

public class HashMapSimple_Demo {

    public static void main(String args[]) {
        HashMapSimple hashMapSimple = new HashMapSimple();
        hashMapSimple.put("email", "mahbub.kuet@gmail.com");
        hashMapSimple.put("username", "iamcrypticcoder");
        hashMapSimple.put("designation", "Sr. Software Engineer");

        // Replace current value
        hashMapSimple.put("username", "another_username");

        System.out.println("email: " + hashMapSimple.get("email"));
        System.out.println("username: " + hashMapSimple.get("username"));
        System.out.println("designation: " + hashMapSimple.get("designation"));
    }

    static class HashMapSimple {
        public static final int BIG_PRIME = 101;
        public static final int BUCKET_SIZE = 8;
        Node[] buckets;

        public HashMapSimple() {
            buckets = new Node[BUCKET_SIZE];
        }

        // Complexity: O(1) ~ O(n)
        public boolean put(String key, String value) {
            int hash = calcHash(key);
            int bucketIndex = hash % BUCKET_SIZE;

            // No node available in this bucket
            if (null == buckets[bucketIndex]) {
                buckets[bucketIndex] = new Node(key, value);
                return true;
            }

            Node current = buckets[bucketIndex];

            // If duplicate key found in the first node replace by new value
            if (current.key.equals(key)) {
                current.value = value;
                return true;
            }

            // If duplicate key found in any node replace by new value
            boolean isReplaced = false;
            while (null != current.next) {
                if (current.key.equals(key)) {
                    current.value = value;
                    isReplaced = true;
                    break;
                }
                current = current.next;
            }

            // If not replaced add new node
            if (false == isReplaced)
                current.next = new Node(key, value);

            return true;
        }

        // Complexity: O(1) ~ O(n)
        public String get(String key) {
            int hash = calcHash(key);
            int bucketIndex = hash % BUCKET_SIZE;

            Node current = buckets[bucketIndex];

            while (null != current.next) {
                if (current.key.equals(key)) break;
            }

            return current.value;
        }

        // Complexity: O(1) ~ O(n)
        public boolean remove(String key) {
            int hash = calcHash(key);
            int bucketIndex = hash % BUCKET_SIZE;

            Node current = buckets[bucketIndex];
            Node prev = current;

            while (null != current.next) {
                if (current.key.equals(key)) break;
                prev = current;
                current = current.next;
            }

            // If matched to first node of linked list
            if (prev == current) {
                buckets[bucketIndex] = null;
                return true;
            }

            prev.next = current.next;
            return true;
        }

        private int calcHash(String str) {
            int sum = 0;
            for (int i=0; i < str.length(); i++)
                sum = (sum + (int)str.charAt(i)) % BIG_PRIME;
            return sum % BIG_PRIME;
        }
    }

    static class Node {
        String key, value;
        Node next;

        public Node(String key, String value) {
            this.key = key;
            this.value = value;
        }
    }
}
Blog Comments powered by Disqus.

Next Post Previous Post