Pages

Thursday, June 14, 2012

LinkedHashMap

Internal workings of LinkedHashMap

LinkedHashMap vs. HashMap


LinkedHashMap is a HashMap that also defines the iteration ordering using an additional data structure, a double linked list. By default, the iteration order is same as insertion-order. It can also be the order in which its entries were last accessed so it can be easily extended to build LRU cache.

Data Structure


The data structure of LinkedHashMap extends that of HashMap.

In HashMap, the data structure is based on array and linked list. An entry finds its location in the array based on its hash value. If an array element is already occupied, the new entry replaces the old entry and the old entry is linked to the new one.


In HashMap, there is no control on the iteration order.

In LinkedHashMap, the iteration order is defined, either by the insertion order or access order.

LinkedHashMap differs from HashMap in that it maintains a doubly-linked list running through all of its entries. The below one is a modified example of the above data structure. It defines the iteration ordering based on the order in which keys were inserted into the map. In order to do so, the entry element is extended to keep track of the after and before element. A zero size LinkedHashMap contains just the Head element with before and after pointing to itself.


LinkedHashMap data structure
LinkedHashMap data strutcture




Below is the HashMap data structure:

HashMap Data Structure
HashMap data structure









Entry



LinkedHashMap's Entry extends the HashMap's Entry so it also inherits the same properties key, value, hash and the next Entry sharing the index. Other than these, it also has couple of additional properties to maintain the double-linked list, after and before entries.


LinkedHashMap Entry class
LinkedHashMap Entry class




New Entry


LinkedHashMap inherits HashMap so its internal data structure is same as that of HashMap. Apart from that it also maintains a double-linked list which is circularly linked via the sentinel node called head. Each node contains references to the previous and to the next node . A new node is always added to the end of the list. In order to do so, the last node's and the header node's links have to be adjusted.

  1. The new node's next reference will point to the head.
  2. The new node's previous reference will point to the current last node.
  3. The current last node's next reference will point to the new node instead of head.
  4. Head's previous reference will point to the new node.
after  = head;
before = head.before;
before.after = this;
after.before = this;



Double linked-list
Double linked-list





Performance is likely to be just slightly below that of HashMap, due to the added expense of maintaining the linked list.

Access Ordered


A special LinkedHashMap(capacity, loadFactor, accessOrderBoolean) constructor is provided to create a linked hash map whose order of iteration is the order in which its entries were last accessed, from least-recently accessed to most-recently. Invoking the put or get method results in an access to the corresponding entry. If the enclosing Map is access-ordered, it moves the entry to the end of the list; otherwise, it does nothing.
   public void testLinkedHashMap() {
        LinkedHashMap lru = new LinkedHashMap(16, 0.75f, true);
        lru.put("one", null);
        lru.put("two", null);
        lru.put("three", null);

        Iterator itr = lru.keySet().iterator();
        while(itr.hasNext()) {
            System.out.println(itr.next());
        }

        System.out.println("** Access one, will move it to end **");
        lru.get("one");

        itr = lru.keySet().iterator();
        while(itr.hasNext()) {
            System.out.println(itr.next());
        }

        System.out.println("** Access two, will move it to end **");
        lru.put("two", "two");

        itr = lru.keySet().iterator();
        while(itr.hasNext()) {
            System.out.println(itr.next());
        }
    }
Result:
one
two
three
** Access one, will move it to end **
two
three
one
** Access two, will move it to end **
three
one
two
Thus in access-ordered linked hash maps, merely querying the map with get is a structural modification.

Iterator


In HashMap, the iterator has to traverse through each table element and the element's own linked list, requiring time proportional to its capacity.
In LinkedHashMap, it simply has to traverse through its own double-linked list thus requires time proportional to the size of the map and not its capacity so HashMap iteration is likely to be more expensive.

Transfer


Re-sizing is supposed to be faster as it iterates through its double-linked list to transfer the contents into a new table array.
    void transfer(HashMap.Entry[] newTable) {
        int newCapacity = newTable.length;
        for (Entry e = header.after; e != header; e = e.after) {
            int index = indexFor(e.hash, newCapacity);
            e.next = newTable[index];
            newTable[index] = e;
        }
    }

Contains Value


containsValue() is Overridden to take advantage of the faster iterator.

LRU Cache


If access ordered is true, order of iteration is the order in which its entries were last accessed, from least-recently accessed to most-recently (access-order). This kind of map is well-suited to building LRU caches. The removeEldestEntry(Entry) method may be overridden to impose a policy for removing stale mappings automatically when new mappings are added to the map. For example,
    protected boolean removeEldestEntry(Map.Entry eldest) {
        return size() > maxCacheSize;
    }
The eldest entry is returned by header.after. The default implementation of removeEldestEntry() returns false.

2 comments:

  1. Hi,

    This is Yashwanth. One of the nice explanation about linkedhashmap that i could find on the net. But, i think you need to explain bit more on entry array in linkedhashmap.

    HashMap maintains entry array and in each array index, a singly linkedlist runs. But in case of linkedhashmap, it extends Hashmap and uses doubly linkedlist. My question is how doubly linkedlist improves the performance of linkedhashmap. What will be the significance of array index in linkedhashmap

    Thanks,
    Yashwanth

    ReplyDelete
    Replies
    1. Hi Yashwanth,
      Thanks!
      The LinkedHashMap is HashMap + double linked list. This double linked list is in addition to the linked list that each entry of hashmMap contains. LinkedHashMap's main purpose is to retain the order in which the elements are added. Whenever a new element is added, it is also added to the double linked list. The double linked list can also be used to create a LRU cache. Please see http://javaopensourcecode.blogspot.in/2012/11/hashmap-q.html to know how HashMap iterates, there is a diagram in the end of FAQ.
      Thanks
      Ram Satish

      Delete