LRU算法及Apache LRUMap源码

时间:2022-05-24 fenglllle 人气:0

1. 什么是LRU

LRU(least recently used) : 最近最少使用

LRU就是一种经典的算法，在容器中，对元素定义一个最后使用时间，当新的元素写入的时候，如果容器已满，则淘汰最近最少使用的元素，把新的元素写入。

1.1 自定义实现LRU的要求

比如redis，如何自己实现简易版的redis缓存。

那么我们需要一种数据结构，支持set和get操作。

1） get操作时间复杂度O(1);

2）需要支持RLU算法，空间不足时，需要将使用最少的元素移除，为新元素让空间；

3）时间失效remove(这个先不谈，比较麻烦)。

1.2 Apache LRUMap示例

1.2.1 pom依赖

        <dependency>
            <groupId>org.apache.commons</groupId>
            <artifactId>commons-collections4</artifactId>
            <version>4.2</version>
        </dependency>

1.2.2 demo

        LRUMap<String, String> map = new LRUMap<>(3);
        map.put("1", "1");
        map.put("2", "2");
        map.put("3", "3");
 
        map.get("2");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
            System.out.println(k+"\t"+v)
        );
 
        map.put("4", "4");
        map.put("5", "5");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
                System.out.println(k+"\t"+v)
        );
 
        map.put("6", "6");
 
        System.out.println("---------------------------------");
        map.forEach((k,v)->
                System.out.println(k+"\t"+v)
        );

结果如下：

---------------------------------

1 1

3 3

2 2

---------------------------------

2 2

4 4

5 5

---------------------------------

4 4

5 5

6 6

可以看出在get("2")，2的位置挪后，然后移除的顺序就延后。

容量不足时，总是移除，使用最少的，时间最远的。

2. 源码解析

2.1 设计

public class LRUMap<K, V>
        extends AbstractLinkedMap<K, V> implements BoundedMap<K, V>, Serializable, Cloneable {

进一步查看AbstractLinkedMap，AbstractHashedMap

public abstract class AbstractLinkedMap<K, V> extends AbstractHashedMap<K, V> implements OrderedMap<K, V> {

public class AbstractHashedMap<K, V> extends AbstractMap<K, V> implements IterableMap<K, V> {

本质是自定义AbstractMap

我们看看HashMap LinkedHashMap

public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements Map<K,V>

public class HashMap<K,V> extends AbstractMap<K,V>
    implements Map<K,V>, Cloneable, Serializable {

可以看出AbstractMap，AbstractHashedMap，LRUMap的本质其实也是HashMap。

2.2 数据结构

protected static final int DEFAULT_MAX_SIZE = 100;
 
public LRUMap() {
        this(DEFAULT_MAX_SIZE, DEFAULT_LOAD_FACTOR, false);
}

可以看出默认初始化容量100，最大容量也是100.

进一步跟踪

public LRUMap(final int maxSize, final float loadFactor, final boolean scanUntilRemovable) {
        this(maxSize, maxSize, loadFactor, scanUntilRemovable);
}
 
/**
     * Constructs a new, empty map with the specified max / initial capacity and load factor.
     *
     * @param maxSize  the maximum size of the map
     * @param initialSize  the initial size of the map
     * @param loadFactor  the load factor
     * @param scanUntilRemovable  scan until a removeable entry is found, default false
     * @throws IllegalArgumentException if the maximum size is less than one
     * @throws IllegalArgumentException if the initial size is negative or larger than the maximum size
     * @throws IllegalArgumentException if the load factor is less than zero
     * @since 4.1
     */
    public LRUMap(final int maxSize,
                  final int initialSize,
                  final float loadFactor,
                  final boolean scanUntilRemovable) {
 
        super(initialSize, loadFactor);
        if (maxSize < 1) {
            throw new IllegalArgumentException("LRUMap max size must be greater than 0");
        }
        if (initialSize > maxSize) {
            throw new IllegalArgumentException("LRUMap initial size must not be greather than max size");
        }
        this.maxSize = maxSize;
        this.scanUntilRemovable = scanUntilRemovable;
    }

跟踪super(initialSize, loadFactor);

public abstract class AbstractLinkedMap<K, V> extends AbstractHashedMap<K, V> implements OrderedMap<K, V> {
 
    protected AbstractLinkedMap(final int initialCapacity, final float loadFactor) {
        super(initialCapacity, loadFactor);
    }
//又super，再上一层追踪
 
public class AbstractHashedMap<K, V> extends AbstractMap<K, V> implements IterableMap<K, V> {
    //定义一些基本初始化数据
    /** The default capacity to use */
    protected static final int DEFAULT_CAPACITY = 16;
    /** The default threshold to use */
    protected static final int DEFAULT_THRESHOLD = 12;
    /** The default load factor to use */
    protected static final float DEFAULT_LOAD_FACTOR = 0.75f;
    /** The maximum capacity allowed */
    protected static final int MAXIMUM_CAPACITY = 1 << 30;
 
    /** Load factor, normally 0.75 */
    transient float loadFactor;
    /** The size of the map */
    transient int size;
    /** Map entries */
    transient HashEntry<K, V>[] data;
    /** Size at which to rehash */
    transient int threshold;
    /** Modification count for iterators */
    transient int modCount;
    /** Entry set */
    transient EntrySet<K, V> entrySet;
    /** Key set */
    transient KeySet<K> keySet;
    /** Values */
    transient Values<V> values;
 
    protected AbstractHashedMap(int initialCapacity, final float loadFactor) {
        super();
        if (initialCapacity < 0) {
            throw new IllegalArgumentException("Initial capacity must be a non negative number");
        }
        if (loadFactor <= 0.0f || Float.isNaN(loadFactor)) {
            throw new IllegalArgumentException("Load factor must be greater than 0");
        }
        this.loadFactor = loadFactor;
        initialCapacity = calculateNewCapacity(initialCapacity);
        this.threshold = calculateThreshold(initialCapacity, loadFactor);
        this.data = new HashEntry[initialCapacity];
        init();
    }
 
    /**
     * Initialise subclasses during construction, cloning or deserialization.
     */
    protected void init() {
        //没有任何逻辑，仅用于子类构造
    }

DEFAULT_LOAD_FACTOR = 0.75f; 负载因子0.75

可以看出LRUMap的本质，HashEntry数组。

上面的init方法没有实现逻辑，但是在他的子类中AbstractLinkedMap有相关的定义。

    /** Header in the linked list */
    transient LinkEntry<K, V> header;
 
    /**
     * Creates an entry to store the data.
     * <p>
     * This implementation creates a new LinkEntry instance.
     *
     * @param next  the next entry in sequence
     * @param hashCode  the hash code to use
     * @param key  the key to store
     * @param value  the value to store
     * @return the newly created entry
     */
    @Override
    protected LinkEntry<K, V> createEntry(final HashEntry<K, V> next, final int hashCode, final K key, final V value) {
        return new LinkEntry<>(next, hashCode, convertKey(key), value);
    }
 
    protected static class LinkEntry<K, V> extends HashEntry<K, V> {
        /** The entry before this one in the order */
        protected LinkEntry<K, V> before;
        /** The entry after this one in the order */
        protected LinkEntry<K, V> after;
 
        /**
         * Constructs a new entry.
         *
         * @param next  the next entry in the hash bucket sequence
         * @param hashCode  the hash code
         * @param key  the key
         * @param value  the value
         */
        protected LinkEntry(final HashEntry<K, V> next, final int hashCode, final Object key, final V value) {
            super(next, hashCode, key, value);
        }
    }
 
    /**
     * Initialise this subclass during construction.
     * <p>
     * NOTE: As from v3.2 this method calls
     * {@link #createEntry(HashEntry, int, Object, Object)} to create
     * the map entry object.
     */
    @Override
    protected void init() {
        header = createEntry(null, -1, null, null);
        header.before = header.after = header;
    }

这个很关键。可以看出LRUMap是持有LinkEntry header，的双链表结构，初始header为null，前后节点都是自身。将HashEntry转成LinkEntry。

解析HashEntry

transient HashEntry<K, V>[] data;
//构造初始化
this.data = new HashEntry[initialCapacity];

再跟踪

 protected static class HashEntry<K, V> implements Map.Entry<K, V>, KeyValue<K, V> {
        /** The next entry in the hash chain */
        protected HashEntry<K, V> next;
        /** The hash code of the key */
        protected int hashCode;
        /** The key */
        protected Object key;
        /** The value */
        protected Object value;

key，value，next单链表。

public int hashCode() {
            return (getKey() == null ? 0 : getKey().hashCode()) ^
                   (getValue() == null ? 0 : getValue().hashCode());
        }

hashCode方法可以看出是key的hash与value的hash按位^运算。

在此我们看透LRU的本质了，数组+单链表。同时是持有头结点的双链表结构（怎么看就是LinkedHashMap的结构，只是有尾节点）。

public class LinkedHashMap<K,V>
    extends HashMap<K,V>
    implements Map<K,V>
{    
    /**
     * The head (eldest) of the doubly linked list.
     */
    transient LinkedHashMap.Entry<K,V> head;
 
    /**
     * The tail (youngest) of the doubly linked list.
     */
    transient LinkedHashMap.Entry<K,V> tail;

那么LRUMap是如何实现LRU算法的？

2.3 方法解析put get remove

2.3.1 get方法

public V get(final Object key) {
        return get(key, true);
}
 
public V get(final Object key, final boolean updateToMRU) {
        final LinkEntry<K, V> entry = getEntry(key);
        if (entry == null) {
            return null;
        }
        if (updateToMRU) {
            moveToMRU(entry);
        }
        return entry.getValue();
}
 
//父类方法获取值entry
protected HashEntry<K, V> getEntry(Object key) {
        key = convertKey(key);
        final int hashCode = hash(key);
        HashEntry<K, V> entry = data[hashIndex(hashCode, data.length)]; // no local for hash index
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(key, entry.key)) {
                return entry;
            }
            entry = entry.next;
        }
        return null;
}

下面看不一样的moveToMRU(entry);

/**
     * Moves an entry to the MRU position at the end of the list.
     * <p>
     * This implementation moves the updated entry to the end of the list.
     *
     * @param entry  the entry to update
     */
    protected void moveToMRU(final LinkEntry<K, V> entry) {
        if (entry.after != header) {
            modCount++;
            // remove
            if(entry.before == null) {
                throw new IllegalStateException("Entry.before is null." +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
            }
            entry.before.after = entry.after;
            entry.after.before = entry.before;
            // add first
            entry.after = header;
            entry.before = header.before;
            header.before.after = entry;
            header.before = entry;
        } else if (entry == header) {
            throw new IllegalStateException("Can't move header to MRU" +
                " (please report this to dev@commons.apache.org)");
        }
    }

看出LRU的一个本质，每次get方法拨动指针，将get的元素移动到header的前一个位置。

2.3.2 remove方法

remove方法使用的父类的方法

    /**
     * Removes the specified mapping from this map.
     *
     * @param key  the mapping to remove
     * @return the value mapped to the removed key, null if key not in map
     */
    @Override
    public V remove(Object key) {
        key = convertKey(key);
        final int hashCode = hash(key);
        final int index = hashIndex(hashCode, data.length);
        HashEntry<K, V> entry = data[index];
        HashEntry<K, V> previous = null;
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(key, entry.key)) {
                final V oldValue = entry.getValue();
                removeMapping(entry, index, previous);
                return oldValue;
            }
            previous = entry;
            entry = entry.next;
        }
        return null;
    }
 
    /**
     * Removes a mapping from the map.
     * <p>
     * This implementation calls <code>removeEntry()</code> and <code>destroyEntry()</code>.
     * It also handles changes to <code>modCount</code> and <code>size</code>.
     * Subclasses could override to fully control removals from the map.
     *
     * @param entry  the entry to remove
     * @param hashIndex  the index into the data structure
     * @param previous  the previous entry in the chain
     */
    protected void removeMapping(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        modCount++;
        removeEntry(entry, hashIndex, previous);
        size--;
        destroyEntry(entry);
    }
 
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        if (previous == null) {
            data[hashIndex] = entry.next;
        } else {
            previous.next = entry.next;
        }
    }
 
    protected void destroyEntry(final HashEntry<K, V> entry) {
        entry.next = null;
        entry.key = null;
        entry.value = null;
    }

这里并没有移除header双链表的数据。

2.3.3 put方法

    /**
     * Puts a key-value mapping into this map.
     *
     * @param key  the key to add
     * @param value  the value to add
     * @return the value previously mapped to this key, null if none
     */
    @Override
    public V put(final K key, final V value) {
        final Object convertedKey = convertKey(key);
        final int hashCode = hash(convertedKey);
        final int index = hashIndex(hashCode, data.length);
        HashEntry<K, V> entry = data[index];
        //仅在元素存在才循环，更新updateEntry，header前一个位置
        while (entry != null) {
            if (entry.hashCode == hashCode && isEqualKey(convertedKey, entry.key)) {
                final V oldValue = entry.getValue();
                updateEntry(entry, value);
                return oldValue;
            }
            entry = entry.next;
        }
 
        addMapping(index, hashCode, key, value);
        return null;
    }

updateEntry(entry, value);

    /**
     * Updates an existing key-value mapping.
     * <p>
     * This implementation moves the updated entry to the end of the list
     * using {@link #moveToMRU(AbstractLinkedMap.LinkEntry)}.
     *
     * @param entry  the entry to update
     * @param newValue  the new value to store
     */
    @Override
    protected void updateEntry(final HashEntry<K, V> entry, final V newValue) {
        moveToMRU((LinkEntry<K, V>) entry);  // handles modCount
        entry.setValue(newValue);
    }

moveToMRU((LinkEntry<K, V>) entry); // handles modCount

上面get方法有讲，更新了链表的指针，新添加的元素在双链表的header前一个位置，仅在元素存在的时候，while循环才生效。

那么新增的元素呢？

下面看重点 addMapping(index, hashCode, key, value); 这句代码定义了，容量满了的处理策略。

    /**
     * Adds a new key-value mapping into this map.
     * <p>
     * This implementation checks the LRU size and determines whether to
     * discard an entry or not using {@link #removeLRU(AbstractLinkedMap.LinkEntry)}.
     * <p>
     * From Commons Collections 3.1 this method uses {@link #isFull()} rather
     * than accessing <code>size</code> and <code>maxSize</code> directly.
     * It also handles the scanUntilRemovable functionality.
     *
     * @param hashIndex  the index into the data array to store at
     * @param hashCode  the hash code of the key to add
     * @param key  the key to add
     * @param value  the value to add
     */
    @Override
    protected void addMapping(final int hashIndex, final int hashCode, final K key, final V value) {
        //容量是否已满
        if (isFull()) {
            LinkEntry<K, V> reuse = header.after;
            boolean removeLRUEntry = false;
            //默认是false
            if (scanUntilRemovable) {
                //这里不知道干啥，难道是以后扩展？
                while (reuse != header && reuse != null) {
                    //removeLRU一定返回true，很奇怪，估计以后扩展用
                    if (removeLRU(reuse)) {
                        removeLRUEntry = true;
                        break;
                    }
                    reuse = reuse.after;
                }
                if (reuse == null) {
                    throw new IllegalStateException(
                        "Entry.after=null, header.after" + header.after + " header.before" + header.before +
                        " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                        " Please check that your keys are immutable, and that you have used synchronization properly." +
                        " If so, then please report this to dev@commons.apache.org as a bug.");
                }
            } else {
                //一定返回true
                removeLRUEntry = removeLRU(reuse);
            }
 
            if (removeLRUEntry) {
                if (reuse == null) {
                    throw new IllegalStateException(
                        "reuse=null, header.after=" + header.after + " header.before" + header.before +
                        " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                        " Please check that your keys are immutable, and that you have used synchronization properly." +
                        " If so, then please report this to dev@commons.apache.org as a bug.");
                }
                reuseMapping(reuse, hashIndex, hashCode, key, value);
            } else {
                super.addMapping(hashIndex, hashCode, key, value);
            }
        } else {
            super.addMapping(hashIndex, hashCode, key, value);
        }
    }
 
    protected boolean removeLRU(final LinkEntry<K, V> entry) {
        return true;
    }

先判断容量

public boolean isFull() {
        return size >= maxSize;
}

未满就直接添加

super.addMapping(hashIndex, hashCode, key, value);

    protected void addMapping(final int hashIndex, final int hashCode, final K key, final V value) {
        modCount++;
        final HashEntry<K, V> entry = createEntry(data[hashIndex], hashCode, key, value);
        addEntry(entry, hashIndex);
        size++;
        checkCapacity();
    }

//这里调用了AbstractLinkedMap的方法

addEntry(entry, hashIndex);

    /**
     * Adds an entry into this map, maintaining insertion order.
     * <p>
     * This implementation adds the entry to the data storage table and
     * to the end of the linked list.
     *
     * @param entry  the entry to add
     * @param hashIndex  the index into the data array to store at
     */
    @Override
    protected void addEntry(final HashEntry<K, V> entry, final int hashIndex) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.after  = header;
        link.before = header.before;
        header.before.after = link;
        header.before = link;
        data[hashIndex] = link;
    }

放在header的前一个位置，最早的元素链接到header。

双向环回链表。

如果容量满了，执行LRU算法 reuseMapping(reuse, hashIndex, hashCode, key, value);

    /**
     * Reuses an entry by removing it and moving it to a new place in the map.
     * <p>
     * This method uses {@link #removeEntry}, {@link #reuseEntry} and {@link #addEntry}.
     *
     * @param entry  the entry to reuse
     * @param hashIndex  the index into the data array to store at
     * @param hashCode  the hash code of the key to add
     * @param key  the key to add
     * @param value  the value to add
     */
    protected void reuseMapping(final LinkEntry<K, V> entry, final int hashIndex, final int hashCode,
                                final K key, final V value) {
        // find the entry before the entry specified in the hash table
        // remember that the parameters (except the first) refer to the new entry,
        // not the old one
        try {
            //要干掉的元素下标
            final int removeIndex = hashIndex(entry.hashCode, data.length);
            final HashEntry<K, V>[] tmp = data;  // may protect against some sync issues
            HashEntry<K, V> loop = tmp[removeIndex];
            HashEntry<K, V> previous = null;
            //避免已经被删除
            while (loop != entry && loop != null) {
                previous = loop;
                loop = loop.next;
            }
            //如果被其他线程删除，抛异常
            if (loop == null) {
                throw new IllegalStateException(
                    "Entry.next=null, data[removeIndex]=" + data[removeIndex] + " previous=" + previous +
                    " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
            }
 
            // reuse the entry
            modCount++;
            //双链表移除旧元素,AbstractHashedMap移除旧元素
            removeEntry(entry, removeIndex, previous);
            //复用移除的对象，减少创建对象和GC；增加AbstractHashedMap单链表next指向
            reuseEntry(entry, hashIndex, hashCode, key, value);
            //复用的元素加AbstractLinkedMap双链表和AbstractHashedMap单链表
            addEntry(entry, hashIndex);
        } catch (final NullPointerException ex) {
            throw new IllegalStateException(
                    "NPE, entry=" + entry + " entryIsHeader=" + (entry==header) +
                    " key=" + key + " value=" + value + " size=" + size + " maxSize=" + maxSize +
                    " Please check that your keys are immutable, and that you have used synchronization properly." +
                    " If so, then please report this to dev@commons.apache.org as a bug.");
        }
    }

    /**
     * Removes an entry from the map and the linked list.
     * <p>
     * This implementation removes the entry from the linked list chain, then
     * calls the superclass implementation.
     *
     * @param entry  the entry to remove
     * @param hashIndex  the index into the data structure
     * @param previous  the previous entry in the chain
     */
    @Override
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.before.after = link.after;
        link.after.before = link.before;
        link.after = null;
        link.before = null;
        super.removeEntry(entry, hashIndex, previous);
    }
 
    /**
     * Removes an entry from the chain stored in a particular index.
     * <p>
     * This implementation removes the entry from the data storage table.
     * The size is not updated.
     * Subclasses could override to handle changes to the map.
     *
     * @param entry  the entry to remove
     * @param hashIndex  the index into the data structure
     * @param previous  the previous entry in the chain
     */
    protected void removeEntry(final HashEntry<K, V> entry, final int hashIndex, final HashEntry<K, V> previous) {
        if (previous == null) {
            data[hashIndex] = entry.next;
        } else {
            previous.next = entry.next;
        }
    }
 
    /**
     * Reuses an existing key-value mapping, storing completely new data.
     * <p>
     * This implementation sets all the data fields on the entry.
     * Subclasses could populate additional entry fields.
     *
     * @param entry  the entry to update, not null
     * @param hashIndex  the index in the data array
     * @param hashCode  the hash code of the key to add
     * @param key  the key to add
     * @param value  the value to add
     */
    protected void reuseEntry(final HashEntry<K, V> entry, final int hashIndex, final int hashCode,
                              final K key, final V value) {
        entry.next = data[hashIndex];
        entry.hashCode = hashCode;
        entry.key = key;
        entry.value = value;
    }
 
    /**
     * Adds an entry into this map, maintaining insertion order.
     * <p>
     * This implementation adds the entry to the data storage table and
     * to the end of the linked list.
     *
     * @param entry  the entry to add
     * @param hashIndex  the index into the data array to store at
     */
    @Override
    protected void addEntry(final HashEntry<K, V> entry, final int hashIndex) {
        final LinkEntry<K, V> link = (LinkEntry<K, V>) entry;
        link.after  = header;
        link.before = header.before;
        header.before.after = link;
        header.before = link;
        data[hashIndex] = link;
    }

3. 总结

LRU的本质了，数组+单链表。同时是持有头结点的环回双链表结构

LRU最新使用的元素放在双链表的header的前一个位置，如果，新增元素容量已满就会移除header的后一个元素。

加载全部内容