深入理解ThreadLocal

ThreadLocal 的作用是提供线程内的局部变量，这种变量只在线程的生命周期内起作用。

ThreadLocal 可以用来减少同一个线程内多个函数或者组件之间一些公共变量的传递的复杂度，但是如果滥用就可能会导致内存泄漏。

背景

在并发编程中时常有这样一种需求：每条线程都需要存取一个同名变量，但每条线程中该变量的值均不相同。

常规的思路如下：使用一个线程共享的 Map，Map中的 key 为线程对象，value 即为需要存储的值。通过map.get(Thread.currentThread()) 就可获取本线程中该变量的值。

虽然这可以实现需求，但这种方式需要同步，效率低。

由于这个map对象需要被所有线程共享，因此需要加锁来保证线程安全性。当然我们可以使用ConcurrentHashMap 提高并发效率，但这种方法只能降低锁的粒度，不能从根本上避免同步锁。而ThreadLocal就能很好地解决这一问题。

源码分析

ThreadLocal 的内部结构如下图所示：

ThreadLocal并不维护 ThreadLocalMap，它只是提供了操作该容器的方法。而ThreadLocal的静态内部类 ThreadLocalMap 才是存储数据的容器，并且该容器由 Thread 维护。

每一个Thread对象均含有一个ThreadLocalMap类型的成员变量threadLocals，它存储本线程中所有ThreadLocal对象及其对应的值。

ThreadLocal

public class ThreadLocal<T> {
    
    /**
     * Returns the value in the current thread's copy of this
     * thread-local variable.  If the variable has no value for the
     * current thread, it is first initialized to the value returned
     * by an invocation of the {@link #initialValue} method.
     *
     * @return the current thread's value of this thread-local
     */
    public T get() {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t); // 获取线程关联的ThreadLocalMap哈希表
        if (map != null) {
            ThreadLocalMap.Entry e = map.getEntry(this); // 获取entry
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }

    /**
     * Variant of set() to establish initialValue. Used instead
     * of set() in case user has overridden the set() method.
     *
     * @return the initial value
     */
    private T setInitialValue() {
        T value = initialValue(); // 这里其实是返回null
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t); // 获取线程关联的ThreadLocalMap哈希表
        // 如果ThreadLocalMap存在，则直接插入；不存在，则新建ThreadLocalMap
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value);
        return value;
    }

    /**
     * Sets the current thread's copy of this thread-local variable
     * to the specified value.  Most subclasses will have no need to
     * override this method, relying solely on the {@link #initialValue}
     * method to set the values of thread-locals.
     *
     * @param value the value to be stored in the current thread's copy of
     *        this thread-local.
     */
    public void set(T value) {
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null)
            map.set(this, value);
        else
            createMap(t, value); // 如果 ThreadLocalMap 为空则创建
    }

    /**
     * Removes the current thread's value for this thread-local
     * variable.  If this thread-local variable is subsequently
     * {@linkplain #get read} by the current thread, its value will be
     * reinitialized by invoking its {@link #initialValue} method,
     * unless its value is {@linkplain #set set} by the current thread
     * in the interim.  This may result in multiple invocations of the
     * {@code initialValue} method in the current thread.
     *
     * @since 1.5
     */
     public void remove() {
         ThreadLocalMap m = getMap(Thread.currentThread());
         if (m != null)
             m.remove(this); // 移除
     }

    /**
     * Get the map associated with a ThreadLocal. Overridden in
     * InheritableThreadLocal.
     *
     * @param  t the current thread
     * @return the map
     */
    ThreadLocalMap getMap(Thread t) {
        // 从 Thread 中获取 threadLocals
        return t.threadLocals;
    }
    
    /**
     * Create the map associated with a ThreadLocal. Overridden in
     * InheritableThreadLocal.
     *
     * @param t the current thread
     * @param firstValue value for the initial entry of the map
     */
    void createMap(Thread t, T firstValue) {
        t.threadLocals = new ThreadLocalMap(this, firstValue);
    }
}

ThreadLocalMap

static class ThreadLocalMap {
    static class Entry extends WeakReference<ThreadLocal<?>> {
        /** The value associated with this ThreadLocal. */
        Object value;
        // Entry的key是ThreadLocal对象，并且是一个弱引用
        // 当没指向key的强引用后，该key就会被垃圾收集器回收
        Entry(ThreadLocal<?> k, Object v) {
            super(k);
            value = v;
        }
    }
    private static final int INITIAL_CAPACITY = 16;
    private Entry[] table;
    private int threshold; // Default to 0
    
    ThreadLocalMap(ThreadLocal<?> firstKey, Object firstValue) {
        table = new Entry[INITIAL_CAPACITY];
        // 计算索引下标
        int i = firstKey.threadLocalHashCode & (INITIAL_CAPACITY - 1);
        // ThreadLocal作为key
        table[i] = new Entry(firstKey, firstValue);
        size = 1;
        setThreshold(INITIAL_CAPACITY); // 设置扩容阈值为 容量的 * 2/3
    }
    private void set(ThreadLocal<?> key, Object value) {
        // We don't use a fast path as with get() because it is at
        // least as common to use set() to create new entries as
        // it is to replace existing ones, in which case, a fast
        // path would fail more often than not.

        Entry[] tab = table;
        int len = tab.length;
        int i = key.threadLocalHashCode & (len-1); // 计算数组下标

        for (Entry e = tab[i];
             e != null;
             e = tab[i = nextIndex(i, len)]) {
            ThreadLocal<?> k = e.get();

            if (k == key) {
                e.value = value;
                return;
            }

            if (k == null) { 
                // key为空，则代表该Entry不再需要，设置 Entry的value指针和Entry指针为null，帮助GC
                replaceStaleEntry(key, value, i);
                return;
            }
        }

        tab[i] = new Entry(key, value);
        int sz = ++size;
        // 如果没有清除entry并且容量到达扩容阈值
        if (!cleanSomeSlots(i, sz) && sz >= threshold)
            rehash();
    }
    // 将staleSlot位置的脏Entry替换为新Entry(key, value)
    private void replaceStaleEntry(ThreadLocal<?> key, Object value,
                                   int staleSlot) {
        Entry[] tab = table;
        int len = tab.length;
        Entry e;

        // Back up to check for prior stale entry in current run.
        // We clean out whole runs at a time to avoid continual
        // incremental rehashing due to garbage collector freeing
        // up refs in bunches (i.e., whenever the collector runs).
        // 向前环形遍历找到第一个key为null的Entry, staleEntry 翻译过来就是 脏Entry
        int slotToExpunge = staleSlot;
        for (int i = prevIndex(staleSlot, len);
             (e = tab[i]) != null;
             i = prevIndex(i, len))
            if (e.get() == null)
                slotToExpunge = i;

        // Find either the key or trailing null slot of run, whichever
        // occurs first
        // 从staleSlot开始向后环形遍历tab直到tab[i]为空
        for (int i = nextIndex(staleSlot, len);
             (e = tab[i]) != null;
             i = nextIndex(i, len)) {
            ThreadLocal<?> k = e.get();

            // If we find key, then we need to swap it
            // with the stale entry to maintain hash table order.
            // The newly stale slot, or any other stale slot
            // encountered above it, can then be sent to expungeStaleEntry
            // to remove or rehash all of the other entries in run.
            if (k == key) {
                // 如果在向后环形查找过程中发现key相同的Entry就和脏Entry进行交换
                e.value = value;

                tab[i] = tab[staleSlot];
                tab[staleSlot] = e;

                // Start expunge at preceding stale entry if it exists
                // 如果在查找过程中还未发现脏Entry，也就是开始向前环形遍历时没有发现脏Entry
                // 那么就以当前位置作为cleanSomeSlots的起点
                if (slotToExpunge == staleSlot)
                    slotToExpunge = i;
                // 搜索脏Entry并进行清理
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                return;
            }

            // If we didn't find stale entry on backward scan, the
            // first stale entry seen while scanning for key is the
            // first still present in the run.
            // 如果向后环形遍历时发现脏Entry，并且之前向前环形遍历时未发现脏Entry,
            // 后面就以此时这个位置作为起点执行cleanSomeSlots
            if (k == null && slotToExpunge == staleSlot)
                slotToExpunge = i;
        }

        // If key not found, put new entry in stale slot
        //如果在查找过程中没有找到可以覆盖的Entry，则将新的entry插入在脏Entry
        tab[staleSlot].value = null;
        tab[staleSlot] = new Entry(key, value);

        // If there are any other stale entries in run, expunge them
        if (slotToExpunge != staleSlot)
            // 从slotToExpunge位置开始清理脏Entry
            cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
    }
    // 遍历时发现脏Entry，会调用该方法清除脏Entry，expunge翻译过来就是清除的意思
    // 从 staleSlot 开始
    private int expungeStaleEntry(int staleSlot) {
        Entry[] tab = table;
        int len = tab.length;

        // expunge entry at staleSlot
        // 清除掉当前的脏Entry(也就是遍历时发现的脏Entry)，帮助GC
        tab[staleSlot].value = null;
        tab[staleSlot] = null;
        size--; // tab的容量 - 1

        // Rehash until we encounter null
        Entry e;
        int i;
        //往后环形遍历直到遇到tab[i]为空
        for (i = nextIndex(staleSlot, len);
             (e = tab[i]) != null;
             i = nextIndex(i, len)) {
            ThreadLocal<?> k = e.get();
            if (k == null) {
                // 遇到脏Entry，将其清理掉
                e.value = null;
                tab[i] = null;
                size--;
            } else {
                // 处理rehash的情况
                int h = k.threadLocalHashCode & (len - 1);
                if (h != i) {
                    tab[i] = null;

                    // Unlike Knuth 6.4 Algorithm R, we must scan until
                    // null because multiple entries could have been stale.
                    while (tab[h] != null)
                        h = nextIndex(h, len);
                    tab[h] = e;
                }
            }
        }
        return i;
    }
    // 从数组下标i+1开始清除一些key为null的entry
    private boolean cleanSomeSlots(int i, int n) {
        boolean removed = false;
        Entry[] tab = table;
        int len = tab.length;
        do {
            i = nextIndex(i, len);
            Entry e = tab[i];
            if (e != null && e.get() == null) {
                n = len;
                removed = true;
                i = expungeStaleEntry(i);
            }
        } while ( (n >>>= 1) != 0); // n >>>= 1 相当于 log2(n)次
        return removed;
    }
    // 获取 Entry
    private Entry getEntry(ThreadLocal<?> key) {
        int i = key.threadLocalHashCode & (table.length - 1);
        Entry e = table[i];
        if (e != null && e.get() == key)
            return e;
        else
            return getEntryAfterMiss(key, i, e);
    }
    // 从i出找出的元素为空或者与key不相等(可能发生hash冲突)，
    private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
        Entry[] tab = table;
        int len = tab.length;

        while (e != null) {
            ThreadLocal<?> k = e.get();
            if (k == key)
                return e;
            if (k == null)
                expungeStaleEntry(i);
            else
                // 采用线性探测法解决hash冲突
                // i = ((i + 1 < len) ? i + 1 : 0)
                i = nextIndex(i, len);
            e = tab[i];
        }
        return null;
    }
    // 移除元素
    private void remove(ThreadLocal<?> key) {
        Entry[] tab = table;
        int len = tab.length;
        int i = key.threadLocalHashCode & (len-1);
        for (Entry e = tab[i];
             e != null;
             e = tab[i = nextIndex(i, len)]) {
            if (e.get() == key) {
                e.clear();
                expungeStaleEntry(i);
                return;
            }
        }
    }
}

InheritableThreadLocal

使用类 InheritableThreadLocal 可使子线程继承父线程的值，通俗点说就是可以在子线程中获取父线程中设置的值。

public class InheritableThreadLocal<T> extends ThreadLocal<T> {

    protected T childValue(T parentValue) {
        return parentValue;
    }

    ThreadLocalMap getMap(Thread t) {
       return t.inheritableThreadLocals;
    }
    
    void createMap(Thread t, T firstValue) {
        t.inheritableThreadLocals = new ThreadLocalMap(this, firstValue);
    }
}

InheritableThreadLocal 源码很简单，只是重写了 ThreadLocal 的三个方法，将存储数据的容器换成Thread对象中 inheritableThreadLocals。

既然其他方法没有改变，那它是如何继承父线程中的数据的呢？其实是 Thread 构造方法中的 init() 方法实现的。

public Thread() {
    init(null, null, "Thread-" + nextThreadNum(), 0);
}
/**
 * Initializes a Thread with the current AccessControlContext.
 * @see #init(ThreadGroup,Runnable,String,long,AccessControlContext,boolean)
 */
private void init(ThreadGroup g, Runnable target, String name,
                    long stackSize) {
    init(g, target, name, stackSize, null, true);
}
/**
 * Initializes a Thread.
 *
 * @param g the Thread group
 * @param target the object whose run() method gets called
 * @param name the name of the new Thread
 * @param stackSize the desired stack size for the new thread, or
 *        zero to indicate that this parameter is to be ignored.
 * @param acc the AccessControlContext to inherit, or
 *            AccessController.getContext() if null
 * @param inheritThreadLocals if {@code true}, inherit initial values for
 *            inheritable thread-locals from the constructing thread
 */
private void init(ThreadGroup g, Runnable target, String name,
                  long stackSize, AccessControlContext acc,
                  boolean inheritThreadLocals) {
    if (name == null) {
        throw new NullPointerException("name cannot be null");
    }

    this.name = name;

    Thread parent = currentThread();
    SecurityManager security = System.getSecurityManager();
    if (g == null) {
        /* Determine if it's an applet or not */

        /* If there is a security manager, ask the security manager
               what to do. */
        if (security != null) {
            g = security.getThreadGroup();
        }

        /* If the security doesn't have a strong opinion of the matter
               use the parent thread group. */
        if (g == null) {
            g = parent.getThreadGroup();
        }
    }

    /* checkAccess regardless of whether or not threadgroup is
           explicitly passed in. */
    g.checkAccess();

    /*
         * Do we have the required permissions?
         */
    if (security != null) {
        if (isCCLOverridden(getClass())) {
            security.checkPermission(SUBCLASS_IMPLEMENTATION_PERMISSION);
        }
    }

    g.addUnstarted();

    this.group = g;
    this.daemon = parent.isDaemon();
    this.priority = parent.getPriority();
    if (security == null || isCCLOverridden(parent.getClass()))
        this.contextClassLoader = parent.getContextClassLoader();
    else
        this.contextClassLoader = parent.contextClassLoader;
    this.inheritedAccessControlContext =
        acc != null ? acc : AccessController.getContext();
    this.target = target;
    setPriority(priority);
    // inheritThreadLocals 默认为 true，如果 父线程中的inheritableThreadLocals不为空
    // 就将其拷贝到子线程中的inheritableThreadLocals
    if (inheritThreadLocals && parent.inheritableThreadLocals != null)
        this.inheritableThreadLocals =
        ThreadLocal.createInheritedMap(parent.inheritableThreadLocals);
    /* Stash the specified stack size in case the VM cares */
    this.stackSize = stackSize;

    /* Set thread ID */
    tid = nextThreadID();
}

注意：如果父线程中的 inheritableThreadLocal 重新执行了 set 方法，那么子线程执行 get 方法是获取不到父线程中 set 的数据，因为它们是两个不同的引用，和深拷贝浅拷贝无关。

内存泄漏分析

ThreadLocalMap使用 ThreadLocal 的弱引用作为key，如果一个ThreadLocal没有强引用指向它， GC 的时候就会被回收，这样 ThreadLocalMap 中就会出现key为null的Entry，就没有办法访问这些key为null的Entry的value。

不过，从源码中可以看出，ThreadLocal 的设计者已经考虑到了这种情况，并且在 ThreadLocal 的get()，set()，remove()的时候都会清除线程 ThreadLocalMap 里所有key为 null 的 value 。

弱引用

内存泄漏的根源并非在于使用了弱引用。

分情况讨论：

key 使用强引用：引用的 ThreadLocal 的对象被回收了，但是 ThreadLocalMap 还持有 ThreadLocal 的强引用，如果没有手动删除，ThreadLocal 不会被回收，导致整个 Entry 内存泄漏。
key 使用弱引用：引用的 ThreadLocal 的对象被回收了，由于 ThreadLocalMap 持有 ThreadLocal 的弱引用，即使没有手动删除，ThreadLocal 也会被回收，但 value 会内存泄漏。不过最终 value 会在下一次ThreadLocalMap调用set,get，remove的时候会被清除。

为什么 value 不使用弱引用呢？

不设置为弱引用，是因为不清楚这个value 除了 ThreadLocalMap 的引用还是否还存在其他引用，如果不存在其他引用，当GC的时候 value 干掉了，而此时我们的 ThreadLocal 还处于使用期间，就会造成 value 为null的错误，所以将其设置为强引用。

WeakHashMap 则是将整个 Entry（key 和 value）都设置成了弱引用，不过它们被设计出来的作用不一样。

static

ThreadLocal 变量一般都会采用 static 修饰，这样做既有好处，也有坏处。

将 ThreadLocal 声明为某个类的实例变量，那么每创建一个该类的实例就会导致一个新的 ThreadLocal 实例被创建，可能会导致浪费。

将 ThreadLocal 声明为某个类的静态变量，那么就有了强引用，即使发生 GC 也不会回收threadLocalMap中的key。分以下情况讨论：

如果是新建线程，线程消亡时，threadLocalMap整个都会被回收。
如果是使用线程池，下一次使用时则会覆盖 value。

小结

内存泄漏的根源在于 ThreadLocalMap 的生命周期跟Thread一样长，不管使用弱引用还是强引用都会导致内存泄漏，但是使用弱引用可以多一层保障：弱引用 ThreadLocal不会内存泄漏，对应的 value 在下一次ThreadLocalMap调用set,get,remove的时候会被清除。

要想避免内存泄漏：每次使用完 ThreadLocal，都调用它的remove()方法，清除数据。

应用场景分析

数据库连接
数据库 session 管理
Spring 的声明式事务
Spring 的 RequestContextHolder
web 项目的日志切面
web 项目的用户信息
避免一些复杂的参数传递

哈希冲突分析

通常用如下几种解决 Hash 冲突的方法：

开放定址法：其基本思想是：当关键字key的哈希地址p=H（key）出现冲突时，以p为基础，产生另一个哈希地址p1，如果p1仍然冲突，再以p为基础，产生另一个哈希地址p2，…，直到找出一个不冲突的哈希地址pi ，将相应元素存入其中。
- 线性探测
- 二次探测
- 伪随机探测
拉链法：对于相同的值，使用链表进行连接，使用数组存储每一个链表。
再哈希：对于冲突的哈希值再次进行哈希处理，直至没有哈希冲突。
建立公共溢出区：建立公共溢出区存储所有哈希冲突的数据。

拉链法与开放地址法对比

拉链法优点：

开放定址法为减少冲突，要求装填因子α较小，故当结点规模较大时会浪费很多空间。而拉链法中可取α≥1，且结点较大时，拉链法中增加的指针域可忽略不计，因此节省空间；
在用拉链法构造的散列表中，删除结点的操作易于实现。只要简单地删去链表上相应的结点即可。而对开放地址法构造的散列表，删除结点不能简单地将被删结点的空间置为空，否则将截断在它之后填人散列表的同义词结点的查找路径。这是因为各种开放地址法中，空地址单元(即开放地址)都是查找失败的条件。因此在用开放地址法处理冲突的散列表上执行删除操作，只能在被删结点上做删除标记，而不能真正删除结点。
拉链法处理冲突简单，且无堆积现象，即非同义词决不会发生冲突，因此平均查找长度较短；
由于拉链法中各链表上的结点空间是动态申请的，故它更适合于造表前无法确定表长的情况；

拉链法缺点：

指针需要额外的空间，故当结点规模较小时，开放定址法较为节省空间，而若将节省的指针空间用来扩大散列表的规模，可使装填因子变小，这又减少了开放定址法中的冲突，从而提高平均查找速度。