線程池中的一個 BUG，註意了！！

來源：https://segmentfault.com/a/1190000021109130 問題描述前幾天在幫同事排查生產一個線上偶發的線程池錯誤邏輯很簡單，線程池執行了一個帶結果的非同步任務。但是最近有偶發的報錯： java.util.concurrent.RejectedExecutionE ...

來源：https://segmentfault.com/a/1190000021109130

問題描述

前幾天在幫同事排查生產一個線上偶發的線程池錯誤

邏輯很簡單，線程池執行了一個帶結果的非同步任務。但是最近有偶發的報錯：

java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask@a5acd19 rejected from java.util.concurrent.ThreadPoolExecutor@30890a38[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]

本文中的模擬代碼已經問題都是在HotSpot java8 (1.8.0_221)版本下模擬&出現的

下麵是模擬代碼，通過Executors.newSingleThreadExecutor創建一個單線程的線程池，然後在調用方獲取Future的結果

public class ThreadPoolTest {

    public static void main(String[] args) {
        final ThreadPoolTest threadPoolTest = new ThreadPoolTest();
        for (int i = 0; i < 8; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while (true) {

                        Future<String> future = threadPoolTest.submit();
                        try {
                            String s = future.get();
                        } catch (InterruptedException e) {
                            e.printStackTrace();
                        } catch (ExecutionException e) {
                            e.printStackTrace();
                        } catch (Error e) {
                            e.printStackTrace();
                        }
                    }
                }
            }).start();
        }
        
        //子線程不停gc，模擬偶發的gc
        new Thread(new Runnable() {
            @Override
            public void run() {
                while (true) {
                    System.gc();
                }
            }
        }).start();
    }

    /**
     * 非同步執行任務
     * @return
     */
    public Future<String> submit() {
        //關鍵點，通過Executors.newSingleThreadExecutor創建一個單線程的線程池
        ExecutorService executorService = Executors.newSingleThreadExecutor();
        FutureTask<String> futureTask = new FutureTask(new Callable() {
            @Override
            public Object call() throws Exception {
                Thread.sleep(50);
                return System.currentTimeMillis() + "";
            }
        });
        executorService.execute(futureTask);
        return futureTask;
    }

}

分析&疑問

第一個思考的問題是：線程池為什麼關閉了，代碼中並沒有手動關閉的地方。看一下Executors.newSingleThreadExecotor的源碼實現：

public static ExecutorService newSingleThreadExecutor() {
    return new FinalizableDelegatedExecutorService
            (new ThreadPoolExecutor(1, 1,
                    0L, TimeUnit.MILLISECONDS,
                    new LinkedBlockingQueue<Runnable>()));
}

這裡創建的實際上是一個FinalizableDelegatedExecutorService，這個包裝類重寫了finalize函數，也就是說這個類會在被GC回收之前，先執行線程池的shutdown方法。

問題來了，GC只會回收不可達（unreachable）的對象，在submit函數的棧幀未執行完出棧之前，executorService應該是可達的才對。

更多多線程系列教程：https://www.javastack.cn/categories/Java/

對於此問題，先拋出結論：

當對象仍存在於作用域（stack frame）時，finalize也可能會被執行

oracle jdk文檔中有一段關於finalize的介紹：

A reachable object is any object that can be accessed in any potential continuing computation from any live thread.

Optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. For example, a Java compiler or code generator may choose to set a variable or parameter that will no longer be used to null to cause the storage for such an object to be potentially reclaimable sooner.

大概意思是：可達對象(reachable object)是可以從任何活動線程的任何潛在的持續訪問中的任何對象；java編譯器或代碼生成器可能會對不再訪問的對象提前置為null，使得對象可以被提前回收

也就是說，在jvm的優化下，可能會出現對象不可達之後被提前置空並回收的情況

舉個例子來驗證一下，摘自：https://stackoverflow.com/questions/24376768/can-java-finalize-an-object-when-it-is-still-in-scope

class A {
    @Override protected void finalize() {
        System.out.println(this + " was finalized!");
    }

    public static void main(String[] args) throws InterruptedException {
        A a = new A();
        System.out.println("Created " + a);
        for (int i = 0; i < 1_000_000_000; i++) {
            if (i % 1_000_00 == 0)
                System.gc();
        }
        System.out.println("done.");
    }
}

//列印結果
Created A@1be6f5c3
A@1be6f5c3 was finalized!//finalize方法輸出
done.

從例子中可以看到，如果a在迴圈完成後已經不再使用了，則會出現先執行finalize的情況；雖然從對象作用域來說，方法沒有執行完，棧幀並沒有出棧，但是還是會被提前執行。

現在來增加一行代碼，在最後一行列印對象a，讓編譯器/代碼生成器認為後面有對象a的引用

...
System.out.println(a);

//列印結果
Created A@1be6f5c3
done.
A@1be6f5c3

從結果上看，finalize方法都沒有執行（因為main方法執行完成後進程直接結束了），更不會出現提前finalize的問題了

基於上面的測試結果，再測試一種情況，在迴圈之前先將對象a置為null，並且在最後列印保持對象a的引用

A a = new A();
System.out.println("Created " + a);
a = null;//手動置null
for (int i = 0; i < 1_000_000_000; i++) {
    if (i % 1_000_00 == 0)
        System.gc();
}
System.out.println("done.");
System.out.println(a);

//列印結果
Created A@1be6f5c3
A@1be6f5c3 was finalized!
done.
null

從結果上看，手動置null的話也會導致對象被提前回收，雖然在最後還有引用，但此時引用的也是null了

現在再回到上面的線程池問題，根據上面介紹的機制，在分析沒有引用之後，對象會被提前finalize

可在上述代碼中，return之前明明是有引用的executorService.execute(futureTask)，為什麼也會提前finalize呢？

猜測可能是由於在execute方法中，會調用threadPoolExecutor，會創建並啟動一個新線程，這時會發生一次主動的線程切換，導致在活動線程中對象不可達

結合上面Oracle Jdk文檔中的描述“可達對象(reachable object)是可以從任何活動線程的任何潛在的持續訪問中的任何對象”，可以認為可能是因為一次顯示的線程切換，對象被認為不可達了，導致線程池被提前finalize了

下麵來驗證一下猜想：

//入口函數
public class FinalizedTest {
    public static void main(String[] args) {
        final FinalizedTest finalizedTest = new FinalizedTest();
        for (int i = 0; i < 8; i++) {
            new Thread(new Runnable() {
                @Override
                public void run() {
                    while (true) {
                        TFutureTask future = finalizedTest.submit();
                    }
                }
            }).start();
        }
        new Thread(new Runnable() {
            @Override
            public void run() {
                while (true) {
                    System.gc();
                }
            }
        }).start();
    }
    public TFutureTask submit(){
        TExecutorService TExecutorService = Executors.create();
        TExecutorService.execute();
        return null;
    }
}

//Executors.java，模擬juc的Executors
public class Executors {
    /**
     * 模擬Executors.createSingleExecutor
     * @return
     */
    public static TExecutorService create(){
        return new FinalizableDelegatedTExecutorService(new TThreadPoolExecutor());
    }

    static class FinalizableDelegatedTExecutorService extends DelegatedTExecutorService {

        FinalizableDelegatedTExecutorService(TExecutorService executor) {
            super(executor);
        }
        
        /**
         * 析構函數中執行shutdown，修改線程池狀態
         * @throws Throwable
         */
        @Override
        protected void finalize() throws Throwable {
            super.shutdown();
        }
    }

    static class DelegatedTExecutorService extends TExecutorService {

        protected TExecutorService e;

        public DelegatedTExecutorService(TExecutorService executor) {
            this.e = executor;
        }

        @Override
        public void execute() {
            e.execute();
        }

        @Override
        public void shutdown() {
            e.shutdown();
        }
    }
}

//TThreadPoolExecutor.java，模擬juc的ThreadPoolExecutor
public class TThreadPoolExecutor extends TExecutorService {

    /**
     * 線程池狀態，false：未關閉，true已關閉
     */
    private AtomicBoolean ctl = new AtomicBoolean();

    @Override
    public void execute() {
        //啟動一個新線程，模擬ThreadPoolExecutor.execute
        new Thread(new Runnable() {
            @Override
            public void run() {

            }
        }).start();
        //模擬ThreadPoolExecutor，啟動新建線程後，迴圈檢查線程池狀態，驗證是否會在finalize中shutdown
        //如果線程池被提前shutdown，則拋出異常
        for (int i = 0; i < 1_000_000; i++) {
            if(ctl.get()){
                throw new RuntimeException("reject!!!["+ctl.get()+"]");
            }
        }
    }

    @Override
    public void shutdown() {
        ctl.compareAndSet(false,true);
    }
}

執行若幹時間後報錯：

Exception in thread "Thread-1" java.lang.RuntimeException: reject!!![true]

從錯誤上來看，“線程池”同樣被提前shutdown了，那麼一定是由於新建線程導致的嗎？

下麵將新建線程修改為Thread.sleep測試一下：

//TThreadPoolExecutor.java，修改後的execute方法
public void execute() {
    try {
        //顯式的sleep 1 ns，主動切換線程
        TimeUnit.NANOSECONDS.sleep(1);
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
    //模擬ThreadPoolExecutor，啟動新建線程後，迴圈檢查線程池狀態，驗證是否會在finalize中shutdown
    //如果線程池被提前shutdown，則拋出異常
    for (int i = 0; i < 1_000_000; i++) {
        if(ctl.get()){
            throw new RuntimeException("reject!!!["+ctl.get()+"]");
        }
    }
}

執行結果一樣是報錯

Exception in thread "Thread-3" java.lang.RuntimeException: reject!!![true]

由此可得，如果在執行的過程中，發生一次顯式的線程切換，則會讓編譯器/代碼生成器認為外層包裝對象不可達

總結

雖然GC只會回收不可達GC ROOT的對象，但是在編譯器（沒有明確指出，也可能是JIT）/代碼生成器的優化下，可能會出現對象提前置null，或者線程切換導致的“提前對象不可達”的情況。

所以如果想在finalize方法里做些事情的話，一定在最後顯示的引用一下對象（toString/hashcode都可以），保持對象的可達性（reachable）

上面關於線程切換導致的對象不可達，沒有官方文獻的支持，只是個人一個測試結果，如有問題歡迎指出

綜上所述，這種回收機制並不是JDK的bug，而算是一個優化策略，提前回收而已；但Executors.newSingleThreadExecutor的實現里通過finalize來自動關閉線程池的做法是有Bug的，在經過優化後可能會導致線程池的提前shutdown，從而導致異常。

線程池的這個問題，在JDK的論壇里也是一個公開但未解決狀態的問題：https://bugs.openjdk.java.net/browse/JDK-8145304。

不過在JDK11下，該問題已經被修複：

JUC  Executors.FinalizableDelegatedExecutorService
public void execute(Runnable command) {
    try {
        e.execute(command);
    } finally { reachabilityFence(this); }
}

近期熱文推薦：

1.1,000+ 道 Java面試題及答案整理(2022最新版)

2.勁爆！Java 協程要來了。。。

3.Spring Boot 2.x 教程，太全了！

4.別再寫滿屏的爆爆爆炸類了，試試裝飾器模式，這才是優雅的方式！！

5.《Java開發手冊（嵩山版）》最新發佈，速速下載！

覺得不錯，別忘了隨手點贊+轉發哦！