NullReferenceException可能是.Net程式員遇到最多的例外了, 這個例外發生的如此頻繁, 以至於人們付出了巨大的努力來使用各種特性和約束試圖防止它發生, 但時至今日它仍然讓很多程式員頭痛, 今天我將講解這個令人頭痛的例外是如何發生的. 可以導致NullReferenceExcep ...

NullReferenceException可能是.Net程式員遇到最多的例外了, 這個例外發生的如此頻繁,
以至於人們付出了巨大的努力來使用各種特性和約束試圖防止它發生, 但時至今日它仍然讓很多程式員頭痛, 今天我將講解這個令人頭痛的例外是如何發生的.

可以導致NullReferenceException發生的源代碼

我們先來看看什麼樣的代碼可以導致NullReferenceException發生:

第一份代碼, 調用函數時this等於null導致例外發生

using System;

namespace ConsoleApp1
{
    class Program
    {
        public class MyClass
        {
            public int MyMember;
            public void MyMethod() { }
        }
        
        static void Main(string[] args)
        {
            MyClass obj = null;
            obj.MyMethod();
        }
    }
}

第二份代碼, 訪問成員時this等於null導致例外發生

using System;

namespace ConsoleApp1
{
    class Program
    {
        public class MyClass
        {
            public int MyMember;
            public void MyMethod() { }
        }
        
        static void Main(string[] args)
        {
            MyClass obj = null;
            Console.WriteLine(obj.MyMember);
        }
    }
}

觀察生成的IL代碼

再來看看生成的IL:

第一份代碼的IL

.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 11 (0xb)
    .maxstack 1
    .entrypoint
    .locals init (
        [0] class ConsoleApp1.Program/MyClass
    )

    IL_0000: nop
    IL_0001: ldnull
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: callvirt instance void ConsoleApp1.Program/MyClass::MyMethod()
    IL_0009: nop
    IL_000a: ret
} // end of method Program::Main

第二份代碼的IL

.method private hidebysig static 
    void Main (
        string[] args
    ) cil managed 
{
    // Method begins at RVA 0x2050
    // Code size 16 (0x10)
    .maxstack 1
    .entrypoint
    .locals init (
        [0] class ConsoleApp1.Program/MyClass
    )

    IL_0000: nop
    IL_0001: ldnull
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: ldfld int32 ConsoleApp1.Program/MyClass::MyMember
    IL_0009: call void [System.Console]System.Console::WriteLine(int32)
    IL_000e: nop
    IL_000f: ret
} // end of method Program::Main

看出什麼了嗎? 看不出吧, 我也看不出, 這代表了null檢查不是在IL層面實現的, 我們需要繼續往下看.

觀察生成的彙編代碼

看生成的彙編代碼:

第一份代碼生成的彙編 (架構不同生成的代碼也不同, 以下代碼是windows x64生成的)

    10:         static void Main(string[] args) {
00007FF9F5C30482 56                   push        rsi  
00007FF9F5C30483 48 83 EC 30          sub         rsp,30h  
00007FF9F5C30487 48 8B EC             mov         rbp,rsp  
00007FF9F5C3048A 33 C0                xor         eax,eax  
00007FF9F5C3048C 48 89 45 20          mov         qword ptr [rbp+20h],rax  
00007FF9F5C30490 48 89 45 28          mov         qword ptr [rbp+28h],rax  
00007FF9F5C30494 48 89 4D 50          mov         qword ptr [rbp+50h],rcx  
00007FF9F5C30498 83 3D 49 48 EA FF 00 cmp         dword ptr [7FF9F5AD4CE8h],0  
00007FF9F5C3049F 74 05                je          00007FF9F5C304A6  
00007FF9F5C304A1 E8 1A B5 C0 5F       call        00007FFA5583B9C0  
00007FF9F5C304A6 90                   nop  
    11:             MyClass obj = null;
00007FF9F5C304A7 33 C9                xor         ecx,ecx  
00007FF9F5C304A9 48 89 4D 20          mov         qword ptr [rbp+20h],rcx  
    12:             obj.MyMethod();
00007FF9F5C304AD 48 8B 4D 20          mov         rcx,qword ptr [rbp+20h]  
00007FF9F5C304B1 39 09                cmp         dword ptr [rcx],ecx  
00007FF9F5C304B3 E8 E8 FB FF FF       call        00007FF9F5C300A0  
00007FF9F5C304B8 90                   nop  
    13:         }

第二份代碼生成的彙編

    10:         static void Main(string[] args) {
00007FF9F5C20B22 56                   push        rsi  
00007FF9F5C20B23 48 83 EC 30          sub         rsp,30h  
00007FF9F5C20B27 48 8B EC             mov         rbp,rsp  
00007FF9F5C20B2A 33 C0                xor         eax,eax  
00007FF9F5C20B2C 48 89 45 20          mov         qword ptr [rbp+20h],rax  
00007FF9F5C20B30 48 89 45 28          mov         qword ptr [rbp+28h],rax  
00007FF9F5C20B34 48 89 4D 50          mov         qword ptr [rbp+50h],rcx  
00007FF9F5C20B38 83 3D A9 41 EA FF 00 cmp         dword ptr [7FF9F5AC4CE8h],0  
00007FF9F5C20B3F 74 05                je          00007FF9F5C20B46  
00007FF9F5C20B41 E8 7A AE C1 5F       call        00007FFA5583B9C0  
00007FF9F5C20B46 90                   nop  
    11:             MyClass obj = null;
00007FF9F5C20B47 33 C9                xor         ecx,ecx  
00007FF9F5C20B49 48 89 4D 20          mov         qword ptr [rbp+20h],rcx  
    12:             Console.WriteLine(obj.MyMember);
00007FF9F5C20B4D 48 8B 4D 20          mov         rcx,qword ptr [rbp+20h]  
00007FF9F5C20B51 8B 49 08             mov         ecx,dword ptr [rcx+8]  
00007FF9F5C20B54 E8 87 FB FF FF       call        00007FF9F5C206E0  
00007FF9F5C20B59 90                   nop  
    13:         }

從彙編我們可以看出點端倪了, 註意第一份代碼中的以下指令

00007FF9F5C304B1 39 09                cmp         dword ptr [rcx],ecx

和第二份代碼中的以下指令

00007FF9F5C20B51 8B 49 08             mov         ecx,dword ptr [rcx+8]

在第一份代碼中多了一個奇怪的cmp指令,
這個cmp比較了rcx自身但是卻不使用比較的結果(後續je, jne等等),
這個指令正是null檢查的真面目,
rcx寄存器保存的是obj對象的指針, 也是下麵的call指令的第一個參數(this),
如果rcx等於0(obj等於null)時, 這條指令就會執行失敗.

在第二份代碼中mov ecx,dword ptr [rcx+8]指令的作用是把rcx保存的obj的MyMember成員的值移到ecx,
可以理解為c語言的int myMember = obj->MyMember;或int myMember = *(int*)(((char*)obj)+8),
這裡的8是MyMember距離對象開頭的偏移值,
想象一下如果obj等於null, rcx+8等於8,
因為記憶體地址8上面不存在任何內容, 這條指令就會執行失敗.
因為這條指令已經帶有檢查null的作用, 所以第二份代碼中你看不到像第一份代碼中的cmp指令.

熟悉c語言的可能會問, 這樣的指令執行失敗以後程式不會立刻退出嗎?
答案是會, 如果你不做特殊的處理, 訪問((MyClass*)NULL)->MyMember會導致程式立刻退出.
那麼在CoreCLR中是如何處理的?

指令執行失敗以後

CPU指令執行失敗以後(記憶體訪問失敗, 除0等)時, 會傳遞一個硬體例外給內核, 然後內核會結束對應的進程.
但在結束之前它會允許進程補救, 補救的方法Windows和Linux都不一樣.

在Linux上可以通過捕捉SIGSEGV處理記憶體訪問失敗, 示例代碼如下

#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <setjmp.h>

jmp_buf recover_point;

static void sigsegv_handler(int sig, siginfo_t* si, void* unused) {
    fprintf(stderr, "catched sigsegv\n");
    longjmp(recover_point, 1);
}

int main() {
    struct sigaction action;
    action.sa_handler = NULL;
    action.sa_sigaction = sigsegv_handler;
    action.sa_flags = SA_SIGINFO;
    sigemptyset(&action.sa_mask);
    if (sigaction(SIGSEGV, &action, NULL) != 0) {
        perror("bind signal handler failed");
        abort();
    }
    
    if (setjmp(recover_point) == 0) {
        int* ptr = NULL;
        *ptr = 1;
    } else {
        printf("recover success\n");;
    }
    return 0;
}

而在Windows上可以通過註冊VectoredExceptionHandler處理硬體異常, 示例代碼如下

#include "stdafx.h"
#include <Windows.h>
#include <setjmp.h>

void* gVectoredExceptionHandler = NULL;
jmp_buf gRecoverPoint;

LONG WINAPI MyVectoredExceptionHandler(PEXCEPTION_POINTERS pExceptionInfo)
{
    if (pExceptionInfo->ExceptionRecord->ExceptionCode == STATUS_ACCESS_VIOLATION)
    {
        fprintf(stderr, "catched access violation\n");
        longjmp(gRecoverPoint, 1);
    }
    return EXCEPTION_CONTINUE_SEARCH;
}

int main()
{
    gVectoredExceptionHandler = AddVectoredExceptionHandler(
        TRUE, (PVECTORED_EXCEPTION_HANDLER)MyVectoredExceptionHandler);

    if (setjmp(gRecoverPoint) == 0)
    {
        int* ptr = NULL;
        *ptr = 1;
    }
    else
    {
        printf("recover success\n");
    }
    return 0;
}

在上面的代碼中我使用了longjmp來從異常中恢復, 這是最簡單的做法但也會帶來很多問題, 接下來我們看看CoreCLR會如何處理這些異常.

CoreCLR中的處理 (Linux, OSX)

我們先來看Linux上CoreCLR是如何處理的, 以下代碼來源於CoreCLR 1.1.0, OSX上的處理邏輯和Linux一樣.

首先CoreCLR會註冊SIGSEGV的處理器, 在pal\src\exception\signal.cpp中可以找到以下的代碼

BOOL SEHInitializeSignals(DWORD flags)
{
    TRACE("Initializing signal handlers\n");

    /* we call handle_signal for every possible signal, even
       if we don't provide a signal handler.

       handle_signal will set SA_RESTART flag for specified signal.
       Therefore, all signals will have SA_RESTART flag set, preventing
       slow Unix system calls from being interrupted. On systems without
       siginfo_t, SIGKILL and SIGSTOP can't be restarted, so we don't
       handle those signals. Both the Darwin and FreeBSD man pages say
       that SIGKILL and SIGSTOP can't be handled, but FreeBSD allows us
       to register a handler for them anyway. We don't do that.

       see sigaction man page for more details
       */
    handle_signal(SIGILL, sigill_handler, &g_previous_sigill);
    handle_signal(SIGTRAP, sigtrap_handler, &g_previous_sigtrap);
    handle_signal(SIGFPE, sigfpe_handler, &g_previous_sigfpe);
    handle_signal(SIGBUS, sigbus_handler, &g_previous_sigbus);
    handle_signal(SIGSEGV, sigsegv_handler, &g_previous_sigsegv);
    handle_signal(SIGINT, sigint_handler, &g_previous_sigint);
    handle_signal(SIGQUIT, sigquit_handler, &g_previous_sigquit);

這裡除了註冊SIGSEGV以外還會註冊其他信號的處理器, 接下來看sigsegv_handler的內容:

static void sigsegv_handler(int code, siginfo_t *siginfo, void *context)
{
    if (PALIsInitialized())
    {
        // TODO: First variable parameter says whether a read (0) or write (non-0) caused the
        // fault. We must disassemble the instruction at record.ExceptionAddress
        // to correctly fill in this value.
        if (common_signal_handler(code, siginfo, context, 2, (size_t)0, (size_t)siginfo->si_addr))
        {
            return;
        }
    }

    if (g_previous_sigsegv.sa_sigaction != NULL)
    {
        g_previous_sigsegv.sa_sigaction(code, siginfo, context);
    }
    else
    {
        // Restore the original or default handler and restart h/w exception
        restore_signal(code, &g_previous_sigsegv);
    }

    PROCNotifyProcessShutdown();
}

common_signal_handler的內容:

static bool common_signal_handler(int code, siginfo_t *siginfo, void *sigcontext, int numParams, ...)
{
    sigset_t signal_set;
    CONTEXT *contextRecord;
    EXCEPTION_RECORD *exceptionRecord;
    native_context_t *ucontext;

    ucontext = (native_context_t *)sigcontext;

    AllocateExceptionRecords(&exceptionRecord, &contextRecord);

    // 把信號轉換為例外代碼, 這裡的例外代碼等於windows上的STATUS_ACCESS_VIOLATION (0xC0000005L)
    exceptionRecord->ExceptionCode = CONTEXTGetExceptionCodeForSignal(siginfo, ucontext);
    exceptionRecord->ExceptionFlags = EXCEPTION_IS_SIGNAL;
    exceptionRecord->ExceptionRecord = NULL;
    exceptionRecord->ExceptionAddress = GetNativeContextPC(ucontext);
    exceptionRecord->NumberParameters = numParams;

    va_list params;
    va_start(params, numParams);

    for (int i = 0; i < numParams; i++)
    {
        exceptionRecord->ExceptionInformation[i] = va_arg(params, size_t);
    }

    // 捕捉例外發生時的上下文
    // Pre-populate context with data from current frame, because ucontext doesn't have some data (e.g. SS register)
    // which is required for restoring context
    RtlCaptureContext(contextRecord);

    ULONG contextFlags = CONTEXT_CONTROL | CONTEXT_INTEGER | CONTEXT_FLOATING_POINT;

#if defined(_AMD64_)
    contextFlags |= CONTEXT_XSTATE;
#endif

    // Fill context record with required information. from pal.h:
    // On non-Win32 platforms, the CONTEXT pointer in the
    // PEXCEPTION_POINTERS will contain at least the CONTEXT_CONTROL registers.
    CONTEXTFromNativeContext(ucontext, contextRecord, contextFlags);

    /* Unmask signal so we can receive it again */
    sigemptyset(&signal_set);
    sigaddset(&signal_set, code);
    int sigmaskRet = pthread_sigmask(SIG_UNBLOCK, &signal_set, NULL);
    if (sigmaskRet != 0)
    {
        ASSERT("pthread_sigmask failed; error number is %d\n", sigmaskRet);
    }

    contextRecord->ContextFlags |= CONTEXT_EXCEPTION_ACTIVE;
    // The exception object takes ownership of the exceptionRecord and contextRecord
    PAL_SEHException exception(exceptionRecord, contextRecord);

    // 轉換為和windows一致的SEH例外類型並繼續處理
    if (SEHProcessException(&exception))
    {
        // Exception handling may have modified the context, so update it.
        CONTEXTToNativeContext(contextRecord, ucontext);
        return true;
    }

    return false;
}

繼續追下去會很長, 這裡就只貼跟蹤的調用流程了:

觸發 sigsegv_handler (pal\src\exception\signal.cpp)
    調用 common_signal_handler (pal\src\exception\signal.cpp)
        調用 SEHProcessException (pal\src\exception\seh.cpp)
            調用 HandleHardwareException (vm\exceptionhandling.cpp)
                調用 DispatchManagedException (vm\exceptionhandling.cpp)
                    調用 UnwindManagedExceptionPass1 (vm\exceptionhandling.cpp:4503)
                        調用 ProcessCLRException (vm\exceptionhandling.cpp:751)
                        調用 UnwindManagedExceptionPass2 (vm\exceptionhandling.cpp:4357)
                            調用 ProcessCLRException (vm\exceptionhandling.cpp:751)
                                調用 ExceptionTracker::GetOrCreateTracker (vm\exceptionhandling.cpp:3613)
                                    調用 ExceptionTracker::CreateThrowable (vm\exceptionhandling.cpp:4004)
                                        調用 CreateCOMPlusExceptionObject (vm\excep.cpp:6978)
                                            調用 MapWin32FaultToCOMPlusException (vm\excep.cpp:6996)
                                                在這裡會把STATUS_ACCESS_VIOLATION轉換為NullReferenceException
                                                轉換到的NullReferenceException是一個預先分配好的全局對象
                                調用 ExceptionTracker::ProcessOSExceptionNotification (vm\exceptionhandling.cpp:1589)
                                    這個函數會調用finally中的代碼
                                    調用 ExceptionTracker::ProcessManagedCallFrame (vm\exceptionhandling.cpp:2321)
                                        調用 ExceptionTracker::CallHandler (vm\exceptionhandling.cpp:3273)
                                調用 ExceptionTracker::ResumeExecution (vm\exceptionhandling.cpp:3972)
                                    調用 RtlRestoreContext (pal\src\arch\i386\context2.S)
                                        在這裡會跳到對應的處理例外(catch)的代碼
                                        跳過去以後會繼續處理, 不再返回

總結:

在Linux上
- 如果對象是null並且訪問對象的函數或者成員, 會觸發SIGSEGV信號
- CoreCLR捕捉到SIGSEGV信號後會根據信號生成類似Windows形式的EXCEPTION_POINTERS結構體
  - 這是為了可以和Windows共用處理的代碼
- 處理例外時, 根據例外代碼(0xC0000005L)轉換為CLR中的NullReferenceException的對象
- 回滾堆棧和調用finally中的代碼
- 跳到對應的處理例外(catch)的代碼

例外處理不是這一篇的重點所以這裡我就不詳細解釋了(目前還未弄清楚).

CoreCLR中的處理 (Windows)

在Windows上CoreCLR會註冊一個VectoredHandler用於處理硬體例外:

這是vm\excep.cpp中的CLRAddVectoredHandlers函數, 啟動時會調用

void CLRAddVectoredHandlers(void)
{
#ifndef FEATURE_PAL

    // We now install a vectored exception handler on all supporting Windows architectures.
    g_hVectoredExceptionHandler = AddVectoredExceptionHandler(TRUE, (PVECTORED_EXCEPTION_HANDLER)CLRVectoredExceptionHandlerShim);
    if (g_hVectoredExceptionHandler == NULL)
    {
        LOG((LF_EH, LL_INFO100, "CLRAddVectoredHandlers: AddVectoredExceptionHandler() failed\n"));
        COMPlusThrowHR(E_FAIL);
    }

    LOG((LF_EH, LL_INFO100, "CLRAddVectoredHandlers: AddVectoredExceptionHandler() succeeded\n"));
#endif // !FEATURE_PAL
}

當硬體異常發生時會調用這個處理器, 代碼同樣在vm\excep.cpp, 如下:

LONG WINAPI CLRVectoredExceptionHandlerShim(PEXCEPTION_POINTERS pExceptionInfo)
{
    //
    // HandleManagedFault will take a Crst that causes an unbalanced
    // notrigger scope, and this contract will whack the thread's
    // ClrDebugState to what it was on entry in the dtor, which causes
    // us to assert when we finally release the Crst later on.
    //
//    CONTRACTL
//    {
//        NOTHROW;
//        GC_NOTRIGGER;
//        MODE_ANY;
//    }
//    CONTRACTL_END;

    //
    // WARNING WARNING WARNING WARNING WARNING WARNING WARNING
    //
    // o This function should not call functions that acquire
    //   synchronization objects or allocate memory, because this
    //   can cause problems.  <-- quoteth MSDN  -- probably for
    //   the same reason as we cannot use LOG(); we'll recurse
    //   into a stack overflow.
    //
    // o You cannot use LOG() in here because that will trigger an
    //   exception which will cause infinite recursion with this
    //   function.  We work around this by ignoring all non-error
    //   exception codes, which serves as the base of the recursion.
    //   That way, we can LOG() from the rest of the function
    //
    // The same goes for any function called by this
    // function.
    //
    // WARNING WARNING WARNING WARNING WARNING WARNING WARNING
    //

    // If exceptions (or runtime) have been disabled, then simply return.
    if (g_fForbidEnterEE || g_fNoExceptions)
    {
        return EXCEPTION_CONTINUE_SEARCH;
    }

    // WARNING
    //
    // We must preserve this so that GCStress=4 eh processing doesnt kill last error.
    // Note that even GetThread below can affect the LastError.
    // Keep this in mind when adding code above this line!
    //
    // WARNING
    DWORD dwLastError = GetLastError();

#if defined(_TARGET_X86_)
    // Capture the FPU state before we do anything involving floating point instructions
    FPUStateHolder captureFPUState;
#endif // defined(_TARGET_X86_)

#ifdef FEATURE_INTEROP_DEBUGGING
    // For interop debugging we have a fancy exception queueing stunt. When the debugger
    // initially gets the first chance exception notification it may not know whether to
    // continue it handled or unhandled, but it must continue the process to allow the
    // in-proc helper thread to work. What it does is continue the exception unhandled which
    // will let the thread immediately execute to this point. Inside this worker the thread
    // will block until the debugger knows how to continue the exception. If it decides the
    // exception was handled then we immediately resume execution as if the exeption had never
    // even been allowed to run into this handler. If it is unhandled then we keep processing
    // this handler
    //
    // WARNING: This function could potentially throw an exception, however it should only
    // be able to do so when an interop debugger is attached
    if(g_pDebugInterface != NULL)
    {
        if(g_pDebugInterface->FirstChanceSuspendHijackWorker(pExceptionInfo->ContextRecord,
            pExceptionInfo->ExceptionRecord) == EXCEPTION_CONTINUE_EXECUTION)
        return EXCEPTION_CONTINUE_EXECUTION;
    }
#endif


    // 獲取例外代碼
    DWORD dwCode = pExceptionInfo->ExceptionRecord->ExceptionCode;
    if (dwCode == DBG_PRINTEXCEPTION_C || dwCode == EXCEPTION_VISUALCPP_DEBUGGER)
    {
        return EXCEPTION_CONTINUE_SEARCH;
    }

#if defined(_TARGET_X86_)
    if (dwCode == EXCEPTION_BREAKPOINT || dwCode == EXCEPTION_SINGLE_STEP)
    {
        // For interop debugging, debugger bashes our managed exception handler.
        // Interop debugging does not work with real vectored exception handler :(
        return EXCEPTION_CONTINUE_SEARCH;
    }
#endif

    bool bIsGCMarker = false;

#ifdef USE_REDIRECT_FOR_GCSTRESS
    // This is AMD64 & ARM specific as the macro above is defined for AMD64 & ARM only
    bIsGCMarker = IsGcMarker(dwCode, pExceptionInfo->ContextRecord);
#elif defined(_TARGET_X86_) && defined(HAVE_GCCOVER)
    // This is the equivalent of the check done in COMPlusFrameHandler, incase the exception is
    // seen by VEH first on x86.
    bIsGCMarker = IsGcMarker(dwCode, pExceptionInfo->ContextRecord);
#endif // USE_REDIRECT_FOR_GCSTRESS

    // Do not update the TLS with exception details for exceptions pertaining to GCStress
    // as they are continueable in nature.
    if (!bIsGCMarker)
    {
        SaveCurrentExceptionInfo(pExceptionInfo->ExceptionRecord, pExceptionInfo->ContextRecord);
    }


    LONG result = EXCEPTION_CONTINUE_SEARCH;

    // If we cannot obtain a Thread object, then we have no business processing any
    // exceptions on this thread.  Indeed, even checking to see if the faulting
    // address is in JITted code is problematic if we have no Thread object, since
    // this thread will bypass all our locks.
    Thread *pThread = GetThread();

    // Also check if the exception was in the EE or not
    BOOL fExceptionInEE = FALSE;
    if (!pThread)
    {
        // Check if the exception was in EE only if Thread object isnt available.
        // This will save us from unnecessary checks
        fExceptionInEE = IsIPInEE(pExceptionInfo->ExceptionRecord->ExceptionAddress);
    }

    // We are going to process the exception only if one of the following conditions is true:
    //
    // 1) We have a valid Thread object (implies exception on managed thread)
    // 2) Not a valid Thread object but the IP is in the execution engine (implies native thread within EE faulted)
    // 如果例外來源是運行引擎中的代碼(托管代碼), 或者有Thread對象(pinvoke代碼)則繼續處理
    if (pThread || fExceptionInEE)
    {
        if (!bIsGCMarker)
            result = CLRVectoredExceptionHandler(pExceptionInfo);
        else
            result = EXCEPTION_CONTINUE_EXECUTION;

        if (EXCEPTION_EXECUTE_HANDLER == result)
        {
            result = EXCEPTION_CONTINUE_SEARCH;
        }

#ifdef _DEBUG
#ifndef FEATURE_PAL
#ifndef WIN64EXCEPTIONS
        {
            CantAllocHolder caHolder;

            PEXCEPTION_REGISTRATION_RECORD pRecord = GetCurrentSEHRecord();
            while (pRecord != EXCEPTION_CHAIN_END)
            {
                STRESS_LOG2(LF_EH, LL_INFO10000, "CLRVectoredExceptionHandlerShim: FS:0 %p:%p\n",
                            pRecord, pRecord->Handler);
                pRecord = pRecord->Next;
            }
        }
#endif // WIN64EXCEPTIONS

        {
            // The call to "CLRVectoredExceptionHandler" above can return EXCEPTION_CONTINUE_SEARCH
            // for different scenarios like StackOverFlow/SOFT_SO, or if it is forbidden to enter the EE.
            // Thus, if we dont have a Thread object for the thread that has faulted and we came this far
            // because the fault was in MSCORWKS, then we work with the frame chain below only if we have
            // valid Thread object.

            if (pThread)
            {
                CantAllocHolder caHolder;

                TADDR* sp;
                sp = (TADDR*)&sp;
                DWORD count = 0;
                void* stopPoint = pThread->GetCachedStackBase();
                // If Frame chain is corrupted, we may get AV while accessing frames, and this function will be
                // called recursively.  We use Frame chain to limit our search range.  It is not disaster if we
                // can not use it.
                if (!(dwCode == STATUS_ACCESS_VIOLATION &&
                      IsIPInEE(pExceptionInfo->ExceptionRecord->ExceptionAddress)))
                {
                    // Find the stop point (most jitted function)
                    Frame* pFrame = pThread->GetFrame();
                    for(;;)
                    {
                        // skip GC frames
                        if (pFrame == 0 || pFrame == (Frame*) -1)
                            break;

                        Frame::ETransitionType type = pFrame->GetTransitionType();
                        if (type == Frame::TT_M2U || type == Frame::TT_InternalCall)
                        {
                            stopPoint = pFrame;
                            break;
                        }
                        pFrame = pFrame->Next();
                    }
                }
                STRESS_LOG0(LF_EH, LL_INFO100, "CLRVectoredExceptionHandlerShim: stack");
                while (count < 20 && sp < stopPoint)
                {
                    if (IsIPInEE((BYTE*)*sp))
                    {
                        STRESS_LOG1(LF_EH, LL_INFO100, "%pK\n", *sp);
                        count ++;
                    }
                    sp += 1;
                }
            }
        }
#endif // !FEATURE_PAL
#endif // _DEBUG

#ifndef WIN64EXCEPTIONS
        {
            CantAllocHolder caHolder;
            STRESS_LOG1(LF_EH, LL_INFO1000, "CLRVectoredExceptionHandlerShim: returning %d\n", result);
        }
#endif // WIN64EXCEPTIONS

    }

    SetLastError(dwLastError);

    return result;
}

同樣的, 繼續跟下去會非常長我就只貼跟蹤流程了:

CLRVectoredExceptionHandlerShim (vm\excep.cpp:8171)
    調用 CLRVectoredExceptionHandler (vm\excep.cpp:7464)
        調用 CLRVectoredExceptionHandlerPhase2 (vm\excep.cpp:7622)
            調用 CLRVectoredExceptionHandlerPhase3 (vm\excep:7803)
                調用 HandleManagedFault (vm\excep:7311)
                    調用 SetNakedThrowHelperArgRegistersInContext (vm\excep:7297)
                        把引發例外的指令地址存到rcx (第一個參數)
                        設置IP到NakedThrowHelper (vm\amd64\RedirectedHandledJITCase.asm)
NakedThrowHelper (vm\amd64\RedirectedHandledJITCase.asm)
    調用 LinkFrameAndThrow (vm\excep.cpp:7278)
        調用 RaiseException (https://msdn.microsoft.com/en-us/library/windows/desktop/ms680552(v=vs.85).aspx)
NakedThrowHelper 經過包裝
    GenerateRedirectedStubWithFrame NakedThrowHelper, FixContextHandler, NakedThrowHelper2 (vm\amd64\RedirectedHandledJITCase.asm)
    RaiseException時會使用FixContextHandler處理
    巨集GenerateRedirectedStubWithFrame使用了巨集NESTED_ENTRY, 可以參考vm\amd64\AsmMacros.inc
FixContextHandler (vm\exceptionhandler.cpp:5631)
    調用 FixupDispatcherContext (vm\exceptionhandler.cpp:5525)
        調用 FixupDispatcherContext 的另一個重載 (vm\exceptionhandler.cpp:5399)
            設置 pDispatcherContext->LanguageHandler = (PEXCEPTION_ROUTINE)GetEEFuncEntryPoint(ProcessCLRException);
ProcessCLRException (vm\exceptionhandling.cpp:751)
    調用 ExceptionTracker::GetOrCreateTracker (vm\exceptionhandling.cpp:3613)
        調用 ExceptionTracker::CreateThrowable (vm\exceptionhandling.cpp:4004)
            調用 CreateCOMPlusExceptionObject (vm\excep.cpp:6978)
                調用 MapWin32FaultToCOMPlusException (vm\excep.cpp:6996)
                    在這裡會把STATUS_ACCESS_VIOLATION轉換為NullReferenceException
                    轉換到的NullReferenceException是一個預先分配好的全局對象
    調用 ExceptionTracker::ProcessOSExceptionNotification (vm\exceptionhandling.cpp:1589)
        這個函數會調用finally中的代碼
        調用 ExceptionTracker::ProcessManagedCallFrame (vm\exceptionhandling.cpp:2321)
            調用 ExceptionTracker::CallHandler (vm\exceptionhandling.cpp:3273)
    調用 ClrUnwindEx (vm\exceptionhandling.cpp:5229)
        調用 RtlUnwindEx (Windows自帶的API)

        在這裡會跳到對應的處理例外(catch)的代碼
        跳過去以後會繼續處理, 不再返回

總結:

在Windows上
- 如果對象是null並且訪問對象的函數或者成員, 會觸發硬體異常
- CoreCLR通過CLRVectoredExceptionHandlerShim捕捉到異常
- 調用原生的RaiseException拋出例外給ProcessCLRException處理
- 處理例外時, 根據例外代碼(0xC0000005L)轉換為CLR中的NullReferenceException的對象
- 回滾堆棧和調用finally中的代碼
- 跳到對應的處理例外(catch)的代碼

特殊情況的null檢查

註意到上面第二份代碼中的訪問異常是在訪問了0x8的時候出現的嗎?
想想如果成員在更後面的位置, 例如0x10000, 並且在0x10000有內容存在的時候還可以檢測出來嗎?
這裡我模擬一下特殊情況下的null檢查, 看看CoreCLR是否可以正確處理.

測試使用的代碼:

using System;
using System.Diagnostics;
using System.Reflection;
using System.Runtime.InteropServices;

namespace ConsoleApp1
{
    class Program
    {
        struct st_32 { long a; long b; long c; long d; }
        struct st_128 { st_32 a; st_32 b; st_32 c; st_32 d; }
        struct st_512 { st_128 a; st_128 b; st_128 c; st_128 d; }
        struct st_2048 { st_512 a; st_512 b; st_512 c; st_512 d; }
        struct st_10240 { st_2048 a; st_2048 b; st_2048 c; st_2048 d; st_2048 e; }
        struct st_51200 { st_10240 a; st_10240 b; st_10240 c; st_10240 d; st_10240 e; }
        struct st_65536 { st_51200 a; st_10240 b; st_2048 c; st_2048 d; }
        struct padding { st_10240 a; st_10240 b; st_10240 c; public int x; }
        class A
        {
            public padding a;
        }

        const uint PAGE_EXECUTE_READWRITE = 0x40;
        const uint MEM_COMMIT = 0x1000;
        const uint MEM_RESERVE = 0x2000;
        [DllImport("kernel32.dll", SetLastError = true)]
        static extern IntPtr VirtualAlloc(IntPtr lpAddress, uint dwSize, uint flAllocationType, uint flProtect);

        static void Main(string[] args)
        {
            var body = new byte[4] { 1, 0, 0, 0 };
            IntPtr buf = VirtualAlloc((IntPtr)0x10000, 1024, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
            buf = (IntPtr)0x10008; // 10000 + 8, 8是method table指針的大小
            Marshal.Copy(body, 0, buf, body.Length);

            var a = new A();
            a = null;
            Console.WriteLine(a.a.x);
        }
    }
}

運行時的彙編代碼:

註意圖中紅框的部分, CoreCLR加了額外的cmp, 成功避過了使用VirtualAlloc設下的陷阱.

你也可能會問, 如果使用VirtualAlloc來在0x8分配記憶體可以騙過CoreCLR嗎?
事實上VirtualAlloc不能在0x8分配記憶體, 可以分配到的虛擬記憶體地址有範圍限制,
如果成員的位置大於最小可以分配的虛擬記憶體地址, 則CoreCLR會插入一個額外的檢查, 所以這種情況是騙不過CoreCLR的.

性能測試

我們再來測下自動拋出NullReferenceException和手動拋出NullReferenceException性能有多大的差別

測試的代碼如下:

public static string GetString()
{
    return null;
}

public static void BenchmarkNullReferenceException()
{
    for (int x = 0; x < 100000; ++x)
    {
        try
        {
            string str = GetString();
            int length = str.Length;
        }
        catch (Exception ex)
        {
        }
    }
}

public static void BenchmarkManualNullReferenceException() {
    for (int x = 0; x < 100000; ++x)
    {
        try
        {
            string str = GetString();
            if (str == null)
            {
                throw new NullReferenceException();
            }
            int length = str.Length;
        }
        catch (Exception ex)
        {

        }
    }
}

測試結果:

BenchmarkNullReferenceException: 0.9024312s
BenchmarkManualNullReferenceException: 0.9746265s

測試的結果比較出乎意料,
BenchmarkNullReferenceException和BenchmarkManualNullReferenceException在Debug和Release配置下所花的時間都是1秒左右,
這也說明瞭處理硬體異常的消耗相對於處理CLR異常的消耗並不大, 甚至還比手動拋出的消耗更小.

為什麼要這樣實現null檢查

最常見也是最容易理解的null檢查可能是在底層生成類似test rcx, rcx; jne 1f; call ThrowNullReferenceException; 1:的代碼,
然而CoreCLR並不採用這種辦法, 我個人推測有這些原因:

可以節省生成的代碼大小, 一條檢查用的cmp指令只占2個位元組
可以提升檢查性能, 例如訪問成員時直接使用mov 寄存器, [對象寄存器+成員偏移值]即可同時取出值並檢查是否null, 不需要額外的檢查指令
可以捕捉非托管代碼中的異常, 調用使用c寫的代碼中發生了記憶體訪問錯誤也可以捕捉到

參考鏈接

這篇文章參考了以下鏈接, 並且還在github上向CoreCLR提過了相關問題

這篇相對來說比較易懂, 之前講好的JIT篇要繼續延期, 請大家耐心等待了.

CoreCLR源碼探索(六) NullReferenceException是如何發生的