遙遙領先.NET 7, .NET 8 性能大幅提升

来源:https://www.cnblogs.com/yyww/archive/2023/09/15/17703243.html
-Advertisement-
Play Games

以下內容均來自Gitee的開源倉庫,具體的使用請移步Gitee:https://gitee.com/pojianbing/lazy-captcha 以下是我自己使用的具體方式 首先安裝NuGet包: Microsoft.Extensions.Caching.StackExchangeRedis La ...


每個版本必有的性能提升彙總文章又來了。大家可以學習閱讀了。

 

微軟 .NET 開發團隊的工程師 Stephen Toub 發表博Performance Improvements in .NET 8詳細介紹了 .NET 8 中的性能改進。

 

一言蔽之:

.NET 7 was super fast. .NET 8 is faster.

.NET 8 比 .NET 7 的超級快更快!

這篇博客全方位介紹了 .NET 8 的性能表現,包括 JIT、原生 AOT、VM、GC、Mono、線程、文件 I/O、網路、JSON 處理、日誌等。

 

Benchmarking Setup

基準設置

在本文中,將使用微基準來突出討論的改進方面。這些微基準大多使用BenchmarkDotNet v0.13.8實現,除非另有說明,否則每個基準都有一個簡單的設置。

要跟隨進行,首先確保您已安裝.NET 7和.NET 8。在本文中,我使用的是.NET 8 Release Candidate(8.0.0-rc.1.23419.4)。

完成這些先決條件後,在一個新的基準目錄中創建一個新的C#項目:

dotnet new console -o benchmarks cd benchmarks

該目錄將包含兩個文件:benchmarks.csproj(包含有關應該如何構建應用程式的信息的項目文件)和Program.cs(應用程式的代碼)。將benchmarks.csproj的全部內容替換為以下內容:

Exe net8.0;net7.0 Preview enable true true

上述項目文件告訴構建系統我們想要:

構建一個可運行的應用程式(而不是庫), 能夠在.NET 8和.NET 7上運行(以便BenchmarkDotNet可以運行多個進程,一個使用.NET 7,一個使用.NET 8,以便能夠比較結果), 儘管C# 12尚未正式發佈,但能夠使用C#語言的所有最新功能, 自動導入常用命名空間, 在代碼中能夠使用unsafe關鍵字, 並將垃圾回收器(GC)配置為“伺服器”配置,這影響它在記憶體消耗和吞吐量之間做出的權衡(這不是嚴格必需的,我只是習慣使用它,並且對於ASP.NET應用程式來說,這是預設配置)。

最後的從NuGet中引入BenchmarkDotNet,以便我們能夠在Program.cs中使用該庫。(一些基準需要添加其他包;我已經在適用的位置做了說明。)

然後,我將每個基準的完整Program.cs源代碼包含在了裡面;只需將該代碼複製並粘貼到Program.cs中,替換其全部內容。在每個測試中,您會註意到幾個屬性可以應用於Tests類。[記憶體診斷器]屬性表示我想跟蹤托管分配,[反彙編診斷器]屬性表示我想報告實際為測試生成的彙編代碼(預設情況下還有一個層級的函數調用),[隱藏列]屬性僅僅抑制了BenchmarkDotNet可能預設輸出但對我們在這裡的目的無關緊要的一些數據列。

然後,運行基準非常簡單。每個顯示的測試還包括一個以dotnet命令開頭的註釋,用於運行基準測試。通常是這樣的:

dotnet run -c Release -f net7.0 --filter "*" --runtimes net7.0 net8.0

上述dotnet run命令:

以發佈版本構建基準。這對性能測試很重要,因為大多數優化在調試構建中都被禁用了,包括C#編譯器和JIT編譯器。 針對主機項目選擇的是.NET 7。通常情況下,對於BenchmarkDotNet,您需要針對您將執行的所有運行時的最低公共標準進行目標設定,以確保所有被使用的API在需要的地方都可用。 運行整個程式中的所有基準。--filter參數可以縮小範圍,僅對所需基準的子集進行範圍限制,但“*”表示“運行所有基準”。 在.NET 7和.NET 8上運行測試。

整篇文章中,我展示了許多基準和我運行它們時得到的結果。所有的代碼在所有支持的操作系統和架構上都運行良好。除非另有說明,否則所示的基準結果均來自在Linux(Ubuntu 22.04)上(一個x64處理器)運行時的結果(唯一的例外是當我使用[反彙編診斷器]顯示彙編代碼時,我在Windows 11上運行了它們,因為在Unix上使用[反彙編診斷器]運行.NET 7並不總是產生所請求的彙編)。我的標準警告:這些是微基準,通常測量非常短的操作時間,並且當這些時間的改進通過一遍又一遍的執行而累積起來時,其影響是顯著的。不同的硬體、不同的操作系統、您的電腦上運行的其他內容、您當前的心情以及您早餐吃了什麼都可能影響涉及的數字。簡而言之,不要指望您看到的數字與我在這裡報告的數字完全匹配,儘管我選擇的示例中,所引用的差異的數量級可完全重現。

解釋完了,我們開始吧...”

 

 

全文請看:   Performance Improvements in .NET 8 - .NET Blog (microsoft.com)

 

JIT

Code generation permeates every single line of code we write, and it’s critical to the end-to-end performance of applications that the compiler doing that code generation achieves high code quality. In .NET, that’s the job of the Just-In-Time (JIT) compiler, which is used both “just in time” as an application executes as well as in Ahead-Of-Time (AOT) scenarios as the workhorse to perform the codegen at build-time. Every release of .NET has seen significant improvements in the JIT, and .NET 8 is no exception. In fact, I dare say the improvements in .NET 8 in the JIT are an incredible leap beyond what was achieved in the past, in large part due to dynamic PGO…

Tiering and Dynamic PGO

To understand dynamic PGO, we first need to understand “tiering.” For many years, a .NET method was only ever compiled once: on first invocation of the method, the JIT would kick in to generate code for that method, and then that invocation and every subsequent one would use that generated code. It was a simple time, but also one frought with conflict… in particular, a conflict between how much the JIT should invest in code quality for the method and how much benefit would be gained from that enhanced code quality. Optimization is one of the most expensive things a compiler does; a compiler can spend an untold amount of time searching for additional ways to shave off an instruction here or improve the instruction sequence there. But none of us has an infinite amount of time to wait for the compiler to finish, especially in a “just in time” scenario where the compilation is happening as the application is running. As such, in a world where a method is compiled once for that process, the JIT has to either pessimize code quality or pessimize how long it takes to run, which means a tradeoff between steady-state throughput and startup time.

As it turns out, however, the vast majority of methods invoked in an application are only ever invoked once or a small number of times. Spending a lot of time optimizing such methods would actually be a deoptimization, as likely it would take much more time to optimize them than those optimizations would gain. So, .NET Core 3.0 introduced a new feature of the JIT known as “tiered compilation.” With tiering, a method could end up being compiled multiple times. On first invocation, the method would be compiled in “tier 0,” in which the JIT prioritizes speed of compilation over code quality; in fact, the mode the JIT uses is often referred to as “min opts,” or minimal optimization, because it does as little optimization as it can muster (it still maintains a few optimizations, primarily the ones that result in less code to be compiled such that the JIT actually runs faster). In addition to minimizing optimizations, however, it also employs call counting “stubs”; when you invoke the method, the call goes through a little piece of code (the stub) that counts how many times the method was invoked, and once that count crosses a predetermined threshold (e.g. 30 calls), the method gets queued for re-compilation, this time at “tier 1,” in which the JIT throws every optimization it’s capable of at the method. Only a small subset of methods make it to tier 1, and those that do are the ones worthy of additional investment in code quality. Interestingly, there are things the JIT can learn about the method from tier 0 that can lead to even better tier 1 code quality than if the method had been compiled to tier 1 directly. For example, the JIT knows that a method “tiering up” from tier 0 to tier 1 has already been executed, and if it’s already been executed, then any static readonly fields it accesses are now already initialized, which means the JIT can look at the values of those fields and base the tier 1 code gen on what’s actually in the field (e.g. if it’s a static readonly bool, the JIT can now treat the value of that field as if it were const bool). If the method were instead compiled directly to tier 1, the JIT might not be able to make the same optimizations. Thus, with tiering, we can “have our cake and eat it, too.” We get both good startup and good throughput. Mostly…

One wrinkle to this scheme, however, is the presence of longer-running methods. Methods might be important because they’re invoked many times, but they might also be important because they’re invoked only a few times but end up running forever, in particular due to looping. As such, tiering was disabled by default for methods containing backward branches, such that those methods would go straight to tier 1. To address that, .NET 7 introduced On-Stack Replacement (OSR). With OSR, the code generated for loops also included a counting mechanism, and after a loop iterated to a certain threshold, the JIT would compile a new optimized version of the method and jump from the minimally-optimized code to continue execution in the optimized variant. Pretty slick, and with that, in .NET 7 tiering was also enabled for methods with loops.

But why is OSR important? If there are only a few such long-running methods, what’s the big deal if they just go straight to tier 1? Surely startup isn’t significantly negatively impacted? First, it can be: if you’re trying to trim milliseconds off startup time, every method counts. But second, as noted before, there are throughput benefits to going through tier 0, in that there are things the JIT can learn about a method from tier 0 which can then improve its tier 1 compilation. And the list of things the JIT can learn gets a whole lot bigger with dynamic PGO.

Profile-Guided Optimization (PGO) has been around for decades, for many languages and environments, including in .NET world. The typical flow is you build your application with some additional instrumentation, you then run your application on key scenarios, you gather up the results of that instrumentation, and then you rebuild your application, feeding that instrumentation data into the optimizer, allowing it to use the knowledge about how the code executed to impact how it’s optimized. This approach is often referred to as “static PGO.” “Dynamic PGO” is similar, except there’s no effort required around how the application is built, scenarios it’s run on, or any of that. With tiering, the JIT is already generating a tier 0 version of the code and then a tier 1 version of the code… why not sprinkle some instrumentation into the tier 0 code as well? Then the JIT can use the results of that instrumentation to better optimize tier 1. It’s the same basic “build, run and collect, re-build” flow as with static PGO, but now on a per-method basis, entirely within the execution of the application, and handled automatically for you by the JIT, with zero additional dev effort required and zero additional investment needed in build automation or infrastructure.

Dynamic PGO first previewed in .NET 6, off by default. It was improved in .NET 7, but remained off by default. Now, in .NET 8, I’m thrilled to say it’s not only been significantly improved, it’s now on by default. This one-character PR to enable it might be the most valuable PR in all of .NET 8: dotnet/runtime#86225.

There have been a multitude of PRs to make all of this work better in .NET 8, both on tiering in general and then on dynamic PGO in particular. One of the more interesting changes is dotnet/runtime#70941, which added more tiers, though we still refer to the unoptimized as “tier 0” and the optimized as “tier 1.” This was done primarily for two reasons. First, instrumentation isn’t free; if the goal of tier 0 is to make compilation as cheap as possible, then we want to avoid adding yet more code to be compiled. So, the PR adds a new tier to address that. Most code first gets compiled to an unoptimized and uninstrumented tier (though methods with loops currently skip this tier). Then after a certain number of invocations, it gets recompiled unoptimized but instrumented. And then after a certain number of invocations, it gets compiled as optimized using the resulting instrumentation data. Second, crossgen/ReadyToRun (R2R) images were previously unable to participate in dynamic PGO. This was a big problem for taking full advantage of all that dynamic PGO offers, in particular because there’s a significant amount of code that every .NET application uses that’s already R2R’d: the core libraries. ReadyToRun is an AOT technology that enables most of the code generation work to be done at build-time, with just some minimal fix-ups applied when that precompiled code is prepared for execution. That code is optimized and not instrumented, or else the instrumentation would slow it down. So, this PR also adds a new tier for R2R. After an R2R method has been invoked some number of times, it’s recompiled, again with optimizations but this time also with instrumentation, and then when that’s been invoked sufficiently, it’s promoted again, this time to an optimized implementation utilizing the instrumentation data gathered in the previous tier. Code flow between JIT tiers

There have also been multiple changes focused on doing more optimization in tier 0. As noted previously, the JIT wants to be able to compile tier 0 as quickly as possible, however some optimizations in code quality actually help it to do that. For example, dotnet/runtime#82412 teaches it to do some amount of constant folding (evaluating constant expressions at compile time rather than at execution time), as that can enable it to generate much less code. Much of the time the JIT spends compiling in tier 0 is for interactions with the Virtual Machine (VM) layer of the .NET runtime, such as resolving types, and so if it can significantly trim away branches that won’t ever be used, it can actually speed up tier 0 compilation while also getting better code quality. We can see this with a simple repro app like the following:

// dotnet run -c Release -f net8.0

MaybePrint(42.0);

static void MaybePrint<T>(T value)
{
    if (value is int)
        Console.WriteLine(value);
}

I can set the DOTNET_JitDisasm environment variable to *MaybePrint*; that will result in the JIT printing out to the console the code it emits for this method. On .NET 7, when I run this (dotnet run -Release -f net7.0), I get the following tier 0 code:

; Assembly listing for method Program:<<Main>$>g__MaybePrint|0_0[double](double)
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; Tier-0 compilation
; MinOpts code
; rbp based frame
; partially interruptible

G_M000_IG01:                ;; offset=0000H
       55                   push     rbp
       4883EC30             sub      rsp, 48
       C5F877               vzeroupper
       488D6C2430           lea      rbp, [rsp+30H]
       33C0                 xor      eax, eax
       488945F8             mov      qword ptr [rbp-08H], rax
       C5FB114510           vmovsd   qword ptr [rbp+10H], xmm0

G_M000_IG02:                ;; offset=0018H
       33C9                 xor      ecx, ecx
       85C9                 test     ecx, ecx
       742D                 je       SHORT G_M000_IG03
       48B9B877CB99F97F0000 mov      rcx, 0x7FF999CB77B8
       E813C9AE5F           call     CORINFO_HELP_NEWSFAST
       488945F8             mov      gword ptr [rbp-08H], rax
       488B4DF8             mov      rcx, gword ptr [rbp-08H]
       C5FB104510           vmovsd   xmm0, qword ptr [rbp+10H]
       C5FB114108           vmovsd   qword ptr [rcx+08H], xmm0
       488B4DF8             mov      rcx, gword ptr [rbp-08H]
       FF15BFF72000         call     [System.Console:WriteLine(System.Object)]

G_M000_IG03:                ;; offset=0049H
       90                   nop

G_M000_IG04:                ;; offset=004AH
       4883C430             add      rsp, 48
       5D                   pop      rbp
       C3                   ret

; Total bytes of code 80

The important thing to note here is that all of the code associated with the Console.WriteLine had to be emitted, including the JIT needing to resolve the method tokens involved (which is how it knew to print “System.Console:WriteLine”), even though that branch will provably never be taken (it’s only taken when value is int and the JIT can see that value is a double). Now in .NET 8, it applies the previously-reserved-for-tier-1 constant folding optimizations that recognize the value is not an int and generates tier 0 code accordingly (dotnet run -Release -f net8.0):

; Assembly listing for method Program:<<Main>$>g__MaybePrint|0_0[double](double) (Tier0)
; Emitting BLENDED_CODE for X64 with AVX - Windows
; Tier0 code
; rbp based frame
; partially interruptible

G_M000_IG01:                ;; offset=0x0000
       push     rbp
       mov      rbp, rsp
       vmovsd   qword ptr [rbp+0x10], xmm0

G_M000_IG02:                ;; offset=0x0009

G_M000_IG03:                ;; offset=0x0009
       pop      rbp
       ret

; Total bytes of code 11

dotnet/runtime#77357 and dotnet/runtime#83002 also enable some JIT intrinsics to be employed in tier 0 (a JIT intrinsic is a method the JIT has some special knowledge of, either knowing about its behavior so it can optimize around it accordingly, or in many cases actually supplying its own implementation to replace the one in the method’s body). This is in part for the same reason; many intrinsics can result in better dead code elimination (e.g. if (typeof(T).IsValueType) { ... }). But more so, without recognizing intrinsics as being special, we might end up generating code for an intrinsic method that we would never otherwise need to generate code for, even in tier 1. dotnet/runtime#88989 also eliminates some forms of boxing in tier 0.

Collecting all of this instrumentation in tier 0 instrumented code brings with it some of its own challenges. The JIT is augmenting a bunch of methods to track a lot of additional data; where and how does it track it? And how does it do so safely and correctly when multiple threads are potentially accessing all of this at the same time? For example, one of the things the JIT tracks in an instrumented method is which branches are followed and how frequently; that requires it to count each time code traverses that branch. You can imagine that happens, well, a lot. How can it do the counting in a thread-safe yet efficient way?

The answer previously was, it didn’t. It used racy, non-synchronized updates to a shared value, e.g. _branches[branchNum]++. This means that some updates might get lost in the presence of multithreaded access, but as the answer here only needs to be approximate, that was deemed ok. As it turns out, however, in some cases it was resulting in a lot of lost counts, which in turn caused the JIT to optimize for the wrong things. Another approach tried for comparison purposes in dotnet/runtime#82775 was to use interlocked operations (e.g. if this were C#, Interlocked.Increment); that results in perfect accuracy, but that explicit synchronization represents a huge potential bottleneck when heavily contended. dotnet/runtime#84427 provides the approach that’s now enabled by default in .NET 8. It’s an implementation of a scalable approximate counter that employs some amount of pseudo-randomness to decide how often to synchronize and by how much to increment the shared count. There’s a great description of all of this in the dotnet/runtime repo; here is a C# implementation of the counting logic based on that discussion:

static void Count(ref uint sharedCounter)
{
    uint currentCount = sharedCounter, delta = 1;
    if (currentCount > 0)
    {
        int logCount = 31 - (int)uint.LeadingZeroCount(currentCount);
        if (logCount >= 13)
        {
            delta = 1u << (logCount - 12);
            uint random = (uint)Random.Shared.NextInt64(0, uint.MaxValue + 1L);
            if ((random & (delta - 1)) != 0)
            {
                return;
            }
        }
    }

    Interlocked.Add(ref sharedCounter, delta);
}

For current count values less than 8192, it ends up just doing the equivalent of an Interlocked.Add(ref counter, 1). However, as the count increases to beyond that threshold, it starts only doing the add randomly half the time, and when it does, it adds 2. Then randomly a quarter of the time it adds 4. Then an eighth of the time it adds 8. And so on. In this way, as more and more increments are performed, it requires writing to the shared counter less and less frequently.

We can test this out with a little app like the following (if you want to try running it, just copy the above Count into the program as well):

// dotnet run -c Release -f net8.0

using System.Diagnostics;

uint counter = 0;
const int ItersPerThread = 100_000_000;

while (true)
{
    Run(
              
您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • Notion相信大家都不陌生了,一款非常好用的筆記軟體,TJ君也一直在用來記筆記和寫文章。關於Notion的替代品,之前有給大家推薦AFFiNE ,但這個還是一個比較成型的軟體。 那麼如果想開發一個類Notion的工具,又或者在自己的應用中增加一個類Notion的內容編輯功能,是否有好用的開源工具呢 ...
  • 1.什麼是迴圈依賴? 迴圈依賴是指一個或多個對象之間存在直接或間接的依賴關係,這種依賴關係構成一個環形調用 , 舉個例子 : A 依賴B , B依賴C , C依賴A , 這樣就形成了迴圈依賴; 2.spring對迴圈依賴的處理有三種情況: ①構造器的迴圈依賴:這種依賴spring是處理不了的,直接拋 ...
  • JavaSe 變數和運算符: 基本數據類型介紹 java中浮點數精度怎麼解決,有瞭解過實現嗎,為什麼有精度問題 BigDecimal,如何判斷BigDecimal是否相等。如何進行計算、怎麼四捨五入 基本類型幾種,分別占用空間 int和Integer區別--包裝類,int有幾個位元組。 包裝類常量池 ...
  • 今天在看開源項目的時候發現了這樣一句代碼 import static com.abin.mallchat.common.common.service.frequencycontrol.FrequencyControlStrategyFactory.TOTAL_COUNT_WITH_IN_FIX_TI ...
  • 背景 在分散式系統中,經常需要用到全局唯一ID發生器,標識需要存儲的數據。我們需要什麼樣的ID生成器? ID生成器除了是數據的唯一標識以外,一般需要在系統中承擔更多的責任,概括起來有以下幾點: 唯一性:“全局唯一” vs “業務唯一”? 分散式系統使用唯一的ID生成器,會有非常嚴重的申請互斥問題。互 ...
  • IAT(Import Address Table)Hook是一種針對Windows操作系統的API Hooking 技術,用於修改應用程式對動態鏈接庫(DLL)中導入函數的調用。IAT是一個數據結構,其中包含了應用程式在運行時使用的導入函數的地址。IAT Hook的原理是通過修改IAT中的函數指針,... ...
  • 假設你有一行 String condition = "A or B and C"; 語句,請問怎麼做才能變成一行真正的邏輯表達式(能在電腦中運行計算)? Resolution 聲明一個List<List<String>>結構; 先分割 or ; 變成 [ A, B and C ] 不包含and的, ...
  • 前言 插件式架構,一種全新的、開放性的、高擴展性的架構體系。插件式架構設計好處很多,把擴展功能從框架中剝離出來,降低了框架的複雜度,讓框架更容易實現。擴展功能與框架以一種很松的方式耦合,兩者在保持介面不變的情況下,可以獨立變化和發佈。基於插件設計並不神秘,相反它比起一團泥的設計更簡單,更容易理解。 ...
一周排行
    -Advertisement-
    Play Games
  • 前言 插件化的需求主要源於對軟體架構靈活性的追求,特別是在開發大型、複雜或需要不斷更新的軟體系統時,插件化可以提高軟體系統的可擴展性、可定製性、隔離性、安全性、可維護性、模塊化、易於升級和更新以及支持第三方開發等方面的能力,從而滿足不斷變化的業務需求和技術挑戰。 一、插件化探索 在WPF中我們想要開 ...
  • 歡迎ReaLTaiizor是一個用戶友好的、以設計為中心的.NET WinForms項目控制項庫,包含廣泛的組件。您可以使用不同的主題選項對項目進行個性化設置,並自定義用戶控制項,以使您的應用程式更加專業。 項目地址:https://github.com/Taiizor/ReaLTaiizor 步驟1: ...
  • EDP是一套集組織架構,許可權框架【功能許可權,操作許可權,數據訪問許可權,WebApi許可權】,自動化日誌,動態Interface,WebApi管理等基礎功能於一體的,基於.net的企業應用開發框架。通過友好的編碼方式實現數據行、列許可權的管控。 ...
  • Channel 是乾什麼的 The System.Threading.Channels namespace provides a set of synchronization data structures for passing data between producers and consume ...
  • efcore如何優雅的實現按年分庫按月分表 介紹 本文ShardinfCore版本 本期主角: ShardingCore 一款ef-core下高性能、輕量級針對分表分庫讀寫分離的解決方案,具有零依賴、零學習成本、零業務代碼入侵適配 距離上次發文.net相關的已經有很久了,期間一直在從事java相關的 ...
  • 前言 Spacesniffer 是一個免費的文件掃描工具,通過使用樹狀圖可視化佈局,可以立即瞭解大文件夾的位置,幫助用戶處理找到這些文件夾 當前系統C盤空間 清理後系統C盤空間 下載 Spacesniffer 下載地址:https://spacesniffer.en.softonic.com/dow ...
  • EDP是一套集組織架構,許可權框架【功能許可權,操作許可權,數據訪問許可權,WebApi許可權】,自動化日誌,動態Interface,WebApi管理等基礎功能於一體的,基於.net的企業應用開發框架。通過友好的編碼方式實現數據行、列許可權的管控。 ...
  • 一、ReZero簡介 ReZero是一款.NET中間件 : 全網唯一開源界面操作就能生成API , 可以集成到任何.NET6+ API項目,無破壞性,也可讓非.NET用戶使用exe文件 免費開源:MIT最寬鬆協議 , 一直從事開源事業十年,一直堅持開源 1.1 純ReZero開發 適合.Net Co ...
  • 一:背景 1. 講故事 停了一個月沒有更新文章了,主要是忙於寫 C#內功修煉系列的PPT,現在基本上接近尾聲,可以回頭繼續更新這段時間分析dump的一些事故報告,有朋友微信上找到我,說他們的系統出現了大量的http超時,程式不響應處理了,讓我幫忙看下怎麼回事,dump也抓到了。 二:WinDbg分析 ...
  • 開始做項目管理了(本人3年java,來到這邊之後真沒想到...),天天開會溝通整理需求,他們講話的時候忙裡偷閑整理一下常用的方法,其實語言還是有共通性的,基本上看到方法名就大概能猜出來用法。出去打水的時候看到外面太陽好好,真想在外面坐著曬太陽,回來的時候好兄弟三年前送給我的鍵盤D鍵不靈了,在打"等待 ...