在資料庫伺服器異常斷電重啟後,資料庫會進行實例恢復,那麼實例恢復的過程中Oracle做了什麼操作呢?參考官網在這裡做一下解釋,菜鳥水平有限,歡迎勘正。 首先說下實例恢復的定義: Instance recovery is the process of applying records in the o ...
在資料庫伺服器異常斷電重啟後,資料庫會進行實例恢復,那麼實例恢復的過程中Oracle做了什麼操作呢?參考官網在這裡做一下解釋,菜鳥水平有限,歡迎勘正。
首先說下實例恢復的定義:
Instance recovery is the process of applying records in the online redo log to data files to reconstruct changes made after the most recent checkpoint. Instance recovery occurs automatically when an administrator attempts to open a database that was previously shut down inconsistently.
Oracle Database performs instance recovery automatically in the following situations:
-
The database opens for the first time after the failure of a single-instance database or all instances of an Oracle RAC database. This form of instance recovery is also called crash recovery. Oracle Database recovers the online redo threads of the terminated instances together.
-
Some but not all instances of an Oracle RAC database fail. Instance recovery is performed automatically by a surviving instance in the configuration.
The SMON background process performs instance recovery, applying online redo automatically. No user intervention is required.
因此我們知道非一致性關閉會引發實例恢復(一致性關閉不會,參考shutdown immediate的官方定義)同時RAC節點宕機也會在一個存活節點進行實例恢復,其過程就是重構記憶體中的臟塊並提交,同時對未提交的做出回滾,這個過程由smon後臺進程負責。
實例恢復分兩階段: 1.前滾:Rolling Forward Oracle根據redo日誌中的記載: 1)對於提交的事務,根據日誌進行記憶體中的臟塊重現,然後進行commit,並按正常機制正常寫入磁碟。 2)對於未提交的事務,也根據redo進行臟塊重現(原因是檢查點隊列里記錄的未提交事務的臟塊也可能在已提交事務的臟塊之前),但只是重現,oracle在此階段完全不對這些臟塊做其他操作。 由於一些大事務的更改已經被寫入磁碟,以及前滾過程中生成的未提交事務的臟塊,oracle必須進行第二步的回滾。2.回滾:Rolling Back 對於所有未提交但是已經被寫入磁碟的數據,oracle根據undo進行回滾。
這裡上一幅官網的圖:
圖:Basic Instance Recovery Steps: Rolling Forward and Rolling Back
圖解: 我們看到實例恢復前redo日誌中記載的日誌對應著四種更改塊(redo只記載更改): 1)已提交且被寫入磁碟的更改塊,oracle對這種塊無需做任何操作。 2)已提交但未被寫入磁碟的更改塊,oracle會在前滾過程中在記憶體重現臟塊,然後按正常機制提交。 3)未提交且未被寫入磁碟的更改塊。 4)未提交但已被寫入磁碟的更改塊。 由於回滾是按事務為單位進行處理的,因此對於3、4兩種塊的處理全部是在回滾階段,oracle根據undo進行所有未提交事務的回滾操作,用前鏡像覆蓋掉磁碟中的數據,這樣就會處理掉第3、4種塊。 此外,從上不難看出oracle預設undo中記載的事務進度是和redo日誌中的完全一致的,不存在undo記載了事務被提交但是redo日誌記錄未提交的情況。 但是並不是所有情況都符合Oracle預設的預期,有時候資料庫頻繁掉電就可能出現無法成功進行實例恢復的情況,此時只能採取一些特殊手段對數據文件頭和SCN做一些改動。 一般除非特別緊急的狀況,否則不要用BBED、強制推進SCN等”偏方“去打開資料庫,對於一個成功的DBA來說,做好備份和災備永遠是最重要的工作。Instance Recovery Phases
The first phase of instance recovery is called cache recovery or rolling forward, and involves reapplying all of the changes recorded in the online redo log to the data files. Because rollback data is recorded in the online redo log, rolling forward also regenerates the corresponding undo segments.
Rolling forward proceeds through as many online redo log files as necessary to bring the database forward in time. After rolling forward, the data blocks contain all committed changes recorded in the online redo log files. These files could also contain uncommitted changes that were either saved to the data files before the failure, or were recorded in the online redo log and introduced during cache recovery.
After the roll forward, any changes that were not committed must be undone. Oracle Database uses the checkpoint position, which guarantees that every committed change with an SCN lower than the checkpoint SCN is saved on disk. Oracle Database applies undo blocks to roll back uncommitted changes in data blocks that were written before the failure or introduced during cache recovery. This phase is called rolling back or transaction recovery.