Media recovery Terminated with ORA-00600

Recently worked on huge transactional production database of 11.2.0.2, It has Standby database with Real-time Apply, In part of journal archiving the data will be inserted into other tables and same data will be purged from the Huge Ledger tables.

Log History on 29th September between 1 AM to 3AM.

Date      Day   Total h00 h01 h02 h03
--------- ----- ----- --- --- --- ---
27-SEP-12 Thu   1207  0   5   0   12
28-SEP-12 Fri   728   1   84  21  97
29-SEP-12 Sat   668   90  372 99  1 
30-SEP-12 Sun   83    1   2   1   1

Total number of rows archived are “19351374” and according to above log history on average total archives are 472 and each redo log file size is 100mb which is 472 * 100mb = 46 GB of redo in just 2 hours. This production database has standby database with Real-Time apply, During high redo unable to archive quickly on standby database of course all of the archives are transferred. But MRP went smooth up to archive log sequence “147613”, When MRP is trying to apply the next sequence “147614”, Process been terminated and returned internal errors as below.

Sat Sep 29 06:49:21 2012
Completed: ALTER DATABASE RECOVER MANAGED STANDBY DATABASE  THROUGH ALL SWITCHOVER DISCONNECT  USING CURRENT LOGFILE
Media Recovery Log /u04/arch/ckpt/1_147614_751616314.dbf
Errors in file /u00/app/oracle/diag/rdbms/ckptsby1/ckpt/trace/ckpt_pr00_21586.trc  (incident=582898):
ORA-00600: internal error code, arguments: [2619], [147614], [], [], [], [], [], [], [], [], [], []
Incident details in: /u00/app/oracle/diag/rdbms/ckptsby1/ckpt/incident/incdir_582898/ckpt_pr00_21586_i582898.trc

Whenever Internal errors occurred we usually check look up tool , if there is any listed bugs found we follow work around and submit SR to Oracle Support, But not easy to get response on weekends and need to fix this issue ASAP. Lets do some troubleshooting here. Initially bounced standby database still errors remain same and MRP been terminated as  usual.

Total How many Archives been generated on Primary?

SQL> select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
147729
SQL>

Now what is the latest sequence been applied on Standby?

SQL> select max(sequence#) from v$archived_log where applied='YES';
MAX(SEQUENCE#)
--------------
147613
SQL>

so around 116 of archives are not applied and still many of archives can be generated in primary database, Quickly most of them think to either to perform 1) Incremental Roll forward (or) 2) Rebuild standby database because of internal errors when there is no outstanding time to work. But in some cases still we can deal Internal errors(ORA-00600) depending on our analysis.

Now revise our findings, The maximum sequence applied on standby is “147613”, Wait am giving you small TIP 🙂 check what is the second argument in internal error? It’s “ORA-00600: internal error code, arguments: [2619], [147614]” So whatever the first argument either may it relates to the FRA (or) archive location size but second argument referring to the Next sequence which is not applied on standby.. BINGO!!! So the issue is with the archive log sequence “147614“.

So if the archive log is corrupted then how to resolve this GAP? Straight away answer is Incremental Roll forward, You can also have a try with Primary archive log(May be Primary archive also corrupted). Because there may be corruption in Archive log of standby may be during the transport over network or any physical corruption of Standby redo log files.

So first copy the primary archive log sequence to standby database and see if there is any differences in Sizes of both archives logs of primary and standby Archive logs.

oracle@ora2-11.ckpt.net> ls -ltr 1_147614_751616314.dbf_primary 1_147614_751616314.dbf_standby
-rw-r----- 1 oracle oinstall 69396480 Sep 29 10:53 1_147614_751616314.dbf_standby
-rw-r----- 1 oracle oinstall 69396480 Sep 29 10:56 1_147614_751616314.dbf_primary
oracle@ora2-11.ckpt.net>

Here both archive logs permissions are same, File sizes are also same. Hmm. Will see any difference in both of the files by Linux commands

oracle@ora2-11.ckpt.net> diff 1_147614_751616314.dbf_primary 1_147614_751616314.dbf_standby
Binary files 1_147614_751616314.dbf_primary and 1_147614_751616314.dbf_standby differ
oracle@ora2-11.ckpt.net> echo $?
2
oracle@ora2-11.ckpt.net>

So according to the above command, both the archives from primary and standby are differ. Now will perform more investigation using Log Miner.

SQL> EXECUTE DBMS_LOGMNR.ADD_LOGFILE('/u00/app/oracle/standby_arch/1_147614_751616314.dbf_standby' ,options => DBMS_LOGMNR.NEW);
PL/SQL procedure successfully completed.
SQL> EXECUTE DBMS_LOGMNR.START_LOGMNR();
PL/SQL procedure successfully completed.
SQL> SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS;
SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS
                                                             *
ERROR at line 1:
ORA-00600: internal error code, arguments: [2619], [147614], [], [], [], [],
[], [], [], [], [], []
SQL>

If analyzed primary archive redo log file, Log miner able to read without any errors.

SQL> SELECT rbasqn, rbablk, rbabyte, operation, status, info FROM V$LOGMNR_CONTENTS;
    RBASQN     RBABLK    RBABYTE OPERATION                            STATUS INFO
---------- ---------- ---------- -------------------------------- ---------- --------------------------------
    147614       3262        368 COMMIT                                    0
    147614       3264        472 DELETE                                    2 Dictionary Mismatch
    147614       3266        184 DELETE                                    2 Dictionary Mismatch
    147614       3267        392 DELETE                                    2 Dictionary Mismatch
    147614       3269        104 DELETE                                    2 Dictionary Mismatch
    147614       3270        312 DELETE                                    2 Dictionary Mismatch
    147614       3272         24 DELETE                                    2 Dictionary Mismatch
    147614       3273        232 DELETE                                    2 Dictionary Mismatch
    147614       3275         16 DELETE                                    2 Dictionary Mismatch
    147614       3276        224 DELETE                                    2 Dictionary Mismatch
    147614       3277        432 DELETE                                    2 Dictionary Mismatch

So there is no corruption also, But internal errors which is almost our out of Hands? Still you can perform manual recovery of standby with the only log which is corrupted and then lets start MRP and see how it goes.

SQL> recover standby database until cancel;
ORA-00279: change 2209191282893 generated at 09/29/2012
01:58:45 needed for thread 1
ORA-00289: suggestion :/u04/arch/ckpt/1_147614_751616314.dbf
ORA-00280: change 2209191282893 for thread 1 is in sequence #147614
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u00/app/oracle/standby_arch/1_147614_751616314.db f
ORA-00279: change 2209191284000 generated at 09/29/2012 01:59:00 needed for thread 1
ORA-00289: suggestion : /u04/arch/ckpt/1_147615_751616314.dbf
ORA-00280: change 2209191284000 for thread 1 is in sequence #147615
ORA-00278: log file '/u00/app/oracle/standby_arch/1_147614_751616314.dbf' no longer needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
/u00/app/oracle/standby_arch/1_147615_751616314.dbf

Now manually archive log sequence of primary “147614” Successfully applied and requesting for the next sequence. Alert log information as below

Sat Sep 29 07:41:56 2012
ALTER DATABASE RECOVER  standby database until cancel
Media Recovery Start
 started logmerger process
Sat Sep 29 07:41:56 2012
Managed Standby Recovery not using Real Time Apply
Parallel Media Recovery started with 32 slaves
ORA-279 signalled during: ALTER DATABASE RECOVER  standby database until cancel  ...
Sat Sep 29 07:42:24 2012
ALTER DATABASE RECOVER    LOGFILE '/u00/app/oracle/standby_arch/1_147614_751616314.dbf'
Media Recovery Log /u00/app/oracle/standby_arch/1_147614_751616314.dbf
ORA-279 signalled during: ALTER DATABASE RECOVER    LOGFILE '/u00/app/oracle/standby_arch/1_147614_751616314.dbf'  ...

Now cancelled the recovery and started Broker, So that MRP process will be initiated.

SQL> alter system set dg_broker_start=true;
System altered.
SQL>

When started Broker(DMON) sucessfully MRP also started and applying archives from “147615” without any errors, All the archives applied and now standby is completely in SYNC with primary database.

Primary:-
SQL> select max(sequence#) from v$archived_log;
MAX(SEQUENCE#)
--------------
147849
SQL>

Standby:-
SQL> select max(sequence#) from v$archived_log where applied='YES';
MAX(SEQUENCE#)
--------------
147848
SQL>

Now the issue is resolved, But what is the issue with archive log sequence “147614” ? Still its unanswered? Aaah.. I just tried to delete the archive log using RMAN after performing manual copy and below is the issue.

Sat Sep 29 08:01:01 2012
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
***
Corrupt block seq: 147614 blocknum=1.
Bad header found during deleting archived log
Data in bad block - flag:1. format:34. bno:1. seq:147614
beg:0 cks:47474
calculated check value: 47474
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
Reread of seq=147614, blocknum=1, file=/u04/arch/ckpt/1_147614_751616314.dbf, found same corrupt data
archivelog header validation failure for file /u04/arch/ckpt/1_147614_751616314.dbf
***

But why Log Miner unable to show the Block corruption issues when this archive log is analyzed? But I think its issue more with the BAD HEADER…  So in short there may be less control on Internal errors still sometimes we can escape from those errors some times.

Happy Reading… Thank you