接同事电话,ADG备库同步出了问题。
在备库查看,大量的redolog没有apply。
在备库查看GAP,竟然没有查到。
查看备库日志,当前只有1号、3号的日志不停的传了过来,唯独缺了2号的,如下:
Tue Aug 21 09:28:00 2018 RFS[16]: Selected log 22 for thread 1 sequence 91327 dbid 1638344406 branch 954674776 Tue Aug 21 09:28:06 2018 Archived Log entry 76689 added for thread 1 sequence 91326 ID 0x61a75fd3 dest 2: Tue Aug 21 09:32:55 2018 RFS[16]: Selected log 21 for thread 1 sequence 91328 dbid 1638344406 branch 954674776 Tue Aug 21 09:32:58 2018 Archived Log entry 76690 added for thread 1 sequence 91327 ID 0x61a75fd3 dest 2: Tue Aug 21 09:36:41 2018 RFS[16]: Selected log 22 for thread 1 sequence 91329 dbid 1638344406 branch 954674776 Tue Aug 21 09:36:42 2018 RFS[17]: Selected log 41 for thread 3 sequence 73888 dbid 1638344406 branch 954674776 Tue Aug 21 09:36:45 2018 Archived Log entry 76691 added for thread 3 sequence 73887 ID 0x61a75fd3 dest 2: Tue Aug 21 09:36:46 2018 Archived Log entry 76692 added for thread 1 sequence 91328 ID 0x61a75fd3 dest 2: Tue Aug 21 09:40:10 2018 RFS[16]: Selected log 21 for thread 1 sequence 91330 dbid 1638344406 branch 954674776 Tue Aug 21 09:40:21 2018 Archived Log entry 76693 added for thread 1 sequence 91329 ID 0x61a75fd3 dest 2: Tue Aug 21 09:41:29 2018 RFS[16]: Selected log 22 for thread 1 sequence 91331 dbid 1638344406 branch 954674776 Tue Aug 21 09:41:33 2018 Archived Log entry 76694 added for thread 1 sequence 91330 ID 0x61a75fd3 dest 2: Tue Aug 21 09:42:33 2018 RFS[16]: Selected log 21 for thread 1 sequence 91332 dbid 1638344406 branch 954674776 Tue Aug 21 09:42:35 2018 RFS[17]: Selected log 42 for thread 3 sequence 73889 dbid 1638344406 branch 954674776
看来是2号节点那边出了问题。
查看2号节点日志,果然是无法登陆adg节点,如下:
Tue Aug 21 08:41:30 2018 ORA-16198: LGWR received timedout error from KSR LGWR: Attempting destination LOG_ARCHIVE_DEST_1 network reconnect (16198) LGWR: Destination LOG_ARCHIVE_DEST_1 network reconnect abandoned Error 16198 for archive log file 9 to 'jkdbadg' ORA-16198: LGWR received timedout error from KSR LGWR: Error 16198 disconnecting from destination LOG_ARCHIVE_DEST_1 standby host 'jkdbadg' Tue Aug 21 08:41:51 2018 LGWR: Failed to archive log 9 thread 2 sequence 73164 (16198) ,,, Tue Aug 21 08:47:13 2018 Error 1017 received logging on to the standby ------------------------------------------------------------ Check that the primary and standby are using a password file and remote_login_passwordfile is set to SHARED or EXCLUSIVE, and that the SYS password is same in the password files. returning error ORA-16191 ------------------------------------------------------------ PING[ARC2]: Heartbeat failed to connect to standby 'jkdbadg'. Error is 16191.
让同事协助重启2号节点,依然有问题,从1号节点拷贝来密码文件,命名为2号的密码文件,再次重启,查看备库的GAP,这次GAP信息对了,如下:
SYS@jkdb1> SELECT * FROM v$archive_gap; THREAD# LOW_SEQUENCE# HIGH_SEQUENCE# --------------- --------------- --------------- 2 73164 73213 SYS@jkdb1> SELECT thread#, MAX(sequence#) FROM v$archived_log WHERE applied = 'YES' GROUP BY thread#; THREAD# MAX(SEQUENCE#) --------------- --------------- 1 91317 2 73163 3 73875
可见,2号节点的73164-73213共计50个redolog没有传过来,2号的redo只在备库应用到73163,并堵塞了1号、3号的日志应用。
只能从2号主库去找这些日志了,不幸的是,2号上的日志只保留到了73124开始往后的,如下:
select name from v$archived_log where name like '+FRA/jkdb/archivelog/2018_08_21/thread_2%' order by sequence#; NAME ---------------------------------------------------------------------------------------------------- +FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73214.4526.984744477 +FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73215.4486.984744967 +FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73216.2786.984745033 +FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73217.1013.984745537 +FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73218.3876.984745721 ,,,
可见,前面的日志已经在备份后删除了,只能从备份恢复出来,再搞到备库。
此处先暂停2号节点的自动备份与删除任务,恢复日志备份,如下:
restore archivelog from logseq 73164 until logseq 73213 thread 2; 单通道恢复太慢!改为多通道,效率提升明显,如下: run { allocate channel c1 type disk; allocate channel c2 type disk; allocate channel c3 type disk; allocate channel c4 type disk; restore archivelog from logseq 73170 until logseq 73213 thread 2; release channel c1; release channel c2; release channel c3; release channel c4; }
耗时几个小时!
此时查看可见恢复了这些日志:
select name from v$archived_log where name like '+FRA/jkdb/archivelog/2018_08_21/thread_2%' order by sequence#;
此处显示结果略。
将这些短缺的redolog拷贝到备库,如下:
run { allocate channel c1 type disk; allocate channel c2 type disk; allocate channel c3 type disk; allocate channel c4 type disk; copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73164.3169.984774029' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73164.3169.984774029'; copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73165.2819.984774871' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73165.2819.984774871'; copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73166.3733.984776431' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73166.3733.984776431'; ,,, copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73211.3202.984780775' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73211.3202.984780775'; copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73212.3220.984782767' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73212.3220.984782767'; copy archivelog '+FRA/jkdb/archivelog/2018_08_21/thread_2_seq_73213.3700.984778461' to '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73213.3700.984778461'; release channel c1; release channel c2; release channel c3; release channel c4; }
又耗费几个小时,汗。
然后再备库注册这些日志,如下:
alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73164.3169.984774029'; alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73165.2819.984774871'; alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73166.3733.984776431'; ,,, alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73211.3202.984780775'; alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73212.3220.984782767'; alter database register logfile '/nfs/bak/archivelog/2018_08_21/thread_2_seq_73213.3700.984778461';
需要注意的是,注册时提示已经注册了。但是备库并没有主动开始应用这些日志。
在备库关闭同步,再打开同步,让它自己慢慢apply吧!预计至少几个小时吧,因为截至此时,已经积攒了600多个日志待apply。
睡觉,第二天早上7点查看,备库已经完全同步了。
文章评论