RAC心跳断开导致脑裂

2021年1月20日 1818点热度 0人点赞 0条评论

关闭网络心跳ip后,此时磁盘心跳正常,数据库主集群的2个节点发生脑裂,导致数据库重新配置。
重新配置的基本原则:节点数多的子集群存活,如果子集群包含的节点数相同,那么包含最小编号节点的子集群存活。
所以2节点集群,如果停掉所有的心跳ip后,node1存活,继续对外提供服务。
实际日志顺序如下:
1、node2的数据库实例被强行终止
2、node1确认检测到node2实例终止后,自我调整后继续提供数据库服务

■■停掉所有的心跳ip

■■node2 日志

2021-01-20T16:20:15.835897+08:00
PMON (ospid: 86042): terminating the instance due to ORA error 499
2021-01-20T16:20:15.836015+08:00
Cause - 'Instance is being terminated due to fatal process death (pid: 5, ospid: 86119, IPC0)'
2021-01-20T16:20:16.873708+08:00
Instance terminated by PMON, pid = 86042

■■node1 日志

2021-01-20T16:20:15.858020+08:00
Increasing priority of 4 RS
Reconfiguration started (old inc 14, new inc 16)
List of instances (total 1) :
 1
Dead instances (total 1) :
 2
My inst 1   
publish big name space -  dead or down/up instance detected, invalidate domain 0 
 Global Resource Directory frozen
* dead instance detected - domain 0 invalid = TRUE 
* dead instance detected - domain 2 invalid = TRUE, need cdb-level instance recovery
* dead instance detected - domain 3 invalid = TRUE, need cdb-level instance recovery
* dead instance detected - domain 4 invalid = TRUE, need cdb-level instance recovery
* dead instance detected - domain 6 invalid = TRUE, need cdb-level instance recovery
 Communication channels reestablished
 Master broadcasted resource hash value bitmaps
 Non-local Process blocks cleaned out
2021-01-20T16:20:15.919916+08:00
 LMS 2: 11 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2021-01-20T16:20:15.919965+08:00
 LMS 0: 3 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2021-01-20T16:20:15.920014+08:00
 LMS 1: 0 GCS shadows cancelled, 0 closed, 0 Xw survived, skipped 0
2021-01-20T16:20:15.920098+08:00
 LMS 3: 11 GCS shadows cancelled, 1 closed, 0 Xw survived, skipped 0
2021-01-20T16:20:17.126005+08:00
 Set master node info 
 Dwn-cvts replayed, VALBLKs dubious
 All grantable enqueues granted
2021-01-20T16:20:17.155938+08:00
Set instance recovery to serial mode.
Set instance recovery parallelism to 1.
Post SMON to start 1st pass IR (domain 0).
2021-01-20T16:20:21.225167+08:00
Reconfiguration complete (total time 5.4 secs) 
Decreasing priority of 4 RS
2021-01-20T16:20:21.368705+08:00
CDB IR excluding pdb 2 which was cleanly closed.
2021-01-20T16:20:21.368775+08:00
CDB IR excluding pdb 5 which is either deleted, or not valid.
2021-01-20T16:20:21.368842+08:00
Instance recovery: looking for dead threads
2021-01-20T16:20:21.370570+08:00
Beginning instance recovery of 1 threads
 Thread 2: Recovery starting at checkpoint rba (logseq 499 block 1977), scn 0
2021-01-20T16:20:21.494881+08:00
Instance Recovery pruning the Recovery Set that was constructed by RMS0 per Buddy Instance feature for thread 2, RMS0's currba 499.28423.16, RMS0's nxscn 0x00000f46a99fb0c8, RMS0's Start RBA 499.1977.0, IR's Start RBA 499.1977.0.
2021-01-20T16:20:21.499527+08:00
Instance Recovery "using" the Recovery Set constructed by RMS0 after pruning per Buddy Instance feature.
2021-01-20T16:20:21.499679+08:00
Started redo scan from RMS0's current RBA
2021-01-20T16:20:21.528774+08:00
ALTER SYSTEM SET remote_listener=' wydb-scan:1521' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:21.530033+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:22.281296+08:00
Completed redo scan (from RMS0's current RBA to end of thread)
 read 303593 KB redo, 24318 data blocks need recovery
2021-01-20T16:20:23.662794+08:00
validate pdb 0, flags x10, valid 0, pdb flags x884 
* validated domain 0, flags = 0x880
* validated domain 2, flags = 0x40080
* validated domain 3, flags = 0x40080
* validated domain 4, flags = 0x40080
* validated domain 6, flags = 0x40080
2021-01-20T16:20:23.681095+08:00
Started redo application at
 Thread 2: logseq 499, block 1977, offset 0
2021-01-20T16:20:23.687876+08:00
Recovery of Online Redo Log: Thread 2 Group 4 Seq 499 Reading mem 0
  Mem# 0: +DATA1/WYDB/ONLINELOG/group_4.270.1059677885
  Mem# 1: +FRA/WYDB/ONLINELOG/group_4.471.1059677889
2021-01-20T16:20:23.821150+08:00
ALTER SYSTEM SET remote_listener=' wydb-scan:1521' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:23.822138+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:25.989678+08:00
minact-scn: master detected reconf/inst-rec stop us-scan old-inc#:14 new-inc#:16
PDBAPP(3):minact-scn: master detected reconf/inst-rec stop us-scan old-inc#:16 new-inc#:16
PDBHIS(4):minact-scn: master detected reconf/inst-rec stop us-scan old-inc#:16 new-inc#:16
PDBCOL(6):minact-scn: master detected reconf/inst-rec stop us-scan old-inc#:16 new-inc#:16
2021-01-20T16:20:26.193643+08:00
Completed redo application of 270.39MB
2021-01-20T16:20:26.193736+08:00
Completed instance recovery at
 Thread 2: RBA 499.635609.16, nab 635609, scn 0x00000f46a9a444bb
 24304 data blocks read, 24377 data blocks written, 303593 redo k-bytes read
2021-01-20T16:20:26.233302+08:00
Thread 2 advanced to log sequence 500 (thread recovery)
2021-01-20T16:20:26.241454+08:00
Redo thread 2 internally disabled at seq 500 (SMON)
CDB instance recovery complete: pdb 2 valid 1 (flags x10, pdb flags x40080) 
CDB instance recovery complete: pdb 3 valid 1 (flags x10, pdb flags x40080) 
CDB instance recovery complete: pdb 4 valid 1 (flags x10, pdb flags x40080) 
CDB instance recovery complete: pdb 6 valid 1 (flags x10, pdb flags x40080) 
CDB instance recovery complete: pdb 0 valid 1 (flags x10, pdb flags x880) 
2021-01-20T16:20:26.403458+08:00
ALTER SYSTEM SET remote_listener=' wydb-scan:1521' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:26.404195+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:26.713883+08:00
ARC3 (PID:78108): Archived Log entry 13435 added for T-2.S-499 ID 0x4dbccd0a LAD:1
2021-01-20T16:20:26.898149+08:00
ARC2 (PID:78106): SRL selected for T-2.S-499 for LAD:2
2021-01-20T16:20:27.305071+08:00
ARC0 (PID:78090): Archiving disabled T-2.S-500
2021-01-20T16:20:27.315861+08:00
ARC0 (PID:78090): Archived Log entry 13437 added for T-2.S-500 ID 0x4dbccd0a LAD:1
2021-01-20T16:20:28.263876+08:00
Thread 1 advanced to log sequence 444 (LGWR switch),  current SCN: 16796168342713
  Current log# 2 seq# 444 mem# 0: +DATA1/WYDB/ONLINELOG/group_2.259.1055527355
  Current log# 2 seq# 444 mem# 1: +FRA/WYDB/ONLINELOG/group_2.258.1055527359
2021-01-20T16:20:29.474094+08:00
ARC0 (PID:78090): Archived Log entry 13439 added for T-1.S-443 ID 0x4dbccd0a LAD:1
2021-01-20T16:20:29.950566+08:00
TT02 (PID:243573): SRL selected for T-1.S-444 for LAD:2
2021-01-20T16:20:32.029310+08:00
Decreasing number of high priority LMS from 4 to 0
2021-01-20T16:20:33.070513+08:00
ALTER SYSTEM SET remote_listener=' wydb-scan:1521' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:33.071607+08:00
ALTER SYSTEM SET listener_networks='' SCOPE=MEMORY SID='wydb1';
2021-01-20T16:20:56.037592+08:00
PDBAPP(3):minact-scn: master continuing after IR
PDBHIS(4):minact-scn: master continuing after IR
PDBCOL(6):minact-scn: master continuing after IR

liking

这个人很懒,什么都没留下

文章评论