某项目一个数据库连续3次宕掉,经查看alertlog文件,发现每次宕掉前报错:
ORA-00600: internal error code, arguments: [15709], [29], [1]
该错误是由于数据库发生并行回滚触发了Oracle Bug 6954722
metalink:
Solution :
To implement solution for unpublished Bug: 6954722, please execute one of the following steps:
- Use the following workaround
Set fast_start_parallel_rollback=false and recovery_parallelism=0
Setting fast_start_parallel_rollback=false and recovery_parallelism=0 simply tells Oracle to recover failed/aborted transaction in serial mode. THere is not harm in setting these as that should not be a common operation。
分析alertlog日志,前几天不时出现大量deadlock,Oracle对于"死锁"采取的策略是回滚其中一个事务,让另外一个事务顺利进行.大量deadlock,后果必然是是引起大量并行回滚,从而触发了那个bug,导致数据库宕掉.虽是一个bug,但是与应用或配置不合理有关.
按照metalink建议,修改了这两个参数,规避这个bug的发作,重启后观察效果.
但是发现deadlock还是有,建议排查优化应用,以免引起其他问题或bug.
文章评论