dataguard备库出现GAP修复
1. 产生原因
巡检发现备库的日志应用有滞后的情况
DGMGRL> show database sxcmpdg 'RecvQEntries'
STANDBY_RECEIVE_QUEUE
STATUS RESETLOGS_ID THREAD LOG_SEQ TIME_GENERATED TIME_COMPLETED FIRST_CHANGE# NEXT_CHANGE# SIZE (KBs) NOT_APPLIED 943612304 1 11606 09/03/2017 02:59:24 09/03/2017 03:06:03 167076036 167089250 160017 NOT_APPLIED 943612304 1 11607 09/03/2017 03:06:03 09/03/2017 03:22:06 167089250 167128494 668992 NOT_APPLIED 943612304 1 12110 09/04/2017 02:58:30 09/04/2017 03:04:30 177789138 177810714 267788
去备库查询滞后情况
select * from v$archive_gap;
THREAD# LOW_SEQUENCE# HIGH_SEQUENCE#
---------- ------------- --------------
1 11442 11605
去broker中查询dg的配置情况
DGMGRL> show configuration;
Configuration - sxdg
Protection Mode: MaxAvailability
Databases:
sxcmpdb - Primary database Error: ORA-16810: multiple errors or warnings detected for the database sxcmpdg - Physical standby database Warning: ORA-16857: standby disconnected from redo source for longer than specified threshold
Fast-Start Failover: DISABLED
Configuration Status:
ERROR
发现有fast_recovery_area不可用的报错,并且无法继续归档。
查询配置
sql> select * from v$flash_recovery_area_usage;
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE
-------------------- ------------------ -------------------------
NUMBER_OF_FILES
CONTROL FILE 0 0
0
REDO LOG 2.6 0
13
ARCHIVED LOG 97.37 0
1118
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE
-------------------- ------------------ -------------------------
NUMBER_OF_FILES
BACKUP PIECE 0 0
0
IMAGE COPY 0 0
0
FLASHBACK LOG 0 0
0
FILE_TYPE PERCENT_SPACE_USED PERCENT_SPACE_RECLAIMABLE
-------------------- ------------------ -------------------------
NUMBER_OF_FILES
FOREIGN ARCHIVED LOG 0 0
0
7 rows selected.
sql> show parameter db_recover
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
db_recovery_file_dest string /u01/app/oracle/fast_recovery_
area
db_recovery_file_dest_size big integer 500G
[oracle@haixindg ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 3.3T 1.8T 1.5T 56% /
devtmpfs 126G 0 126G 0% /dev
tmpfs 126G 0 126G 0% /dev/shm
tmpfs 126G 178M 126G 1% /run
tmpfs 126G 0 126G 0% /sys/fs/cgroup
/dev/sda2 497M 119M 379M 24% /boot
/dev/sda1 200M 9.5M 191M 5% /boot/efi
tmpfs 26G 0 26G 0% /run/user/0
原来fast_recovery_area的空间被另一个备库的归档所填充很大空间,并且路径相同.
2. 修复过程
修改fast_recovery_area大小
sql> alter system set db_recovery_file_dest_size = 3072G;
System altered.
重新启动配置
DGMGRL> enable configuration;
Enabled.
观察后台日志,备库开始追赶滞后情况
Mon Sep 04 09:45:41 2017
Media Recovery Log /u01/app/oracle/fast_recovery_area/SXCMPDG/archivelog/2017_09_04/o1_mf_1_11860_dtsd15hp_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/SXCMPDG/archivelog/2017_09_04/o1_mf_1_11861_dtsd15wz_.arc
Mon Sep 04 09:45:48 2017
Archived Log entry 2682 added for thread 1 sequence 11863 rlc 943612304 ID 0x6e44f610 dest 2:
RFS[66]: Opened log for thread 1 sequence 11866 dbid 1850015504 branch 943612304
Mon Sep 04 09:45:54 2017
Media Recovery Log /u01/app/oracle/fast_recovery_area/SXCMPDG/archivelog/2017_09_04/o1_mf_1_11862_dtsd19fg_.arc
Media Recovery Log /u01/app/oracle/fast_recovery_area/SXCMPDG/archivelog/2017_09_04/o1_mf_1_11863_dtsd1mr9_.arc
Media Recovery Waiting for thread 1 sequence 11864 (in transit)
Mon Sep 04 09:46:12 2017
Archived Log entry 2683 added for thread 1 sequence 11864 rlc 943612304 ID 0x6e44f610 dest 2:
RFS[67]: Opened log for thread 1 sequence 11867 dbid 1850015504 branch 943612304
Mon Sep 04 09:46:14 2017
Archived Log entry 2684 added for thread 1 sequence 11865 rlc 943612304 ID 0x6e44f610 dest 2:
RFS[68]: Opened log for thread 1 sequence 11868 dbid 1850015504 branch 943612304
Mon Sep 04 09:46:15 2017
Archived Log entry 2685 added for thread 1 sequence 11866 rlc 943612304 ID 0x6e44f610 dest 2:
RFS[66]: Opened log for thread 1 sequence 11869 dbid 1850015504 branch 943612304
配置重新启用,并有告警信息产生
DGMGRL> show configuration;
Configuration - sxdg
Protection Mode: MaxAvailability
Databases:
sxcmpdb - Primary database Warning: ORA-16629: database reports a different protection level from the protection mode sxcmpdg - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
WARNING
还需要追赶几百个归档文件
rw-r-----. 1 oracle oinstall 974596608 Sep 4 09:48 o1_mf_1_11878_dtsd7jsg_.arc
-rw-r-----. 1 oracle oinstall 1053221888 Sep 4 09:48 o1_mf_1_11879_dtsd7jwl_.arc
-rw-r-----. 1 oracle oinstall 1044361216 Sep 4 09:48 o1_mf_1_11880_dtsd7k0r_.arc
-rw-r-----. 1 oracle oinstall 274215936 Sep 4 03:07 o1_mf_1_12110_dtrnprd3_.arc
-rw-r-----. 1 oracle oinstall 986118144 Sep 4 09:16 o1_mf_1_12111_dtsbco0h_.arc
-rw-r-----. 1 oracle oinstall 588338688 Sep 4 09:22 o1_mf_1_12174_dtsbpy68_.arc
-rw-r-----. 1 oracle oinstall 1043307008 Sep 4 09:17 o1_mf_1_12175_dtsbfms5_.arc
-rw-r-----. 1 oracle oinstall 148306944 Sep 4 09:17 o1_mf_1_12176_dtsbg2hn_.arc
-rw-r-----. 1 oracle oinstall 927752704 Sep 4 09:37 o1_mf_1_12177_dtsckvx6_.arc
-rw-r-----. 1 oracle oinstall 357376 Sep 4 09:37 o1_mf_1_12178_dtscm96t_.arc
调整归档删除天数,由原来的保留7天,改为保留2天。
追平后备库的归档已经连续
-rw-r-----. 1 oracle oinstall 148306944 Sep 4 09:17 o1_mf_1_12176_dtsbg2hn_.arc
-rw-r-----. 1 oracle oinstall 927752704 Sep 4 09:37 o1_mf_1_12177_dtsckvx6_.arc
-rw-r-----. 1 oracle oinstall 357376 Sep 4 09:37 o1_mf_1_12178_dtscm96t_.arc
-rw-r-----. 1 oracle oinstall 923085824 Sep 4 10:05 o1_mf_1_12179_dtsf6dgp_.arc
-rw-r-----. 1 oracle oinstall 791576064 Sep 4 10:30 o1_mf_1_12180_dtsgozgc_.arc
[oracle@haixindg 2017_09_04]$
备库没有gap产生
sql> select * from v$archive_gap;
no rows selected
后台日志已经应用到redo
Mon Sep 04 10:31:12 2017
Media Recovery Log /u01/app/oracle/fast_recovery_area/SXCMPDG/archivelog/2017_09_04/o1_mf_1_12180_dtsgozgc_.arc
Media Recovery Waiting for thread 1 sequence 12181 (in transit)
Recovery of Online Redo Log: Thread 1 Group 7 Seq 12181 Reading mem 0
Mem# 0: /u01/app/oracle/fast_recovery_area/SXCMPDG/onlinelog/o1_mf_7_drqfh50g_.log
dg的配置已经正常
DGMGRL> show configuration;
Configuration - sxdg
Protection Mode: MaxAvailability
Databases:
sxcmpdb - Primary database sxcmpdg - Physical standby database
Fast-Start Failover: DISABLED
Configuration Status:
SUCCESS
主备库的保护模式以及保护级别已经统一。
sql> select PROTECTION_MODE,PROTECTION_LEVEL from v$database;
PROTECTION_MODE PROTECTION_LEVEL
-------------------- --------------------
MAXIMUM AVAILABILITY MAXIMUM AVAILABILITY
3. 总结
定期巡检不要忘记,在多实例的备库服务器中,要设置合理的fast_recovery_area大小,合理的保存备库归档冗余,有助于快速恢复,如果备库归档文件已经删除,只能通过主库来增量恢复。