nagios报 check_oracle_rman_backup_problems告警处理思路

前端之家收集整理的这篇文章主要介绍了nagios报 check_oracle_rman_backup_problems告警处理思路前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。

本人不是Oracle DBA,不懂Oracle,告警了运维又不管,说是DBA的活,反正在他们眼里无论是MysqL,Oracle,SYBASE还是Redis,MongoDB都是DBA,和他们没关系。。。。。
1.打开nrpe.cfg,找到check_oracle_rman_backup_problems监控项,执行一下
cat /usr/local/nagios/etc/nrpe.cfgbr/>![](http://i2.51cto.com/images/blog/201803/09/6ac77908871d3a4587a289d7f718f8a4.png?x-oss-process=image/watermark,type_ZmFuZ3poZW5naGVpdGk=)
2.找到check_oracle_health脚本(perl语言)监控的,那就打开看看是如何取值监控的呗
通过rman-backup-problems搜索到在@mode数组

并找到如下代码,其中sql就是我们最终要找的,这是关于rman备份状态监控
elsif ($params{mode} =~ /server::instance::rman::backup::problems/) {
$self->{rman_backup_problems} = $self->{handle}->fetchrow_array(q{
SELECT COUNT(*) FROM v$rman_status
WHERE
operation = 'BACKUP'
AND
status != 'COMPLETED'
AND
status != 'RUNNING'
AND
start_time > sysdate-3
});
} elsif ($params{mode} =~ /server::instance::rman::backup::problems/) {
$self->add_nagios(
$self->check_thresholds($self->{rman_backup_problems},1,2),
sprintf "rman had %d problems during the last 3 days",
$self->{rman_backup_problems});
$self->add_perfdata(sprintf "rman_backup_problems=%d;%d;%d",
$self->{rman_backup_problems},
$self->{warningrange},$self->{criticalrange});
现在知道这个是由于rman备份造成,那就执行下sql和备份日志,发现如下错误
Deleting the following obsolete backups and copies:
Type Key Completion Time Filename/Handle


Control File Copy 69 2017-12-20 11:22:41 /data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.fRMAN-00571: ===========================================================RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============RMAN-00571: ===========================================================RMAN-03009: failure of delete command on ORA_DISK_1 channel at 03/06/2018 01:15:28ORA-19606: Cannot copy or restore to snapshot control file知道错误,那就好解决啦,网上一搜总结如下:CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f_bak';crosscheck controlfilecopy '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';delete expired controlfilecopy '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/data/ora11g/product/11.2.0/db_1/dbs/snapcf_oradb2.f';CONFIGURE SNAPSHOT CONTROLFILE NAME clear;总结,这里需要你能看懂perl面向对象编程,这里package xxx相当于class 声明类,new函数就是常说的构造函数,我觉的不会不可怕,不会可以去学,顺便了解了一下perl语言,还是有收获的

猜你在找的Oracle相关文章