1、查看磁盘组信息
asmcmd lsdg
查看磁盘组的具体信息(总大小,可用等)
2、查看votedisk的当前情况
crsctl query css votedisk
3、crsctl replace votedisk +DG2
表决盘从DATA磁盘组迁移到asm磁盘组DG2中。
现在votedisk在asm磁盘组dg2中,我们希望手工在asm磁盘组DG1中添加一块新的表决盘。但是添加失败了。
另外,表决盘也不支持同时在asm磁盘组中和ocfs文件系统中。
比如,当前表决盘在dg2磁盘组中,如果执行下面添加一块表决盘在ocfs文件系统中的命令,也是不能成功的:
crsctl add css votedisk /myocfs1/vdfile5
crsctl replace votedisk /myocfs1/vd4
查看是否成功:
crsctl query css votedisk
crsctl query css votedisk
也可以删除多于的votedisk,通过fuid删除指定的表决文件:
crsctl delete css votedisk afb49b9a67304f9ebfaf4278c2eeeb71
6、ocr的管理
检查ocr的情况:
当前ocr在data磁盘组中。
如果当前系统中有ocfs文件系统,我们可以在ocfs添加ocr镜像:
touch /myocfs1/ocr_mirror1
chown root:oinstall /myocfs1/ocr_mirror1
chmod 640 /myocfs1/ocr_mirror1
执行添加ocr镜像的操作:
ocrconfig -add /myocfs1/ocr_mirror1
也可以在asm磁盘组中执行添加ocr镜像的操作:
ocrconfig -add +DG2
添加完成检查是否成功:
Ocrcheck
你也可以删除ocr镜像:
删除ocr镜像:
ocrconfig -delete /myocfs1/ocr_mirror1
删除asm磁盘组中的镜像:
ocrconfig -delete +DG2
检查删除是否成功:
ocrcheck
发现没有dg2了。
7、替换ocr ocfs文件系统中
替换ocr:
touch /myocfs1/ocr_new
chown root:oinstall /myocfs1/ocr_new
chmod 640 /myocfs1/ocr_new
把/myocfs1/ocr_mirror1替换到新的位置:/myocfs1/ocr_new
ocrconfig -replace /myocfs1/ocr_mirror1 -replacement /myocfs1/ocr_new
可以把上面的位置替换到DG1的新位置:
ocrconfig -replace /myocfs1/ocr_new -replacement +DG1
查看是否成功:
Ocrcheck
8、ocr的备份
先查看备份情况:
手工备份:
ocrconfig -manualbackup
再次查看ocr备份情况:
ocrconfig -showbackup
9、ocr的备份二
可以使用ocrdump
可以把指定的备份ocr文件dump出来,默认生成在当前目录,文件名:OCRDUMPFILE
ocrdump -backupfile /taryartar/12c/grid_home/cdata/mycluster/backup_20150520_231258.ocr
Ocr的导出:
导出ocr,ocrEXP是导出的文件名,当前目录:
ocrconfig -export ocrEXP
Ocr的导入:
ocrconfig -import ocrEXP
10、
Ocr两种备份方式选择:
ocrconfig -manualbackup and ocrconfig -restore
ocrconfig -export and ocrconfig -import
一般建议是通过第一种方式进行备份和恢复。
如果要用第二种导入、导出的方式,那么需要关闭集群,然后执行导入导出获得完整的ocr文件。
ocrconfig -manualbackup产生的文件和ocrconfig -export产生的文件,格式不同。
所以ocrconfig -manualbackup必须是用ocrconfig -restore
Ocrconfig -export导出的文件必须是ocrconfig -import的方式进行导入。
11、
表决盘/文件(voting files)的备份与恢复
1、表决盘备份作为ocr备份的一部分
3、表决盘全部丢失,需要人工干预。
恢复5大步:
1、恢复ocr。
2、恢复voting disk
3、启动集群。
4、检查ocr和voting disk的完整性。
5、检查集群的状态。
实验:
先检查ocr的基本情况
===ocr voting disk 恢复试验(集群文件系统)======
Ocrcheck
记录ocr的位置:
/myocfs1/ocr_new
crsctl query css votedisk
记录表决盘位置:
/myocfs1/storage/vd6B
做一个ocr手工备份:
ocrconfig -manualbackup
记录刚备份的备份ocr文件:
/taryartar/12c/grid_home/cdata/mycluster/backup_20150521_152003.ocr
ocrconfig -showbackup
直接删除集群现在使用的ocr
rm -rf /myocfs1/ocr_new
直接删除集群现在使用的voting disk
rm -rf /myocfs1/storage/vd6B
查看ocr和表决文件情况:
Ocrcheck
Ocr情况不可查,命令报错
crsctl query css votedisk
Votedisk还可查,但实际文件已不存在。
查看当前集群情况:
Olsnodes
强制在当前节点的关闭集群,要加-f参数。否则在集群故障的时候无法完全关闭。
crsctl stop crs -f
sudo su - grid关闭实例
sql> shutdown abort;
其它的集群相关进程,从操作系统层面直接杀掉:
ps -ef |grep ora|awk '{print $2}'|xargs kill -9
ps -ef |grep asm|awk '{print $2}'|xargs kill -9
ps -ef |grep grid|awk '{print $2}'|xargs kill -9
ocrconfig -showbackup
查看集群停止之前正在使用的ocr是否存在:
on one node
这边是不存在的:
ll /myocfs1/ocr_new
确认不存在则新建当前文件:
touch /myocfs1/ocr_new
chown root:oinstall /myocfs1/ocr_new
chmod 640 /myocfs1/ocr_new
文件大小为0
ll /myocfs1/ocr_new
执行从指定的备份文件进行ocr的还原:
ocrconfig -restore /taryartar/12c/grid_home/cdata/mycluster/backup_20150521_152003.ocr
ll /myocfs1/ocr_new
还原完成后可以看到ocr文件大小变为不是0了
而且执行ocrcheck命令可以正常看到ocr的信息。
ocrcheck
下面进行votedisk的恢复
把集群启动到排他模式,只在一个节点启动:
on one node
crsctl start crs -excl
启动完成可以查看一下表决盘的情况:
crsctl query css votedisk
如果原位置可用,则可以直接替换到原先的表决盘。此处原先位置已经删除了,所以不可用,下面的命令报错。
crsctl replace votedisk /myocfs1/storage/vd6B
此时可以新增一个表决盘:
crsctl add css votedisk /myocfs1/storage/vd6C
然后就可以查看表决盘的情况:
crsctl query css votedisk
此时在把当前启动的集群关闭掉。
crsctl stop crs -f
--此时ocr和votedisk都有了,可以正式启动集群了。所有节点都执行:
crsctl start crs
确定集群是否正常
检查集群所有节点的ocr的版本完整性:
cluvfy comp ocr -n all -verbose
检查集群所有节点votedisk的版本完整性:
cluvfy comp vdisk -n all -verbose
crsctl check cluster -all
crsctl stat res -t
cluvfy comp ocr -n all -verbose
cluvfy comp vdisk -n all -verbose
12、ocr voting disk 恢复试验(ASM篇)
首先查看ocr的当前集群情况
Ocrcheck
ocr在磁盘组dg2中。
查看votedisk情况
crsctl query css votedisk
[root@rac1 ~]# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
-- ----- ----------------- --------- ---------
1. ONLINE 3d071f92464a4fe2bf8fa4ce05f897f0 (/dev/raw/raw4) [DG2]
Located 1 voting disk(s).
[root@rac1 ~]#
votedisk也在磁盘组dg2中。
发现ocr和votedisk都放在dg2磁盘组中
下面确定dg2磁盘组由哪些磁盘组成
[root@rac1 ~]# su - grid
[grid@rac1 ~]$ sqlplus / as sysasm
sql*Plus: Release 12.1.0.2.0 Production on Thu Nov 9 22:31:33 2017
Copyright (c) 1982,2014,Oracle. All rights reserved.
Connected to:
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options
sql> column path format a20;
sql> set linesize 500
sql> select dg.NAME as disk_group,d.NAME,MOUNT_STATUS,HEADER_STATUS,MODE_STATUS,PATH from V$ASM_DISK d,V$ASM_DISKGROUP dg
where d.GROUP_NUMBER=dg.GROUP_NUMBER
2 3 order by dg.NAME;
DISK_GROUP NAME MOUNT_S HEADER_STATU MODE_ST PATH
------------------------------ ------------------------------ ------- ------------ ------- --------------------
DATA DATA_0001 CACHED MEMBER ONLINE /dev/raw/raw2
DATA DATA_0000 CACHED MEMBER ONLINE /dev/raw/raw1
DG1 DG1_0000 CACHED MEMBER ONLINE /dev/raw/raw3
DG2 DG2_0000 CACHED MEMBER ONLINE /dev/raw/raw4
sql>
破坏ocr和votedisk所在的磁盘组使用的磁盘:
dd if=/dev/zero of=/dev/raw/raw4 bs=1024k count=1
在执行上面的sql语句查询,发现已经看不到dg2磁盘组了,说明已经破坏掉了。
没有DG2了。
下面看集群几个节点组成
Olsnodes
四个节点,但是我这边只起了两个。
在每个节点上都强制停止集群
crsctl stop crs -f 节点一: [root@rac1 ~]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac1' CRS-2673: Attempting to stop 'ora.crsd' on 'rac1' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac1' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rac1' CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac1' CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1' CRS-2673: Attempting to stop 'ora.DG2.dg' on 'rac1' CRS-2673: Attempting to stop 'ora.tar.db' on 'rac1' CRS-2673: Attempting to stop 'ora.mgmtdb' on 'rac1' CRS-2673: Attempting to stop 'ora.DG1.dg' on 'rac1' CRS-2673: Attempting to stop 'ora.DATA.dg' on 'rac1' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rac1' CRS-2673: Attempting to stop 'ora.cvu' on 'rac1' CRS-2673: Attempting to stop 'ora.oc4j' on 'rac1' CRS-2677: Stop of 'ora.cvu' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.cvu' on 'rac2' CRS-2677: Stop of 'ora.rac1.vip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac2' CRS-2677: Stop of 'ora.LISTENER_SCAN3.lsnr' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.scan3.vip' on 'rac1' CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded CRS-2677: Stop of 'ora.scan3.vip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.scan3.vip' on 'rac2' CRS-2677: Stop of 'ora.LISTENER_SCAN2.lsnr' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.scan2.vip' on 'rac1' CRS-2676: Start of 'ora.scan3.vip' on 'rac2' succeeded CRS-2676: Start of 'ora.rac1.vip' on 'rac2' succeeded CRS-2677: Stop of 'ora.scan2.vip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.scan2.vip' on 'rac2' CRS-2677: Stop of 'ora.DG2.dg' on 'rac1' succeeded CRS-2676: Start of 'ora.scan2.vip' on 'rac2' succeeded CRS-2677: Stop of 'ora.DG1.dg' on 'rac1' succeeded CRS-2677: Stop of 'ora.DATA.dg' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'rac1' CRS-2677: Stop of 'ora.tar.db' on 'rac1' succeeded CRS-2677: Stop of 'ora.mgmtdb' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.MGMTLSNR' on 'rac1' CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1' CRS-2677: Stop of 'ora.ASMNET1LSNR_ASM.lsnr' on 'rac1' succeeded CRS-2677: Stop of 'ora.MGMTLSNR' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.MGMTLSNR' on 'rac2' CRS-2676: Start of 'ora.MGMTLSNR' on 'rac2' succeeded CRS-2676: Start of 'ora.cvu' on 'rac2' succeeded CRS-2672: Attempting to start 'ora.LISTENER_SCAN3.lsnr' on 'rac2' CRS-2673: Attempting to stop 'ora.rac4.vip' on 'rac1' CRS-2672: Attempting to start 'ora.LISTENER_SCAN2.lsnr' on 'rac2' CRS-2672: Attempting to start 'ora.mgmtdb' on 'rac2' CRS-2677: Stop of 'ora.rac4.vip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.rac4.vip' on 'rac2' CRS-2676: Start of 'ora.rac4.vip' on 'rac2' succeeded CRS-2675: Stop of 'ora.oc4j' on 'rac1' Failed CRS-2679: Attempting to clean 'ora.oc4j' on 'rac1' CRS-2681: Clean of 'ora.oc4j' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.oc4j' on 'rac2' CRS-2676: Start of 'ora.LISTENER_SCAN3.lsnr' on 'rac2' succeeded CRS-2676: Start of 'ora.LISTENER_SCAN2.lsnr' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.rac3.vip' on 'rac1' CRS-2677: Stop of 'ora.rac3.vip' on 'rac1' succeeded CRS-2672: Attempting to start 'ora.rac3.vip' on 'rac2' CRS-2676: Start of 'ora.rac3.vip' on 'rac2' succeeded CRS-2676: Start of 'ora.oc4j' on 'rac2' succeeded CRS-2676: Start of 'ora.mgmtdb' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.ons' on 'rac1' CRS-2677: Stop of 'ora.ons' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.net1.network' on 'rac1' CRS-2677: Stop of 'ora.net1.network' on 'rac1' succeeded CRS-2792: Shutdown of Cluster Ready Services-managed resources on 'rac1' has completed CRS-2677: Stop of 'ora.crsd' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac1' CRS-2673: Attempting to stop 'ora.storage' on 'rac1' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac1' CRS-2677: Stop of 'ora.storage' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.crf' on 'rac1' CRS-2673: Attempting to stop 'ora.ctssd' on 'rac1' CRS-2673: Attempting to stop 'ora.evmd' on 'rac1' CRS-2673: Attempting to stop 'ora.asm' on 'rac1' CRS-2677: Stop of 'ora.crf' on 'rac1' succeeded CRS-2677: Stop of 'ora.ctssd' on 'rac1' succeeded CRS-2677: Stop of 'ora.evmd' on 'rac1' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'rac1' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rac1' succeeded CRS-2677: Stop of 'ora.asm' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac1' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'rac1' CRS-2677: Stop of 'ora.cssd' on 'rac1' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'rac1' CRS-2677: Stop of 'ora.gipcd' on 'rac1' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac1' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@rac1 ~]# [root@rac1 ~]# 节点二: [root@rac2 ~]# crsctl stop crs -f CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on 'rac2' CRS-2673: Attempting to stop 'ora.crsd' on 'rac2' CRS-2790: Starting shutdown of Cluster Ready Services-managed resources on 'rac2' CRS-2673: Attempting to stop 'ora.tar.db' on 'rac2' CRS-2673: Attempting to stop 'ora.rac3.vip' on 'rac2' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN3.lsnr' on 'rac2' CRS-2673: Attempting to stop 'ora.LISTENER_SCAN2.lsnr' on 'rac2' CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac2' CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac2' CRS-2673: Attempting to stop 'ora.DG1.dg' on 'rac2' CRS-2673: Attempting to stop 'ora.cvu' on 'rac2' CRS-2673: Attempting to stop 'ora.DG2.dg' on 'rac2' CRS-2677: Stop of 'ora.rac1.vip' on 'rac2' succeeded CRS-2677: Stop of 'ora.rac3.vip' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.rac2.vip' on 'rac2' CRS-2677: Stop of 'ora.cvu' on 'rac2' succeeded CRS-2677: Stop of 'ora.rac2.vip' on 'rac2' succeeded CRS-2797: Shutdown is already in progress for 'rac2',waiting for it to complete CRS-2797: Shutdown is already in progress for 'rac2',waiting for it to complete CRS-2677: Stop of 'ora.crsd' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.storage' on 'rac2' CRS-2673: Attempting to stop 'ora.crf' on 'rac2' CRS-2673: Attempting to stop 'ora.mdnsd' on 'rac2' CRS-2673: Attempting to stop 'ora.gpnpd' on 'rac2' CRS-2677: Stop of 'ora.storage' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.asm' on 'rac2' CRS-2677: Stop of 'ora.crf' on 'rac2' succeeded CRS-2677: Stop of 'ora.gpnpd' on 'rac2' succeeded CRS-2677: Stop of 'ora.mdnsd' on 'rac2' succeeded CRS-2677: Stop of 'ora.asm' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.ctssd' on 'rac2' CRS-2673: Attempting to stop 'ora.evmd' on 'rac2' CRS-2673: Attempting to stop 'ora.cluster_interconnect.haip' on 'rac2' CRS-2677: Stop of 'ora.cluster_interconnect.haip' on 'rac2' succeeded CRS-2677: Stop of 'ora.ctssd' on 'rac2' succeeded CRS-2677: Stop of 'ora.evmd' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.cssd' on 'rac2' CRS-2677: Stop of 'ora.cssd' on 'rac2' succeeded CRS-2673: Attempting to stop 'ora.gipcd' on 'rac2' CRS-2677: Stop of 'ora.gipcd' on 'rac2' succeeded CRS-2793: Shutdown of Oracle High Availability Services-managed resources on 'rac2' has completed CRS-4133: Oracle High Availability Services has been stopped. [root@rac2 ~]#
清理残存进程
Asm实例
sudo su - grid关闭实例
sql> shutdown abort;
后台操作系统清理进程
ps -ef |grep ora|awk '{print $2}'|xargs kill -9
ps -ef |grep asm|awk '{print $2}'|xargs kill -9
ps -ef |grep grid|awk '{print $2}'|xargs kill -9
清理节点一:
清理节点二:
在一个节点把集群起到排他模式
on one node
crsctl start crs -excl -nocrs
启动后可以通过sqlplus去asm实例里面看看磁盘组的情况
注意,在哪个节点起的排他模式,就在哪个节点查磁盘组的情况。
su - grid
sqlplus / as sysasm
select name,state from v$asm_diskgroup;
发现看不到磁盘组的信息的信息
有时候破坏了磁盘组,但是有可能实例中还有残存的信息,所以需要删除一下磁盘组
drop diskgroup dg2 force including contents;
报错不用管,因为磁盘组dg2本来就不存在了。
删除后新建磁盘组dg2
create diskgroup dg2
external redundancy
disk
'/dev/raw/raw4' name nr_1;
创建完成后,需要改变磁盘组兼容性属性(三条语句):
ALTER DISKGROUP dg2 SET ATTRIBUTE 'compatible.asm' = '12.1.0.0.0' ;
ALTER DISKGROUP dg2 SET ATTRIBUTE 'compatible.rdbms' = '12.1.0.0.0';
ALTER DISKGROUP dg2 SET ATTRIBUTE 'compatible.advm' = '12.1.0.0.0';
再次查询磁盘组的情况:
sudo su - grid
sqlplus / as sysasm
select name,state from v$asm_diskgroup;
发现可以发现dg2了。
下面可以找到最新的ocr备份
ocrconfig -showbackup
发现最近的是/taryartar/12c/grid_home/cdata/mycluster/backup00.ocr
执行ocr的恢复
ocrconfig -restore /taryartar/12c/grid_home/cdata/mycluster/backup00.ocr
本次成功。
完成以后检查ocr的情况:
Ocrcheck
下面恢复表决盘
先查看一下表决盘的情况
crsctl query css votedisk
发现当前没有表决盘。
把表决盘恢复到dg2
crsctl replace votedisk +dg2
检查votedisk情况
crsctl query css votedisk
可以发现有表决盘了
此时ocr和表决盘都已经恢复了,可以关闭集群了。
crsctl stop crs -f
注意,此时只用在节点一停,因为前面我们只在节点一以排他模式启动了集群。
然后以正常模式重启集群(所有节点)
crsctl start crs
节点一:
节点二:
检查集群服务是否都正常
crsctl check crs
节点一:
节点二:
crsctl check cluster -all
检查集群资源是否都正常
crsctl stat res -t
检查ocr版本一致性
Su - grid
cluvfy comp ocr -n all -verbose
[grid@rac2 ~]$ cluvfy comp ocr -n all -verbose 验证 OCR 完整性 正在检查 OCR 完整性... 正在检查是否缺少非集群配置... 所有节点都没有非集群的,仅限本地的配置 正在检查守护程序的活动性... 检查: "CRS daemon" 的活动性 节点名 正在运行? ------------------------------------ ------------------------ rac2 是 rac1 是 结果:"CRS daemon" 的活动性检查已通过 正在检查 OCR 配置文件 "/etc/oracle/ocr.loc"... OCR 配置文件 "/etc/oracle/ocr.loc" 检查成功 ocr 位置 "+DG2/mycluster/OCRFILE/registry.255.959643883" 的磁盘组在所有节点上都可用 检查 OCR 转储功能 OCR 转储检查已通过 NOTE: 此检查不验证 OCR 内容的完整性。请以授权用户的身份执行 'ocrcheck' 以验证 OCR 的内容。 OCR 完整性检查已通过 OCR 完整性 的验证成功。