一些版本信息:
Operating system is Ubuntu 11.10,on EC2,kernel is 3.0.0-16-virtual and the application info is: Version: 8.3.11 (api:88) GIT-hash: 0de839cee13a4160eed6037c4bddd066645e23c5 build by buildd@allspice,2011-07-05 19:51:07
在dmesg(见下文)中获得一些奇怪的错误,没有复制发生.我已将我的第一个节点设为主节点并显示:
drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Primary/Unknown UpToDate/DUnknown r----s ext3
我的辅助节点显示:
drbd driver loaded OK; device status: version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 m:res cs ro ds p mounted fstype 0:r0 StandAlone Secondary/Unknown Inconsistent/DUnknown r----s
version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----s ns:0 nr:0 dw:4 dr:1073 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964
在奴隶上显示/ proc / drbd表明没有任何东西被转移……
version: 8.3.11 (api:88/proto:86-96) srcversion: DA5A13F16DE6553FC7CE9B2 0: cs:StandAlone ro:Secondary/Unknown ds:Inconsistent/DUnknown r----s ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:262135964
这是我的配置……
resource r0 { protocol C; startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "test123; } on drbd01 { device /dev/drbd0; disk /dev/xvdm; address 23.XX.XX.XX:7788; # blocked out ip Meta-disk internal; } on drbd02 { device /dev/drbd0; disk /dev/xvdm; address 184.XX.XX.XX:7788; #blocked out ip Meta-disk internal; } }
我在主人身上运行了以下内容:
sudo drbdadm -- --overwrite-data-of-peer primary all
系统之间没有防火墙.
这是dmesg有一些错误:
[2285172.969955] drbd: initialized. Version: 8.3.11 (api:88/proto:86-96) [2285172.969960] drbd: srcversion: DA5A13F16DE6553FC7CE9B2 [2285172.969962] drbd: registered as block device major 147 [2285172.969965] drbd: minor_table @ 0xffff88000276ea00 [2285173.000952] block drbd0: Starting worker thread (from drbdsetup [1300]) [2285173.003971] block drbd0: disk( Diskless -> Attaching ) [2285173.006150] block drbd0: No usable activity log found. [2285173.006154] block drbd0: Method to ensure write ordering: flush [2285173.006158] block drbd0: max BIO size = 4096 [2285173.006165] block drbd0: drbd_bm_resize called with capacity == 524271928 [2285173.008512] block drbd0: resync bitmap: bits=65533991 words=1023969 pages=2000 [2285173.008518] block drbd0: size = 250 GB (262135964 KB) [2285173.079566] block drbd0: bitmap READ of 2000 pages took 17 jiffies [2285173.081189] block drbd0: recounting of set bits took additional 1 jiffies [2285173.081194] block drbd0: 250 GB (65533991 bits) marked out-of-sync by on disk bit-map. [2285173.081203] block drbd0: Suspended AL updates [2285173.081210] block drbd0: disk( Attaching -> UpToDate ) [2285173.081214] block drbd0: attached to UUIDs 1C1291D39584C1D1:0000000000000004:0000000000000000:0000000000000000 [2285173.095016] block drbd0: conn( StandAlone -> Unconnected ) [2285173.095046] block drbd0: Starting receiver thread (from drbd0_worker [1301]) [2285173.099297] block drbd0: receiver (re)started [2285173.099304] block drbd0: conn( Unconnected -> WFConnection ) [2285173.099330] block drbd0: bind before connect Failed,err = -99 [2285173.099346] block drbd0: conn( WFConnection -> Disconnecting ) [2285173.295788] block drbd0: Discarding network configuration. [2285173.295815] block drbd0: Connection closed [2285173.295826] block drbd0: conn( Disconnecting -> StandAlone ) [2285173.295840] block drbd0: receiver terminated [2285173.295844] block drbd0: Terminating drbd0_receiver
编辑:
阅读其他一些类似的问题,有人建议做一个’drbdadm dump all’,所以我认为它不会受到伤害.
ubuntu@drbd01:~$drbdadm dump all /etc/drbd.conf:19: in resource r0,on drbd01: IP 23.XX.XX.XX not found on this host.
和奴隶:
root@drbd02:~# drbdadm dump all /etc/drbd.conf:25: in resource r0,on drbd02: IP 184.XX.XX.XX not found on this host.
奇怪的是它没有找到自己的ip,但是,这是一个使用弹性IP的Amazon EC2系统…这里是我的ipconfigs两者…
主:
ubuntu@drbd01:~$ifconfig eth0 Link encap:Ethernet HWaddr 22:00:0a:1c:27:11 inet addr:10.28.39.17 Bcast:10.28.39.63 Mask:255.255.255.192 inet6 addr: fe80::2000:aff:fe1c:2711/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1569 errors:0 dropped:0 overruns:0 frame:0 TX packets:1169 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:124409 (124.4 KB) TX bytes:213601 (213.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
奴隶:
root@drbd02:~# ifconfig eth0 Link encap:Ethernet HWaddr 12:31:3f:00:14:9d inet addr:10.160.27.107 Bcast:10.160.27.255 Mask:255.255.254.0 inet6 addr: fe80::1031:3fff:fe00:149d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:915 errors:0 dropped:0 overruns:0 frame:0 TX packets:774 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:75381 (75.3 KB) TX bytes:109673 (109.6 KB) Interrupt:26 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
解决方法
你实际上不需要运行sudo drbdadm – –overwrite-data-of-peer primary all
只要/ dev / drbd您应该做到以下几点
Step01)sudo service MysqL在DRBD Primary上停止,因此不会为DRBD同步进行额外的更改
Step02)sudo drbdadm连接所有DRBD Secondary
Step03)DRBD Secondary上的sudo cat / proc / drbd以确保连接统计信息是WFConnection
Step04)sudo drbdadm连接所有DRBD Primary
步骤05)在DRBD主服务器上执行sudo cat / proc / drbd以确保连接状态为SyncTarget.
Step06)sudo服务MysqL停在DRBD主服务器上,所以MysqL可以恢复.同步将继续.您不必等待DRBD在步骤05中完全同步.
警告
不应在地理距离上使用DRBD.我使用通过CrossOver电缆通过192.168.x.x连接DRBD对的设置.