我的故障转移群集遇到了一个神秘的问题,
Cluster name: PrintCluster01.domain.com Members: PrintServer01.domain.com andPrintServer02.domain.com
在故障转移群集管理 – 群集事件中我收到严重错误消息1135和1177:
Log Name: System Source: Microsoft-Windows-FailoverClustering Date: 15/06/2011 9:07:49 PM Event ID: 1177 Task Category: None Level: Critical Keywords: User: SYSTEM Computer: PrintServer01.domain.com Description: The Cluster service is shutting down because quorum was lost. This could be due to the loss of network connectivity between some or all nodes in the cluster,or a failover of the witness disk. Run the Validate a Configuration wizard to check your network configuration. If the condition persists,check for hardware or software errors related to the network adapter. Also check for failures in any other network components to which the node is connected such as hubs,switches,or bridges. Log Name: System Source: Microsoft-Windows-FailoverClustering Date: 15/06/2011 9:07:28 PM Event ID: 1135 Task Category: None Level: Critical Keywords: User: SYSTEM Computer: PrintServer01.domain.com Description: Cluster node 'PrintServer02' was removed from the active failover cluster membership. The Cluster service on this node may have stopped. This could also be due to the node having lost communication with other active nodes in the failover cluster. Run the Validate a Configuration wizard to check your network configuration. If the condition persists,check for hardware or software errors related to the network adapters on this node. Also check for failures in any other network components to which the node is connected such as hubs,or bridges.
在进一步调查之后,我发现了一些有趣的错误,从PrintServer02上的事件查看器中记录的第一个严重错误消息:
Log Name: System Source: Tcpip Date: 15/06/2011 9:07:29 PM Event ID: 4199 Task Category: None Level: Error Keywords: Classic User: N/A Computer: PrintServer02-VM.domain.com Description: The system detected an address conflict for IP address 192.168.127.142 with the system having network hardware address 00-50-56-AE-29-23. Network operations on this system may be disrupted as a result.
192.168.127.142 – > PrintServer01的辅助IP
怎么可能它被一个PrintServer01节点冲突?详情如下:
**From PrintServer01** Ethernet adapter Local Area Connection* 8: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Microsoft Failover Cluster Virtual Adapter Physical Address. . . . . . . . . : 02-50-56-AE-29-23 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 169.254.1.183(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.0.0 Default Gateway . . . . . . . . . : NetBIOS over Tcpip. . . . . . . . : Enabled
我仔细检查了所有集群成员,现在所有IP地址都是唯一的.
但是我确信我的IP是静态的而不是DHCP,因为IPCONFIG结果如下:
From **PrintServer01** (the Active Node) Windows IP Configuration Host Name . . . . . . . . . . . . : PrintServer01 Primary Dns Suffix . . . . . . . : domain.com Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : domain.com domain.com.au Ethernet adapter Local Area Connection* 8: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Microsoft Failover Cluster Virtual Adapter Physical Address. . . . . . . . . : 02-50-56-AE-29-23 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 169.254.1.183(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.0.0 Default Gateway . . . . . . . . . : NetBIOS over Tcpip. . . . . . . . : Enabled Ethernet adapter Cluster Public Network: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Intel® PRO/1000 MT Network Connection Physical Address. . . . . . . . . : 00-50-56-AE-29-23 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.127.155(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 IPv4 Address. . . . . . . . . . . : 192.168.127.88(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 IPv4 Address. . . . . . . . . . . : 192.168.127.142(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 IPv4 Address. . . . . . . . . . . : 192.168.127.143(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 IPv4 Address. . . . . . . . . . . : 192.168.127.144(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.127.254 DNS Servers . . . . . . . . . . . : 192.168.127.10 192.168.127.11 Primary WINS Server . . . . . . . : 192.168.127.10 Secondary WINS Server . . . . . . : 192.168.127.11 NetBIOS over Tcpip. . . . . . . . : Enabled Ethernet adapter Cluster Private Network: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Intel® PRO/1000 MT Network Connection #2 Physical Address. . . . . . . . . : 00-50-56-AE-43-EC DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 10.184.2.2(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : NetBIOS over Tcpip. . . . . . . . : Disabled From **PrintServer02** Windows IP Configuration Host Name . . . . . . . . . . . . : PrintServer02 Primary Dns Suffix . . . . . . . : domain.com Node Type . . . . . . . . . . . . : Hybrid IP Routing Enabled. . . . . . . . : No WINS Proxy Enabled. . . . . . . . : No DNS Suffix Search List. . . . . . : domain.com domain.com.au Ethernet adapter Local Area Connection* 8: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Microsoft Failover Cluster Virtual Adapter Physical Address. . . . . . . . . : 02-50-56-AE-5F-E5 DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 169.254.2.86(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.0.0 Default Gateway . . . . . . . . . : NetBIOS over Tcpip. . . . . . . . : Enabled Ethernet adapter Cluster Public Network: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Intel® PRO/1000 MT Network Connection Physical Address. . . . . . . . . : 00-50-56-AE-79-FA DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.127.172(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 IPv4 Address. . . . . . . . . . . : 192.168.127.119(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.127.254 DNS Servers . . . . . . . . . . . : 192.168.127.10 192.168.127.11 Primary WINS Server . . . . . . . : 192.168.127.11 Secondary WINS Server . . . . . . : 192.168.127.10 NetBIOS over Tcpip. . . . . . . . : Enabled Ethernet adapter Cluster Private Network: Connection-specific DNS Suffix . : Description . . . . . . . . . . . : Intel® PRO/1000 MT Network Connection #2 Physical Address. . . . . . . . . : 00-50-56-AE-77-8D DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 10.184.2.3(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : NetBIOS over Tcpip. . . . . . . . : Disabled
任何帮助将不胜感激.
谢谢,
AWT
当群集中的多个节点尝试同时使资源组(及其关联的IP)联机时,会发生IP地址冲突错误.
如果群集节点暂时失去彼此联系,则会发生这种情况.每个节点都假设另一个节点发生故障,因此“被动”节点将使所有资源组在“活动”节点上实际上仍处于联机状态时将其联机.
当其中一个ESX(i)主机过载时,我在VMWare环境中看到了这个问题 – 有时甚至只是在HBA总线重新扫描期间,突然MSCS节点突然失去联系并发生这种混乱.