CentOS是RedHat的社区编译版,所以也就采用了RedHat ES 中的集群套件:RHCS。在RHCS架构中,Fence设备十分重要,关系到故障恢复和数据一致性。在这里,我们使用服务器上主板提供的IPMI设备来实现Fence的功能。
IPMI是一种远程查询和控制服务器基本状态的协议,目前的主流服务器产品基本都提供了这种协议的实现。在Linux中可以使用OpenIPMI软件包来设置和使用IPMI。
启动IPMI服务:
[zarra@node1 ~]$ sudo service ipmi start
Starting ipmi drivers: [确定]
设置IPMI接口网络地址、掩码、默认网关:
[zarra@node1 ~]$sudo ipmitool lan set 1 ipsrc static
[zarra@node1 ~]$sudo ipmitool lan set 1 ipaddr 10.0.0.10
[zarra@node1 ~]$sudo ipmitool lan set 1 netmask 255.255.255.0
[zarra@node1 ~]$sudo ipmitool lan set 1 arp respond on
[zarra@node1 ~]$sudo ipmitool lan set 1 arp gernerate on
[zarra@node1 ~]$sudo ipmitool lan set 1 arp interval 5
设置IPMI用户和口令:
[zarra@node1 ~]$sudo ipmitool lan set 1 user
[zarra@node1 ~]$sudo ipmitool lan set 1 access on
[zarra@node1 ~]$sudo ipmitool user list
ID Name Callin Link Auth IPMI Msg Channel Priv Limit
2 root true true true ADMINISTRATOR
[zarra@node1 ~]$sudo ipmitool user set password 2
Password for user 2: Enter your password
Password for user 2: Enter your password
设置好IPMI地址和用户后,可以使用命令来检查服务器的基本状态,例如:
[zarra@node1 ~]$sudo ipmitool -H 10.0.0.1 -I lan -U root -P password power status
IPMI的设置基本就是如上过程,两台服务其的IPMI 具体参数如下:
[zarra@node1 ~]$ sudo ipmitool lan print 1
Set in Progress : Set Complete
Auth Type Support : NONE MD2 MD5 PASSWORD
Auth Type Enable : Callback : MD2 MD5
: User : MD2 MD5
: Operator : MD2 MD5
: Admin : MD2 MD5
: OEM : MD2 MD5
IP Address Source : Static Address
IP Address : 10.0.0.10
Subnet Mask : 255.255.255.0
MAC Address : 00:22:19:d6:05:38
SNMP Community String : public
IP Header : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
Default Gateway IP : 10.10.71.240
Default Gateway MAC : 00:00:00:00:00:00
Backup Gateway IP : 0.0.0.0
Backup Gateway MAC : 00:00:00:00:00:00
802.1q VLAN ID : Disabled
802.1q VLAN Priority : 0
Cipher Suite Priv Max : Not Available
[zarra@node2 ~]$ sudo ipmitool lan print 1
Set in Progress : Set Complete
Auth Type Support : NONE MD2 MD5 PASSWORD
Auth Type Enable : Callback : MD2 MD5
: User : MD2 MD5
: Operator : MD2 MD5
: Admin : MD2 MD5
: OEM : MD2 MD5
IP Address Source : Static Address
IP Address : 10.0.0.11
Subnet Mask : 255.255.255.0
MAC Address : 00:22:19:d5:f4:61
SNMP Community String : public
IP Header : TTL=0x40 Flags=0x40 Precedence=0x00 TOS=0x10
Default Gateway IP : 10.10.71.240
Default Gateway MAC : 00:00:00:00:00:00
Backup Gateway IP : 0.0.0.0
Backup Gateway MAC : 00:00:00:00:00:00
802.1q VLAN ID : Disabled
802.1q VLAN Priority : 0
Cipher Suite Priv Max : Not Available
修改两节点服务器上的 /etc/hosts 文件,内容如下:
[zarra@node1 ~]$ cat /etc/hosts
# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1 localhost.localdomain localhost
10.0.0.1 node1.test.com node1
10.0.0.2 node2.test.com node2
10.10.71.42 cluster.test.com cluster
节点一服务器上的网络设置:
[zarra@node1 ~]$ cat /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=localhost.localdomain
GATEWAY=10.10.71.240
[zarra@node1 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express
DEVICE=eth0
BOOTPROTO=static
HWADDR=00:22:19:D6:05:36
ONBOOT=yes
NETMASK=255.255.255.0
IPADDR=10.0.0.1
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
[zarra@node1 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth1
# Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:22:19:D6:05:37
ONBOOT=yes
IPADDR=10.10.71.40
NETMASK=255.255.255.0
TYPE=Ethernet
节点二服务器上的网络设置:
[zarra@node2 ~]$ cat /etc/sysconfig/network
NETWORKING=yes
NETWORKING_IPV6=no
HOSTNAME=localhost.localdomain
GATEWAY=10.10.71.240
[zarra@node2 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth0
# Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express
DEVICE=eth0
BOOTPROTO=none
HWADDR=00:22:19:D5:F4:5F
ONBOOT=yes
IPADDR=10.0.0.2
NETMASK=255.255.255.0
TYPE=Ethernet
USERCTL=no
IPV6INIT=no
[zarra@node2 ~]$ cat /etc/sysconfig/network-scripts/ifcfg-eth1
# Broadcom Corporation NetXtreme BCM5721 Gigabit Ethernet PCI Express
DEVICE=eth1
BOOTPROTO=static
HWADDR=00:22:19:D5:F4:60
IPADDR=10.10.71.41
NETMASK=255.255.255.0
ONBOOT=yes
TYPE=Ethernet
如果在系统安装是选择了Clustering支持,那么就会自动安装RHCS集群套件。如果想手动安装,可以执行如下命令:
[zarra@node1 ~]$ sudo yum groupinstall clustering
安装完成后,检查 ricci 服务状态,应该如下所示:
[zarra@node1 ~]$ chkconfig ricci --list
ricci 0:关闭 1:关闭 2:启用 3:启用 4:启用 5:启用 6:关闭
如果没有设为启动,执行如下命令:
[zarra@node1 ~]$ sudo chkconfig ricci on
然后启动 ricci 服务:
[zarra@node1 ~]$ sudo service ricci start
在node2 上执行同样的操作,以完成集群套件的安装。
在 RedHat ES 5.0 系统中提供了Conga这一Web界面的集群设置工具,方便了集群设置。Conga工具最好是安装在集群外的一台服务器上。在准备安装Conga的服务器上执行:
[zarra@localhost ~]$ sudo yum install luci
安装完成后设置Luci的admin用户口令,执行命令:
[zarra@localhost ~]$ sudo luci_admin init
然后依据提示设置口令。完成后即可开启luci服务:
[zarra@localhost ~]$ sudo chkconfig luci on
[zarra@localhost ~]$ sudo service luci start
Starting luci: [确定]
Point your web browser to https://node1.test.com:8084 to access luci
依据提示登陆 https://node1.test.com:8084 进行集群设置。
打开Web浏览器登陆Conga服务器,如图所示:
进入“Cluster”标签,点击 “Create a New Cluster” 按钮,进行建立集群操作,如图所示:
点击 “Add a Failover Domain”按钮,建立实效域,如图
点击 “Add a Resource” 按钮,建立两个IP地址资源,完成后Resource页面如图所示:
点击 “Add a Service”按钮,建立iptables服务,完成后应如图所示:
依次配置各节点服务器的Fence设备,完成后如图所示:
Node1
Node2
设置完成后 cluster.conf 应该如下所示:
[zarra@node1 ~]$ sudo cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster alias="cluster_test" config_version="17" name="cluster_test">
<fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
<clusternodes>
<clusternode name="node1.test.com" nodeid="1" votes="1">
<fence>
<method name="1">
<device name="node1_ipmi"/>
</method>
</fence>
</clusternode>
<clusternode name="node2.test.com" nodeid="2" votes="1">
<fence>
<method name="1">
<device name="node2_ipmi"/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes="1" two_node="1"/>
<fencedevices>
<fencedevice agent="fence_ipmilan" ipaddr="10.0.0.11" login="root" name="node2_ipmi" passwd="adaptor"/>
<fencedevice agent="fence_ipmilan" ipaddr="10.0.0.10" login="root" name="node1_ipmi" passwd="adaptor"/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name="firewall" nofailback="0" ordered="1" restricted="1">
<failoverdomainnode name="node1.test.com" priority="1"/>
<failoverdomainnode name="node2.test.com" priority="10"/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address="10.10.71.42" monitor_link="1"/>
<ip address="10.0.0.4" monitor_link="1"/>
</resources>
<service autostart="1" domain="firewall" exclusive="0" name="iptables" recovery="relocate">
<ip ref="10.10.71.42"/>
<ip ref="10.0.0.4"/>
</service>
</rm>
</cluster>