heartbeat的作用就可以增加drbd的可用性,它能在某节点故障后,自动切换drbd块到备份节点,并自动进行虚IP从新绑定,DRBD块提权,磁盘挂载以及启动NFS等脚本操作,这一系列操作因为只在他后端节点间完成,前端用户访问的是heartbeat的虚IP,所以对用户来说无任何感知。
一、配置DBRS ,请查看DBRS配置篇 ,就在下面
二、Heartbeat配置
1、 安装heartbeat(CentOS6.3中默认不带有Heartbeat包,因此需要从第三方下载
(Master , Slave 同样操作) [root@localhost ~]#wget ftp://mirror.switch.ch/pool/4/mirror/scientificlinux/6rolling/i386/os/Packages/epel-release-6-5.noarch.rpm //如果版本找不到 , 可以到ftp://mirror.switch.ch/pool/选择新的版本 [root@localhost ~]#rpm –ivUh epel-release-6-5.noarch.rpm [root@localhost ~]#yum –enablerepo=epel install heartbeat –y
2、 配置beartbeat
(Master)
1) 配置ha.cf配置文件
[root@localhost ~]#vim /etc/ha.d/ha.cf
#日志 logfile /var/log/ha-log logfacility local0 #心跳检测时间 keepalive 2 #死亡四件 deadtime 6 #制定对方IP , 这里的网卡需要对应 ucast eth0 192.168.232.165 #服务器正常后由主服务器接口资源 , 另一台放弃资源 auto_failback off #自定义节点 node Master Slave
2)认证文件authkeys用于配置心跳的加密方式,该文件主要是用于集群中两个节点的认证,采用的算法和密钥在集群中节点上必须相同,目前提供了3种算法:md5,sha1和crc。其中crc不能够提供认证,它只能够用于校验数据包是否损坏,而sha1,md5需要一个密钥来进行认证。
[root@localhost ~]# vim /etc/ha.d/ha.cf [root@localhost ~]# dd if=/dev/random bs=512 count=1 | openssl md5 0+1 records in 0+1 records out 20 bytes (20 B) copied, 0.000126508 s, 158 kB/s (stdin)= 12f3afe7de1c1a9948444feeacf11479 //获得的md5加密的随机码 [root@localhost ~]# vim /etc/had.d/authkeys auth 3 3 md5 12f3afe7de1c1a9948444feeacf11479 //authkeys文件需要更改权限为600 , 否则heartbeat启动会失败 [root@localhost ~]#chmod 600 /etc/had.d/authkeys
3)生成集群资源文件haresource
[root@localhost ~]# vim /etc/ha.d/haresources Master IPaddr::192.168.232.254/24/eth0 drbddisk::r0 Filesystem::/dev/drbd0::/data::ext3 killnfsd
注:
该文件内IPaddr,Filesystem等脚本存放路径在/etc/ha.d/resource.d/下,也可在该目录下存放服务启动脚本(例如:mysql,www),将相同脚本名称添加到/etc/ha.d/haresources内容中,从而跟随heartbeat启动而启动该脚本。
IPaddr::192.168.7.90/24/eth0:用IPaddr脚本配置浮动VIP
drbddisk::r0:用drbddisk脚本实现DRBD主从节点资源组的挂载和卸载
Filesystem::/dev/drbd0::/data::ext3:用Filesystem脚本实现磁盘挂载和卸载
生成killnfsd文件 , 如果不存在nfs系统请运行yum install nfs -y
[root@localhost ~]#vim /etc/ha.d/resource.d/killnfsd killall -9 nfsd ; /etc/init.d/nfs restart;exit 0; [root@localhost ~]#chmod 755 /etc/ha.d/resource.d/killnfsd
4)生成drbddisk启动脚本
[root@localhost ~]# vim /etc/ha.d/resource.d/drbddisk
[root@localhost ~]# vim /etc/ha.d/ha.cf [root@localhost ~]# dd if=/dev/random bs=512 count=1 | openssl md5 0+1 records in 0+1 records out 20 bytes (20 B) copied, 0.000126508 s, 158 kB/s (stdin)= 12f3afe7de1c1a9948444feeacf11479 //获得的md5加密的随机码 [root@localhost ~]# vim /etc/had.d/authkeys auth 3 3 md5 12f3afe7de1c1a9948444feeacf11479 //authkeys文件需要更改权限为600 , 否则heartbeat启动会失败 [root@localhost ~]#chmod 600 /etc/had.d/authkeys #!/bin/bash # # This script is inteded to be used as resource script by heartbeat # # Copright 2003-2008 LINBIT Information Technologies # Philipp Reisner, Lars Ellenberg # ### DEFAULTFILE="/etc/default/drbd" DRBDADM="/sbin/drbdadm" if [ -f $DEFAULTFILE ]; then . $DEFAULTFILE fi if [ "$#" -eq 2 ]; then RES="$1" CMD="$2" else RES="all" CMD="$1" fi ## EXIT CODES # since this is a "legacy heartbeat R1 resource agent" script, # exit codes actually do not matter that much as long as we conform to # http://wiki.linux-ha.org/HeartbeatResourceAgent # but it does not hurt to conform to lsb init-script exit codes, # where we can. # http://refspecs.linux-foundation.org/LSB_3.1.0/ #LSB-Core-generic/LSB-Core-generic/iniscrptact.html #### drbd_set_role_from_proc_drbd() { local out if ! test -e /proc/drbd; then ROLE="Unconfigured" return fi dev=$( $DRBDADM sh-dev $RES ) minor=${dev#/dev/drbd} if [[ $minor = *[!0-9]* ]] ; then # sh-minor is only supported since drbd 8.3.1 minor=$( $DRBDADM sh-minor $RES ) fi if [[ -z $minor ]] || [[ $minor = *[!0-9]* ]] ; then ROLE=Unknown return fi if out=$(sed -ne "/^ *$minor: cs:/ { s/:/ /g; p; q; }" /proc/drbd); then set -- $out ROLE=${5%/**} : ${ROLE:=Unconfigured} # if it does not show up else ROLE=Unknown fi } case "$CMD" in start) # try several times, in case heartbeat deadtime # was smaller than drbd ping time try=6 while true; do $DRBDADM primary $RES && break let "--try" || exit 1 # LSB generic error sleep 1 done ;; stop) # heartbeat (haresources mode) will retry failed stop # for a number of times in addition to this internal retry. try=3 while true; do $DRBDADM secondary $RES && break # We used to lie here, and pretend success for anything != 11, # to avoid the reboot on failed stop recovery for "simple # config errors" and such. But that is incorrect. # Don't lie to your cluster manager. # And don't do config errors... let --try || exit 1 # LSB generic error sleep 1 done ;; status) if [ "$RES" = "all" ]; then echo "A resource name is required for status inquiries." exit 10 fi ST=$( $DRBDADM role $RES ) ROLE=${ST%/**} case $ROLE in Primary|Secondary|Unconfigured) # expected ;; *) # unexpected. whatever... # If we are unsure about the state of a resource, we need to # report it as possibly running, so heartbeat can, after failed # stop, do a recovery by reboot. # drbdsetup may fail for obscure reasons, e.g. if /var/lock/ is # suddenly readonly. So we retry by parsing /proc/drbd. drbd_set_role_from_proc_drbd esac case $ROLE in Primary) echo "running (Primary)" exit 0 # LSB status "service is OK" ;; Secondary|Unconfigured) echo "stopped ($ROLE)" exit 3 # LSB status "service is not running" ;; *) # NOTE the "running" in below message. # this is a "heartbeat" resource script, # the exit code is _ignored_. echo "cannot determine status, may be running ($ROLE)" exit 4 # LSB status "service status is unknown" ;; esac ;; *) echo "Usage: drbddisk [resource] {start|stop|status}" exit 1 ;; esac exit 0
[root@localhost ~]# chmod 755 /etc/ha.d/resource/drbddisk
5)将上面新建的文件复制给Slave文件
[root@localhost ~]# scp /etc/ha.d/ha.cf root@192.168.232.165:/etc/ha.d/ [root@localhost ~]# scp /etc/had.d/authkeys root@192.168.232.165:/etc/ha.d/ [root@localhost ~]# scp /etc/ha.d/haresources root@192.168.232.165:/etc/ha.d/ [root@localhost ~]# scp /etc/ha.d/resource.d/drbddisk root@192.168.232.165:/etc/ha.d/resource/
(以下是Slave操作)
6)更改Master传过来的文件 , 到那时只改ha.cf就可以了
[root@localhost ~]# vim /etc/ha.d/ha.cf //以后一下的内容就好 Ucast eth0 192.168.232.165
三、启动heartbeat并测试
(Master , Slave操作) 1、启动 [root@localhost ~]# service heartbeat start [root@localhost ~]# chkconfig heartbeat on 2、 测试 , ping heartbeat中生成的虚拟ip是否成功 [root@localhost ~]# ping 192.168.232.254
四、配置NFS
(Master , Slave操作) [root@localhost ~]# vim /etc/exports/data *(rw,no_root_squash) //重启NFS服务 [root@localhost ~]# service rpcbind restart [root@localhost ~]# service nfs restart [root@localhost ~]# chkconfig rpcbind on //开机不启动nfs , 因为启动heartbeat的脚本就已经启动nfs了 [root@localhost ~]# chkconfig nfs off
五、测试
在另外一台linux客户端中挂载虚ip : 192.168.232.254 , 挂载成功表示整个实验成功了
这里我就在slave测试
1)挂载nfs系统
[root@localhost ~]# mkdir /test [root@localhost ~]# mount –t nfs 192.168.232.254:/data /test [root@localhost ~]# df –h [root@localhost ~]# cd /test [root@localhost ~]# ls //如果刚刚测试的数据都在 , 证明OK了 [root@localhost ~]#touch ttt //测试能不能写入
2)测试当primary挂掉时 , 看备份的机子成功接管Primary
(Master操作) [root@localhost ~]# init 0 (Slave操作) [root@localhost ~]# service drbd status //如果出现提升为primary证明系统成功切换 [root@localhost ~]# cd test //再来回查挂载的系统能不能使用
版权声明:未经博主允许不得转载。http://www.smister.com/post-25.html