ActiveMQ 5.1.3:单节点与集群部署实战指南

前言:在当今数字化时代,企业级应用的复杂性呈指数级增长,系统之间的通信和数据交互成为业务流程的核心。无论是金融交易的实时处理、电商平台的订单流转,还是物联网设备的数据采集,高效、可靠的消息传递机制都是确保系统稳定运行的关键。然而,传统的同步通信方式往往面临性能瓶颈、系统耦合度过高以及扩展性不足等问题。在这种背景下,消息中间件应运而生,而ActiveMQ作为其中的佼佼者,凭借其卓越的性能、丰富的功能和广泛的应用场景,成为了企业级应用集成的首选解决方案之一。

ActiveMQ是一个开源的、高性能的消息中间件,它基于Java开发,遵循JMS(Java Message Service)规范,同时支持多种消息协议,能够与不同语言和平台无缝集成。它不仅提供了强大的消息传递功能,还通过灵活的部署方式、高可用性设计以及丰富的社区支持,满足了企业级应用对可靠性和扩展性的严格要求。无论是初学者还是资深开发者,ActiveMQ都能提供从入门到进阶的全方位支持,帮助用户快速构建高效、稳定的消息通信架构。

在接下来的篇章中,我们将深入探索ActiveMQ的核心特性、部署方式、应用场景以及最佳实践,帮助您全面了解这一强大的消息中间件。无论您是希望优化现有系统架构,还是正在寻找新的技术解决方案,ActiveMQ都值得您深入了解和尝试。让我们一起走进ActiveMQ的世界,开启高效消息通信之旅。

以下是基于当前最新稳定版的 ActiveMQ(5.18.3 版本)的单机部署和集群部署步骤,以及 JDK 的具体安装步骤:

一、JDK 安装步骤

1. 下载适合您系统的 JDK 安装包,可从 Oracle 官方网站获取。

2. 上传 JDK 安装包到服务器指定目录,如 /mpjava

3. 使用以下命令安装 JDK:

bash 复制代码
cd /mpjava
rpm -ivh jdk-<version>-linux-x64.rpm

或使用 yum 安装:

bash 复制代码
yum install java-11-openjdk

4. 验证 JDK 是否安装成功:

bash 复制代码
java -version

二、ActiveMQ 5.18.3 单机部署

1. 安装 JDK,参看上述步骤。

2. 将 ActiveMQ 压缩包(apache-activemq-5.18.3-bin.tar.gz)上传到服务器目录(例如:/mpjava)。

3. 解压压缩包:

bash 复制代码
cd /mpjava
tar xvf apache-activemq-5.18.3-bin.tar.gz

4. 启动 ActiveMQ:

bash 复制代码
cd /mpjava/apache-activemq-5.18.3
bin/activemq start

5. 检测 ActiveMQ 是否安装成功:

  • 检测 ActiveMQ 端口 61616 是否监听:

    bash 复制代码
    netstat -an | grep 61616
  • 在能访问 ActiveMQ 服务器的机器上浏览器输入 http://<ActiveMQ服务器IP>:8161/admin/ 验证管理服务是否启动。

6. 配置开机启动:

bash 复制代码
vim /etc/rc.d/rc.local

添加执行语句:

/mpjava/apache-activemq-5.18.3/bin/activemq start

保存后退出,并授权开机启动文件:

bash 复制代码
chmod +x /etc/rc.d/rc.local

三、ActiveMQ 5.18.3 集群部署

1. 集群规划:

  • 作用:ZooKeeper 节点,地址:172.26.223.71、172.26.223.72、172.26.223.73,ZooKeeper 端口:2181。
  • 作用:ActiveMQ 节点,地址:172.26.223.71、172.26.223.72、172.26.223.73,各节点端口规划:
    • openwire 端口:61616、61626
    • amqp 端口:5672、5682
    • stomp 端口:61613、61623
    • mqtt 端口:1883、1884
    • ws 端口:61614、61624
    • admin 端口:8161、8162
  • 集群结构:mq11、mq12、mq13 组成高可用集群 cluster1,mq21、mq22、mq23 组成高可用集群 cluster2;cluster1、cluster2 进行负载组成集群 mqcluster。
  • 三台服务器上传 apache-activemq-5.18.3-bin.tar.gz,并解压到两个目录:
    • /mpjava/activemq-cluster1/
    • /mpjava/activemq-cluster2/

2. 部署 ZooKeeper 集群(参照我的相关部署博客文档)。

3. 高可用集群配置:

  • 修改 conf/activemq.xml:

    xml 复制代码
    <broker xmlns="http://activemq.apache.org/schema/core" brokerName="cluster1" dataDirectory="${activemq.data}">
      <persistenceAdapter>
        <replicatedLevelDB
          directory="${activemq.data}/leveldb"
          replicas="3"
          bind="tcp://0.0.0.0:0"
          zkAddress="172.26.223.71:2181,172.26.223.72:2181,172.26.223.73:2181"
          hostname="172.26.223.71"
          sync="local_disk"
          zkPath="/activemq/leveldb-stores/cluster1"/>
      </persistenceAdapter>
      <transportConnectors>
        <transportConnector name="openwire" uri="tcp://0.0.0.0:61616?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
        <transportConnector name="amqp" uri="amqp://0.0.0.0:5672?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
        <transportConnector name="stomp" uri="stomp://0.0.0.0:61613?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
        <transportConnector name="mqtt" uri="mqtt://0.0.0.0:1883?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
        <transportConnector name="ws" uri="ws://0.0.0.0:61614?maximumConnections=1000&amp;wireFormat.maxFrameSize=104857600"/>
      </transportConnectors>
    </broker>
  • 修改 conf/jetty.xml:

    xml 复制代码
    <bean id="jettyPort" class="org.apache.activemq.web.WebConsolePort" init-method="start">
      <property name="host" value="0.0.0.0"/>
      <property name="port" value="8161"/>
    </bean>

4. 负载集群:

  • cluster1 集群的每个节点的 activemq.xml 中添加配置(在 persistenceAdapter 标签前):

    xml 复制代码
    <networkConnectors>
      <networkConnector uri="static:(tcp://172.26.223.71:61626,tcp://172.26.223.72:61626,tcp://172.26.223.73:61626)" duplex="true"/>
    </networkConnectors>
  • cluster2 集群的每个节点的 activemq.xml 中添加配置(在 persistenceAdapter 标签前):

    xml 复制代码
    <networkConnectors>
      <networkConnector uri="static:(tcp://172.26.223.71:61616,tcp://172.26.223.72:61616,tcp://172.26.223.73:61616)" duplex="true"/>
    </networkConnectors>

5. 启动 ActiveMQ:

  • 在每个 ActiveMQ 的 bin 目录下执行:

    bash 复制代码
    ./activemq start

6. 配置开机启动:

bash 复制代码
vim /etc/rc.d/rc.local

添加执行语句:

/mpjava/activemq-cluster1/bin/activemq start
/mpjava/activemq-cluster2/bin/activemq start

保存后退出,并授权开机启动文件:

bash 复制代码
chmod +x /etc/rc.d/rc.local

运维增强与故障预防

1. 程序中配置:

  • 将 ly-mp-other.properties 中 mp.component.amqUrl 配置为:

    mp.component.amqType=2
    mp.component.amqUrl = failover:(tcp://172.26.223.71:61616,tcp://172.26.223.71:61626,tcp://172.26.223.72:61616,tcp://172.26.223.72:61626,tcp://172.26.223.73:61616,tcp://172.26.223.73:61626)
    

2. ActiveMQ守护与监控脚本

功能简述:

本脚本用于监控ActiveMQ集群实例的运行状态,并在实例异常退出时自动重启。同时,脚本会检查ActiveMQ的日志文件,检测是否出现leveldb数据损坏的错误日志(如Could not load message seq或No reader available for position),并在检测到错误时记录详细的告警信息到activemq-cluster-error-alert.log文件中。此脚本适用于生产环境中ActiveMQ集群的高可用性监控和故障恢复。
核心功能矩阵:

功能模块 实现方式 关键指标
进程存活监控 每60秒检测activemq-cluster1activemq-cluster2进程状态 检测精度:100%
异常自动恢复 进程消失时自动执行: 1. 数据目录备份(带时间戳) 2. 服务重启 恢复时间:<30秒
LevelDB健康检测 实时扫描日志中的关键错误: - Could not load message seq - No reader available 错误检出率:95%
智能告警系统 结构化日志输出到activemq-cluster-error-alert.log 告警延迟:<60秒
自保护机制 通过nohup实现后台运行,开机自启动配置 运行稳定性:7×24小时

使用方法:
① 手动启动: su - root -c '/mpjava/amqwatch.sh &'
② 开机自启动:

将以下命令加入/etc/rc.local:su - root -c '/mpjava/amqwatch.sh &'
③ 或者创建systemd服务文件(推荐):

cat <<EOF > /etc/systemd/system/amqwatch.service
[Unit]
Description=ActiveMQ Watcher Service
After=network.target

[Service]
ExecStart=/mpjava/amqwatch.sh
Restart=always
User=root

[Install]
WantedBy=multi-user.target
EOF

systemctl enable amqwatch.service
systemctl start amqwatch.service

示例日志输出:

当检测到ActiveMQ实例异常或日志错误时,脚本会记录以下格式的告警信息到activemq-cluster-error-alert.log:

start-----------------------------------------------------
日志编号:监控脚本-2001
时间:2025-02-17 14:30:00
主机名称:amq-cluster-01(192.168.1.101)
步骤:0
类:ActiveMQ-61616
功能描述:ActiveMQ数据文件leveldb同步异常,文件损坏!
用户名:SYS
日志:ActiveMQ数据文件leveldb同步异常,文件损坏!异常日志内容:Could not load message seq 和 No reader available for position
解决方案:建议在非生产时间,结束所有ActiveMQ进程。
级别:ERROR
其他参数:
end----------------------------------------------------

注意事项:

**① 权限问题:**脚本需要以root用户运行,以确保能够正常操作ActiveMQ实例和日志文件。

**② 备份数据目录:**在重启ActiveMQ实例时,脚本会自动备份data目录,以防止数据损坏导致的问题。

**③ 日志文件路径:**确保activemq-cluster-error-alert.log文件路径存在,否则需要手动创建。

**④ 监控频率:**脚本以60秒为周期运行,可根据实际需求调整监控频率。

通过此脚本,可以有效提升ActiveMQ集群的高可用性和稳定性,减少因实例异常或数据损坏导致的业务中断风险。

脚本内容:

bash 复制代码
#!/bin/bash

#add for chkconfig
#chkconfig: 2345 70 30
#description:AmqWatch  shell #关于脚本的简短描述
#processname:AmqWatch        #第一个进程名,后边设置自启动的时候会用到
#开机启动/etc/rc.local加入:su - root -c '/mpjava/amqwatch.sh &'
P1=/mpjava/activemq-cluster1/
P2=/mpjava/activemq-cluster2/

errorLogNum61616=0
errorLogNum61626=0
errorLogFile="/mpjava/activemq-cluster-error-alert.log"
local_host=`hostname`
local_ip=`/sbin/ifconfig -a|grep inet|grep -v 127.0.0.1|grep -v 192.168.122.1|grep -v inet6|awk '{print $2}'|tr -d "addr:"`
while true; do

		#获取运行程序的路径
		pidpath=$(ps x | grep activemq | grep -v grep | awk '{print $9}')
		echo $pidpath
		datetime=`date +%Y%m%d_%H%M%S_%N |cut -b1-20`
		currTime=$(date +"%Y-%m-%d %T")
		date=$(date +%Y%m%d)
		if [[ $pidpath =~ $P1 ]]
		then
			echo "$P1 已经存在"
		else
			echo "start activemq. $P1 bin/activemq"
			mv ${P1}data ${P1}data.bak$datetime
			nohup ${P1}bin/activemq start >/dev/null 2>&1 &
		fi
		#检查61616日志
		#tmpLogNum1=`cat /mpjava/activemq-cluster1/data/activemq.log |grep -E "Could not load message seq|No reader available for position" |wc -l`
		tmpLogNum1=`grep -E "Could not load message seq|No reader available for position" ${P1}/data/activemq.log   |wc -l`
		#tmpLogNum1=$?
		echo "tmpLogNum1:${tmpLogNum1}"
		if [[ $tmpLogNum1 -gt $errorLogNum61616 ]]
		then
			echo "start-----------------------------------------------------"  | tee -a $errorLogFile
			echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
			echo "       时间:$currTime" | tee -a $errorLogFile
			echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
			echo "       步骤:0" | tee -a $errorLogFile
			echo "        类:ActiveMQ-61616" | tee -a $errorLogFile
			echo "     功能描述:ActiveMQ数据文件levelDB同步异常,文件损坏!" | tee -a $errorLogFile
			echo "      用户名:SYS" | tee -a $errorLogFile
			echo "       日志:ActiveMQ数据文件levelDB同步异常,文件损坏!异常日志内容:Could not load message seq 和 No reader available for position" | tee -a $errorLogFile
			echo "     解决方案:建议在非生产时间,结束3台ActiveMQ进程。" | tee -a $errorLogFile
			echo "       级别:ERROR" | tee -a $errorLogFile
			echo "     其他参数:" | tee -a $errorLogFile
			echo "end----------------------------------------------------" | tee -a $errorLogFile
		fi
		errorLogNum61616=$tmpLogNum1;
		
		if [[ $pidpath =~ $P2 ]]
		then
			echo "$P2 已经存在"
		else
			echo "start activemq. $P2 bin/activemq"
			mv ${P2}data ${P2}data.bak$datetime
			nohup ${P2}bin/activemq start >/dev/null 2>&1 &
		fi
		#检查61626日志
		#tmpLogNum2=`cat /mpjava/activemq-cluster2/data/activemq.log |grep -E "Could not load message seq|No reader available for position" |wc -l`
		tmpLogNum2=`grep -E "Could not load message seq|No reader available for position" ${P2}/data/activemq.log   |wc -l`
		#tmpLogNum2=$?
		echo "tmpLogNum2:${tmpLogNum2}"
		if [[ $tmpLogNum2 -gt $errorLogNum61626 ]]
		then
			echo "start-----------------------------------------------------"  | tee -a $errorLogFile
			echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
			echo "       时间:$currTime" | tee -a $errorLogFile
			echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
			echo "       步骤:0" | tee -a $errorLogFile
			echo "        类:ActiveMQ-61626(${P2})" | tee -a $errorLogFile
			echo "     功能描述:ActiveMQ数据文件levelDB同步异常,文件损坏!" | tee -a $errorLogFile
			echo "      用户名:SYS" | tee -a $errorLogFile
			echo "       日志:ActiveMQ数据文件levelDB同步异常,文件损坏!异常日志内容:Could not load message seq 和 No reader available for position" | tee -a $errorLogFile
			echo "     解决方案:建议在非生产时间,结束3台ActiveMQ进程。" | tee -a $errorLogFile
			echo "       级别:ERROR" | tee -a $errorLogFile
			echo "     其他参数:" | tee -a $errorLogFile
			echo "end----------------------------------------------------" | tee -a $errorLogFile
		fi
		errorLogNum61626=$tmpLogNum2;
        #每次循环沉睡60s
        sleep 60
done

3.activemq定时归档脚本进程amqportwatch.sh

此脚本必须在三台ActiveMQ以外的服务器(如监控服务器)部署。实现1)端口全部异常扫描。2)定时对activemq进行归档,即在没有未消费消息情况下,对三台ActiveMQ服务器的6个进程进程关停。

手动启动命令:su - root -c '/mpjava/amqportwatch.sh &'

开机自启动命令,在/etc/rc.local,加入su - root -c '/mpjava/amqportwatch.sh &'
脚本内容:

#!/bin/bash

#add for chkconfig
#chkconfig: 2345 70 30
#description:AmqPortWatch  shell #关于脚本的简短描述
#processname:AmqPortWatch        #第一个进程名,后边设置自启动的时候会用到
#开机启动/etc/rc.local加入:su - root -c '/mpjava/amqportwatch.sh &'
P1=/mpjava/activemq-cluster1/
P2=/mpjava/activemq-cluster2/

#amq集群服务器
remote_hosts="192.168.1.101 192.168.1.102 192.168.1.103"
#amq服务器用户
remote_host_user=root
#amq服务器密码
remote_host_right_password="CZGC123!!"

#归档开关 1-开,开启定时归档;0-关,不归档;
data_keep_run=1;
#归档周:1-6即周一至六,0是周日
data_keep_week=3;
#归档时:00-23,05为凌晨5点,17点为下午5点
data_keep_hour=15;
#activemq访问网关地址(Nginx-VIP),默认地址,如:http://172.26.152.173:8161/admin/xml/queues.jsp
NginxGatewayServerIP="172.26.152.173"
#activemq访问账号密码
data_keep_xml_user="admin";
data_keep_xml_password="admin";


#归档时间未成功归档,离上次成功归档超过1个月(30天),在归档时间执行强制归档。
data_keep_day_count=30;
#归档日志目录
logPath="/mpjava/amqportwatch-log/"
mkdir -p $logPath
#归档日志文件
logfile=""
#告警日志输出,监控格式标准start---- end----
errorLogFile="${logPath}activemq-cluster-error-alert.log"
#节点MQ归档脚本(已增加远程归档,不是必须部署)
amqwatchPathFile="/mpjava/amqwatch.sh"
#当前时间
currTime=$(date +"%Y-%m-%d %T")
#文件名称
cur_datetime="`date +%Y-%m-%d-%H-%M-%S`";
#归档当前状态 0-未归档;1-归档中;
data_keep_state=0;

#归档状态,控制归档期间重试
data_keep_61616_state=0;
data_keep_61626_state=0;

#上次成功归档时间,三个月进行强制归档
data_keep_61616_last_date=$currTime
data_keep_61626_last_date=$currTime

checkPortReslut=0
checkPortServerIP=$NginxGatewayServerIP

check61616ServerIP=$NginxGatewayServerIP
check61626ServerIP=$NginxGatewayServerIP

local_host=`hostname`
local_ip=`/sbin/ifconfig -a|grep inet|grep -v 127.0.0.1|grep -v 192.168.122.1|grep -v inet6|awk '{print $2}'|tr -d "addr:"`

function checkAMQPort()
{
	currTime=$(date +"%Y-%m-%d %T")
	AMQPort=$1; AMQPath=$2
	echo "${currTime}检查${AMQPort}端口bengin" | tee -a $logfile
	checkncatreslut="`rpm -qa |grep ncat`"
	echo "checkncatreslut:${checkncatreslut}" | tee -a $logfile
	checksshpassreslut="`rpm -qa sshpass`"
	echo "checksshpassreslut:${checksshpassreslut}" | tee -a $logfile
	if [ -n "$checkncatreslut" ] && [ -n "$checksshpassreslut" ];then 
		#check begin
		checkPortReslut=0
		checkPortServerIP=$NginxGatewayServerIP
		for itemServer in $remote_hosts
		do
			echo $itemServer  | tee -a $logfile
			ping -c2 -i0.3 -W1 $itemServer &>/dev/null
			pingResult=$?;
			echo "pingResult:${pingResult}" | tee -a $logfile
			if [ $pingResult -ne 0 ]; then
				echo "${itemServer},无法ping通!" | tee -a $logfile
			else
				echo "${itemServer},ping OK!"  | tee -a $logfile
				ncat -w 1 $itemServer $AMQPort  </dev/null
				ncatResult=$?;
				echo "ncatResult:${ncatResult}" | tee -a $logfile
				if [ $ncatResult -ne 1 ]; then
					echo "ncat -w 1 ${itemServer}:${AMQPort},OK!"   | tee -a $logfile
					checkPortReslut=1;
					checkPortServerIP=$itemServer;
				else
					echo "ncat -w 1 ${itemServer}:${AMQPort},NG!" | tee -a $logfile
				fi 
			fi
			
		done
		echo "checkPortReslut:${checkPortReslut}" | tee -a $logfile
		#check end
		if [ $checkPortReslut -ne 0 ]; then
			echo "${AMQPort}端口正常!" | tee -a $logfile
		else
			echo "${AMQPort}端口全部异常!" | tee -a $logfile
			
			echo "start-----------------------------------------------------"  | tee -a $errorLogFile
			echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
			echo "       时间:$currTime" | tee -a $errorLogFile
			echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
			echo "       步骤:0" | tee -a $errorLogFile
			echo "        类:ActiveMQ" | tee -a $errorLogFile
			echo "     功能描述:ActiveMQ${AMQPort}端口全部异常" | tee -a $errorLogFile
			echo "      用户名:SYS" | tee -a $errorLogFile
			echo "       日志:ActiveMQ${AMQPort}端口全部异常" | tee -a $errorLogFile
			echo "     解决方案:结束3台ActiveMQ${AMQPort}进程进行归档。" | tee -a $errorLogFile
			echo "       级别:ERROR" | tee -a $errorLogFile
			echo "     其他参数:" | tee -a $errorLogFile
			echo "end----------------------------------------------------" | tee -a $errorLogFile
			
			for itemServer in $remote_hosts
			do
				ping -c2 -i0.3 -W1 $itemServer &>/dev/null
				if [ $? -ne 0 ]; then
					echo "${itemServer},无法ping通!" | tee -a $logfile
				else
					echo "开始结束进程:${itemServer}:${AMQPort}:${AMQPath}" | tee -a $logfile
					stopresult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "ps -ef |grep "$AMQPath" | grep -v grep| awk '{print \$2}'| xargs kill -9 >/dev/null 2>&1 &")
					echo "stopresult:${stopresult}" | tee -a $logfile
					amqwatchCheckResult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "ps -ef |grep amqwatch.sh |grep -v grep |wc -l;")
					echo "amqwatchCheckResult:${amqwatchCheckResult}" | tee -a $logfile
					if [ $amqwatchCheckResult -ne 0 ]; then
						echo "${itemServer}监控脚amqwatch.sh本正常!" | tee -a $logfile 
					else
						echo "${itemServer}监控脚本amqwatch.sh不正常!" | tee -a $logfile
						echo "确定脚本文件amqwatch.sh:${itemServer}:${AMQPort}:${AMQPath}" | tee -a $logfile
						amqwatchFileResult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "[ -e ${amqwatchPathFile} ] && echo 1 || echo 0;")
						echo "amqwatchFileResult:${amqwatchFileResult}" | tee -a $logfile
						if [ $amqwatchFileResult = 1 ]; then
							echo "远程启动amqwatch.sh:${itemServer}:${AMQPort}:${AMQPath}" | tee -a $logfile
							amqwatchRunResult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "su - root -c '${amqwatchPathFile} &' >/dev/null 2>&1 &")
							echo "amqwatchRunResult:${amqwatchRunResult}" | tee -a $logfile
						else
							echo "远程归档mv:${itemServer}:${AMQPort}:${AMQPath}" | tee -a $logfile
							amqMVDataResult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "mv ${AMQPath}data ${AMQPath}data.bak${cur_datetime} &") 
							echo "amqMVDataResult${amqMVDataResult}" | tee -a $logfile
							echo "远程启动./activemq start:${itemServer}:${AMQPort}:${AMQPath}" | tee -a $logfile
							amqStartResult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "${AMQPath}bin/./activemq start >/dev/null 2>&1 &") 
							echo "amqStartResult:${amqStartResult}" | tee -a $logfile
						fi
					fi
				fi 
			done
		fi
	else
		echo "请安装nact和sshpass!" | tee -a $logfile
	fi
	echo "检查${AMQPort}端口end" | tee -a $logfile
}


while true; do
	#AMQ定期归档begin
	cur_date="`date +%Y-%m-%d`";
	currTime=$(date +"%Y-%m-%d %T")
	cur_datetime="`date +%Y-%m-%d-%H-%M-%S`";
	#分
	cur_date_M="`date +%M`";
	#时
	cur_date_H="`date +%H`";
	#周
	cur_date_W="`date +%w`";
	#日志输出 echo "日志输出"| tee -a $logfile
	logfile="${logPath}/${cur_date}.log" 
	
	#归档开关 1-开,开启定时归档;0-关,不归档;
	if [ $data_keep_run = 1 ]; then
		#if [ $cur_date_H = "05" ]; then
		#凌晨5点执行
		if [ $cur_date_W = $data_keep_week ] && [ $cur_date_H = $data_keep_hour ]; then
		#每周日凌晨5点执行
			echo "${currTime}执行定时归档[周${data_keep_week},时${data_keep_hour}]![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
			if [ $data_keep_state = 0 ]; then
				#归档状态进行中
				data_keep_state=1;
				
				#确认是否有待消费MQ消息begin
				echo "开始确认是否有待消费(Number Of Pending Messages)MQ消息![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
				cur_curl_datetime="`date +%Y-%m-%d-%H-%M-%S`";
				
				#未消费消息检查结果0-下载队列xml文件错误,1-没有未消费队列或者只有死信队列,可执行归档,2-有未消费队列,不执行归档
				check61616PendingMessagesResult=0;
				activemq61616PathXml="${logPath}/activemq61616-${cur_curl_datetime}.xml";
				activemq61616PathUrl="http://${check61616ServerIP}:8161/admin/xml/queues.jsp"
				#curl -u admin:admin -o activemq61616-data.xml "http://172.26.153.110:8161/admin/xml/queues.jsp"
				curl -u $data_keep_xml_user:$data_keep_xml_password -o $activemq61616PathXml $activemq61616PathUrl
				
				#确认下载61616队列xml文件格式正常
				curl61616Result=`grep -E '<queues>|<queue name=' $activemq61616PathXml |wc -l`
				echo "curl61616Result:$curl61616Result" | tee -a $logfile
				if [ $curl61616Result = 0 ]; then
					echo "下载61616队列xml文件错误${activemq61616PathUrl},无法确认未消费队列[curl61616Result:$curl61616Result],此次定时归档失败!" | tee -a $logfile
					echo "start-----------------------------------------------------"  | tee -a $errorLogFile
					echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
					echo "       时间:$currTime" | tee -a $errorLogFile
					echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
					echo "       步骤:0" | tee -a $errorLogFile
					echo "        类:ActiveMQ" | tee -a $errorLogFile
					echo "     功能描述:定时归档[周${data_keep_week},时${data_keep_hour}]" | tee -a $errorLogFile
					echo "      用户名:SYS" | tee -a $errorLogFile
					echo "       日志:下载61616队列xml文件错误${activemq61616PathUrl},无法确认未消费队列[curl61616Result:$curl61616Result],此次定时归档失败!" | tee -a $errorLogFile
					echo "     解决方案:确认ActiveMQ-8161web管理服务正常。" | tee -a $errorLogFile
					echo "       级别:ERROR" | tee -a $errorLogFile
					echo "     其他参数:activemq61616PathXml:${activemq61616PathXml},data_keep_xml_user:${data_keep_xml_user},data_keep_xml_password:${data_keep_xml_password}]" | tee -a $errorLogFile
					echo "end----------------------------------------------------" | tee -a $errorLogFile
			
				else
					echo "下载61616队列xml文件正常${activemq61616PathUrl},[curl61616Result:$curl61616Result]" | tee -a $logfile
					#统计61616未消费队列
					#grep -E --color 'size="[1-9][0-9]*"' activemq61616-data.xml
					ActiveMQ61616_Pending_Num=`grep -E --color 'size="[1-9][0-9]*"' $activemq61616PathXml |wc -l`
					echo "统计61616未消费队列[ActiveMQ61616_Pending_Num:$ActiveMQ61616_Pending_Num]" | tee -a $logfile
					#统计61616死信队列
					ActiveMQ61616_DLQ_Num=`grep -C 2 -E --color '<queue name="ActiveMQ.DLQ">' $activemq61616PathXml |grep  -E --color 'size="[1-9][0-9]*"' |wc -l`
					echo "统计61616死信队列[ActiveMQ61616_DLQ_Num:$ActiveMQ61616_DLQ_Num]" | tee -a $logfile
					if [ $ActiveMQ61616_Pending_Num = 0 ] || [ $ActiveMQ61616_Pending_Num = $ActiveMQ61616_DLQ_Num ]; then
						#没有未消费队列或者只有死信队列,执行61616归档
						check61616PendingMessagesResult=1;
					else
						#有未消费队列,无法执行61616归档,输出告警
						check61616PendingMessagesResult=2;
					fi
				fi
				echo "完成确认61616是否有待消费(Number Of Pending Messages)MQ消息![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}][check61616PendingMessagesResult:$check61616PendingMessagesResult]" | tee -a $logfile
				
				
				#未消费消息检查结果0-下载队列xml文件错误,1-没有未消费队列或者只有死信队列,可执行归档,2-有未消费队列,不执行归档
				check61626PendingMessagesResult=0;
				activemq61626PathXml="${logPath}/activemq61626-${cur_curl_datetime}.xml";
				activemq61626PathUrl="http://${check61626ServerIP}:8162/admin/xml/queues.jsp"
				curl -u $data_keep_xml_user:$data_keep_xml_password -o $activemq61626PathXml $activemq61626PathUrl
				#确认下载61626队列xml文件格式正常
				curl61626Result=`grep -E '<queues>|<queue name=' $activemq61626PathXml |wc -l`
				echo "curl61626Result:$curl61626Result" | tee -a $logfile
				if [ $curl61626Result = 0 ]; then
					echo "下载61626队列xml文件错误${$activemq61626PathUrl},无法确认未消费队列[curl61626Result:$curl61626Result],此次定时归档失败!" | tee -a $logfile
					echo "start-----------------------------------------------------"  | tee -a $errorLogFile
					echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
					echo "       时间:$currTime" | tee -a $errorLogFile
					echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
					echo "       步骤:0" | tee -a $errorLogFile
					echo "        类:ActiveMQ" | tee -a $errorLogFile
					echo "     功能描述:定时归档[周${data_keep_week},时${data_keep_hour}]" | tee -a $errorLogFile
					echo "      用户名:SYS" | tee -a $errorLogFile
					echo "       日志:下载61626队列xml文件错误${activemq61626PathUrl},无法确认未消费队列[curl61626Result:$curl61626Result],此次定时归档失败!" | tee -a $errorLogFile
					echo "     解决方案:确认ActiveMQ-8162web管理服务正常。" | tee -a $errorLogFile
					echo "       级别:ERROR" | tee -a $errorLogFile
					echo "     其他参数:activemq61626PathXml:${activemq61626PathXml},data_keep_xml_user:${data_keep_xml_user},data_keep_xml_password:${data_keep_xml_password}]" | tee -a $errorLogFile
					echo "end----------------------------------------------------" | tee -a $errorLogFile
				else
					echo "下载61626队列xml文件正常${activemq61626PathUrl},[curl61626Result:$curl61626Result]" | tee -a $logfile
					#统计61626未消费队列
					#grep -E --color 'size="[1-9][0-9]*"' activemq61626-data.xml
					ActiveMQ61626_Pending_Num=`grep -E --color 'size="[1-9][0-9]*"' $activemq61626PathXml |wc -l`
					echo "统计61626未消费队列[ActiveMQ61626_Pending_Num:$ActiveMQ61626_Pending_Num]" | tee -a $logfile
					#统计61626死信队列
					ActiveMQ61626_DLQ_Num=`grep -C 2 -E --color '<queue name="ActiveMQ.DLQ">' $activemq61626PathXml |grep  -E --color 'size="[1-9][0-9]*"' |wc -l`
					echo "统计61626死信队列[ActiveMQ61626_DLQ_Num:$ActiveMQ61626_DLQ_Num]" | tee -a $logfile
					
					if [ $ActiveMQ61626_Pending_Num = 0 ] || [ $ActiveMQ61626_Pending_Num = $ActiveMQ61626_DLQ_Num ]; then
						#没有未消费队列或者只有死信队列,执行61626归档
						check61626PendingMessagesResult=1;
					else
						#有未消费队列,无法执行61626归档,输出告警
						check61626PendingMessagesResult=2;
					fi
				fi
				echo "完成确认61626是否有待消费(Number Of Pending Messages)MQ消息![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}][check61626PendingMessagesResult:$check61626PendingMessagesResult]" | tee -a $logfile
				#确认是否有待消费MQ消息end
				
				#记录本次执行归档结果
				data_keep_61616_state_now=$data_keep_61616_state;
				data_keep_61626_state_now=$data_keep_61626_state;
				for itemServer in $remote_hosts
				do
					ping -c2 -i0.3 -W1 $itemServer &>/dev/null
					if [ $? -ne 0 ]; then
						echo "${itemServer},无法ping通!" | tee -a $logfile
					else
						echo "开始执行归档和清理begin" | tee -a $logfile
						if [ $data_keep_61616_state -ne 1 ]; then
							echo "开始结束进程:${itemServer}:${P1}" | tee -a $logfile
							lastDateTime=`date -d "${data_keep_61616_last_date}" +%s`;
							nowDate=$(date +"%Y-%m-%d %T");
							nowDateTime=`date -d "${nowDate}" +%s`;
							CountSeconds=$(($nowDateTime-$lastDateTime));
							CountDays=0;
							if [ $CountSeconds -gt 86400 ]; then
								CountDays=`expr $CountSeconds / 86400`;
							fi
							echo "61616距离上次归档天数:[CountSeconds:${CountSeconds},CountDays:${CountDays}]" | tee -a $logfile
							
							#可执行归档或者离上次成功归档超过1个月(30天)
							if [ $check61616PendingMessagesResult = 1 ] || [ $CountDays -ge $data_keep_day_count ]; then
								stopresult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "ps -ef |grep "$P1" | grep -v grep| awk '{print \$2}'| xargs kill -9 >/dev/null 2>&1 &")
								data_keep_61616_last_date=$(date +"%Y-%m-%d %T")
								data_keep_61616_state_now=1;
								echo "${data_keep_61616_last_date}完成结束进程!stopresult:${stopresult}" | tee -a $logfile
							else
								data_keep_61616_state_now=0;
								echo "未执行结束进程,8161有待消费MQ消息或者下载数据异常:${itemServer}:${P1}" | tee -a $logfile
							fi
						else
							echo "61616今天归档成功,不需要重复归档![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
						fi
						
						if [ $data_keep_61626_state -ne 1 ]; then
							echo "开始结束进程:${itemServer}:${P2}" | tee -a $logfile
							
							lastDateTime=`date -d "${data_keep_61626_last_date}" +%s`;
							nowDate=$(date +"%Y-%m-%d %T");
							nowDateTime=`date -d "${nowDate}" +%s`;
							CountSeconds=$(($nowDateTime-$lastDateTime));
							CountDays=0;
							if [ $CountSeconds -gt 86400 ]; then
								CountDays=`expr $CountSeconds / 86400`;
							fi
							echo "61626距离上次归档天数:[CountSeconds:${CountSeconds},CountDays:${CountDays}]" | tee -a $logfile
							#可执行归档或者离上次成功归档超过1个月(30天)
							if [ $check61626PendingMessagesResult = 1 ] || [ $CountDays -ge $data_keep_day_count ]; then
								stopresult2=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "ps -ef |grep "$P2" | grep -v grep| awk '{print \$2}'| xargs kill -9 >/dev/null 2>&1 &")
								data_keep_61626_last_date=$(date +"%Y-%m-%d %T");
								data_keep_61626_state_now=1;
								echo "${data_keep_61626_last_date}完成结束进程!stopresult2:${stopresult2}" | tee -a $logfile
							else
								data_keep_61626_state_now=0;
								echo "未执行结束进程,8162有待消费MQ消息或者下载数据异常:${itemServer}:${P2}" | tee -a $logfile
							fi
						else
							echo "61626今天归档成功,不需要重复归档![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
						fi
						
						
						if [ $data_keep_61616_state -ne 1 ]; then
							echo "开始清理:${itemServer}:${P1}" | tee -a $logfile
							#find /mpjava/activemq-cluster1/ -mtime +7 -type d -name "data.bak*" -exec rm -rf {} \;
							rmresult=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "find "$P1" -mtime +7 -type d -name 'data.bak*' -exec rm -rf {} \;")
							echo "rmresult:${rmresult}" | tee -a $logfile
						else
							echo "61616今天清理成功,不需要再清理![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
						fi
						
						if [ $data_keep_61626_state -ne 1 ]; then
							echo "开始清理:${itemServer}:${P2}" | tee -a $logfile
							#find /mpjava/activemq-cluster2/ -mtime +7 -type d -name "data.bak*" -exec rm -rf {} \;
							rmresult2=$(sshpass -p "${remote_host_right_password}" ssh -o StrictHostKeyChecking=no $remote_host_user@$itemServer "find "$P2" -mtime +7 -type d -name 'data.bak*' -exec rm -rf {} \;")
							echo "rmresult2:${rmresult2}" | tee -a $logfile
						else
							echo "61626今天清理成功,不需要再清理![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
						fi
						echo "完成执行归档和清理end" | tee -a $logfile
					fi 
				done
				
				#更新本次执行归档结果
				data_keep_61616_state=$data_keep_61616_state_now;
				data_keep_61626_state=$data_keep_61626_state_now;
				
				#清理7天前/mpjava/amqportwath/*.log和*.xml文件
				echo "清理7天前${logPath}*.log和*.xml文件" | tee -a $logfile
				find ${logPath} -mtime +7 -type f -name "*.log" -exec rm -rf {} \;
				find ${logPath} -mtime +7 -type f -name "*.xml" -exec rm -rf {} \;
				
				#如果61616或者61626归档失败,10分钟后重试
				if [ $data_keep_61616_state -ne 1 ] || [ $data_keep_61626_state -ne 1 ]; then
					data_keep_state=0;
				#if [ $data_keep_state = 1 ]; then 
					if [ $data_keep_61616_state -ne 1 ]; then 
					#61616归档失败
						echo "61616归档失败!" | tee -a $logfile
						echo "start-----------------------------------------------------"  | tee -a $errorLogFile
						echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
						echo "       时间:$currTime" | tee -a $errorLogFile
						echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
						echo "       步骤:0" | tee -a $errorLogFile
						echo "        类:ActiveMQ" | tee -a $errorLogFile
						echo "     功能描述:定时归档[周${data_keep_week},时${data_keep_hour}]" | tee -a $errorLogFile
						echo "      用户名:SYS" | tee -a $errorLogFile
						echo "       日志:61616归档失败!原因是有待消费MQ消息或者下载数据异常。" | tee -a $errorLogFile
						echo "     解决方案:非生产时间,手动执行归档(结束三台MQ服务器的6个activemq进程)。" | tee -a $errorLogFile
						echo "       级别:ERROR" | tee -a $errorLogFile
						echo "     其他参数:" | tee -a $errorLogFile
						echo "end----------------------------------------------------" | tee -a $errorLogFile
					fi
					if [ $data_keep_61626_state -ne 1 ]; then 
					#61626归档失败
						echo "61626归档失败!" | tee -a $logfile
						echo "start-----------------------------------------------------"  | tee -a $errorLogFile
						echo "     日志编号:监控脚本-2001" | tee -a $errorLogFile
						echo "       时间:$currTime" | tee -a $errorLogFile
						echo "     主机名称:$local_host($local_ip)" | tee -a $errorLogFile
						echo "       步骤:0" | tee -a $errorLogFile
						echo "        类:ActiveMQ" | tee -a $errorLogFile
						echo "     功能描述:定时归档[周${data_keep_week},时${data_keep_hour}]" | tee -a $errorLogFile
						echo "      用户名:SYS" | tee -a $errorLogFile
						echo "       日志:61626归档失败!原因是有待消费MQ消息或者下载数据异常。" | tee -a $errorLogFile
						echo "     解决方案:非生产时间,手动执行归档(结束三台MQ服务器的6个activemq进程)。" | tee -a $errorLogFile
						echo "       级别:ERROR" | tee -a $errorLogFile
						echo "     其他参数:" | tee -a $errorLogFile
						echo "end----------------------------------------------------" | tee -a $errorLogFile
					fi
				#fi
					echo "61616或者61626归档失败,10分钟后重试。[data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
					echo "sleep:10分钟(600s)" | tee -a $logfile
					sleep 600
				fi
			else
				echo "${currTime}今天已完成归档![周${data_keep_week},时${data_keep_hour}][data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
				echo "sleep:10分钟(600s)" | tee -a $logfile
				sleep 600
			fi
		else
			echo "${currTime}非归档时间![周${data_keep_week},时${data_keep_hour}][data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
			

			#归档状态复原
			data_keep_61616_state=0;
			data_keep_61626_state=0;
			data_keep_state=0;
			echo "${currTime}归档状态复原![data_keep_state:${data_keep_state},data_keep_61616_state:${data_keep_61616_state},data_keep_61626_state:${data_keep_61626_state}]" | tee -a $logfile
			
		fi
	fi
	#AMQ定期归档end 
	
	if [ $data_keep_state = 0 ]; then 
		#AMQ端口61616/61626检查begin
		check61616PortReslut=0
		check61616ServerIP=""
		checkAMQPort 61616 $P1
		check61616PortReslut=$checkPortReslut
		check61616ServerIP=$checkPortServerIP
		echo "check61616PortReslut${check61616PortReslut}" | tee -a $logfile
		echo "check61616ServerIP${check61616ServerIP}" | tee -a $logfile
		
		check61626PortReslut=0
		check61626ServerIP=""
		checkAMQPort 61626 $P2
		check61626PortReslut=$checkPortReslut
		check61626ServerIP=$checkPortServerIP
		echo "check61626PortReslut:${check61626PortReslut}" | tee -a $logfile
		echo "check61626ServerIP${check61626ServerIP}" | tee -a $logfile
		#AMQ端口61616/61626检查end 
		
		if [ $check61616PortReslut -ne 0 ] && [ $check61626PortReslut -ne 0 ]; then
			#检查端口正常,每次循环沉睡1分钟(60s)
			echo "sleep:1分钟(60s)" | tee -a $logfile
			sleep 60
		else
			#检查端口不正常,结束进程后,20分钟再进行确认
			echo "sleep:20分钟(1200s)" | tee -a $logfile
			sleep 1200
		fi
		
	fi
done

相关推荐
热心市民运维小孙16 分钟前
基于HAproxy搭建负载均衡
运维·junit·负载均衡
紫雾凌寒1 小时前
计算机视觉应用|自动驾驶的感知革命:多传感器融合架构的技术演进与落地实践
人工智能·机器学习·计算机视觉·架构·自动驾驶·多传感器融合·waymo
Zucker n1 小时前
Ubuntu本地部署Open manus(完全免费可用)
linux·运维·ubuntu
zym大哥大3 小时前
Linux进程信号二
linux·运维·服务器
WeiLai11123 小时前
面试基础--高并发高可用架构深度实践:降级熔断(Hystrix vs Sentinel)核心原理与源码解析
java·分布式·后端·hystrix·面试·架构·sentinel
call_me_wangcheng4 小时前
Ubuntu用户安装cpolar内网穿透
linux·运维·ubuntu
小杨4044 小时前
springboot框架项目应用实践四(日志)
运维·spring boot·后端
人间打气筒(Ada)5 小时前
负载均衡集群和高可用集群
运维·负载均衡
obboda5 小时前
Docker基础入门
运维·docker·云原生·容器·eureka
cooldream20095 小时前
Docker Desktop 安装与使用详解
运维·docker·容器