Doris BE 监控,
当 BE 挂掉,Azkaban 定时任务会监控到,同时将 BE 服务重启,
重启完成后,Azkaban 失败、告警。
Azkaban Zip 脚本
azkaban.project
1
| azkaban-flow-version: 2.0
|
doris_check.flow
1 2 3 4 5 6 7 8 9 10 11 12 13
| nodes: - name: doris_node170_be_was_dead_and_restart_complited_now type: command config: command: sh /opt/sync/sync_script/sink_doris/az_doris_be_check.sh 10.0.14.170 - name: doris_node171_be_was_dead_and_restart_complited_now type: command config: command: sh /opt/sync/sync_script/sink_doris/az_doris_be_check.sh 10.0.14.171 - name: doris_node172_be_was_dead_and_restart_complited_now type: command config: command: sh /opt/sync/sync_script/sink_doris/az_doris_be_check.sh 10.0.14.172
|
az_doris_be_check.sh
1 2 3 4 5 6 7 8 9
| #! /bin/bash host=$1 doris_check_script_path=/opt/doris/be/bin
ssh root@$host << eeooff
$doris_check_script_path/doris_be_check.sh
eeooff
|
BE 节点脚本
$DORIS_HOME/be/bin/doris_be_check.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| #! /bin/bash
check="$(ps -ef | grep palo_be | grep -v grep | awk '{print $8}' | awk -F '[/]' '{print $NF}')" doris_path="/opt/doris"
start(){ now=`date "+%Y-%m-%d %H:%M:%S"` echo "BE重启中... 重启时间:$now..." $doris_path/be/bin/start_be.sh --daemon sleep 10s test_after_restart="$(ps -ef | grep palo_be | grep -v grep | awk '{print $8}' | awk -F '[/]' '{print $NF}')" if [[ $test_after_restart = "palo_be" ]]; then echo "be重启成功..." else echo "be启动失败..." fi }
if [[ $check = "palo_be" ]]; then echo "BE 运行正常..." exit 0 else echo "BE 挂了..." start exit 1 fi
|