如何在Nagios监控Tomcat,是一个比较简单又复杂的事情,简单是因为如果只监控web应用服务器的一个tomcat服务是否正常运行,那么比较简单;如果要监控tomcat的其他比如连接数比如jvm内存使用率等就比较复杂,google没有适合的监控脚本;如果要监控web应用上面的多个tomcat服务器,而且很多tomcat服务都是跳转式的,那就需要多做很多事情。
一般通常都使用tcp tomcat端口的方式,不过这有一个bug就是tomcat假死的情况下,tcp 端口是OK的,但是tomcat里面部署的web应用其实已经不能正常访问,这个时候需要使用http方式来监控tomcat的状态。
所以本文就记录了如何采用http方式来监控一台web服务器上多个tomcat应用服务器。
1在tomcat web服务器上安装nrpe客户端:
Rpm包下载地址为:
免费下载地址在 http://linux.linuxidc.com/
用户名与密码都是www.linuxidc.com
具体下载目录在 /2014年资料/6月/17日/Nagios通过check_http监控一台Web应用服务器上多个Tomcat服务
下载方法见 http://www.linuxidc.com/Linux/2013-07/87684.htm
------------------------------------分割线------------------------------------
网络监控器Nagios全攻略 http://www.linuxidc.com/Linux/2013-07/87067.htm
Nagios搭建与配置详解 http://www.linuxidc.com/Linux/2013-05/84848.htm
Nginx环境下构建Nagios监控平台 http://www.linuxidc.com/Linux/2011-07/38112.htm
在RHEL5.3上配置基本的Nagios系统(使用Nagios-3.1.2) http://www.linuxidc.com/Linux/2011-07/38129.htm
CentOS 5.5+Nginx+Nagios监控端和被控端安装配置指南 http://www.linuxidc.com/Linux/2011-09/44018.htm
Ubuntu 13.10 Server 安装 Nagios Core 网络监控运用 http://www.linuxidc.com/Linux/2013-11/93047.htm
1.1,rpm方式安装nrpe客户端
- [root@localhost nagios]# ll
- 总计 768
- -rw-r--r-- 1 root root 713389 12-16 12:08 nagios-plugins-1.4.11-1.x86_64.rpm
- -rw-r--r-- 1 root root 32706 12-16 12:09 nrpe-2.12-1.x86_64.rpm
- -rw-r--r-- 1 root root 18997 12-16 12:08 nrpe-plugin-2.12-1.x86_64.rpm
- [root@localhost nagios]# rpm -ivh *.rpm --nodeps --force
- Preparing... ########################################### [100%]
- 1:nagios-plugins ########################################### [ 33%]
- id: nagios:无此用户
- 2:nrpe ########################################### [ 67%]
- 3:nrpe-plugin ########################################### [100%]
- [root@cache-1 ~]#
1.2 在配置文件最末尾,添加配置信息以及监控主机服务器ip地址
- [root@ localhost nagios]# vim /etc/nagios/nrpe.cfg
- # addby tim on 2014-06-11
- command[check_users]=/usr/local/nagios/libexec/check_users -w 8 -c 15
- command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
- command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda
- command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
- #command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 50 -c 80
- command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 750 -c 800
- command[check-host-alive]=/usr/local/nagios/libexec/check_ping -H 10.xx.xx.10 -w 3000.0,80% -c 5000.0,100% -p 5
- allowed_hosts = 127.0.0.1,10.xx.xxx.xx1
check下命令是否生效:
- [root@webserver nrpe-2.15]# /usr/local/nagios/libexec/check_users -w 8 -c 15
- USERS OK - 2 users currently logged in |users=2;8;15;0
- [root@webserver nrpe-2.15]#
看到已经USERS OK -….命令已经生效。
1.3 启动nrpe报错如下:
- [root@webserver ~]# service nrpe restart
- Shutting down nrpe: [失败]
- Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory
- [失败]
- [root@webserver ~]#
- [root@db-m2-slave-1 nagios_client]# service nrpe start
- Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory
- [失败]
- [root@db-m2-slave-1 nagios_client]#
建立软连接
[root@db-m2-slave-1 nagios_client]# ln -s /usr/lib64/libssl.so /usr/lib64/libssl.so.6
(如果没有libssl.so,就采用别的libssl.so.10来做软连接,ln -s /usr/lib64/libssl.so.10 /usr/lib64/libssl.so.6)
[root@db-m2-slave-1 nagios_client]#
再重新启动如下:
- [root@webserver nagios_client]# service nrpe start
- Starting nrpe: /usr/sbin/nrpe: error while loading shared libraries: libcrypto.so.6: cannot open shared object file: No such file or directory
- [失败]
- [root@web-10 ~]# ll /usr/lib64/libcrypto.so
- lrwxrwxrwx. 1 root root 18 10月 13 2013 /usr/lib64/libcrypto.so -> libcrypto.so.1.0.0
- [root@webserver nagios_client]#
再建软链接:
- [root@webserver nagios_client]# ln -s /usr/lib64/libcrypto.so /usr/lib64/libcrypto.so.6
- (或者如果没有libcrypto.so,就采用libcrypto.so.10做软连接, ln -s /usr/lib64/libcrypto.so.10 /usr/lib64/libcrypto.so.6)
- [root@webserver nagios_client]# service nrpe start
- Starting nrpe: [确定]
- [root@webserver nagios_client]#
1.4 检测下nrpe是否正常运行:
去nagios服务器端check下
- [root@cache-2 ~]# /usr/local/nagios/libexec/check_nrpe -H 10.xx.xx.10
- NRPE v2.12
- [root@cache-2 ~]#
看到返回NRPE v2.15表示已经连接成功。
1.5 在web应用下添加检测jsp文件
(1) 建立测试文件
- vim ./webapps/nagios_test_0611/nagios_test_0611.jsp
- <%@ page language="java" contentType="text/html; charset=gb2312"
- pageEncoding="gb2312"%>
- <!DOCTYPE html PUBLIC"-//W3C//DTD HTML 4.01 Transitional//EN""http://www.w3.org/TR/html4/loose.dtd">
- <html>
- <head>
- <meta http-equiv="Content-Type" content="text/html; charset=gb2312">
- <title>nagios test here</title>
- </head>
- <body>
- <center>Now timeis: <%=new java.util.Date()%></center>
- </body>
- </html>
(2) 去check下check_http命令
- [root@webserver~]# /usr/local/nagios/libexec/check_http -I 10.xx.xx.10 -p 8300 -u /nagios_test_0611/nagios_test_0611.jsp -e 200
- HTTP CRITICAL - Invalid HTTP response received from host on port 8300: HTTP/1.1 404 Not Found
需要重启一下tomcat,使新添加的jsp生效能打开,执行如下stop start命令:
/usr/local/app/apache-tomcat-6.0.37_8300/bin/shutdown.sh
/usr/local/app/apache-tomcat-6.0.37_8300/bin/startup.sh
再执行check_http命令
- [root@webserver~]# /usr/local/nagios/libexec/check_http -I 10.xx.xx.10 -p 8300 -u /nagios_test_0611/nagios_test_0611.jsp -e 200
- HTTP OK: Status line output matched "200" - 571 bytes in 0.882 second response time |time=0.882479s;;;0.000000 size=571B;;;0
- [root@ webserver ~]#
1.6查看NRPE的监控命令
- [root@webserver nrpe-2.15]# cat /etc/nagios/nrpe.cfg |grep -v "^#"|grep -v "^$"
- log_facility=daemon
- pid_file=/var/run/nrpe.pid
- server_port=5666
- nrpe_user=nagios
- nrpe_group=nagios
- dont_blame_nrpe=0
- debug=0
- command_timeout=60
- connection_timeout=300
- command[check_users]=/usr/local/nagios/libexec/check_users -w 8 -c 15
- command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
- command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda
- command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
- command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 750 -c 800
- command[check-host-alive]=/usr/local/nagios/libexec/check_ping -H 10.xx.xx.10 -w 3000.0,80% -c 5000.0,100% -p 5
- allowed_hosts=127.0.0.1,10.xx.xxx.xx1
- [root@webserver nrpe-2.15]#
更多详情见请继续阅读下一页的精彩内容: http://www.linuxidc.com/Linux/2014-06/103268p2.htm