ELK+kafka日志系统搭建-实战

日志主要包括系统日志、应用程序日志和安全日志。系统运维和开发人员可以通过日志了解服务器软硬件信息、检查配置过程中的错误及错误发生的原因。经常分析日志可以了解服务器的负荷，性能安全性，从而及时采取措施纠正错误。

通常，日志被分散的储存不同的设备上。如果你管理数十上百台服务器，你还在使用依次登录每台机器的传统方法查阅日志。这样是不是感觉很繁琐和效率低下。当务之急我们使用集中化的日志管理，例如：开源的syslog，将所有服务器上的日志收集汇总。

集中化管理日志后，日志的统计和检索又成为一件比较麻烦的事情，一般我们使用grep、awk和wc等Linux命令能实现检索和统计，但是对于要求更高的查询、排序和统计等要求和庞大的机器数量依然使用这样的方法难免有点力不从心。

Elasticsearch是个开源分布式搜索引擎，它的特点有：分布式，零配置，自动发现，索引自动分片，索引副本机制，restful风格接口，多数据源，自动搜索负载等。
Logstash是一个完全开源的工具，他可以对你的日志进行收集、过滤，并将其存储供以后使用（如，搜索）。
Kibana 也是一个开源和免费的工具，它Kibana可以为 Logstash 和 ElasticSearch 提供的日志分析友好的 Web 界面，可以帮助您汇总、分析和搜索重要数据日志。

ELK+kafka日志系统原理（介质为日志）

Windows/linux的logstash（客户端）--->kafka（队列）--->kakfa上的logstash（也是一个客户端）--->ES（存储）--->kibana（界面）

角色：

10.10.13.17 ES java

10.10.13.18 ES Java

10.10.13.15 logstash kafka java

10.10.13.12 nginx kibana

10.10.12.7 Linux客户端 logstash Java

一、安装ES (10.10.13.17)

yun install -y java1.7.0-openjdk*

useradd elkuser

groupadd elkuser

usermod -G elkuser elkuser

wget https : //download.elastic.co/elasticsearch/elasticsearch/elasticsearch-1.7.3.tar.gz

tar -xf elasticsearch-1.7.3.tar.gz -C /home(这有挂在分区，所以装这里)

mv /home/elasticsearch-1.7.3 /home/elasticsearch

chown -R elkuser.elkuser /home/elasticsearch/*

编写配置文件：

vi /home/elasticsearch/config/elasticsearch.yml

node.data: true

index.number_of_shards: 5

index.number_of_replicas: 0

bootstrap.mlockall: true

network.bind_host: 10.10.13.17

network.host: 10.10.13.17

http.port: 16780

http.cors.enabled: true

http.cors.allow-origin: "*"

注意：这里默认保存数据路径在：/elasticsearch/data/elasticsearch/nodes/0/indices/目录下

启动ES

#/elasticsearch/elasticsearch-1.5.2/bin/service/elasticsearch start或stop #手工启动/停止

重启脚本restartES.sh

#cat /elasticsearch/elasticsearch-1.5.2/restartES.sh

/elasticsearch/elasticsearch-1.5.2/bin/service/elasticsearch stop

sleep 10s

/elasticsearch/elasticsearch-1.5.2/bin/service/elasticsearch start

验证elasticsearch部署是否正常，访问http://10.10.13.17:16780/输出如下信息

{

"status" : 200,

"name" : "All-American",

"cluster_name" : "elasticsearch",

"version" : {

"number" : "1.5.2",

"build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",

"build_timestamp" : "2015-04-27T09:21:06Z",

"build_snapshot" : false,

"lucene_version" : "4.10.4"

"tagline" : "You Know, for Search"

}

安装elasticsearch插件:

head(集群几乎所有信息，还能进行简单的搜索查询，观察自动恢复的情况等)

bigdesk(该插件可以查看集群的jvm信息，磁盘IO，索引创建删除信息等)

kopf(提供了一个简单的方法，一个elasticsearch集群上执行常见的任务)

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install mobz/elasticsearch-head

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install lukas-vlcek/bigdesk

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install lmenezes/elasticsearch-kopf/1.6

安装好之后，访问方式为： http://10.10.13.17:16780/_plugin/head，由于集群中现在暂时没有数据，所以显示为空

定期清除日志(脚本cleanES.sh)，制定清除crontab计划

#cat /root/cleanES.sh

#! /bin/bash

DAY=$(date -d "20 days ago" +%Y.%m.%d)

DAYy=$(date -d "21 days ago" +%Y.%m.%d)

DAY1=$(date -d "30 days ago" +%Y.%m.%d)

DAY3=$(date -d "300 days ago" +%Y.%m.%d)

DAY2=$(date -d "120 days ago" +%Y.%m)

#haproxy 日志logstash-YYYY.mm.dd 索引加载目录保存15天，40天以前日志删除

#echo logstash-$DAY logstash-YYYY.mm.dd

#echo logstash-$DAY2

#echo logstash-$DAY1 logstash-YYYY.mm

mv /elasticsearch/elasticsearch-1.5.2/data/elasticsearch/nodes/0/indices/logstash-$DAY /elasticsearch/elasticsearch-1.5.2/bakup_data/

mv /elasticsearch/elasticsearch-1.5.2/data/elasticsearch/nodes/0/indices/linuxlog-$DAY /elasticsearch/elasticsearch-1.5.2/bakup_data/

#rm -rf /elasticsearch/elasticsearch-1.5.2/data/elasticsearch/nodes/0/indices/logstash-$DAY2

rm -rf /elasticsearch/elasticsearch-1.5.2/bakup_data/logstash-$DAY1

rm -rf /elasticsearch/elasticsearch-1.5.2/bakup_data/linuxlog-$DAY3

#从集群中删除移走的索引

curl -XDELETE "http://10.10.13.17:16780/linuxlog-$DAYy"

curl -XDELETE "http://10.10.13.17:16780/logstash-$DAYy"

清除日志crontab计划任务

#crontab -e

30 3 * * * sh /root/cleanES.sh

二、安装ES 10.10.13.18

和第一步安装ES一样，配置文件不一样而已

编写配置文件：

vi /home/elasticsearch/config/elasticsearch.yml

node.data: true

index.number_of_shards: 5

index.number_of_replicas: 0

bootstrap.mlockall: true

network.bind_host: 10.10.13.18

network.host: 10.10.13.18

http.port: 16780

http.cors.enabled: true

http.cors.allow-origin: "*"

启动ES

#/elasticsearch/elasticsearch-1.5.2/bin/service/elasticsearch start或stop #手工启动/停止

重启脚本和清楚日志脚本跟第一步一样

安装elasticsearch插件:

head(集群几乎所有信息，还能进行简单的搜索查询，观察自动恢复的情况等)

bigdesk(该插件可以查看集群的jvm信息，磁盘IO，索引创建删除信息等)

kopf(提供了一个简单的方法，一个elasticsearch集群上执行常见的任务)

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install mobz/elasticsearch-head

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install lukas-vlcek/bigdesk

#/elasticsearch/elasticsearch-1.5.2/bin/plugin -install lmenezes/elasticsearch-kopf/1.6

安装好之后，访问方式为： http://10.10.13.18:16780/_plugin/head，由于集群中现在暂时没有数据，所以显示为空,

此时，es集群的部署完成。

====================================================

三、Linux客户端logstash安装及测试输出日志到ES （10.10.12.7）

yun install -y java1.7.0-openjdk*

wget https://download.elasticsearch.org/logstash/logstash/logstash-2.2.0.tar.gz

tar xf logstash-2.2.0.tar.gz -C /usr/local/

cd /usr/local/logstash-2.2.0/

mkdir etc 创建配置文件文件夹

cd etc/

vi logstash-agent.conf

input {

file {

type => "system-messages"

path => "/var/log/messages"

}

output {

elasticsearch {

hosts => ["10.10.13.17:16780","10.10.13.18:16780"]

}

将12.7的系统日志输出到ES

启动logstash：

/usr/local/logstash/bin/logstash -f logstash-agent.conf &

稍后在http://10.10.13.17:16780/_plugin/head上查看有无日志输出

这样可以看到logstash输出到ES成功

系统日志我们已经成功的收集，并且已经写入到es集群中，那上面的演示是logstash直接将日志写入到es集群中的，这种场合我觉得如果量不是很大的话直接像上面已将将输出output定义到es集群即可，如果量大的话需要加上消息队列来缓解es集群的压力。前面已经提到了我这边之前使用的是单台redis作为消息队列，但是redis不能作为list类型的集群，也就是redis单点的问题没法解决，所以这里我选用了kafka.但是由于资源问题现在只用一台kafka。在搭建kafka集群时，需要提前安装zookeeper集群，当然kafka已经自带zookeeper程序只需要解压并且安装配置就行了

四、安装kafka （10.10.13.15）

yun install -y java1.7.0-openjdk*

#wget http://apache.fayea.com/kafka/0.9.0.1/kafka_2.11-0.8.2.2.tgz

#tar -zxvf kafka_2.11-0.8.2.2.tgz

#mv kafka_2.11-0.8.2.2 /usr/local/kafka

vi /usr/local/kafka/config/server.properties

broker.id=0

listeners=PLAINTEXT://:9092

host.name=10.10.13.15

advertised.host.name=10.10.13.15

num.network.threads=3

num.io.threads=8

socket.send.buffer.bytes=102400

socket.receive.buffer.bytes=102400

socket.request.max.bytes=104857600

log.dirs=/elasticsearch/kafka-logs

num.partitions=4

num.recovery.threads.per.data.dir=1

log.retention.hours=168

log.segment.bytes=1073741824

log.retention.check.interval.ms=300000

zookeeper.connect=localhost:2181

zookeeper.connection.timeout.ms=6000

#vi /usr/local/kafka/config/zookeeper.properties

dataDir=/kafka/zookeeper

clientPort=2181

maxClientCnxns=0

autopurge.snapRetainCount=100

autopurge.purgeInterval=12

启动zookeeper/kafka服务，可以用netstat -tnlp看是否正常监听2181和9092，可以设置为开机自动启动服务

#/usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties > /dev/null &

#/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties > /dev/null &

kafka常用命令(可以在bin下对不同脚本执行--hlep查询帮助)：

<>创建一个topic

#bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

#注意：factor大小不能超过broker数

<>查看当前所有topic

#bin/kafka-topics.sh --list --zookeeper localhost:2181

__consumer_offsets

bar

foo

haproxy

linuxlog ((先创建linux客户端使用topic,客户端logstash可以输出到该topic)) 这就是我们使用ELK收集Linux日志

m-test-topic

test-1 - marked for deletion

winlog (先创建windows客户端使用topic,客户端logstash可以输出到该topic) 这就是我们使用ELK收集windows日志

winlog2

测试：

发送消息，这里使用的是生产者角色

/bin/bash /usr/local/kafka/bin/kafka-console-producer.sh --broker-list 10.10.13.15:9092 --topic summer

This is a messages

welcome to kafka

接收消息，这里使用的是消费者角色

# /usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper 10.10.13.15:2181 --topic summer --from-beginning

This is a messages

welcome to kafka

<>开机自动启动服务

vim /etc/rc.local

touch /var/lock/subsys/local

mount /dev/sdc /elasticsearch/ #开机自动挂载存储数据目录

/usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties > /dev/null &

/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties > /dev/null &

那如何将数据从kafka中读取然后给我们的es集群呢？

五、kafka端安装logstash输出接收到的日志到ES (10.10.13.15)

yun install -y java1.7.0-openjdk*

wget https://download.elasticsearch.org/logstash/logstash/logstash-2.2.0.tar.gz

tar xf logstash-2.2.0.tar.gz -C /usr/local/

cd /usr/local/logstash-2.2.0/

mkdir etc 创建配置文件文件夹

cd etc/

vi linuxlog-es.conf (logstash的配置文件随意定义，只要启动的时候指定就行)

将kafka上收取到的topic为Linuxlog的日志输出到ES

input {

kafka {

zk_connect => "10.10.13.15:2181"

topic_id => "linuxlog"

codec => plain

reset_beginning => false

consumer_threads => 5

decorate_events => true

}

output {

elasticsearch {

hosts => ["10.10.13.17:16780","10.10.13.18:16780"]

index => "linuxlog-%{+YYYY-MM}"

}

看到效果前还有一件事，就是把Linux客户端（10.10.12.7）上的日志输出到kafka：

12.7上重新配置logstash配置文件：

vi /usr/local/logstash/etc/logstash-agent.conf

input {

file {

type => "redis"

path => "/var/log/redis/redis.log" 我这里以12.7上装的redis日志为例

}

output {

kafka {

topic_id => "linuxlog"

bootstrap_servers => "10.10.13.15:9092"

workers => 2

}

然后我们就可以在ES的监控页面http://10.10.13.17:16780/_plugin/head看到了！

nginx/kibana访问入口以及良好页面展示效果

六、安装nginx、kibana （10.10.13.12）

安装部署nginx,修改相应的配置文件

从ngnix官网下载ngnix压缩包，解压并进入其目录，并测试启动服务

#wget http://nginx.org/download/nginx-1.6.2.tar.gz

#tar -zxf nginx-1.6.2.tar.gz -C /usr/local/

#mv /usr/local/nginx-1.6.2 /usr/local/nginx

#cd /usr/lcoal/nginx

#./configure --prefix=/usr/local/ngnix

#此处有可能报错，Ngnix依赖于pcre库，所以要先安装pcre库

yum install pcre pcre-devel

make && make install

#启动服务，访问http://10.10.13.12能够出现nginx欢迎页面

#./usr/local/ngnix/sbin/ngnix

安装部署kibana3，并修改相应配置

下载kibana3安装包，https://github.com/elasticsearch/kibana，并解压至/usr/local/nginx/html

#wget https://github.com/elasticsearch/kibana/kibana-3.1.2-linux-x64.tar.gz

#tar -zxf kibana-3.1.2-linux-x64.tar.gz -C /usr/local/nginx/html/kibana-latest/ #如果kibana-latest目录不存在可以先创建

修改kibana-latest目录下的config.js文件指向elasticsearch的配置项elasticsearch: "http://"+"10.10.13.17"+":16780"

注意：其中"+"格式应与此处保持一致，否则会出现无法连接到ES的情况

#vim //usr/local/nginx/html/kibana-latest/config.js

/** @scratch /configuration/config.js/1

* == Configuration

* config.js is where you will find the core Kibana configuration. This file contains parameter that

* must be set before kibana is run for the first time.

define(['settings'],

function (Settings) {

/** @scratch /configuration/config.js/2

* === Parameters

return new Settings({

/** @scratch /configuration/config.js/5

* ==== elasticsearch

* The URL to your elasticsearch server. You almost certainly don't

* want +http://localhost:9200+ here. Even if Kibana and Elasticsearch are on

* the same host. By default this will attempt to reach ES at the same host you have

* kibana installed on. You probably want to set it to the FQDN of your

* elasticsearch host

* Note: this can also be an object if you want to pass options to the http client. For example:

* +elasticsearch: {server: "http://localhost:9200", withCredentials: true}+

elasticsearch: "http://"+"10.10.13.17"+":16780",

/** @scratch /configuration/config.js/5

* ==== default_route

* This is the default landing page when you don't specify a dashboard to load. You can specify

* files, scripts or saved dashboards here. For example, if you had saved a dashboard called

* `WebLogs' to elasticsearch you might use:

* default_route: '/dashboard/elasticsearch/WebLogs',

default_route : '/dashboard/file/default.json',

/** @scratch /configuration/config.js/5

* ==== kibana-int

* The default ES index to use for storing Kibana specific object

* such as stored dashboards

kibana_index: "kibana-int",

/** @scratch /configuration/config.js/5

* ==== panel_name

* An array of panel modules available. Panels will only be loaded when they are defined in the

* dashboard, but this list is used in the "add panel" interface.

panel_names: [

'histogram',

'map',

'goal',

'table',

'filtering',

'timepicker',

'text',

'hits',

'column',

'trends',

'bettermap',

'query',

'terms',

'stats',

'sparklines'

]

});

验证kibana能否正常显示，并且简单添加图形效果。

访问http://10.10.13.12正常显示elasticsearch内容，默认kibana的logstash展示界面只有event时间展示柱状图和all event表

#展示其他图形效果需在界面中添加：

<>点击右下角的 add a rows 进入Dashboard Settings > create new rows > save

<>在新增加的row 中点击Add panel to empty row > add panel > terms > fild > save

<>fild长度、显示宽度和others 、missing值展示可自行选择。

来源:http://blog.csdn.net/wangdaoge/article/details/53130263?locationNum=9&fps=1

ELK+kafka日志系统搭建-实战相关推荐

小白玩大数据日志分析系统经典入门实操篇FileBeat+ElasticSearch+Kibana 实时日志系统搭建从入门到放弃
大数据实时日志系统搭建距离全链路跟踪分析系统第二个迭代已经有一小阵子了,由于在项目中主要在写ES查询\Storm Bolt逻辑,都没有去搭建实时日志分析系统,全链路跟踪分析系统采用的开源产品组合为F ...
Kafka集群搭建实战
Kafka集群搭建实战集群结构: 集群环境信息: 192.168.157.130 192.168.157.131 192.168.157.129 Zookeeper集群搭建三台机器上均安装JDK( ...
ELK + kafka 日志方案
概述本文介绍使用ELK(elasticsearch.logstash.kibana) + kafka来搭建一个日志系统.主要演示使用spring aop进行日志收集,然后通过kafka将日志发送给l ...
初识ELK（日志系统）
1.ELK是Elasticsearch.Logstash. Kibana三大开源框架首字母大写简称.在市面上也被称之为Elastic Stack. 其中Elasticsearch是一个基于Lucene ...
ELK分布式日志收集搭建和使用
大型系统分布式日志采集系统ELK 全框架 SpringBootSecurity 1.传统系统日志收集的问题 2.Logstash操作工作原理 3.分布式日志收集ELK原理 4.Elasticsearc ...
基于ELK的日志系统最佳实践
背景 ELK为日志存储/查询系统,用于监控系统数据源和日志查询目前日志系统主要解决方案为ELK,整条链路为采集(filebeat).缓冲 (kafka)[可选]. 过滤(logstash).搜索( ...
Linux系统Centos7 基于Docker搭建ELK分布式日志系统
ELK 基本概述 ELK是Elasticsearch.Logstash.Kibana的简称,常常用于部署分布式系统日志服务. Elasticsearch:全球实时全文搜索和分析引擎,提供搜集.分析.存 ...
ELK日志系统搭建完整详细步骤
文章目录一.ELK 是什么? 二.安装部署 Elasticsearch 1.下载 2.解压到指定目录 3.修改配置文件 4.新建用户并赋权 5.切换至新建的用户并启动 Elasticsearch 错 ...
php kafka 日志系统,kafka-PHP客户端库(Composer)
简介: Kafka是一种高吞吐量的分布式发布订阅消息系统,它可以处理消费者规模的网站中的所有动作流数据. 这种动作(网页浏览,搜索和其他用户的行动)是在现代网络上的许多社会功能的一个关键因素. 这些数 ...

ELK+kafka日志系统搭建-实战

ELK+kafka日志系统搭建-实战相关推荐

最新文章

热门文章