ELK: logstash

Reading time ~3 minutes

ELK学习笔记之logstash

Installing logstash

1
2
3
4
5
6
7
8
9
10
11
12
13
# Logstash 依赖Java7,检测Java版本
$ java -version
java version "1.7.0_101"
OpenJDK Runtime Environment (IcedTea 2.6.6) (7u101-2.6.6-0ubuntu0.14.04.1)
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)

# Installing from Package Repositoriesedit
# Download and install the Public Signing Key
$ wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

# Add the repository definition
$ echo "deb https://packages.elastic.co/logstash/2.3/debian stable main" | sudo tee -a /etc/apt/sources.list
$ sudo apt-get update && sudo apt-get install logstash

Basic Logstash example

1
2
3
4
$ cd /opt/logstash
$ bin/logstash -e 'input { stdin { } } output { stdout {} }'
Hello World!
2016-05-24T13:34:42.423Z ubuntu Hello World!

Logstash pipeline

A Logstash pipeline has two required elements, input and output, and one optional element, filter.

Input plugin: File

以特定文件为输入:

1
2
3
4
5
6
7
8
9
input {
#   stdin {}
    file {
        path => "/home/fan/sourcecode/elk/grok/access.log"
        start_position => beginning 
        ignore_older => 0 
    }
}

很多时候我们需要重复导入,由于Logstach将已经处理过的文件保存在.sincedb的数据库中。 如果需要重复导入,则需要手动将.sincedb删除。

1
2
3
4
fan@ubuntu:~$ ls -ltr .sincedb*
-rw-rw-r-- 1 fan fan 19 Jun 28 15:56 .sincedb_39998516454e71a3e5bbefcf9b8bc709
fan@ubuntu:~$ rm .sincedb*
fan@ubuntu:~$ 

Output plugin: Stdout

先将输出指向到console,以方便前期验证。

1
2
3
4
5
6
output {
    stdout {
        # 格式化输出
        codec=>rubydebug
    }
}

Filter plugin: Grok

Grok is currently the best way in logstash to parse crappy unstructured log data into something structured and queryable.

Grok是提供了很多基础模板,大部分的情况都可以使用基础的模板来匹配。

# text: www.google.com
# pattern: %{HOSTNAME:host}
output:
{
  "host": [
    [
      "www.google.com"
    ]
  ]
}

当然也可以自定义模板,具体的方式如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# (?<field_name>the pattern here) 
# text: text/html
# pattern: (?<mime>%{WORD:mime_type}/%{WORD:mime_subtype}|-)
output:
{
  "mime": [
    [
      "text/html"
    ]
  ],
  "mime_type": [
    [
      "text"
    ]
  ],
  "mime_subtype": [
    [
      "html"
    ]
  ]
}

可以通过Grokdebug网站验证新建的Grok模板是否正常工作。

样例

下面的例子展示了如何通过匹配Squild3默认的Access日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# example
# 1463987745.513  55287 127.0.0.1 TCP_MISS/200 147235 CONNECT ssl.gstatic.com:443 - HIER_DIRECT/216.58.221.99 -
# 1463884248.230   2251 127.0.0.1 TCP_MISS/404 491 GET http://tp.client.xunlei.com/update/xml/1.1.2.259_0.xml - HIER_DIRECT/119.188.94.188 text/html

# configuration
# The # character at the beginning of a line indicates a comment. Use 
# comments to describe your configuration. 
input { 
    file { 
        path => "/home/fan/sourcecode/elk/grok/access_10.log" 
        start_position => beginning  
        ignore_older => 0  
    } 
} 
# The filter part of this file is commented out to indicate that it is 
# optional. 
filter { 
    grok { 
        match => { "message" => "%{NUMBER:timestamp}\s+%{NUMBER:duration} %{IPV4:client} (%{WORD:squid_request_status}/%{NUMBER:http_status}) %{NUMBER:reply_size} %{WORD:method} (?<request_url>%{URI}|%{URIHOST}) - (%{WORD:squid_hierarchy_status}/%{IPV4:server_ip}) (?<mime>%{WORD:mime_type}/%{WORD:mime_subtype}|-)" } 
    } 
 
    date { 
        match => [ "timestamp", "UNIX" ] 
    } 
} 
output { 
    elasticsearch { 
    } 
} 


# output
{
                   "message" => "1463987745.513  55287 127.0.0.1 TCP_MISS/200 147235 CONNECT ssl.gstatic.com:443 - HIER_DIRECT/216.58.221.99 -",
                  "@version" => "1",
                "@timestamp" => "2016-05-24T12:35:32.870Z",
                      "host" => "ubuntu",
                 "timestamp" => "1463987745.513",
                  "duration" => "55287",
                    "client" => "127.0.0.1",
      "squid_request_status" => "TCP_MISS",
               "http_status" => "200",
                "reply_size" => "147235",
                    "method" => "CONNECT",
               "request_url" => "ssl.gstatic.com:443",
                      "port" => "443",
    "squid_hierarchy_status" => "HIER_DIRECT",
                 "server_ip" => "216.58.221.99",
                      "mime" => "-"
}

{
                   "message" => "1463884248.230   2251 127.0.0.1 TCP_MISS/404 491 GET http://tp.client.xunlei.com/update/xml/1.1.2.259_0.xml - HIER_DIRECT/119.188.94.188 text/html",
                  "@version" => "1",
                "@timestamp" => "2016-05-24T12:35:42.517Z",
                      "host" => "ubuntu",
                 "timestamp" => "1463884248.230",
                  "duration" => "2251",
                    "client" => "127.0.0.1",
      "squid_request_status" => "TCP_MISS",
               "http_status" => "404",
                "reply_size" => "491",
                    "method" => "GET",
               "request_url" => "http://tp.client.xunlei.com/update/xml/1.1.2.259_0.xml",
    "squid_hierarchy_status" => "HIER_DIRECT",
                 "server_ip" => "119.188.94.188",
                      "mime" => "text/html",
                 "mime_type" => "text",
              "mime_subtype" => "html"
}

参考资料

  • https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html#_custom_patterns
  • http://www.squid-cache.org/Doc/config/logformat/

Written with StackEdit.

AZ-204: Practice topic 5

1. inboundOutboundBackend2. C### [Page 25](https://www.examtopics.com/exams/microsoft/az-204/view/25/)25. 26. 27. 28. 29. ### [Page 26](h...… Continue reading

AZ-204: Practice topic 4

Published on February 20, 2022

AZ-204: Practice topic 3

Published on February 07, 2022