Logstash Filter¶
Logstash filter plugins are used to perform intermediary processing on Logs. In our production, we use
grok
filter to extract various information from different service logs and add tags to them.
The reason why we use filters is that it will be very helpful for us to do search and data visualization in Kibana.
What is grok filter?¶
There are many filter plugins that already developed for you to use. You could find them here
For our production Logstash, we use grok filter to parse logs collected by Filebeat.
Grok works by combining text patterns into something that matches your logs.
The syntax for a grok pattern is %{SYNTAX:SEMANTIC}
.
Text patterns
are defined by regular expressions. You could find many predefined patterns
here.
If you want to build your own patterns, you will find
this one and
this one applications quite useful.
The SYNTAX
is the name of the pattern that will match your text. For example, 3.44
will be matched by the
NUMBER
pattern and 55.3.244.1
will be matched by the IP
pattern. The syntax is how you match.
The SEMANTIC
is the identifier you give to the piece of text being matched. For example, 3.44
could be the
duration of an event, so you could call it simply duration
Further, a string 55.3.244.1
might identify the client
making a request.
For the above example, your grok filter would look something like this:
%{NUMBER:duration} %{IP:client}
Production Logstash filter¶
The Production logstash filter is configured like below:
filter {
if [type] == "neutron" or [type] == "horizon" or [type] == "nova" or [type] == "glance" or [type] == "cinder" or [type] == "keystone" or [type] == "ceilometer" or [type] == "heat" {
grok {
patterns_dir => ["/opt/logstash/patterns"]
match => { 'message' => '%{STACKPATTERN}'}
}
}
if [type] == "horizon" and [module] == "openstack_auth.forms" {
grok {
match => { 'logmessage' => '%{QUOTEDSTRING:username}'}
}
}
}
When Filebeat collectting logs it will add ‘type’ field to the log according to what openstack service the log message comes from. Such as ‘nova’, ‘glance’, ‘horizon’…
This is for use in Logstash. When Logstash received these logs from Filebeat, it will parse them using (maybe) different patterns according to service types.
The grok patterns are defined in /opt/logstash/patterns
. The production patterns is shown below:
MODULE \b(?:[0-9A-Za-z_][0-9A-Za-z-_]{0,62})(?:\.(?:[0-9A-Za-z_][0-9A-Za-z-_]{0,62}))*(\.?|\b)
STACKPATTERN %{TIMESTAMP_ISO8601:date} %{BASE10NUM:processID} %{LOGLEVEL:loglevel} %{MODULE:module} ?(\[(%{NOTSPACE:request})?\s?(%{NOTSPACE:user})?\s?(%{NOTSPACE:tenant})?\s?(%{NOTSPACE:domain})?\s?(%{NOTSPACE:user_domain})?\s?(%{NOTSPACE:project_domain})?\])? %{GREEDYDATA:logmessage}
CEPHPATTERN %{TIMESTAMP_ISO8601:date} %{GREEDYDATA:logmessage}
As you could see above, we could extract many informations like datetime
, processID
, loglevel
, module
from log messages. It is used for future search and visualization in Kibana.