How to extract filename from filebeat shipped logs

By | May 9, 2019

This post will show how to extract filename from filebeat shipped logs, using elasticsearch pipelines and grok. I will also show how to deal with the failures usually seen in real life. With that said lets get started.

WHY

It is very common to create log files with names containing the identifier. In manufacturing the log files often start with the serial number of the device under manufacture or test. In case of application server, sometimes the log name starts with the transaction id for that session. In short, it is quite common for the name of log file to contain some useful information. And with that comes the inevitable request, Can you please extract filename from filebeat shipped logs and make it available?

HOW

Filebeat sends the fully qualified filename of the logs.
Something like this

First you have to define a grok pattern to match it. Use the Grok Debugger provided in the Dev Tools section of Kibana.
Here is a sample screen of how to use it. See how the backslashes never miss a chance to make life difficult. They need to be escaped.

how extract filename from filebeat shipped logs

In case you are wondering about USER, NUMBER, GREEDYDATA then yes, they are the regex monsters grok patterns. See what they match here.

Now we are able to extract the filename.
Sometimes the requirement is to extract something from the filename, like the serial number and discard the date part. Here is a screen showing how we do that.

how extract filename from filebeat shipped logs

I will continue with this pattern to make it look more real world. Now with the grok pattern sorted lets put in a pipeline to put everything together.

Again look at the backslashes. These guys don’t give up.

Here is what is happening.

Line 05-07 : Use the grok processor to split the source into different parameters.
Line 11-13 : Use the set processor to create a fieldname called Identifier using the value of identifier.
Line 17-20 : Use remove processor to drop the fields we do not need.
Line 22-27 : They are to handle the error which might come up.

The on_failure section of pipeline is very important. Why?
Someone can sneak in a file in the monitored folder which does not conform to grok pattern you wrote.
For example

The result is a failure.
how extract filename from filebeat shipped logs
The default behaviour of the pipeline is to halt at first error. You can change it to your wish using the on_failure block. The recommended approach here is to log the data into another index and continue with the processing of the logs. This is what you see happening in the block shown above.

Now just to create a filebeat yml file to start shipping data to your elasticsearch cluster. You can refer to my previous post to see how filebeat is configured.

And that’s it. You now know how extract filename from filebeat shipped logs. Keep rocking.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.