Using Filebeat to ingest apache logs

This tutorial on using Filebeat to ingest apache logs will show you how to create a working system in a jiffy. I will not go into minute details since I want to keep this post simple and sweet. I will just show the bare minimum which needs to be done to make the system work.

WHY

Apache logs are everywhere. Even Buzz LightYear knew that.

And then there is a growing user base of people who are increasingly using ELK stack to handle the logs. Sooner or later you will end up with Apache logs which you will want to push into the Elasticsearch cluster.

There are two popular ways of getting the logs in Elasticsearch cluster. Filebeats and Logstash. Filebeats is light weight application where as Logstash is a big heavy application with correspondingly richer feature set.

HOW

Filebeat has been made highly configurable to enable it to handle a large variety of log formats. In real world however there are a few industry standard log formats which are very common. So to make life easier filebeat comes with modules. Each standard logging format has its own module. All you have to do is to enable it. No messing around in the config files, no need to handle edge cases. Everything has been handled. Since I am using filebeat to ingest apache logs I will enable the apache2 module.

First install and start Elasticsearch and Kibana. Then you have to install some plugins.

sudo /var/usr/elasticsearch/bin/elasticsearch-plugin install ingest-geoip
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-user-agent

If you have a multi-node cluster then you have to install these on all the nodes.
This might be a bug as of now. But I had to restart all the nodes for changes to take effect.

Then install the filebeats.

curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.5.2-x86_64.rpm
sudo rpm -vi filebeat-6.5.2-x86_64.rpm

Then you make changes to the /etc/filebeat/filebeat.yml file to specify the connections. Since I am not using security this section will be easy.

setup.kibana:
  host: "yourhostname:5601"

output.elasticsearch:
  hosts: ["yourhostname:9200"]

Then you enable the apache2 module.

sudo filebeat modules enable apache2

The settings for this module will be found in /etc/filebeat/modules.d/apache2.yml. If you open it you will see that there is an option to provide the path for the access and error logs. In case the logs are in custom location rather the usual place (for a given logging format and OS) then you can provide the paths to the logs.

- module: apache2
  # Access logs
  access:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

  # Error logs
  error:
    enabled: true

    # Set custom paths for the log files. If left empty,
    # Filebeat will choose the paths depending on your OS.
    #var.paths:

Best practice is to leave it as it is and let filebeat figure out the location based on OS you are using. And I will do the same.

With that done the next command to run is

sudo filebeat setup

Setup makes sure that the mapping of the fields in Elasticsearch is right for the fields which are present in the given log.

Before we start using filebeat to ingest apache logs we should check if things are ok. Use this command:

sudo filebeat test output

You want to see all OK there.

Once that is done then run the filebeat.

sudo service filebeat start

To stop it

sudo service filebeat stop

However since I do not have apache server running I downloaded some logs for demo purpose. And I will pass them at command line. Hence I need to run the filebeat in foreground.

sudo filebeat -e -M "apache2.access.var.paths=[/home/elastic/scratch/apacheLogs/access.log*]" -M "apache2.error.var.paths=[/home/elastic/scratch/apacheLogs/error.log*]"

And that is it.
Filebeat will by default create an index starting with the name filebeat-. Check your cluster to see if the logs were indexed or not. Or better still use kibana to visualize them.

Bonus
With Kibana 6.5.2 onwards you get logs view (it is still in beta). That supports infinite scroll. Something which the community has been asking for so so long. Do try that since you already have apache logs in the cluster now.

WHY

HOW

Leave a Reply Cancel reply