Logstash/Kibana for rsyslog anyone?

Latest response

Been looking at Logstash and Kibana for collating and presenting rsyslog data. They run on top of Elasticsearch which uses Lucene as it's backend.

http://lucene.apache.org/
http://www.elasticsearch.org/overview/
http://www.elasticsearch.org/overview/logstash/
http://www.elasticsearch.org/overview/kibana/

Now whilst I can get lucene3 and elasticsearch RPM's through the Katello repos, I can only find a JAR files for Logstash which I think includes Elasticsearch and Lucene.

It kind of falls into that category of "too cool for school" tool in my mind at the moment, but it's showing a remarkable capability already. Anybody know if there are modular RPMs for Logstash & Kibana anywhere?

Cheers

D

Responses

Not familiar with this myself, but I'm certainly interested to hear whether anyone has had experience with these tools.

Interesting I'd like to play with this idea as I could have a use. I'll see if I have some time to get this working and I will follow up.

As far as Logstash RPM. You'll need to roll your own. Get the conf, wrapper script and jar for Logstash from GitHub and rpmbuild it.

I was messing around with this configuration for JBoss central logging on a cluster. I have it working, but I haven't had much use for it.

I'll try to give a short summary:

1.) mkdir /opt/logs && cd /opt/logs
2.) Download latest logstash.jar
3.) Download and install elasticsearch into /opt/logs/elasticsearch
4.) Download and install Kibana into /opt/logs/Kibana-x.x.x

Elasticsearch

I had to change "network.bind_host" to my external IP in /opt/logs/elasticsearch/config/elasticsearch.yml

To start it:

cd /opt/logs/elasticsearch/bin && ./elasticsearch

Logstash

I'm going to give you a jboss config, there are tutorials out there for syslog configurations. You need to create your own config file, and call it whatever you want. This is the contents of /opt/logs/jboss.conf:

input {

file {
          path => '/opt/jboss-eap-6.1/domain/servers/robtest1/log/server.log'
          format => 'json_event'
          type => 'log4j'
      tags => 'robtest1'
    }

file {
          path => '/opt/jboss-eap-6.1/domain/servers/robtest2/log/server.log'
          format => 'json_event'
          type => 'log4j'
      tags => 'robtest2'
    }

}

filter {
  multiline {
    type => "log4j"
    pattern => "^\\s"
    what => "previous"
  }
  mutate {
    add_field => [ "log4j_ip", "%{@source_host}" ]
  }
  mutate {
    gsub => ["log4j_ip", ":.*$", ""]
  }
}

output {
    elasticsearch_http {
        host => 'X.X.X.X'
        port => 9200
        type => 'log4j'
        flush_size => 10
    }
stdout { }


}

X.X.X.X is my external IP in this case.

I have a bash script set to start logstash:

#!/bin/bash

java -jar logstash-1.1.12-flatjar.jar agent -f jboss.conf -- web --backend elasticsearch:///?local

Kibana

cd /opt/logs/Kibana-X.X.X/bin && ruby Kibana.rb

That will start the Kibana web interface. Browse to X.X.X.X:5601 and see if content is showing up.

That is a small primer on what I had to do to get it working.

Just noticed I was using an old version of Kibana, the newer installation instructions are here:

http://www.elasticsearch.org/overview/kibana/installation/

Much easier this way, just had to copy the content to my document root and hit the browser.

Robert,

Thanks a lot for sharing.

I have given this a try with the latest version of logstash (logstash-1.3.3-flatjar.jar) and have made some changes to jboss.conf to get rid of some "deprecation" warnings:

jboss.conf

input {

file {
      path => '<my-path>/jboss-eap-6.2/standalone/log/server.log'
      codec =>   json {
       charset => "UTF-8"
      }
      type => 'log4j'
      tags => 'A-Server'
    }
}

filter {
  multiline {
    pattern => "^\\s"
    what => "previous"
  }
  mutate {
    add_field => [ "log4j_ip", "%{@source_host}" ]
  }
  mutate {
    gsub => ["log4j_ip", ":.*$", ""]
  }
}

output {
    elasticsearch_http {
        host => '<my-ip>'
        port => 9200
        flush_size => 10
    }
stdout { }


}

Also the command-line option --backend elasticsearch:///?local seems to have gone so I have changed to script to

#!/bin/bash
java -jar logstash-1.3.3-flatjar.jar agent -f jboss.conf -- web

Hi folks,

Try the following to send rsyslog output as JSON:

/etc/rsyslog.d/logstash.conf

template(name="ls_json"
         type="list"
         option.json="on") {
           constant(value="{")
             constant(value="\"@timestamp\":\"")         property(name="timegenerated" dateFormat="rfc3339")
             constant(value="\",\"@version\":\"1")
             constant(value="\",\"message\":\"")         property(name="msg")
             constant(value="\",\"host\":\"")            property(name="hostname")
             constant(value="\",\"logsource\":\"")       property(name="fromhost")
             constant(value="\",\"severity_label\":\"")  property(name="syslogseverity-text")
             constant(value="\",\"severity\":\"")        property(name="syslogseverity")
             constant(value="\",\"facility_label\":\"")  property(name="syslogfacility-text")
             constant(value="\",\"facility\":\"")        property(name="syslogfacility")
             constant(value="\",\"program\":\"")         property(name="programname")
             constant(value="\",\"pid\":\"")             property(name="procid")
             constant(value="\",\"priority\":\"")        property(name="pri")
             constant(value="\",\"rawmsg\":\"")          property(name="rawmsg")
             constant(value="\",\"syslogtag\":\"")       property(name="syslogtag")
             constant(value="\",\"appname\":\"")         property(name="app-name")
           constant(value="\"}\n")
         }

*.* @logserver.example.com:5500;ls_json

On the Logstash side (I'm using the latest logstash-1.3.3-flatjar too for now):
/opt/logstash/syslog.conf

input {
  syslog {
    type => syslog
    port => 5544
  }
  udp {
    type => syslogjson
    port => 5500
    codec => "json"
  }
}

output {
  elasticsearch {
    embedded => true
  }
}

I've been using port 5544 to take raw rsyslog data from machines, but was getting frustrated that some messages would produce __grokparsefailure (run-parts most often, but also sssd sometimes). Raw files also parse the "program" field for things like postfix into "postfix/local", "postfix/pickup", "postfix/qmgr", "postfix/cleanup", etc.

Getting rsyslog to chuck out native JSON directly to port 5500 leaves me with no __grokparsefailures, but am now wondering what I can do about the "program" fields. Haven't used the rsyslog property replacer before, so am trying to figure out how to create a new "subprogram" field. My current setup puts (using postfix as an example again) "postfix" in the program field, but the remainder of the program field - the "local", "pickup", "qmgr", "cleanup" don't go anywhere. I'm sending through the "rawmsg" at the moment while I try to figure it out.

Would be nice to send JSON to logstash that requires no parsing and populates extra fields with some intelligence. Also, Kibana has habit of splitting the programs up a bit too much when looking at a program pie chart. 47 "run-parts" log entries will show up as 47 "run" entries and 47 "parts" entries. Not the worst example, but splitting puppet-agent into "puppet" and "agent", postfix into "postfix", "qmgr", cleanup", "pickup", "local" etc starts to blur the numbers a bit too much.

Cheers

Duncan

Jochen,

My logstash start command is:

/usr/bin/java -Djava.io.tmpdir=/opt/logstash/tmp -jar logstash-1.3.3-flatjar.jar agent --config /opt/logstash/syslog.conf --log /opt/logstash/logstash.log

And then install Kibana as the Web interface. I tried using the "web" interface on the startup command line, but replaced it with Kibana as soon as I saw what Kibana was capable of.

Duncan

There's also an RPM out there, but I've not got round to trying it out yet.

D

Since this topic is about experiences I'd like to know if anyone has any tips regarding scaling of elastic search or limitations.
Like things one should be a aware when dumping data into a index. Sharding, parent-child relations ships between documents.

That's my next step as it happens. I'm planning to roll out a design with failover Redis servers at each site, failover Logstash servers at each site, and an ElasticSearch cluster at one site.

I've heard that ES could get confused if the inter-site link went down, but both sides were still receiving data from their agents.

I've looked into the Paramedic plugin for ES clusters. Can't give any feedback other than that it looks good on the demo site.

Hi Duncan,

I start an ELK project my self and I want to thank you for your contribution and sharing your experience with the community.

Do you have any other advice for new users ?

Regards,
Gabriel

Gabriel,
My setup has now been running for around a year in total and has now morphed into a 5-node ES cluster which is set up with the JVM memory locked to the process and unswappable (don't know how necessary this is as I'm still low enough volume, but every high volume user says it helps a lot). I have 2 logstash servers running the indexing and joined to the ES cluster as nodes. This provides great failover resilience. I've moved our systems to rsyslog7 so that the JSON export is easier. Each client pushes logs to one of the logstash servers, with the other logstash server being a failover. I've created some grok patterns to help parse details out of things like AVC messages and iptables denials.
Check my gists at https://gist.github.com/duncaninnes as I keep stuff there updated with latest config files etc.
Trying to get my head around Kibana 4 at the moment as it's a totally different beast to Kibana 3.

Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.