Quantcast
Channel: IT瘾博客推荐
Viewing all articles
Browse latest Browse all 532

Logstash处理json格式日志文件的三种方法_数据库_很多时候,你缺少的不是知识而是热情-CSDN博客

$
0
0

假设日志文件中的每一行记录格式为json的,如:

{"Method":"JSAPI.JSTicket","Message":"JSTicket:kgt8ON7yVITDhtdwci0qeZg4L-Dj1O5WF42Nog47n_0aGF4WPJDIF2UA9MeS8GzLe6MPjyp2WlzvsL0nlvkohw","CreateTime":"2015/10/13 9:39:59","AppGUID":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_PartitionKey":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_RowKey":"1444700398710_ad4d33ce-a9d9-4d11-932e-e2ccebdb726c","_UnixTS":1444700398710}

默认配置下,logstash处理插入进elasticsearch后,查到的结果是这样的:

{"_index":"logstash-2015.10.16","_type":"voip_feedback","_id":"sheE9eXiQASMDVtRJ0EYcg","_version":1,"found":true,"_source":{"message":"{\"Method\":\"JSAPI.JSTicket\",\"Message\":\"JSTicket:kgt8ON7yVITDhtdwci0qeZg4L-Dj1O5WF42Nog47n_0aGF4WPJDIF2UA9MeS8GzLe6MPjyp2WlzvsL0nlvkohw\",\"CreateTime\":\"2015/10/13 9:39:59\",\"AppGUID\":\"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d\",\"_PartitionKey\":\"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d\",\"_RowKey\":\"1444700398710_ad4d33ce-a9d9-4d11-932e-e2ccebdb726c\",\"_UnixTS\":1444700398710}","@version":"1","@timestamp":"2015-10-16T00:39:51.252Z","type":"voip_feedback","host":"ipphone","path":"/usr1/data/voip_feedback.txt"}}

即会将json记录做为一个字符串放到”message”下,但是我是想让logstash自动解析json记录,将各字段放入elasticsearch中。有三种配置方式可以实现。

第一种,直接设置format => json

file {
        type => "voip_feedback"
        path => ["/usr1/data/voip_feedback.txt"]  
        format => json
        sincedb_path => "/home/jfy/soft/logstash-1.4.2/voip_feedback.access"     
    }

这种方式查询出的结果是:

{"_index":"logstash-2015.10.16","_type":"voip_feedback","_id":"NrNX8HrxSzCvLl4ilKeyCQ","_version":1,"found":true,"_source":{"Method":"JSAPI.JSTicket","Message":"JSTicket:kgt8ON7yVITDhtdwci0qeZg4L-Dj1O5WF42Nog47n_0aGF4WPJDIF2UA9MeS8GzLe6MPjyp2WlzvsL0nlvkohw","CreateTime":"2015/10/13 9:39:59","AppGUID":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_PartitionKey":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_RowKey":"1444700398710_ad4d33ce-a9d9-4d11-932e-e2ccebdb726c","_UnixTS":1444700398710,"@version":"1","@timestamp":"2015-10-16T00:16:11.455Z","type":"voip_feedback","host":"ipphone","path":"/usr1/data/voip_feedback.txt"}}

可以看到,json记录已经被直接解析成各字段放入到了_source中,但是原始记录内容没有被保存

第二种,使用codec => json

file {
        type => "voip_feedback"
        path => ["/usr1/data/voip_feedback.txt"]  
        sincedb_path => "/home/jfy/soft/logstash-1.4.2/voip_feedback.access"
        codec => json {
            charset => "UTF-8"
        }       
    }

这种方式查询出的结果与第一种一样,字段被解析,原始记录内容也没有保存

第三种,使用filter json

filter {
    if [type] == "voip_feedback" {
        json {
            source => "message"
            #target => "doc"
            #remove_field => ["message"]
        }        
    }
}

这种方式查询出的结果是这样的:

{"_index":"logstash-2015.10.16","_type":"voip_feedback","_id":"CUtesLCETAqhX73NKXZfug","_version":1,"found":true,"_source":{"message":"{\"Method222\":\"JSAPI.JSTicket\",\"Message\":\"JSTicket:kgt8ON7yVITDhtdwci0qeZg4L-Dj1O5WF42Nog47n_0aGF4WPJDIF2UA9MeS8GzLe6MPjyp2WlzvsL0nlvkohw\",\"CreateTime\":\"2015/10/13 9:39:59\",\"AppGUID\":\"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d\",\"_PartitionKey\":\"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d\",\"_RowKey\":\"1444700398710_ad4d33ce-a9d9-4d11-932e-e2ccebdb726c\",\"_UnixTS\":1444700398710}","@version":"1","@timestamp":"2015-10-16T00:28:20.018Z","type":"voip_feedback","host":"ipphone","path":"/usr1/data/voip_feedback.txt","Method222":"JSAPI.JSTicket","Message":"JSTicket:kgt8ON7yVITDhtdwci0qeZg4L-Dj1O5WF42Nog47n_0aGF4WPJDIF2UA9MeS8GzLe6MPjyp2WlzvsL0nlvkohw","CreateTime":"2015/10/13 9:39:59","AppGUID":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_PartitionKey":"cb54ba2d-1d38-45f2-9ed1-abff0bf7dd3d","_RowKey":"1444700398710_ad4d33ce-a9d9-4d11-932e-e2ccebdb726c","_UnixTS":1444700398710,"tags":["111","222"]}}

可以看到,原始记录被保存,同时字段也被解析保存。如果确认不需要保存原始记录内容,可以加设置:remove_field => [“message”]

比较以上三种方法,最方便直接的就是在file中设置format => json

另外需要注意的是,logstash会在向es插入数据时默认会在_source下增加type,host,path三个字段,如果json内容中本身也含有type,host,path字段,那么解析后将覆盖掉logstash默认的这三个字段,尤其是type字段,这个同时也是做为index/type用的,覆盖掉后,插入进es中的index/type就是json数据记录中的内容,将不再是logstash config中配置的type值。

这时需要设置filter.json.target,设置该字段后json原始内容将不会放在_source下,而是放到设置的”doc”下:

{"_index":"logstash-2015.10.20","_type":"3alogic_log","_id":"xfj3ngd5S3iH2YABjyU6EA","_version":1,"found":true,"_source":{"@version":"1","@timestamp":"2015-10-20T11:36:24.503Z","type":"3alogic_log","host":"server114","path":"/usr1/app/log/mysql_3alogic_log.log","doc":{"id":633796,"identity":"13413602120","type":"EAP_TYPE_PEAP","apmac":"88-25-93-4E-1F-96","usermac":"00-65-E0-31-62-5D","time":"20151020-193624","apmaccompany":"TP-LINK TECHNOLOGIES CO.,LTD","usermaccompany":""}}}

这样就不会覆盖掉_source下的type,host,path值
而且在kibana中显示时字段名称为doc.type,doc.id…


Viewing all articles
Browse latest Browse all 532

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>