Skip to content

fluent/fluent-plugin-rewrite-tag-filter

Repository files navigation

fluent-plugin-rewrite-tag-filter Build Status

Overview

Rewrite Tag Filter for Fluentd. It is designed to rewrite tags like mod_rewrite.
Re-emit the record with rewritten tag when a value matches/unmatches with a regular expression.
Also you can change a tag from Apache log by domain, status code (ex. 500 error),
user-agent, request-uri, regex-backreference and so on with regular expression.

This is an output plugin because fluentd's filter doesn't allow tag rewrite.

Requirements

fluent-plugin-rewrite-tag-filter Fluentd Ruby
>= 2.0.0 >= v0.14.2 >= 2.1
< 2.0.0 >= v0.12.0 >= 1.9

Installation

Install with gem or td-agent-gem command as:

# for system installed fluentd
$ gem install fluent-plugin-rewrite-tag-filter

# for td-agent2 (with fluentd v0.12)
$ sudo td-agent-gem install fluent-plugin-rewrite-tag-filter -v 1.6.0

# for td-agent3 (with fluentd v0.14)
$ sudo td-agent-gem install fluent-plugin-rewrite-tag-filter

For more details, see Plugin Management

Configuration

  • rewriterule<num> (string) (optional) <attribute> <regex_pattern> <new_tag>
    • Obsoleted: Use <rule> section
  • capitalize_regex_backreference (bool) (optional): Capitalize letter for every matched regex backreference. (ex: maps -> Maps) for more details, see usage.
    • Default value: no
  • remove_tag_prefix (string) (optional): Remove tag prefix for tag placeholder. (see the section of "Tag placeholder")
  • hostname_command (string) (optional): Override hostname command for placeholder. (see the section of "Tag placeholder")
    • Default value: hostname
  • emit_mode (enum) (required): Specify emit_mode to batch or record. batch will emit events per rewritten tag, and decrease IO. record will emit events per record.
    • Default value: batch

<rule> section (optional) (multiple)

  • key (string) (required): The field name to which the regular expression is applied
  • pattern (regexp) (required): The regular expression. /regexp/ is preferred because /regexp/ style can support character classes such as /[a-z]/. The pattern without slashes will cause errors if you use patterns start with character classes.
  • tag (string) (required): New tag
  • label (string) (optional): New label. If specified, label can be changed per-rule.
  • invert (bool) (optional): If true, rewrite tag when unmatch pattern
    • Default value: false

Usage

It's a sample to exclude some static file log before split tag by domain.

<source>
  @type tail
  path /var/log/httpd/access_log
  format apache2
  time_format %d/%b/%Y:%H:%M:%S %z
  tag td.apache.access
  pos_file /var/log/td-agent/apache_access.pos
</source>

# "capitalize_regex_backreference yes" affects converting every matched first letter of backreference to upper case. ex: maps -> Maps
# At 2nd <rule>, redirect to tag named "clear" which unmatched for status code 200.
# At 3rd <rule>, redirect to tag named "clear" which is not end with ".com"
# At 6th <rule>, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
<match td.apache.access>
  @type rewrite_tag_filter
  capitalize_regex_backreference yes
  <rule>
    key     path
    pattern /\.(gif|jpe?g|png|pdf|zip)$/
    tag clear
  </rule>
  <rule>
    key     status
    pattern /^200$/
    tag     clear
    invert  true
  </rule>
  <rule>
    key     domain
    pattern /^.+\.com$/
    tag     clear
    invert  true
  </rule>
  <rule>
    key     domain
    pattern /^maps\.example\.com$/
    tag     site.ExampleMaps
  </rule>
  <rule>
    key     domain
    pattern /^news\.example\.com$/
    tag     site.ExampleNews
  </rule>
  <rule>
    key     domain
    pattern /^(mail)\.(example)\.com$/
    tag     site.$2$1
  </rule>
  # Note: Specify catch-all rule in the last block not to lost unmatched records
  <rule>
    key     domain
    pattern /.+/
    tag     site.unmatched
  </rule>
</match>

<match site.*>
  @type mongo
  host localhost
  database apache_access
  remove_tag_prefix site
  tag_mapped
  capped
  capped_size 100m
</match>

<match clear>
  @type null
</match>

Result

$ mongo
MongoDB shell version: 2.2.0
> use apache_access
switched to db apache_access
> show collections
ExampleMaps
ExampleNews
ExampleMail
unmatched

Debug

On starting td-agent, Logging supported like below.

$ tailf /var/log/td-agent/td-agent.log
2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]

Nested attributes

Dot notation:

<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key $.kubernetes.namespace_name
    pattern ^(.+)$
    tag $1.${tag}
  </rule>
</match>

Bracket notation:

<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key $['kubernetes']['namespace_name']
    pattern ^(.+)$
    tag $1.${tag}
  </rule>
</match>

These example configurations can process nested attributes like following:

{
  "kubernetes": {
    "namespace_name": "default"
  }
}

When original tag is kubernetes.var.log, this will be converted to default.kubernetes.var.log.

Tag placeholder

It is supported these placeholder for new_tag (rewritten tag).

  • ${tag}
  • __TAG__
  • ${tag_parts[n]}
  • __TAG_PARTS[n]__
  • ${hostname}
  • __HOSTNAME__

The placeholder of ${tag_parts[n]} and __TAG_PARTS[n]__ acts accessing the index which split the tag with "." (dot).
For example with td.apache.access tag, it will get td by ${tag_parts[0]} and apache by ${tag_parts[1]}.

Note Currently, range expression ${tag_parts[0..2]} is not supported.

Placeholder Options

  • remove_tag_prefix

This option adds removing tag prefix for ${tag} or __TAG__ in placeholder.

  • remove_tag_regexp

This option adds removing tag regexp for ${tag} or __TAG__ in placeholder.

  • hostname_command

By default, execute command as hostname to get full hostname.
On your needs, it could override hostname command using hostname_command option.
It comes short hostname with hostname_command hostname -s configuration specified.

Placeholder Usage

It's a sample to rewrite a tag with placeholder.

# It will get "rewritten.access.ExampleMail"
<match apache.access>
  @type rewrite_tag_filter
  remove_tag_prefix apache
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.${tag}.$2$1
  </rule>
</match>

# It will get "rewritten.access.ExampleMail"
<match apache.access>
  @type rewrite_tag_filter
  remove_tag_regexp /^apache\./
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.${tag}.$2$1
  </rule>
</match>

# It will get "http.access.log"
<match input.{apache,nginx}.access.log>
  @type rewrite_tag_filter
  remove_tag_regexp /^input\.(apache|nginx)\./
  <rule>
    key     domain
    pattern ^.+$
    tag     http.${tag}
  </rule>
</match>

# It will get "rewritten.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
<match apache.access>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.$2$1.${hostname}
  </rule>
</match>

# It will get "rewritten.ExampleMail.app30-124" when hostname is "app30-124.foo.com"
<match apache.access>
  @type rewrite_tag_filter
  hostname_command hostname -s
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.$2$1.${hostname}
  </rule>
</match>

# It will get "rewritten.game.pool"
<match app.game.pool.activity>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^.+$
    tag     rewritten.${tag_parts[1]}.${tag_parts[2]}
  </rule>
</match>

Altering Labels

In addition to changing tags, you can also change event's route by setting the label for the re-emitted event.

For example, given this configuration:

<match apache.access>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^www\.example\.com$
    tag     web.${tag}
  </rule>
  <rule>
    key     domain
    pattern ^(.*)\.example\.com$
    tag     other.$1
    label   other
  </rule>
</match>

message: {"domain": "www.example.com"} will get its tag changed to web.apache.access, while message {"domain": "api.example.com"} will get its tag changed to other.api and be sent to label other

Example

Related Articles

TODO

Pull requests are very welcome!!

Copyright

Copyright : Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
License : Apache License, Version 2.0