地址:https://www.elastic.co/guide/en/logstash/2.2/plugins-filters-mutate.html 本文內容 語法 mutate 插件可以在欄位上執行變換,包括重命名、刪除、替換和修改。這個插件相當常用。 比如: 你已經根據 Grok 表達式將 T ...
本文內容
- 語法
- 測試數據
- 可選配置項
mutate 插件可以在欄位上執行變換,包括重命名、刪除、替換和修改。這個插件相當常用。
比如:
- 你已經根據 Grok 表達式將 Tomcat 日誌的內容放到各個欄位中,想把狀態碼、位元組大小或是響應時間,轉換成整型;
- 你已經根據正則表達式將日誌內容放到各個欄位中,但是欄位的值,大小寫都有,這對於 Elasticsearch 的全文檢索來說,顯然用處不大,那麼可以用該插件,將欄位內容全部轉換成小寫。
語法
該插件必須是用 mutate 包裹,如下所示:
mutate {}
可用的配置選項如下表所示:
設置 | 輸入類型 | 是否必填 | 預設值 |
add_field | hash | No | {} |
add_tag | array | No | [] |
convert | hash | No | |
gsub | array | No | |
join | hash | No | |
lowercase | array | No | |
merge | hash | No | |
periodic_flush | boolean | No | false |
remove_field | array | No | [] |
remove_tag | array | No | [] |
rename | hash | No | |
replace | hash | No | |
split | hash | No | |
strip | array | No | |
update | hash | No | |
uppercase | array | No |
其中,add_field、remove_field、add_tag、remove_tag 是所有 Logstash 插件都有。它們在插件過濾成功後生效。雖然 Logstash 叫過濾,但不僅僅過濾功能。
tag 作用是,當你對欄位處理期間,還期望進行後續處理,就先作個標記。Logstash 有個內置 tags 數組,包含了期間產生的 tag,無論是 Logstash 自己產生的,還是你添加的,比如,你用 grok 解析日誌,但是錯了,那麼 Logstash 自己就會自己添加一個 _grokparsefailure 的 tag。這樣,你在 output 時,可以對解析失敗的日誌不做任何處理;
而 field 作用是,對欄位的操作,比如,你想利用已有的欄位,創建新的欄位。這些在後面再說。
另外,你會發現,上表中所有選項,要麼是動詞,要麼是動賓短語。估計你也猜到了,選項其實就是 ruby 函數,而它們後面,即“=>”,跟著的肯定是一堆參數(要是你寫程式,你也會這麼乾)。第一個參數,肯定是欄位,也就是你期望該函數作用在哪個欄位上,從第二個欄位開始往後,是具體參數~
什麼是欄位?比如,你想解析 Tomcat 日誌,把一行訪問日誌拆分後,得到客戶端IP、位元組大小、響應時間等放到指定變數,那麼這個變數就是欄位。
下麵具體介紹各個選項。
測試數據
假設有 Tomcat access 日誌:
192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET "/goLogin" "" 8080 200 1692 23 "http://10.1.8.193:8080/goMain" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0"
192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET "/js/common/jquery-1.10.2.min.js" "" 8080 304 - 67 "http://10.1.8.193:8080/goLogin" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0"
192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET "/css/common/login.css" "" 8080 304 - 75 "http://10.1.8.193:8080/goLogin" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0"
192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET "/js/system/login.js" "" 8080 304 - 53 "http://10.1.8.193:8080/goLogin" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0"
它是按如下 Tomcat 配置產生的:
<Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"
prefix="localhost_access_log." suffix=".txt"
pattern="%h %l %u %t %m "%U" "%q" %p %s %b %D "%{Referer}i" "%{User-Agent}i"" />
若用如下 Grok 表達式解析該日誌:
%{IPORHOST:clientip} %{NOTSPACE:identd} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] %{WORD:http_method} %{NOTSPACE:request} %{NOTSPACE:request_query|-} %{NUMBER:port} %{NUMBER:statusCode} (%{NOTSPACE:bytes}|-) %{NUMBER:reqTime} %{QS:referer} %{QS:userAgent}
會得到如下結果:
{
"message" => "192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET \"/goLogin\" \"\" 8080 200 1692 23 \"http://10.1.8.193:8080/goMain\" \"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\"",
"@version" => "1",
"@timestamp" => "2016-05-17T08:26:07.794Z",
"host" => "vcyber",
"clientip" => "192.168.6.25",
"identd" => "-",
"auth" => "-",
"timestamp" => "24/Apr/2016:01:25:53 +0800",
"http_method" => "GET",
"request" => "\"/goLogin\"",
"request_query" => "\"\"",
"port" => "8080",
"statusCode" => "200",
"bytes" => "1692",
"reqTime" => "23",
"referer" => "\"http://10.1.8.193:8080/goMain\"",
"userAgent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\""
}
註意,日誌拆分到各個欄位後的數據類型。port、statusCode、bytes、reqTime 欄位肯定是(最好是)數字,不過這裡暫時先用字元串。後面會介紹,下麵的示例都在此基礎上。
可配置選項
add_field
- 值是散列,就是鍵值對,比如 add_field => {"field1"=>"value1","field2"=>"value2"}。
- 預設值是空對象,即
{}
添加新的欄位。
示例:
input {
stdin {
}
}
filter {
grok {
match=>["message","%{IPORHOST:clientip} %{NOTSPACE:identd} %{NOTSPACE:auth} \[%{HTTPDATE:timestamp}\] %{WORD:http_method} %{NOTSPACE:request} %{NOTSPACE:request_query|-} %{NUMBER:port} %{NUMBER:statusCode} (%{NOTSPACE:bytes}|-) %{NUMBER:reqTime} %{QS:referer} %{QS:userAgent}"]
}
mutate {
add_field=>{
"SayHi"=>"Hello , %{clientip}"
}
}
}
output{
stdout{
codec=>rubydebug
}
}
註意黑體部分,如果用這個配置,解析前面的 Tcomat access 日誌,會得到如下結果:
{
"message" => "192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET \"/goLogin\" \"\" 8080 200 1692 23 \"http://10.1.8.193:8080/goMain\" \"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\"",
"@version" => "1",
"@timestamp" => "2016-05-17T04:52:02.031Z",
"host" => "vcyber",
"clientip" => "192.168.6.25",
"identd" => "-",
"auth" => "-",
"timestamp" => "24/Apr/2016:01:25:53 +0800",
"http_method" => "GET",
"request" => "\"/goLogin\"",
"request_query" => "\"\"",
"port" => "8080",
"statusCode" => "200",
"bytes" => "1692",
"reqTime" => "23",
"referer" => "\"http://10.1.8.193:8080/goMain\"",
"userAgent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\"",
"SayHi" => "Hello , 192.168.6.25"
}
你會看到多了一個 SayHi 欄位。這個欄位是寫死的,當然也可以動態。如果將
"SayHi"=>"Hello , %{clientip}"
改成:
"another_%{clientip}"=>"Hello , %{clientip}"
你會看到如下結果:
{
"message" => "192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET \"/goLogin\" \"\" 8080 200 1692 23 \"http://10.1.8.193:8080/goMain\" \"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\"",
"@version" => "1",
"@timestamp" => "2016-05-17T06:38:04.427Z",
"host" => "vcyber",
"clientip" => "192.168.6.25",
"identd" => "-",
"auth" => "-",
"timestamp" => "24/Apr/2016:01:25:53 +0800",
"http_method" => "GET",
"request" => "\"/goLogin\"",
"request_query" => "\"\"",
"port" => "8080",
"statusCode" => "200",
"bytes" => "1692",
"reqTime" => "23",
"referer" => "\"http://10.1.8.193:8080/goMain\"",
"userAgent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0\"",
"another_192.168.6.25" => "Hello , 192.168.6.25"
}
雖然這個例子不太合理,但你現在知道,用已有欄位的值,可以生成新的欄位和它的值。
上面示例只添加了一個欄位,你也可以添加多個欄位:
add_field=>{
"another_%{clientip}"=>"Hello , %{clientip}"
"another_%{http_method}"=>"Hello, %{http_method}"
}
add_tag
- 值是 array 數組
- 預設值為空數組,即
[]
添加新的標簽。
示例:
mutate {
add_tag=>[
"foo_%{clientip}"
]
}
你會看到如下結果:
{
"message" => "192.168.6.25 - - [24/Apr/2016:01:25:53 +0800] GET \"/goLogin\" \"\" 8080 200 1692 23 \"http://10.1.8.193:8080/goMain\"