原數據: 183.49.46.228 - - [18/Sep/2013:06:49:23 +0000] "-" 400 0 "-" "-"163.177.71.12 - - [18/Sep/2013:06:49:33 +0000] "HEAD / HTTP/1.1" 200 20 "-" "DNSP ...
原數據:
183.49.46.228 - - [18/Sep/2013:06:49:23 +0000] "-" 400 0 "-" "-"
163.177.71.12 - - [18/Sep/2013:06:49:33 +0000] "HEAD / HTTP/1.1" 200 20 "-" "DNSPod-Monitor/1.0"
163.177.71.12 - - [18/Sep/2013:06:49:36 +0000] "HEAD / HTTP/1.1" 200 20 "-" "DNSPod-Monitor/1.0"
101.226.68.137 - - [18/Sep/2013:06:49:42 +0000] "HEAD / HTTP/1.1" 200 20 "-" "DNSPod-Monitor/1.0"
101.226.68.137 - - [18/Sep/2013:06:49:45 +0000] "HEAD / HTTP/1.1" 200 20 "-" "DNSPod-Monitor/1.0"
60.208.6.156 - - [18/Sep/2013:06:49:48 +0000] "GET /wp-content/uploads/2013/07/rcassandra.png HTTP/1.0" 200 185524 "http://cos.name/category/software/packages/" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"
222.68.172.190 - - [18/Sep/2013:06:49:57 +0000] "GET /images/my.jpg HTTP/1.1" 200 19939 "http://www.angularjs.cn/A00n" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36"
222.68.172.190 - - [18/Sep/2013:06:50:08 +0000] "-" 400 0 "-" "-"
所需正則表達式
IP地址:
\\d+[.]\\d+[.]\\d+[.]\\d+
其中 "\\d+"表示任意連續數字,"[.]"表示點號,由於在正則表達式中"."通常表示通配符,因此點號加一個中括弧
時間:
\\d+/[A-Za-z]+/\\d+:\\d+:\\d+:\\d+
中間加入"[A-Za-z]+"表示任意連續字母,因為這裡月份用Jan,Feb, Mar...表示的
網路資源:
\"[a-zA-Z]+[^\"]+\"
"[^\"]+"表示連續非引號,也就是匹配直到遇到一個引號,"^"表示非
Java語言的正則表達式使用(以匹配IP地址為例):
// 解析正則表達式 Pattern pip = Pattern.compile("\\d+[.]\\d+[.]\\d+[.]\\d+"); // 正則表達式匹配字元串line Matcher mip = pip.matcher(line); // 如果匹配到一個目標串 if(mip.find()){ /// 就獲取該目標串 ip=mip.group(); }else{ System.out.println("not found"); }
Matcher.find()表示進行一次匹配,這裡只進行了一次匹配也就是只匹配line中的第一個目標串IP地址
通常要獲得所有的目標串,應該用while,
while(mip.find()){ System.out.println(mip.group()); }
其他的比如時間,網路資源,流量,訪問狀態碼都可以如此操作
-----------------------------------
author: ZKe