Linux通配符與正則表達式_ZenDei技術網路在線

一、通配符匹配參數，匹配文件/目錄名字： *.txt *.sh lidao{1,4}.txt | * | 所有 | | | | | {} | 生成序列 | | [] | 【a-z】匹配小寫字母，一個中括弧相當於一個字元 | | [^] | 取反排除 | | ? | 任何一個字元 | 1. 通配符 ...

一、通配符

匹配參數，匹配文件/目錄名字： *.txt *.sh lidao{1,4}.txt

* 所有

{} 生成序列

[] 【a-z】匹配小寫字母，一個中括弧相當於一個字元

[^] 取反排除

? 任何一個字元

*	所有
{}	生成序列
[]	【a-z】匹配小寫字母，一個中括弧相當於一個字元
[^]	取反排除
?	任何一個字元

**1. 通配符 ‘*’ 號**

匹配所有 *.log *.txt

# 找出當前目錄下.log 及.avi 文件
ls *.log
find ./ -name '*.avi'

# 找出系統中包含catalina的文件
find / -type f -name "*catalina*"

1.2 通配符 ‘{}’ 號

生成序列-數字與字母

#01
[root@localhost ~]# echo {01..10}
01 02 03 04 05 06 07 08 09 10
[root@localhost ~]# echo {A..Z}
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

#02
[root@localhost ~]# echo {01..10}YANG
01YANG 02YANG 03YANG 04YANG 05YANG 06YANG 07YANG 08YANG 09YANG 10YANG
[root@localhost ~]# echo YANG{01..10}
YANG01 YANG02 YANG03 YANG04 YANG05 YANG06 YANG07 YANG08 YANG09 YANG10

#03
[root@localhost ~]# echo {1,5,9}
1 5 9
[root@localhost ~]# echo A{,B}
A AB

#生成有規律的序列（瞭解）
[root@localhost ~]# seq 1 2 10
1
3
5
7
9
[root@localhost ~]# echo {1..10..2}
1 3 5 7 9
[root@localhost ~]# echo {a..z..2}
a c e g i k m o q s u w y

1.3 通配符 '?'號

可以代表任意一個字元、占位

#01
[root@localhost ~]# ls /bin/?
/bin/[  /bin/w
[root@localhost ~]# ls /bin/??
/bin/ar  /bin/cd  /bin/dd  /bin/fc  /bin/ln  /bin/nl  /bin/ps  /bin/rz  /bin/su  /bin/ul
/bin/as  /bin/ci  /bin/df  /bin/fg  /bin/ls  /bin/nm  /bin/rb  /bin/sb  /bin/sx  /bin/vi
/bin/bg  /bin/co  /bin/du  /bin/id  /bin/m4  /bin/od  /bin/rm  /bin/sg  /bin/sz  /bin/wc
/bin/cc  /bin/cp  /bin/ex  /bin/ld  /bin/mv  /bin/pr  /bin/rx  /bin/sh  /bin/tr  /bin/xz
[root@localhost ~]# ls /bin/???
/bin/a2p  /bin/cat  /bin/dwp  /bin/g++  /bin/idn  /bin/ocs  /bin/rev  /bin/seq  /bin/tbl  /bin/vim
/bin/awk  /bin/cmp  /bin/dwz  /bin/gcc  /bin/ldd  /bin/pic  /bin/rpm  /bin/ssh  /bin/tee  /bin/who
/bin/c++  /bin/col  /bin/env  /bin/gdb  /bin/lex  /bin/ptx  /bin/rvi  /bin/sum  /bin/tic  /bin/xxd
/bin/c89  /bin/cpp  /bin/eqn  /bin/gio  /bin/lua  /bin/pwd  /bin/s2p  /bin/svn  /bin/toe  /bin/yes
/bin/c99  /bin/cut  /bin/f95  /bin/git  /bin/lz4  /bin/raw  /bin/scp  /bin/tac  /bin/top  /bin/yum
/bin/cal  /bin/dir  /bin/fmt  /bin/gpg  /bin/man  /bin/rcs  /bin/sed  /bin/tar  /bin/tty  /bin/zip

二、特殊字元

2.1 引號系類：單引號，雙引號，不加引號，反引號

引號系列
單引號	所見即所得，單引號里的內容會被原封不動的輸出（大部分命令）
雙引號	與單引號類似，雙引號的裡面的特殊符號會被解析
不加引號	與雙引號類似，支持通配符
反引號	優先執行命令

[root@localhost ~]# echo 'Yang-zs  $LANG `hostname` $(whoami) {1..5}'
Yang-zs  $LANG `hostname` $(whoami) {1..5}

[root@localhost ~]# echo "Yang-zs  $LANG `hostname` $(whoami) {1..5}"
Yang-zs  zh_CN.UTF-8 localhost.localdomain root {1..5}

[root@localhost ~]# echo Yang-zs  $LANG `hostname` $(whoami) {1..5}
Yang-zs zh_CN.UTF-8 localhost.localdomain root 1 2 3 4 5

三、正則表達式

3.1 正則概述

主要用來進行匹配字元（三劍客過濾文件內容）
匹配字元：手機號，身份證號碼
通過正則表達式匹配內容

3.2 註意事項

1️⃣所有符號都是英文
2️⃣剛開始學習的時候，推薦使用grep顯示，正則執行的過程
3️⃣註意系統的語言與字元集（瞭解）

3.3 正則分類

基礎正則 BRE
擴展正則 ERE

正則分類
基礎正則 ^ $ ^$ . * .* [] [^]
擴展正則 \| + {} () ?

3.4 通配符 VS 正則

區別	處理目標	支持的命令不同
通配符	文件/目錄文件名處理的是參數	Linux大部分命令都支持
正則	進行過濾，在一個文件中查找內容，處理的是字元	Linux三劍客，開發語言：Python,GoLang

3.5 基礎正則-BRE

準備環境

[root@localhost ibjs]# cat re.txt 
I  an  ShuaiGe
I  like Linux


my blog is http://127.0.0.1
My qq is 28728222
not  882812311


my god,i  am not Shuaige,but OLDBOY!

BRE-基礎正則

1️⃣^ 以……開頭的行

#01  ^ 以……開頭的行
[root@localhost ibjs]# grep '^my' re.txt 
my blog is http://127.0.0.1
my god,i  am not Shuaige,but OLDBOY!
[root@localhost ibjs]# grep '^M' re.txt 
My qq is 28728222

2️⃣$ 以……結尾的行

# $  以……結尾的行
[root@localhost ibjs]# grep 'x$' re.txt 
I  like Linux
[root@localhost ibjs]# grep '1$' re.txt 
my blog is http://127.0.0.1
not  882812311

3️⃣ ^$ 匹配空行

# ^$  匹配空行
空行非空格，如果有空格不算空行
[root@localhost ibjs]# grep -n  '^$' re.txt 
3:
4:
8:
9:

#  排除文件的空行和註釋行
root@localhost ibjs]# grep -n  -v '^$' /etc/ssh/sshd_config | grep -v '#'
22:HostKey /etc/ssh/ssh_host_rsa_key
24:HostKey /etc/ssh/ssh_host_ecdsa_key
25:HostKey /etc/ssh/ssh_host_ed25519_key
32:SyslogFacility AUTHPRIV
47:AuthorizedKeysFile	.ssh/authorized_keys
65:PasswordAuthentication yes
69:ChallengeResponseAuthentication no
79:GSSAPIAuthentication yes
80:GSSAPICleanupCredentials no
96:UsePAM yes
101:X11Forwarding yes
126:AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
127:AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
128:AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
129:AcceptEnv XMODIFIERS
132:Subsystem	sftp	/usr/libexec/openssh/sftp-server

4️⃣ . (點)表示任意一個字元

# 以任意字元開頭的行 會自動去除空行
[root@localhost ibjs]# grep -n '^.' re.txt 
1:I  an  ShuaiGe
2:I  like Linux
5:my blog is http://127.0.0.1
6:My qq is 28728222
7:not  882812311
10:my god,i  am not Shuaige,but OLDBOY!


# 練習：過濾出文件中以 . 結尾的行(運用轉義符，將 . 轉義為字元即可匹配到)
[root@localhost ibjs]# grep -n '\.$' re.txt 
1:I  an  ShuaiGe.
2:I  like Linux.
9:yes.

轉義字元	含義
/	轉義，去掉特殊含義
/n	換行符
/t	製表符

# /n
[root@localhost ibjs]# echo -e 'lll\nllll' 
lll
llll
#  \t
[root@localhost ibjs]# echo -e 'lllll\tllllll'
lllll	llllll

5️⃣ * 前一個字元連續出現0次或者0次以上

#連續出現
		    #連續出現0次=沒有
0  			#連續出現一次
000			#連續出現三次
0000		#連續出現多次

#01 文件連續出現的0
[root@localhost ibjs]# grep '0*'  re.txt 
I  an  ShuaiGe.
I  like Linux.


my blog is http://127.0.0.1
My qq is 28728222
not  882812311

yes.

my god,i  am not Shuaige,but OLDBOY!
## 正則表達式表示連續或者所有的時候，會出現儘可能多的匹配，稱為貪婪性

6️⃣ .* 表示所有

#01 以所有開頭到o的內容
[root@localhost ibjs]# grep '^.*o' re.txt 
my blog is http://127.0.0.1
not  882812311
my god,i  am not Shuaige,but OLDBOY!

#02  以I開頭並且以.結尾的行
[root@localhost ibjs]# grep '^I.*\.$' re.txt 
I  am  ShuaiGe.
I  like Linux.

7️⃣ [] 表示匹配字元 a&b&c 中任意一個，中括弧表示一整體

#01
[root@localhost ibjs]# grep '[abc]' re.txt 
I  an  ShuaiGe.
my blog is http://127.0.0.1
my god,i  am not Shuaige,but OLDBOY!

#02 匹配數字及字母精簡寫法
[root@localhost ibjs]# grep '[0-9]' re.txt 
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
[root@localhost ibjs]# grep '[a-z]' re.txt 
I  an  ShuaiGe.
I  like Linux.
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
yes.
my god,i  am not Shuaige,but OLDBOY!

#03 中括弧中想匹配什麼就寫什麼，多餘的不用
[root@localhost ibjs]# grep -i '[a-zA-Z]' re.txt 
I  an  ShuaiGe.
I  like Linux.
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
yes.
my god,i  am not Shuaige,but OLDBOY!

#練習： 
#01 匹配文件中大寫字母開頭的行
[root@localhost ibjs]# grep '^[A-Z]' re.txt 
I  an  ShuaiGe.
I  like Linux.
My qq is 28728222
My xxxxxxxx xx xxxxx
Ix 1233333 
Bi jjjxjd


#02 匹配文件中以大寫字母開頭並且以小寫字母或者空格結尾的行
[root@localhost ibjs]# grep '^[A-Z].*[a-z ]$' re.txt 
My xxxxxxxx xx xxxxx
Ix 1233333 
Bi jjjxjd

8️⃣ [^] 排除：[^abc] 匹配不是a，不是b，不是c的內容

#01
[root@localhost ibjs]# grep '[^abc]' re.txt 
I  an  ShuaiGe.
I  like Linux.
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
yes.
my god,i  am not Shuaige,but OLDBOY!
My xxxxxxxx xx xxxxx
Ix 1233333 
Bi jjjxjd

基礎正則-BRE小結

BRE	含義
^	以……開頭的行
$	以……結尾的行
^$	空行，沒有任何符號的行
.	任何一個字元
*	前一個連續出現0次或者出現0次以上
.*	表示所有 ,貪婪性：表現連續出現及所有的時候
[]	[abc],[a-z],[0-9],[a-z0-9A-Z] [abc] 每次匹配括弧內的一個字元，或者的意思,a&b&c
[^]	[^abc] 排除：匹配不是a，不是b，不是c的內容

3.6 擴展正則-ERE

1️⃣ + 前一個字元出現1次或者1次以上

在如今egrep已經不推薦使用，推薦使用 grep -E 來支持擴展正則
#01 過濾出連續出現的e
[root@localhost ibjs]# grep -E 'e+' re.txt 
I  an  ShuaiGe.
I  like Linux.
yes.

#02 過濾出連續出現的數字
[root@localhost ibjs]# grep -E '[0-9]+' re.txt 
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
Ix 1233333 

#03 過濾出連續出現的單詞MY
[root@localhost ibjs]# grep -E -i 'MY+' re.txt 
my blog is http://127.0.0.1
My qq is 28728222
my god,i  am not Shuaige,but OLDBOY!
My xxxxxxxx xx xxxxx

2️⃣ | 表示或者

# 匹配 I 或者 MY
[root@localhost ibjs]# grep -E 'I|my' re.txt 
I  an  ShuaiGe.
I  like Linux.
my blog is http://127.0.0.1
my god,i  am not Shuaige,but OLDBOY!
Ix 1233333 

# | 與 [] 的區別
#共同點：都可以表示或者
#區別：
#| 表示的或者可以是單詞也可以是字元 a|b|c ，am|my
#[] 表示的或者是 某個字元 [abc]

3️⃣ {} a{n,m} 前一個字元a連續出現至少n次，最多出現m次

{}格式
a{n,m}	前一個字元至少出現n次，最多出現m次
a{n}	前一次字元連續出現n次
a{n,}	前一個字元至少出現n次
a{,m}	前一次字元最多出現m次

#01  匹配數字0，出現了至少1次最多3次的數字0
[root@localhost ibjs]# grep -E '0{1,3}' re.txt 
my blog is http://127.0.0.1
0
000
000000000
000000
0000

#02 匹配連續出現3次的數字0 
[root@localhost ibjs]# grep -E '0{3}' re.txt 
000
000000000
000000
0000

3️⃣ () 被括起來的內容相當於是一個整體；sed命令的後向引用（反向引用）

#
[root@localhost ibjs]# grep -E '0(\.|0)0' re.txt 
my blog is http://127.0.0.1
000
000000000
000000
0000

4️⃣ ? 表示前一個字元出現0次或者1次

# 篩選 gd或者god
[root@localhost ibjs]# grep -E 'gd|god' re.txt 
my god,i  am not Shuaige,but OLDBOY!
god
gd


[root@localhost ibjs]# grep -E 'go?d' re.txt 
my god,i  am not Shuaige,but OLDBOY!
god
gd

6️⃣擴展正則-ERE小結

ERE	含義
+	前一個字元出現一次或者多次
\|	或者
{}	a{n,m}前一個字元出現至少n次，最多m次
()	1.表示整體 2.sed後向引用（反向引用）分組
?	表示前一個字元出現0次或者1次

7️⃣ 中括弧表達式（瞭解）

# [[:alnum:]]
#表示 數字 0-9 字母 a-z A-Z
[root@localhost ibjs]# grep '[[:alnum:]]' re.txt 
I  an  ShuaiGe.
I  like Linux.
my blog is http://127.0.0.1
My qq is 28728222
not  882812311
yes.
my god,i  am not Shuaige,but OLDBOY!
My xxxxxxxx xx xxxxx
Ix 1233333 
Bi jjjxjd
0
000
000000000
000000
0000
432511120011123341
god
df
gd

8️⃣ grep命令選項

grep選項
-n	顯示行號
-v	取反
-i	不區分大小寫
-o	顯示grep執行過程
-E	支持擴展正則，也識別基礎正則
-w	精確過濾
-R	遞歸查找
-l（小寫的L）	只顯示文件名
-A	grep過濾的時候顯示內容及下麵的一行
-B	grep過濾的時候顯示內容及上面的一行
-C	grep過濾的時候顯示內容及上下的一行

# grep 精確過濾
[root@localhost ibjs]# grep -w '0' re.txt 
my blog is http://127.0.0.1
0

# grep 找出/etc/目錄下包含oldboy的文件
#方法1 通用
grep 'oldboy' `find /etc -type f `

#方法2  適用比較簡單的
grep -R 'oldboy' /etc

#找出系統某個目錄中是否包含病毒鏈接 www.bingdu.com
grep -Rl 'www.bingdu.com' /etc

3.7 正則總結

連續重現（重覆出現）
+	1次或多次
{}	任意次範圍
*	0次及多次
？	0次或者1次

整體
[]	[abc]，a或b或c
()	表示整體
[^]	表示取反
\|	或者

其他
^	表示以……開頭
$	表示以……結尾
^$	表示空行
.*	表示所有
.	表示任意一個字元