python3-cookbook中每個小節以問題、解決方案和討論三個部分探討了Python3在某類問題中的最優解決方式,或者說是探討Python3本身的數據結構、函數、類等特性在某類問題上如何更好地使用。這本書對於加深Python3的使用和提升Python編程能力的都有顯著幫助,特別是對怎麼提高Py ...
2.1 使用多個界定符分割字元串
>>> import re >>> line = 'asdf fjdk; afed, fjek,asdf, foo' >>> re.split(r'[;,\s]\s*', line) ['asdf', 'fjdk', 'afed', 'fjek', 'asdf', 'foo'] >>> fields = re.split(r'(;|,|\s)\s*', line) # 分組的內容也會出現在結果里 >>> fields ['asdf', ' ', 'fjdk', ';', 'afed', ',', 'fjek', ',', 'asdf', ',', 'foo'] >>>
2.3 用Shell通配符匹配字元串
當字元串的匹配一般方法不能滿足,但又不想用正則表達式那麼複雜,可以考慮使用fnmatch.fnmatch或fnmatch.fnmatchcase,兩者都可以使用Unix Shell中常用的通配符匹配字元串,區別在於前者使用的是操作系統的大小寫敏感規則,後者則完全按照你寫的內容去匹配。
>>> from fnmatch import fnmatch, fnmatchcase >>> fnmatch('foo.txt', '*.txt') True >>> fnmatch('foo.txt', '?oo.txt') True >>> fnmatch('Dat45.csv', 'Dat[0-9]*') True >>>
2.13 字元串對齊
>>> text = 'Hello World' >>> text.ljust(20) 'Hello World ' >>> text.rjust(20) ' Hello World' >>> ' Hello World ' >>> text.rjust(20, '=') '=========Hello World' >>>, '*') '****Hello World*****' >>>
>>> # 格式化字元串 >>> format(text, '>20') ' Hello World' >>> format(text, '<20') 'Hello World ' >>> format(text, '^20') ' Hello World ' >>> format(text, '=<20s') 'Hello World=========' >>> format(text, '*^20s') '****Hello World*****' >>> # 格式化數字 >>> x = 1.2345 >>> format(x, '^10.2f') ' 1.23 ' >>> # 字元串的format方法 >>> '{:>10s} {:>10s}'.format('Hello', 'World') ' Hello World'
2.16 以指定列寬格式化字元串
>>> import textwrap >>> s = "Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under." >>> print(textwrap.fill(s, 70)) Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under. >>> print(textwrap.fill(s, 40)) Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under. >>> print(textwrap.fill(s, 40, initial_indent=' ')) Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under. >>> print(textwrap.fill(s, 40, subsequent_indent=' ')) Look into my eyes, look into my eyes, the eyes, the eyes, the eyes, not around the eyes, don't look around the eyes, look into my eyes, you're under. >>>
2.17 在字元串中處理html和xml
>>> import html >>> s = 'Elements are written as "<tag>text</tag>".' >>> print(s) Elements are written as "<tag>text</tag>". >>> print(html.escape(s)) Elements are written as "<tag>text</tag>". >>> print(html.escape(s, quote=False)) Elements are written as "<tag>text</tag>". >>> >>> from html.parser import HTMLParser >>> s = 'Spicy "Jalapeño".' >>> p = HTMLParser() >>> p.unescape(s) 'Spicy "Jalapeño".' >>> >>> from xml.sax.saxutils import unescape >>> t = 'The prompt is >>>' >>> unescape(t) 'The prompt is >>>' >>>
2.18 字元串令牌解析
import re NAME = r'(?P<NAME>[a-zA-Z_][a-zA-Z_0-9]*)' NUM = r'(?P<NUM>\d+)' PLUS = r'(?P<PLUS>\+)' TIMES = r'(?P<TIMES>\*)' EQ = r'(?P<EQ>=)' WS = r'(?P<WS>\s+)' master_pat = re.compile('|'.join([NAME, NUM, PLUS, TIMES, EQ, WS])) scanner = master_pat.scanner('foo = 42') for m in iter(scanner.match, None): print(m.lastgroup,
NAME foo WS EQ = WS NUM 42