Python內置函數(47)——open

来源:http://www.cnblogs.com/sesshoumaru/archive/2016/11/09/6047046.html
-Advertisement-
Play Games

英文文檔: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) Open file and return a corresponding fil ...


英文文檔:

open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None)

Open file and return a corresponding file object. If the file cannot be opened, an OSError is raised.

file is either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or an integer file descriptor of the file to be wrapped. (If a file descriptor is given, it is closed when the returned I/O object is closed, unless closefd is set to False.)

mode is an optional string that specifies the mode in which the file is opened. It defaults to 'r' which means open for reading in text mode. Other common values are 'w' for writing (truncating the file if it already exists), 'x' for exclusive creation and 'a' for appending (which on some Unix systems, means that all writes append to the end of the file regardless of the current seek position). In text mode, if encoding is not specified the encoding used is platform dependent: locale.getpreferredencoding(False) is called to get the current locale encoding. (For reading and writing raw bytes use binary mode and leave encoding unspecified.) The available modes are:

CharacterMeaning
'r' open for reading (default)
'w' open for writing, truncating the file first
'x' open for exclusive creation, failing if the file already exists
'a' open for writing, appending to the end of the file if it exists
'b' binary mode
't' text mode (default)
'+' open a disk file for updating (reading and writing)
'U' universal newlines mode (deprecated)

The default mode is 'r' (open for reading text, synonym of 'rt'). For binary read-write access, the mode 'w+b' opens and truncates the file to 0 bytes. 'r+b' opens the file without truncation.

As mentioned in the Overview, Python distinguishes between binary and text I/O. Files opened in binary mode (including 'b' in the mode argument) return contents as bytes objects without any decoding. In text mode (the default, or when 't' is included in the mode argument), the contents of the file are returned as str, the bytes having been first decoded using a platform-dependent encoding or using the specified encoding if given.

Note

Python doesn’t depend on the underlying operating system’s notion of text files; all the processing is done by Python itself, and is therefore platform-independent.

buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer. When no buffering argument is given, the default buffering policy works as follows:

  • Binary files are buffered in fixed-size chunks; the size of the buffer is chosen using a heuristic trying to determine the underlying device’s “block size” and falling back on io.DEFAULT_BUFFER_SIZE. On many systems, the buffer will typically be 4096 or 8192 bytes long.
  • “Interactive” text files (files for which isatty() returns True) use line buffering. Other text files use the policy described above for binary files.

encoding is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

errors is an optional string that specifies how encoding and decoding errors are to be handled–this cannot be used in binary mode. A variety of standard error handlers are available (listed under Error Handlers), though any error handling name that has been registered with codecs.register_error() is also valid. The standard names include:

  • 'strict' to raise a ValueError exception if there is an encoding error. The default value of None has the same effect.
  • 'ignore' ignores errors. Note that ignoring encoding errors can lead to data loss.
  • 'replace' causes a replacement marker (such as '?') to be inserted where there is malformed data.
  • 'surrogateescape' will represent any incorrect bytes as code points in the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These private code points will then be turned back into the same bytes when the surrogateescape error handler is used when writing data. This is useful for processing files in an unknown encoding.
  • 'xmlcharrefreplace' is only supported when writing to a file. Characters not supported by the encoding are replaced with the appropriate XML character reference &#nnn;.
  • 'backslashreplace' replaces malformed data by Python’s backslashed escape sequences.
  • 'namereplace' (also only supported when writing) replaces unsupported characters with \N{...} escape sequences.

newline controls how universal newlines mode works (it only applies to text mode). It can be None, '', '\n', '\r', and '\r\n'. It works as follows:

  • When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '\n', '\r', or '\r\n', and these are translated into '\n' before being returned to the caller. If it is '', universal newlines mode is enabled, but line endings are returned to the caller untranslated. If it has any of the other legal values, input lines are only terminated by the given string, and the line ending is returned to the caller untranslated.
  • When writing output to the stream, if newline is None, any '\n' characters written are translated to the system default line separator, os.linesep. If newline is '' or '\n', no translation takes place. If newline is any of the other legal values, any '\n' characters written are translated to the given string.

If closefd is False and a file descriptor rather than a filename was given, the underlying file descriptor will be kept open when the file is closed. If a filename is given closefd must be True (the default) otherwise an error will be raised.

A custom opener can be used by passing a callable as opener. The underlying file descriptor for the file object is then obtained by calling opener with (file, flags). opener must return an open file descriptor (passing os.open as opener results in functionality similar to passing None).

 

說明:

  1. 函數功能打開一個文件,返回一個文件讀寫對象,然後可以對文件進行相應讀寫操作。

  2. file參數表示的需要打開文件的相對路徑(當前工作目錄)或者一個絕對路徑,當傳入路徑不存在此文件會報錯。或者傳入文件的句柄。

>>> a = open('test.txt') # 相對路徑
>>> a
<_io.TextIOWrapper name='test.txt' mode='r' encoding='cp936'>
>>> a.close()

>>> a = open(r'D:\Python\Python35-32\test.txt') # 絕對路徑
>>> a
<_io.TextIOWrapper name='D:\\Python\\Python35-32\\test.txt' mode='r' encoding='cp936'>

  3. mode參數表示打開文件的模式,常見的打開模式有如下幾種,實際調用的時候可以根據情況進行組合。

    'r': 以只讀模式打開(預設模式)(必須保證文件存在)
    'w':以只寫模式打開。若文件存在,則會自動清空文件,然後重新創建;若文件不存在,則新建文件。使用這個模式必須要保證文件所在目錄存在,文件可以不存在。該模式下不能使用read*()方法

    'a':以追加模式打開。若文件存在,則會追加到文件的末尾;若文件不存在,則新建文件。該模式不能使用read*()方法。

  

  下麵四個模式要和上面的模式組合使用
    'b':以二進位模式打開

    't': 以文本模式打開(預設模式)
    '+':以讀寫模式打開
    'U':以通用換行符模式打開

  常見的mode組合


    'r'或'rt':     預設模式,文本讀模式
    'w'或'wt':   以文本寫模式打開(打開前文件會被清空)
    'rb':          以二進位讀模式打開
    'ab':         以二進位追加模式打開
    'wb':        以二進位寫模式打開(打開前文件會被清空)
    'r+':         以文本讀寫模式打開,可以寫到文件任何位置;預設寫的指針開始指在文件開頭, 因此會覆寫文件
    'w+':        以文本讀寫模式打開(打開前文件會被清空)。可以使用read*()
    'a+':         以文本讀寫模式打開(寫只能寫在文件末尾)。可以使用read*()
    'rb+':       以二進位讀寫模式打開
    'wb+':      以二進位讀寫模式打開(打開前文件會被清空)
    'ab+':      以二進位讀寫模式打開

 

# t為文本讀寫,b為二進位讀寫
>>> a = open('test.txt','rt')
>>> a.read()
'some text'
>>> a = open('test.txt','rb')
>>> a.read()
b'some text'

# r為只讀,不能寫入;w為只寫,不能讀取
>>> a = open('test.txt','rt')
>>> a.write('more text')
Traceback (most recent call last):
  File "<pyshell#67>", line 1, in <module>
    a.write('more text')
io.UnsupportedOperation: write
>>> a = open('test.txt','wt')
>>> a.read()
Traceback (most recent call last):
  File "<pyshell#69>", line 1, in <module>
    a.read()
io.UnsupportedOperation: not readable

#其它不一一舉例了

  4. buffering表示文件在讀取操作時使用的緩衝策略。

      0:    代表buffer關閉(只適用於二進位模式)
      1:    代表line buffer(只適用於文本模式)
      >1:  表示初始化的buffer大小

  5. encoding參數表示讀寫文件時所使用的的文件編碼格式。

  假設現在test.txt文件以utf-8編碼存儲了一下文本:

     

>>> a = open('test.txt','rt') # 未正確指定編碼,有可能報錯
>>> a.read()
Traceback (most recent call last):
  File "<pyshell#87>", line 1, in <module>
    a.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 8: illegal multibyte sequence

>>> a = open('test.txt','rt',encoding = 'utf-8')
>>> a.read()
'我是第1行文本,我將被顯示在屏幕\n我是第2行文本,我將被顯示在屏幕\n我是第3行文本,我將被顯示在屏幕'
>>> 

  6. errors參數表示讀寫文件時碰到錯誤的報錯級別。

  常見的報錯基本有:

  • 'strict' 嚴格級別,字元編碼有報錯即拋出異常,也是預設的級別,errors參數值傳入None按此級別處理.
  • 'ignore' 忽略級別,字元編碼有錯,忽略掉.
  • 'replace' 替換級別,字元編碼有錯的,替換成?. 
>>> a = open('test.txt','rt',encoding = 'utf-8')
>>> a.read()
'我是第1行文本,我將被顯示在屏幕\n我是第2行文本,我將被顯示在屏幕\n我是第3行文本,我將被顯示在屏幕'
>>> a = open('test.txt','rt')
>>> a.read()
Traceback (most recent call last):
  File "<pyshell#91>", line 1, in <module>
    a.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xac in position 8: illegal multibyte sequence
>>> a = open('test.txt','rt',errors = 'ignore' )
>>> a.read()
'鎴戞槸絎1琛屾枃鏈錛屾垜灝嗚鏄劇ず鍦ㄥ睆騫\n鎴戞槸絎2琛屾枃鏈錛屾垜灝嗚鏄劇ず鍦ㄥ睆騫\n鎴戞槸絎3琛屾枃鏈錛屾垜灝嗚鏄劇ず鍦ㄥ睆騫'
>>> a = open('test.txt','rt',errors = 'replace' )
>>> a.read()
'鎴戞槸絎�1琛屾枃鏈�錛屾垜灝嗚��鏄劇ず鍦ㄥ睆騫�\n鎴戞槸絎�2琛屾枃鏈�錛屾垜灝嗚��鏄劇ず鍦ㄥ睆騫�\n鎴戞槸絎�3琛屾枃鏈�錛屾垜灝嗚��鏄劇ず鍦ㄥ睆騫�'

  7. newline表示用於區分換行符(只對文本模式有效,可以取的值有None,'\n','\r','','\r\n')

>>> a = open('test.txt','rt',encoding = 'utf-8',newline = '\r')
>>> a.readline()
'我是第1行文本,我將被顯示在屏幕\r'
>>> a = open('test.txt','rt',encoding = 'utf-8',newline = '\n')
>>> a.readline()
'我是第1行文本,我將被顯示在屏幕\r\n'

  8. closefd表示傳入的file參數類型(預設為True),傳入文件路徑時一定為True,傳入文件句柄則為False。

>>> a = open('test.txt','rt',encoding = 'utf-8',newline = '\n',closefd = False)
Traceback (most recent call last):
  File "<pyshell#115>", line 1, in <module>
    a = open('test.txt','rt',encoding = 'utf-8',newline = '\n',closefd = False)
ValueError: Cannot use closefd=False with file name
>>> a = open('test.txt','rt',encoding = 'utf-8',newline = '\n',closefd = True)

 


您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • 在.NET Framework框架中,程式集是重用、安全性以及版本控制的最小單元。程式集的定義為:程式集是一個或多個類型定義文件及資源文件的集合。程式集主要包含:PE/COFF,CLR頭,元數據,清單,CIL代碼,元數據。 PE/COFF文件是由工具生成的,表示文件的邏輯分組。PE文件包含“清單”數 ...
  • 一,Get請求 1,無參數Get請求,跟平常寫ajax請求一樣,並無什麼差別 $.ajax({ url: '.../api/User/UserVerify, type: 'get', success: function (json) { alert(json); }, error: function ...
  • 嘿嘿,請不要說我是偷取,我只是借鑒一下。。 String 對象是不可改變的。每次使用 System.String 類中的方法之一時,都要在記憶體中創建一個新的字元串 對象,這就需要為該新對象分配新的空間。在需要對字元串執行重覆修改的情況下,與創建新的 String 對象相關的系統開銷可能會非常昂貴。如 ...
  • ::執行效果 ::日誌文件內容 ...
  • 輸出: 輸出: ...
  • '//此VBA為Excel巨集語言' ...
  • 1636: [Usaco2007 Jan]Balanced Lineup Description For the daily milking, Farmer John's N cows (1 <= N <= 50,000) always line up in the same order. One ...
  • http://www.tutorialspoint.com 教程點 線上編程,包含語言豐富(English) http://www.runoob.com/ 菜鳥教程 同上 http://www.imooc.com/ 慕課網 線上視頻 http://stackoverflow.com/ 記憶體溢出 遇到 ...
一周排行
    -Advertisement-
    Play Games
  • 移動開發(一):使用.NET MAUI開發第一個安卓APP 對於工作多年的C#程式員來說,近來想嘗試開發一款安卓APP,考慮了很久最終選擇使用.NET MAUI這個微軟官方的框架來嘗試體驗開發安卓APP,畢竟是使用Visual Studio開發工具,使用起來也比較的順手,結合微軟官方的教程進行了安卓 ...
  • 前言 QuestPDF 是一個開源 .NET 庫,用於生成 PDF 文檔。使用了C# Fluent API方式可簡化開發、減少錯誤並提高工作效率。利用它可以輕鬆生成 PDF 報告、發票、導出文件等。 項目介紹 QuestPDF 是一個革命性的開源 .NET 庫,它徹底改變了我們生成 PDF 文檔的方 ...
  • 項目地址 項目後端地址: https://github.com/ZyPLJ/ZYTteeHole 項目前端頁面地址: ZyPLJ/TreeHoleVue (github.com) https://github.com/ZyPLJ/TreeHoleVue 目前項目測試訪問地址: http://tree ...
  • 話不多說,直接開乾 一.下載 1.官方鏈接下載: https://www.microsoft.com/zh-cn/sql-server/sql-server-downloads 2.在下載目錄中找到下麵這個小的安裝包 SQL2022-SSEI-Dev.exe,運行開始下載SQL server; 二. ...
  • 前言 隨著物聯網(IoT)技術的迅猛發展,MQTT(消息隊列遙測傳輸)協議憑藉其輕量級和高效性,已成為眾多物聯網應用的首選通信標準。 MQTTnet 作為一個高性能的 .NET 開源庫,為 .NET 平臺上的 MQTT 客戶端與伺服器開發提供了強大的支持。 本文將全面介紹 MQTTnet 的核心功能 ...
  • Serilog支持多種接收器用於日誌存儲,增強器用於添加屬性,LogContext管理動態屬性,支持多種輸出格式包括純文本、JSON及ExpressionTemplate。還提供了自定義格式化選項,適用於不同需求。 ...
  • 目錄簡介獲取 HTML 文檔解析 HTML 文檔測試參考文章 簡介 動態內容網站使用 JavaScript 腳本動態檢索和渲染數據,爬取信息時需要模擬瀏覽器行為,否則獲取到的源碼基本是空的。 本文使用的爬取步驟如下: 使用 Selenium 獲取渲染後的 HTML 文檔 使用 HtmlAgility ...
  • 1.前言 什麼是熱更新 游戲或者軟體更新時,無需重新下載客戶端進行安裝,而是在應用程式啟動的情況下,在內部進行資源或者代碼更新 Unity目前常用熱更新解決方案 HybridCLR,Xlua,ILRuntime等 Unity目前常用資源管理解決方案 AssetBundles,Addressable, ...
  • 本文章主要是在C# ASP.NET Core Web API框架實現向手機發送驗證碼簡訊功能。這裡我選擇是一個互億無線簡訊驗證碼平臺,其實像阿裡雲,騰訊雲上面也可以。 首先我們先去 互億無線 https://www.ihuyi.com/api/sms.html 去註冊一個賬號 註冊完成賬號後,它會送 ...
  • 通過以下方式可以高效,並保證數據同步的可靠性 1.API設計 使用RESTful設計,確保API端點明確,並使用適當的HTTP方法(如POST用於創建,PUT用於更新)。 設計清晰的請求和響應模型,以確保客戶端能夠理解預期格式。 2.數據驗證 在伺服器端進行嚴格的數據驗證,確保接收到的數據符合預期格 ...