python讀取文本文件數據_ZenDei技術網路在線

python讀取文本文件數據

-Advertisement-

本文要點剛要：（一）讀文本文件格式的數據函數：read_csv,read_table 1.讀不同分隔符的文本文件，用參數sep 2.讀無欄位名（表頭）的文本文件，用參數names 3.為文本文件制定索引，用index_col 4.跳行讀取文本文件，用skiprows 5.數據太大時需要逐塊讀取文 ...

本文要點剛要：

（一）讀文本文件格式的數據函數：read_csv,read_table

1.讀不同分隔符的文本文件，用參數sep

2.讀無欄位名（表頭）的文本文件，用參數names

3.為文本文件制定索引，用index_col

4.跳行讀取文本文件，用skiprows

5.數據太大時需要逐塊讀取文本數據用chunksize進行分塊。

（二）將數據寫成文本文件格式函數：to_csv

範例如下：

（一）讀取文本文件格式的數據集

1.read_csv和read_table的區別:

#read_csv預設讀取用逗號分隔符的文件，不需要用sep來指定分隔符

import pandas as pd
pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.csv')

#read_csv如果讀的是用非逗號分隔符的文件，必須要用sep指定分割符，不然讀出來的是原文件的樣子，數據沒被分割開
import pandas as pd
pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt')

#與上面的例子可以對比一下區別
import pandas as pd
pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|')

#read_table讀取文件時必須要用sep來指定分隔符，否則讀出來的數據是原始文件，沒有分割開。
import pandas as pd
pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.csv')

#read_table讀取數據必須指定分隔符
import pandas as pd
pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|')

2.讀取文本文件時不用header和names指定表頭時，預設第一行為表頭

#用header=None表示數據集沒有表頭，會預設用阿拉伯數字填充表頭和索引
pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|',header=None)

#用names可以自定義表頭
pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|',
                    names=['x1','x2','x3','x4','x5'])

3.預設用阿拉伯數字指定索引；用index_col指定某一列作為索引

names=['x1','x2','x3','x4','x0']
pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|',
                   names=names,index_col='x0')

4.以下示例是用skiprows將hello對應的行跳過後讀取其他行數據，不管首行是否作為表頭，都是將表頭作為第0行開始數

可以對比一下三個例子的區別進行理解

pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt')

names=['x1','x2','x3','x4','x0']
pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt',names=names,
            skiprows=[0,3,6])

pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt',
            skiprows=[0,3,6])

pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt',header=None,
            skiprows=[0,3,6])

5.分塊讀取，data1.txt中總共8行數據，按照每塊3行來分，會讀3次，第一次3行，第二次3行，第三次1行數據進行讀取。

註意這裡在分塊的時候跟跳行讀取不同的是，表頭沒作為第一行進行分塊讀取，可通過一下兩個例子對比進行理解。

chunker = pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt',chunksize=3)
for m in chunker:   
    print(len(m)) 
    print m

chunker = pd.read_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data1.txt',header=None,
                      chunksize=3)
for m in chunker:    
    print(len(m)) 
    print m

（二）將數據寫入文本格式用to_csv

以data.txt為例,註意寫出文件時，將索引也寫入了

data=pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|')
print data

#可以用index=False禁止索引的寫入。
data=pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|')
data.to_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\outdata.txt',sep='!',index=False)

#可以用columns指定寫入的列
data=pd.read_table('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\data.txt',sep='|')
data.to_csv('C:\\Users\\xiaoxiaodexiao\\pythonlianxi\\test0424\\outdata2.txt',sep=',',index=False,
            columns=['a','c','d'])

您的分享是我們最大的動力!

-Advertisement-

更多相關文章

VS2010使用Release進行調試的三個必須設置選項

How to: Debug a Release Build You can debug a release build of an application. To debug a release build Open the Property Pages dialog box for the pro ...
This/導包/繼承/重寫

Lesson Nine 2018-04-27 02:05:08 this: 1.可以用來修飾屬性、方法、構造器 2.this理解為當前對象或當前正在創建的對象.比如：this.name,this.show(); 1 class TriAngle{ 2 private double base; 3 p ...
用servlet設計OA管理系統時遇到問題

如果不加單引號會使得除變數和int類型的值不能傳遞轉發和重定向的區別轉發需要填寫完整路徑，重定向只需要寫相對路徑。原因是重定向是一次請求之內已經定位到了伺服器端，轉發則需要兩次請求每次都需要完整的路徑。 Request和response在解決中文亂碼時的區別 Request只需要規定編碼集，而r ...
為什麼學java的人越來越多，學好java真的很有錢途嗎？

不知道你有沒有發現現在身邊學java的人越來越越多呢?其實在小編高考的時候，身邊選電腦專業的同學非常少。別誤會，就是幾年前而已。可能是因為小編是小縣城的，身邊很多人甚至都不知道有程式員這一職業。現在學java的人越來越多，學好java真的很有錢途嗎? 為什麼學java的人越來越多? 小編認為一個是 ...
FuelPHP 系列 ------ Oil 命令

之前用過 Laravel，框架自帶的 artisan 命令，用得爽到爆。現在工作需要，要學習 FuelPHP，首先看到框架目錄結構，有 coposer.json 框架可以用 composer 管理，一定也有自己的命令工具。對於新手來說，不妨先用命令自動生成文件，然後看這些生成的文件瞭解基本的 CR ...
找不到或無法載入主類“的問題分析

配置Windows2008伺服器openjdk時候出現這問題原因是CLASSPATH配置出了問題，網上錯誤配置太多，一錯傳10，10傳百 CLASSPATH 外話，配置時候還需 JAVA_HOME Path ...
用python來更改小伙伴的windows開機密碼，不給10塊不給開機

代碼呢分兩部分，一部分是client端跟server端兩個。你只需要想辦法讓小伙伴運行你的client端腳本就OK啦。不過在此之前你一定要在你的電腦上運行server端哦~這樣子的話，client端會在你的小伙伴電腦上隨機生成一個密碼然後通過socket發給server端也就是發給你哦~ ...
Zookeeper簡介和安裝（二）

一、簡介： Zookeeper是一個分散式協調服務，提供的服務如下：命名服務：類似於DNS，但僅對於節點配置管理：服務配置信息的管理集群管理：Dubbo使用Zookeeper實現服務治理分散式鎖：選舉一個leader，這樣某一時刻只有一個服務在幹活，當leader出問題時釋放鎖，立即切到另一 ...