4.python讀寫csv文件_ZenDei技術網路在線

4.python讀寫csv文件

-Advertisement-

1.爬取豆瓣top250書籍結果 2.把評分為9.0的書籍保存到book_out.csv文件中 ...

1.爬取豆瓣top250書籍

import requests
import json
import csv
from bs4 import BeautifulSoup

books = []

def book_name(url):
    res = requests.get(url)
    html = res.text
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find(class_="grid-16-8 clearfix").find(class_="indent").find_all('table')

    for i in items:
        book = []
        title = i.find(class_="pl2").find('a')
        book.append('《' + title.text.replace(' ', '').replace('\n', '') + '》')

        star = i.find(class_="star clearfix").find(class_="rating_nums")
        book.append(star.text + '分')

        try:
            brief = i.find(class_="quote").find(class_="inq")
        except AttributeError:
            book.append('”暫無簡介“')
        else:
            book.append(brief.text)

        link = i.find(class_="pl2").find('a')['href']
        book.append(link)

        global books
        books.append(book)

        print(book)

    try:
        next = soup.find(class_="paginator").find(class_="next").find('a')['href']
    # 翻到最後一頁
    except TypeError:
        return 0
    else:
        return next


next = 'https://book.douban.com/top250?start=0&filter='
count = 0

while next != 0:
    count += 1
    next = book_name(next)
    print('-----------以上是第' + str(count) + '頁的內容-----------')

csv_file = open('D:/top250_books.csv', 'w', newline='', encoding='utf-8')
w = csv.writer(csv_file)
w.writerow(['書名', '評分', '簡介', '鏈接'])
for b in books:
    w.writerow(b)

結果

2.把評分為9.0的書籍保存到book_out.csv文件中

'''
1.爬取豆瓣評分排行前250本書,保存為top250.csv
2.讀取top250.csv文件，把評分為9.0以上的書籍保存到另外一個csv文件中
'''

import csv

#打開的時候必須用encoding='utf-8'，否則報錯
with open('top250.csv', encoding='utf-8') as rf:
    reader = csv.reader(rf)
    #讀取頭部
    headers = next(reader)
    with open('books_out.csv', 'w', encoding='utf-8') as wf:
        writer = csv.writer(wf)
        #把頭部信息寫進去
        writer.writerow(headers)

        for book in reader:
            #獲取評分
            score = book[1]
            #把評分大於9.0的過濾出來
            if score and float(score) >= 9.0:
                writer.writerow(book)

您的分享是我們最大的動力!

-Advertisement-

更多相關文章

移動電商平臺彈性架構案例

移動電商平臺彈性架構案例雲服務彈性機房今天先到這兒，希望對技術領導力，企業管理，系統架構設計與評估，團隊管理, 項目管理, 產品管理,團隊建設有參考作用 , 您可能感興趣的文章: 領導人怎樣帶領好團隊構建創業公司突擊小團隊國際化環境下系統架構演化微服務架構設計視頻直播平臺的系統架構演化微服務與D... ...
spring4.x企業應用開發讀書筆記1

第一章概述 1 spring 以 ioc 和 aop 為內核，提供了展現層 springMVC、持久層SpringJDBC及業務層事務管理等一站式企業級應用技術。 2spring的特性方便解耦，簡化開發。通過IOC容器，用戶可以將對象之間的依賴關係交由spring進行控制，避免硬編碼所造成的的過 ...
IDEA效率插件JRebel的使用

JRebel 使用 JRebel 可以在修改代碼後，動態重新載入修改的代碼，免去了代碼工程全量重建、重啟的耗時流程，有效地提高開發者的效率。在 IDEA 的插件市場搜索 JRebel for IntelliJ 找到安裝即可。 JRebel for IntelliJ 版本：2019.1.4 1、啟用自 ...
Prime Time UVA - 10200（精度處理，素數判定）

Problem Description Problem Description Euler is a well-known matematician, and, among many other things, he discovered that the formulan^{2} + n + 41 ...
Python連載30-多線程之進程&線程&線程使用舉例

一、多線程 1.我們的環境（1）xubuntu 16.04（2）anaconda（3）pycharm（4）python 3.6 2.程式：一堆代碼以文本的形式存入一個文檔 3.進程：程式運行的一個狀態。特點：（1）其中包含地址控制項、記憶體、數據棧等；（2）每個進程由自己完全獨立的運行環境，多進程共 ...
Beyond Compare 4.X 破解方法(親測有效)

Windows下Beyond Compare 4 30天評估到期了的話，可以嘗試下麵兩種方式: 破解方式把Beyond Compare 4安裝文件夾下麵的BCUnrar.dll文件刪掉就行了，但是這種依然會提示在試用期 BC4註冊碼:可以用下麵這個註冊碼，有效期是到2019年12月 BEGIN LI ...
Python學習日記(十) 生成器和迭代器

使用dir()我們可以知道這個數據類型的內置函數有什麼方法: 1.迭代器 iterable：可迭代的迭代就是將數據能夠一個一個按順序取出來上面數據類型返回為真說明它是可以迭代的,反之是不可迭代的可迭代協議: 就是內部要有一個__iter__()來滿足要求當一個具有可迭代的數據執行__iter ...
RocketMQ中PullConsumer的啟動源碼分析

通過DefaultMQPullConsumer作為預設實現，這裡的啟動過程和Producer很相似，但相比複雜一些【RocketMQ中Producer的啟動源碼分析】 DefaultMQPullConsumer的構造方法：這裡會封裝一個DefaultMQPullConsumerImpl，類似於P ...