本系列目錄: 《1、搜索有聲小說》 《2、分析詳細頁地址》 《3、批量下載mp3》 本篇是大結局,看過前兩篇的放心吧,不會有第四篇了,軟體的下載地址,軟體完成的效果大家自己看吧。 一、查找mp3文件的下載地址 我們首先要獲取其下載地址,在評書的詳細頁中沒有找到,我們進入播放頁面,看看能找到什麼,如下 ...
本系列目錄: 《1、搜索有聲小說》 《2、分析詳細頁地址》 《3、批量下載mp3》
本篇是大結局,看過前兩篇的放心吧,不會有第四篇了,軟體的下載地址,軟體完成的效果大家自己看吧。
一、查找mp3文件的下載地址
我們首先要獲取其下載地址,在評書的詳細頁中沒有找到,我們進入播放頁面,看看能找到什麼,如下圖。
直接定位到播放mp3的元素,此處用了一個iframe,說明本頁面不是真正的播放頁面,簡單的看看,沒發現什麼有價值的內容,於是進入真正的播放頁面尋求答案,但是出現瞭如下的提示:
嘗試換瀏覽器無果後,直接上大招,在評書詳細頁中按F12,選中網路標簽,然後F5刷新頁面,找到iframe中顯示的網址記錄,使用IE瀏覽器時,發現一個問題,URL鏈接中有許多亂碼,所以此處建議用Chrome瀏覽器,如下圖
在篩選框中輸入mp3進行一下篩選,mp3這個關鍵字純是猜測,不同的情況要靈活調整,沒想到第一次就蒙上了,兩條結果正好是我想要的,第一條是真正的播放頁面地址,第二條是mp3的下載地址。
先選中播放頁面那條記錄,看右側紅框中的內容,在程式發起請求時把這項加上,這樣就能獲得頁面的內容了,代碼如下:
1 2 3 4 5 6 7 8 9 |
|
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Url); request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"; request.Accept = "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; request.ContentType = "application/x-www-form-urlencoded"; request.KeepAlive = true; //此處換上每集的網址 request.Referer = "http://www.tingchina.com/pingshu/1228/play_1228_0.htm”; request.Method = "GET"; |
有了mp3的下載地址,我們看看播放mp3頁面的html代碼,看看mp3下載地址是怎麼生成的,我們選中開發工具中的Elements標簽,然後點擊一下最左側的放大鏡,選中播放mp3的元素,接著我們按一下Ctrl+F,出現查詢框,從mp3的下載地址中找部分內容填進去,例如我填寫的是mp3?key=,然後回車,就定位到我們想要的內容,實際中,如果找不到可以嘗試換搜索的關鍵詞,還是不行,就把只能逐行看代碼,進行查找了,如下圖。
有了這些數據,咱們簡單分析一下就能看出,url[2] 加上url[3]等於mp3的下載地址,接下來就簡單了,開始下載吧。
二、具體的代碼如下
C# Code
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 |
/// <summary> /// 抓取網頁內容 /// </summary> /// <param name="Url">網址</param> /// <param name="myEncoding">編碼方式</param> /// <param name="myEncoding">請求的網址</param> /// <returns></returns> public string GetHtml(string Url, Encoding myEncoding, string Referer) { string HtmlString = ""; HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(Url); request.Timeout = 15 * 1000; request.KeepAlive = true; request.AllowWriteStreamBuffering = true; request.Credentials = System.Net.CredentialCache.DefaultCredentials; request.MaximumResponseHeadersLength = -1; request.Referer = Referer; request.UserAgent = "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"; request.Accept = "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5"; request.ContentType = "application/x-www-form-urlencoded"; request.Method = "GET"; try { using (HttpWebResponse response = (HttpWebResponse)request.GetResponse()) { Stream resStream = response.GetResponseStream(); StreamReader sr = new StreamReader(resStream, myEncoding); HtmlString = sr.ReadToEnd(); } } catch { } return HtmlString; } /// <summary> /// 下載評書的後臺線程 /// </summary> /// <param name="sender"></param> /// <param name="e"></param> private void bw_Download_DoWork(object sender, DoWorkEventArgs e) { //圖書下載的本地路徑 string LocalPath = e.Argument.ToString(); //查詢出所有未下載的劇集 var query = from m in this._list where m.Status != 1 orderby m.ID ascending select m; //並行迴圈 var loopResult = Parallel.ForEach( query, new ParallelOptions { MaxDegreeOfParallelism = 1 }, (sound,loopStatue) =>{ //抓取劇集的詳細頁內容 string Html = GetHtml(sound.Url, Encoding.GetEncoding("GB2312"), ""); if (Html != "") { //分析播放音頻網頁的相關數據 Match ms_Info = Regex.Match(Html, @"src=""/play/" + this._category + @"/flash.asp\?id=[\d]*&inum=[\d]*&flei=(?<Category>[\s\S]*?)&bookname=(?<BookName>[\s\S]*?)&filename=(?<FileName>[\s\S]*?).mp3&nexturl", RegexOptions.IgnoreCase | RegexOptions.Multiline); if (ms_Info.Success) { //獲取評書的最終名稱和演播者 string[] tmp = ms_Info.Groups["BookName"].Value.Split('_'); if (tmp.Length == 2) { sound.Title = tmp[0]; sound.Performer = tmp[1]; } else { sound.Title = ms_Info.Groups["BookName"].Value; sound.Performer = ms_Info.Groups["Category"].Value; } //播放mp3的網頁地址 string PlayUrl = "http://www.tingchina.com" + ms_Info.Value.Replace(@"src=""", "").Replace(@"&nexturl", ""); //評書的實際播放頁面實際是嵌在詳細頁中的一個frame框架中,所以需要繼續抓取播放評書的頁面。 Html = GetHtml(PlayUrl, Encoding.Default, sound.Url); if (Html != "") { //抓取下載MP3的地址 MatchCollection ms = Regex.Matches(Html, @"url\[[\d]{1}\]= ""http://t(?<Number>[\d]*).tingchina.com""", RegexOptions.IgnoreCase | RegexOptions.Multiline); //抓取下載MP3所需的Key Match ms_Down = Regex.Match(Html, @"key=(?<key>[\d\w_]*)"";", RegexOptions.IgnoreCase | RegexOptions.Multiline); if (ms.Count > 0 && ms_Down.Success) { //音頻mp3下載地址 string DownUrl = string.Format("http://t{0}.tingchina.com/{1}/{2}/{3}/{4}.mp3?key={5}", ms[0].Groups["Number"].Value, this._category, ms_Info.Groups["Category"].Value, ms_Info.Groups["BookName"].Value, ms_Info.Groups["FileName"].Value, ms_Down.Groups["key"].ToString()); WebClient client = new WebClient(); client.Headers.Add("Accept", "*/*"); client.Headers.Add("Accept-Encoding", "gzip, deflate"); client.Headers.Add("Cache-Control", "no-cache"); client.Headers.Add("Host", "t" + ms[0].Groups["Number"].Value + ".tingchina.com"); client.Headers.Add("Cookie", "Hm_lvt_99c9da471c839d239f4f41b80b233115=1445870536,1446212277,1446474813"); client.Headers.Add("UserAgent", "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko"); client.Headers.Add("Referer", "http://www.tingchina.com/play/newflashv4.swf"); client.Headers.Add("x-flash-version", "20,0,0,267"); try { //開始下載MP3 client.DownloadFile(DownUrl, LocalPath + (sound.ID + 1).ToString().PadLeft(3, '0') + @".mp3"); sound.Status = 1; } catch (Exception ex) { sound.Status = -1; sound.Error = "下載MP3失敗,原因:" + ex.Message; } } else { sound.Status = -1; sound.Error = "解析評書播放頁的代碼失敗。"; } } else { sound.Status = -1; sound.Error = "抓取播放頁的Html代碼失敗。"; } } else { sound.Status = -1; sound.Error = "解析詳細頁的Html代碼失敗。"; } } else { sound.Status = -1; sound.Error = "抓取詳細頁的Html代碼失敗。"; } } ); } private void bw_Download_RunWorkerCompleted(object sender, RunWorkerCompletedEventArgs e) { if (!this.IsDisposed) { var query = from m in this._list where m.Status != 1 orderby m.ID ascending select m; if (query.Count() > 0) { this.btn_Download.Text = "繼續下載"; } else { this.btn_Download.Text = "下載完畢"; } } } private void btn_Download_Click(object sender, EventArgs e) { //本地保存的路徑 string LocalPath = Path.Combine(this._outpath, this._title) + @"\"; if (!Directory.Exists(LocalPath)) { try { Directory.CreateDirectory(LocalPath); } catch (Exception ex) { MessageBox.Show("創建評書下載目錄失敗,原因:" + ex.Message); } } using (BackgroundWorker bw_Download = new BackgroundWorker()) { bw_Download.WorkerReportsProgress = true; // 設置可以通告進度 bw_Download.RunWorkerCompleted += new RunWorkerCompletedEventHandler(bw_Download_RunWorkerCompleted); bw_Download.DoWork += new DoWorkEventHandler(bw_Download_DoWork); bw_Download.RunWorkerAsync(LocalPath); } } |
終於寫完了,作為新手,可能寫的比較啰嗦,但是本意是希望讀者能根據三篇教程,一步一步的完成這個軟體。