Deeplearning 學習記錄,從人工神經網路到捲積神經網路_3_使用tensorflow搭建CNN來分類not_MNIST數據 ...
3:用tensorflow搭個神經網路出來
為什麼用tensorflow呢,應為谷歌是親爹啊,雖然有些人說caffe更適合圖像啊mxnet效率更高等等,但爸爸就是爸爸,Android都能那麼火,一個道理嘛。其實這些個框架一通百通,就是語法不一樣了些。從tensorflow開始吧。
關於tf的安裝詳見另一篇博文,此處tensorflow的學習基本來自Udacity中google的深度學習課程。
1:tensorflow的計算圖
在tensorflow中編寫代碼可以分成兩個部分,首先是要定義一個計算的流程,或者叫計算圖,然後再建立一個任務,讓tensorflow調用系統資源去運算這個東西,舉個慄子:
import tensorflow as tf #導入tensorflow庫 matrix1=tf.constant([[3.,3.]])#創建常量節點 matrix2=tf.constant([2.],[2.]) product=tf.matmul(matrix1,matrix2)#創建矩陣乘法節點
上邊並沒有運算具體的值,而只是一個運算圖。
真正的計算要用到session:
sess=tf.Session()#啟動預設圖 #運行這裡會有一堆運行信息出來 result = sess.run(product)#調用sess的run方法來執行矩陣乘法節點的操作,product代表了矩陣乘法這個節點的輸出 print result sess.close()#完成任務後關閉會話
這就是tf的基礎運行方式,對於變數,使用Variable方法定義:
W1=tf.Variable(tf.zeros((2,2)), name=”weights”) sess.run(tf.initialize_all_variables())#變數需要預先初始化 print sess.run(W1)
另一個慄子:
state = tf.Variable(0,name=”counter”) new_value=tf.add(state, tf.constant(1))#對state加1 update=tf.assign(state,new_value)#將自增後的值重新賦值給state with tf.Session() as sess: #使用with可以省去close()操作,還可以處理一些操作出現的異常(也可以用try) sess.run(tf.initialize_all_variables()) print(sess.run(state))#輸出計數器值 for _ in range(3): sess.run(update) print(sess.run(state))
為毛谷歌爸爸這麼蛋疼呢,直接算不好嗎?其實一點也不蛋疼,這麼設計,同樣一套計算圖,就可以扔給不同的設備或者分散式的設備去運算了:
with tf.Session() as sess: with tf.device(“/gpu:1”): …
另一個好處就是python的運算效率較低,所以設計成使用python編寫運算圖,之後再使用python之外的運算器(比如底層的C++)去計算。
2:使用tensorflow搭建一個捲積神經網路
這裡會詳解Google發佈在udacity中使用CNN分類not_MINIST數據代碼,這些代碼包含在了tensorflow源代碼中的examples中
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/udacity
<1>:準備數據(notMINIST)
代碼的第一部分是載入數據:
# These are all the modules we'll be using later. Make sure you can import them # before proceeding further. from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import os import sys import tarfile from IPython.display import display, Image from scipy import ndimage from sklearn.linear_model import LogisticRegression from six.moves.urllib.request import urlretrieve import tensorflow as tf from six.moves import cPickle as pickle from six.moves import range # Config the matplotlib backend as plotting inline in IPython %matplotlib inline url = 'http://commondatastorage.googleapis.com/books1000/' last_percent_reported = None def download_progress_hook(count, blockSize, totalSize): """A hook to report the progress of a download. This is mostly intended for users with slow internet connections. Reports every 5% change in download progress. """ global last_percent_reported percent = int(count * blockSize * 100 / totalSize) if last_percent_reported != percent: if percent % 5 == 0: sys.stdout.write("%s%%" % percent) sys.stdout.flush() else: sys.stdout.write(".") sys.stdout.flush() last_percent_reported = percent def maybe_download(filename, expected_bytes, force=False): """Download a file if not present, and make sure it's the right size.""" if force or not os.path.exists(filename): print('Attempting to download:', filename) filename, _ = urlretrieve(url + filename, filename, reporthook=download_progress_hook) print('\nDownload Complete!') statinfo = os.stat(filename) if statinfo.st_size == expected_bytes: print('Found and verified', filename) else: raise Exception( 'Failed to verify ' + filename + '. Can you get to it with a browser?') return filename train_filename = maybe_download('notMNIST_large.tar.gz', 247336696) test_filename = maybe_download('notMNIST_small.tar.gz', 8458043)
上邊的代碼是下載所需要的數據集壓縮包,下一步是解壓
num_classes = 10 np.random.seed(133) def maybe_extract(filename, force=False): root = os.path.splitext(os.path.splitext(filename)[0])[0] # remove .tar.gz if os.path.isdir(root) and not force: # You may override by setting force=True. print('%s already present - Skipping extraction of %s.' % (root, filename)) else: print('Extracting data for %s. This may take a while. Please wait.' % root) tar = tarfile.open(filename) sys.stdout.flush() tar.extractall() tar.close() data_folders = [ os.path.join(root, d) for d in sorted(os.listdir(root)) if os.path.isdir(os.path.join(root, d))] if len(data_folders) != num_classes: raise Exception( 'Expected %d folders, one per class. Found %d instead.' % ( num_classes, len(data_folders))) print(data_folders) return data_folders train_folders = maybe_extract(train_filename) test_folders = maybe_extract(test_filename)
解壓後可以查看一下代碼文件所在的文件夾中會有兩個文件夾not_MNIST_large和not_MNIST_small,large用來訓練,small用來驗證,每個文件夾中都有10個文件夾,分別保存了A到J的圖像(28*28),這些圖像就是數據集,標簽就是A到J,當然之前下載的壓縮文件也在。下一步是將這些數據轉換成python中更容易處理的pickle格式,為了確保記憶體裝得下,我們把每一個類別分別轉換成一個獨立的pickle文件,同時也對數據進行去均值和歸一化,在這個過程中可能會有一些文件是不可讀的,跳過即可,無所謂:
image_size = 28 # Pixel width and height. pixel_depth = 255.0 # Number of levels per pixel. def load_letter(folder, min_num_images): """Load the data for a single letter label.""" image_files = os.listdir(folder) dataset = np.ndarray(shape=(len(image_files), image_size, image_size), dtype=np.float32) print(folder) num_images = 0 for image in image_files: image_file = os.path.join(folder, image)#文件路徑拼接 try: image_data = (ndimage.imread(image_file).astype(float) - pixel_depth / 2) / pixel_depth #去均值和歸一化 if image_data.shape != (image_size, image_size): raise Exception('Unexpected image shape: %s' % str(image_data.shape)) dataset[num_images, :, :] = image_data num_images = num_images + 1 except IOError as e: print('Could not read:', image_file, ':', e, '- it\'s ok, skipping.') dataset = dataset[0:num_images, :, :] if num_images < min_num_images: raise Exception('Many fewer images than expected: %d < %d' % (num_images, min_num_images)) print('Full dataset tensor:', dataset.shape) print('Mean:', np.mean(dataset)) print('Standard deviation:', np.std(dataset)) return dataset def maybe_pickle(data_folders, min_num_images_per_class, force=False): dataset_names = [] for folder in data_folders:#本例中就是not_MNIST_large/A, not_MNIST_large/B等等 set_filename = folder + '.pickle'#folders是A到J,設定文件名 dataset_names.append(set_filename)#往dataset_names後邊添加set_filename if os.path.exists(set_filename) and not force: # You may override by setting force=True. print('%s already present - Skipping pickling.' % set_filename) else: print('Pickling %s.' % set_filename) dataset = load_letter(folder, min_num_images_per_class) try: with open(set_filename, 'wb') as f: pickle.dump(dataset, f, pickle.HIGHEST_PROTOCOL) except Exception as e: print('Unable to save data to', set_filename, ':', e) return dataset_names train_datasets = maybe_pickle(train_folders, 45000) test_datasets = maybe_pickle(test_folders, 1800)
上邊的代碼就是把數據壓縮到了一個pickle文件中去了,這樣生成的數據文件可以在後續的程式中繼續使用,這也就是沒有直接採集圖像數據的原因之一,下一步是將這些pickle文件中的數據進行合併和分類,生成一個擁有訓練集、測試集合驗證集的文件,訓練數據的量取決於記憶體,如果非要使用超出記憶體的量的數據必須就分開運算了。
def make_arrays(nb_rows, img_size):#在merge_dagasets方法中把數據轉換成圖片個數*imgsize*imgsize(28),同時建一個標簽向量,大小為nb_rows if nb_rows: dataset = np.ndarray((nb_rows, img_size, img_size), dtype=np.float32) labels = np.ndarray(nb_rows, dtype=np.int32) else: dataset, labels = None, None return dataset, labels def merge_datasets(pickle_files, train_size, valid_size=0): num_classes = len(pickle_files) valid_dataset, valid_labels = make_arrays(valid_size, image_size) train_dataset, train_labels = make_arrays(train_size, image_size) vsize_per_class = valid_size // num_classes tsize_per_class = train_size // num_classes start_v, start_t = 0, 0 end_v, end_t = vsize_per_class, tsize_per_class end_l = vsize_per_class+tsize_per_class for label, pickle_file in enumerate(pickle_files):#將分佈在10個pickle文件中的數據合併成一個張量 try: with open(pickle_file, 'rb') as f: letter_set = pickle.load(f) # 將讀取到的pickle文件中的數據打亂 np.random.shuffle(letter_set) if valid_dataset is not None: valid_letter = letter_set[:vsize_per_class, :, :] valid_dataset[start_v:end_v, :, :] = valid_letter valid_labels[start_v:end_v] = label start_v += vsize_per_class end_v += vsize_per_class train_letter = letter_set[vsize_per_class:end_l, :, :] train_dataset[start_t:end_t, :, :] = train_letter train_labels[start_t:end_t] = label start_t += tsize_per_class end_t += tsize_per_class except Exception as e: print('Unable to process data from', pickle_file, ':', e) raise return valid_dataset, valid_labels, train_dataset, train_labels train_size = 200000 valid_size = 10000 test_size = 10000 valid_dataset, valid_labels, train_dataset, train_labels = merge_datasets( train_datasets, train_size, valid_size) _, _, test_dataset, test_labels = merge_datasets(test_datasets, test_size) print('Training:', train_dataset.shape, train_labels.shape) print('Validation:', valid_dataset.shape, valid_labels.shape) print('Testing:', test_dataset.shape, test_labels.shape)
最後將數據再次打亂保存後,就得到了最後的pickle文件。
def randomize(dataset, labels): permutation = np.random.permutation(labels.shape[0]) shuffled_dataset = dataset[permutation,:,:] shuffled_labels = labels[permutation] return shuffled_dataset, shuffled_labels train_dataset, train_labels = randomize(train_dataset, train_labels) test_dataset, test_labels = randomize(test_dataset, test_labels) valid_dataset, valid_labels = randomize(valid_dataset, valid_labels) pickle_file = 'notMNIST.pickle' try: f = open(pickle_file, 'wb') save = {#存到一個dictionary中去 'train_dataset': train_dataset,#num*28*28 'train_labels': train_labels,#num*10 'valid_dataset': valid_dataset,#… 'valid_labels': valid_labels, 'test_dataset': test_dataset, 'test_labels': test_labels, } pickle.dump(save, f, pickle.HIGHEST_PROTOCOL) f.close() except Exception as e: print('Unable to save data to', pickle_file, ':', e) raise statinfo = os.stat(pickle_file) print('Compressed pickle size:', statinfo.st_size)
上面是一些預操作,然後我們讀取這個pickle文件,得到捲積神經網路要使用的數據文件:
pickle_file = 'notMNIST.pickle' with open(pickle_file, 'rb') as f: save = pickle.load(f) train_dataset = save['train_dataset'] train_labels = save['train_labels'] valid_dataset = save['valid_dataset'] valid_labels = save['valid_labels'] test_dataset = save['test_dataset'] test_labels = save['test_labels'] del save # hint to help gc free up memory print('Training set', train_dataset.shape, train_labels.shape) print('Validation set', valid_dataset.shape, valid_labels.shape) print('Test set', test_dataset.shape, test_labels.shape)
運行後:
Training set (200000, 28, 28) (200000,)
Validation set (10000, 28, 28) (10000,)
Test set (18724, 28, 28) (18724,)
可見訓練集、驗證集和測試集的原始格式。如果要將數據用到一個人工神經網路中,就要把每個圖像數據都轉換成一個長×寬維的向量,而在捲積神經網路中我們需要將圖片數據轉換成長×寬×深度的樣子,同時將labels轉換成one-hot encodings格式,於是:
image_size = 28 num_labels = 10 num_channels = 1 # grayscale,如果要使用RGB格式數據就是3了 import numpy as np def reformat(dataset, labels): dataset = dataset.reshape( (-1, image_size, image_size, num_channels)).astype(np.float32) #-1表示我懶得計算該填什麼數字,由python通過a和其他的值3推測出來(這句話來自知乎,感覺好精辟啊) labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32) #這句寫的很迷,腫麽解釋。。。 #labels[:,None]相當於把shape為(20w,)的labels轉換成shape為(20w,1)的數組,從[1,2,3,4...]到[[1],[2],[3],…] #np.arange是生成了一個[0,1,2,3...]的(10,)的數組 #判斷一個(10,)是否等於一個(10,1)的數組,或者說判斷一個列向量是否等於一個行向量,可理解為矩陣乘法了,定義乘法規則為一樣就是true,不一樣就是false,那麼這個判斷式的結果就是一個20w*10的數組。 return dataset, labels train_dataset, train_labels = reformat(train_dataset, train_labels) valid_dataset, valid_labels = reformat(valid_dataset, valid_labels) test_dataset, test_labels = reformat(test_dataset, test_labels) print('Training set', train_dataset.shape, train_labels.shape) print('Validation set', valid_dataset.shape, valid_labels.shape) print('Test set', test_dataset.shape, test_labels.shape)
運行結果為:
Training set (200000, 28, 28, 1) (200000, 10)
Validation set (10000, 28, 28, 1) (10000, 10)
Test set (10000, 28, 28, 1) (10000, 10)
下一步我們先定義一個用來檢測預測精度的方法:
def accuracy(predictions, labels): return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1)) / predictions.shape[0])#註意這裡的argmax方法返回的是數組的索引值。
<2>:Draw a graph
前邊說過了,tensorflow中進行運算,首先需要構建一個運算圖,在這裡將建立一個擁有兩個捲積層和一個全連接層的捲積神經網路,算這個東西需要很土豪的顯卡,所以限制了一下深度和全捲積層的節點。
batch_size = 16 #SGD每次選取的圖片個數 patch_size = 5 #捲積視窗大小 depth = 16 #捲積深度,就是特征圖的個數 num_hidden = 64 #全連接層隱層大小 graph = tf.Graph() with graph.as_default(): # Input data.4 tf_train_dataset = tf.placeholder( tf.float32, shape=(batch_size, image_size, image_size, num_channels))#每次選出batch_size個圖片參與運算 tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels)) tf_valid_dataset = tf.constant(valid_dataset) tf_test_dataset = tf.constant(test_dataset) # Variables. layer1_weights = tf.Variable(tf.truncated_normal( [patch_size, patch_size, num_channels, depth], stddev=0.1))#隨機初始化第一卷積層權重參數,depth*num_channels張特征圖,滑動視窗大小為5*5 layer1_biases = tf.Variable(tf.zeros([depth]))#第一卷積層bias項初始化為0 layer2_weights = tf.Variable(tf.truncated_normal(#隨機初始化第二捲積層權重參數,depth*depth張特征圖,滑動視窗5*5 [patch_size, patch_size, depth, depth], stddev=0.1)) layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))#第二捲積層bias項初始化為0 layer3_weights = tf.Variable(tf.truncated_normal( [image_size // 4 * image_size // 4 * depth, num_hidden], stddev=0.1))#全連接層第一層,//4是因為後邊定義模型的時候定義stride為2, #所以兩次捲積後的數據就是7*7*16*16了??????? layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden])) layer4_weights = tf.Variable(tf.truncated_normal(#全連接層第二層 [num_hidden, num_labels], stddev=0.1)) layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels])) # Model. def model(data): conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')#這裡1,2,2,1是stride,依次對應到data的格式中去 #same padding是補0的那種padding模式,比較便於運算,所以基本上都用這種的。 hidden = tf.nn.relu(conv + layer1_biases) conv = tf.nn.conv2d(hidden, layer2_weights, [1, 2, 2, 1], padding='SAME') hidden = tf.nn.relu(conv + layer2_biases) shape = hidden.get_shape().as_list() reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]]) hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases) return tf.matmul(hidden, layer4_weights) + layer4_biases # Training computation. logits = model(tf_train_dataset) loss = tf.reduce_mean( tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels)) # Optimizer. optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)#梯度下降 # Predictions for the training, validation, and test data. train_prediction = tf.nn.softmax(logits) valid_prediction = tf.nn.softmax(model(tf_valid_dataset)) test_prediction = tf.nn.softmax(model(tf_test_dataset))
這裡看程式的話感覺好像是計算了,尤其是最後幾句話,其實並沒有計算的,下一步才是使用session來計算。
num_steps = 1001 #batch_size=16 with tf.Session(graph=graph) as session: #tf.global_variables_initializer().run()#for old version of tf0 session.run(tf.initialize_all_variables()) print('Initialized') for step in range(num_steps): offset = (step * batch_size) % (train_labels.shape[0] - batch_size)#這句是防止迭代次數過多超出數據集範圍,就通過取餘數改變取batch的偏置 batch_data = train_dataset[offset:(offset + batch_size), :, :, :] batch_labels = train_labels[offset:(offset + batch_size), :] feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels} _, l, predictions = session.run( [optimizer, loss, train_prediction], feed_dict=feed_dict) if (step % 50 == 0): print('Minibatch loss at step %d: %f' % (step, l)) print('Minibatch accuracy: %.1f%%' % accuracy(predictions, batch_labels)) print('Validation accuracy: %.1f%%' % accuracy( valid_prediction.eval(), valid_labels)) print('Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels))
以上就是實用捲積神經網路簡單的區分not_MNIST數據的程式,池化層我還沒加上,還有dropout防止過擬合也沒有添加,待續。先看rfcn吧。