在C#下使用TensorFlow.NET訓練自己的數據集 今天,我結合代碼來詳細介紹如何使用 SciSharp STACK 的 TensorFlow.NET 來訓練CNN模型,該模型主要實現 圖像的分類 ,可以直接移植該代碼在 CPU 或 GPU 下使用,並針對你們自己本地的圖像數據集進行訓練和推理 ...
今天,我結合代碼來詳細介紹如何使用 SciSharp STACK 的 TensorFlow.NET 來訓練CNN模型,該模型主要實現 圖像的分類 ,可以直接移植該代碼在 CPU 或 GPU 下使用,並針對你們自己本地的圖像數據集進行訓練和推理。TensorFlow.NET是基於 .NET Standard 框架的完整實現的TensorFlow,可以支持 .NET Framework
或 .NET CORE
, TensorFlow.NET 為廣大.NET開發者提供了完美的機器學習框架選擇。
SciSharp STACK:https://github.com/SciSharp
什麼是TensorFlow.NET?
TensorFlow.NET 是 SciSharp STACK 開源社區團隊的貢獻,其使命是打造一個完全屬於.NET開發者自己的機器學習平臺,特別對於C#開發人員來說,是一個“0”學習成本的機器學習平臺,該平臺集成了大量API和底層封裝,力圖使TensorFlow的Python代碼風格和編程習慣可以無縫移植到.NET平臺,下圖是同樣TF任務的Python實現和C#實現的語法相似度對比,從中讀者基本可以略窺一二。
由於TensorFlow.NET在.NET平臺的優秀性能,同時搭配SciSharp的NumSharp、SharpCV、Pandas.NET、Keras.NET、Matplotlib.Net等模塊,可以完全脫離Python環境使用,目前已經被微軟ML.NET官方的底層演算法集成,並被谷歌寫入TensorFlow官網教程推薦給全球開發者。
-
SciSharp 產品結構
-
微軟 ML.NET底層集成演算法
-
谷歌官方推薦.NET開發者使用
項目說明
本文利用TensorFlow.NET構建簡單的圖像分類模型,針對工業現場的印刷字元進行單字元OCR識別,從工業相機獲取原始大尺寸的圖像,前期使用OpenCV進行圖像預處理和字元分割,提取出單個字元的小圖,送入TF進行推理,推理的結果按照順序組合成完整的字元串,返回至主程式邏輯進行後續的生產線工序。
實際使用中,如果你們需要訓練自己的圖像,只需要把訓練的文件夾按照規定的順序替換成你們自己的圖片即可。支持GPU或CPU方式,該項目的完整代碼在GitHub如下:
模型介紹
本項目的CNN模型主要由 2個捲積層&池化層 和 1個全連接層 組成,激活函數使用常見的Relu,是一個比較淺的捲積神經網路模型。其中超參數之一"學習率",採用了自定義的動態下降的學習率,後面會有詳細說明。具體每一層的Shape參考下圖:
數據集說明
為了模型測試的訓練速度考慮,圖像數據集主要節選了一小部分的OCR字元(X、Y、Z),數據集的特征如下:
-
分類數量:3 classes 【X/Y/Z】
-
圖像尺寸:Width 64 × Height 64
-
圖像通道:1 channel(灰度圖)
-
數據集數量:
-
train:X - 384pcs ; Y - 384pcs ; Z - 384pcs
-
validation:X - 96pcs ; Y - 96pcs ; Z - 96pcs
-
test:X - 96pcs ; Y - 96pcs ; Z - 96pcs
-
-
其它說明:數據集已經經過 隨機 翻轉/平移/縮放/鏡像 等預處理進行增強
-
整體數據集情況如下圖所示:
代碼說明
環境設置
-
.NET 框架:使用.NET Framework 4.7.2及以上,或者使用.NET CORE 2.2及以上
-
CPU 配置: Any CPU 或 X64 皆可
-
GPU 配置:需要自行配置好CUDA和環境變數,建議 CUDA v10.1,Cudnn v7.5
類庫和命名空間引用
-
從NuGet安裝必要的依賴項,主要是SciSharp相關的類庫,如下圖所示:
註意事項:儘量安裝最新版本的類庫,CV須使用 SciSharp 的 SharpCV 方便內部變數傳遞
<PackageReference Include="Colorful.Console" Version="1.2.9" /> <PackageReference Include="Newtonsoft.Json" Version="12.0.3" /> <PackageReference Include="SciSharp.TensorFlow.Redist" Version="1.15.0" /> <PackageReference Include="SciSharp.TensorFlowHub" Version="0.0.5" /> <PackageReference Include="SharpCV" Version="0.2.0" /> <PackageReference Include="SharpZipLib" Version="1.2.0" /> <PackageReference Include="System.Drawing.Common" Version="4.7.0" /> <PackageReference Include="TensorFlow.NET" Version="0.14.0" />
-
引用命名空間,包括 NumSharp、Tensorflow 和 SharpCV ;
using NumSharp; using NumSharp.Backends; using NumSharp.Backends.Unmanaged; using SharpCV; using System; using System.Collections; using System.Collections.Generic; using System.Diagnostics; using System.IO; using System.Linq; using System.Runtime.CompilerServices; using Tensorflow; using static Tensorflow.Binding; using static SharpCV.Binding; using System.Collections.Concurrent; using System.Threading.Tasks;
###
主邏輯結構
主邏輯:
-
準備數據
-
創建計算圖
-
訓練
-
預測
public bool Run() { PrepareData(); BuildGraph(); using (var sess = tf.Session()) { Train(sess); Test(sess); } TestDataOutput(); return accuracy_test > 0.98; }
數據集載入
數據集下載和解壓
-
數據集地址:https://github.com/SciSharp/SciSharp-Stack-Examples/blob/master/data/data_CnnInYourOwnData.zip
-
數據集下載和解壓代碼 ( 部分封裝的方法請參考 GitHub完整代碼 ):
string url = "https://github.com/SciSharp/SciSharp-Stack-Examples/blob/master/data/data_CnnInYourOwnData.zip"; Directory.CreateDirectory(Name); Utility.Web.Download(url, Name, "data_CnnInYourOwnData.zip"); Utility.Compress.UnZip(Name + "\\data_CnnInYourOwnData.zip", Name);
字典創建
讀取目錄下的子文件夾名稱,作為分類的字典,方便後面One-hot使用
private void FillDictionaryLabel(string DirPath) { string[] str_dir = Directory.GetDirectories(DirPath, "*", SearchOption.TopDirectoryOnly); int str_dir_num = str_dir.Length; if (str_dir_num > 0) { Dict_Label = new Dictionary<Int64, string>(); for (int i = 0; i < str_dir_num; i++) { string label = (str_dir[i].Replace(DirPath + "\\", "")).Split('\\').First(); Dict_Label.Add(i, label); print(i.ToString() + " : " + label); } n_classes = Dict_Label.Count; } }
文件List讀取和打亂
從文件夾中讀取train、validation、test的list,並隨機打亂順序。
-
讀取目錄
ArrayFileName_Train = Directory.GetFiles(Name + "\\train", "*.*", SearchOption.AllDirectories); ArrayLabel_Train = GetLabelArray(ArrayFileName_Train); ArrayFileName_Validation = Directory.GetFiles(Name + "\\validation", "*.*", SearchOption.AllDirectories); ArrayLabel_Validation = GetLabelArray(ArrayFileName_Validation); ArrayFileName_Test = Directory.GetFiles(Name + "\\test", "*.*", SearchOption.AllDirectories); ArrayLabel_Test = GetLabelArray(ArrayFileName_Test);
-
獲得標簽
private Int64[] GetLabelArray(string[] FilesArray) { Int64[] ArrayLabel = new Int64[FilesArray.Length]; for (int i = 0; i < ArrayLabel.Length; i++) { string[] labels = FilesArray[i].Split('\\'); string label = labels[labels.Length - 2]; ArrayLabel[i] = Dict_Label.Single(k => k.Value == label).Key; } return ArrayLabel; }
-
隨機亂序
public (string[], Int64[]) ShuffleArray(int count, string[] images, Int64[] labels) { ArrayList mylist = new ArrayList(); string[] new_images = new string[count]; Int64[] new_labels = new Int64[count]; Random r = new Random(); for (int i = 0; i < count; i++) { mylist.Add(i); } for (int i = 0; i < count; i++) { int rand = r.Next(mylist.Count); new_images[i] = images[(int)(mylist[rand])]; new_labels[i] = labels[(int)(mylist[rand])]; mylist.RemoveAt(rand); } print("shuffle array list: " + count.ToString()); return (new_images, new_labels); }
部分數據集預先載入
Validation/Test數據集和標簽一次性預先載入成NDArray格式。
private void LoadImagesToNDArray() { //Load labels y_valid = np.eye(Dict_Label.Count)[new NDArray(ArrayLabel_Validation)]; y_test = np.eye(Dict_Label.Count)[new NDArray(ArrayLabel_Test)]; print("Load Labels To NDArray : OK!"); //Load Images x_valid = np.zeros(ArrayFileName_Validation.Length, img_h, img_w, n_channels); x_test = np.zeros(ArrayFileName_Test.Length, img_h, img_w, n_channels); LoadImage(ArrayFileName_Validation, x_valid, "validation"); LoadImage(ArrayFileName_Test, x_test, "test"); print("Load Images To NDArray : OK!"); } private void LoadImage(string[] a, NDArray b, string c) { for (int i = 0; i < a.Length; i++) { b[i] = ReadTensorFromImageFile(a[i]); Console.Write("."); } Console.WriteLine(); Console.WriteLine("Load Images To NDArray: " + c); } private NDArray ReadTensorFromImageFile(string file_name) { using (var graph = tf.Graph().as_default()) { var file_reader = tf.read_file(file_name, "file_reader"); var decodeJpeg = tf.image.decode_jpeg(file_reader, channels: n_channels, name: "DecodeJpeg"); var cast = tf.cast(decodeJpeg, tf.float32); var dims_expander = tf.expand_dims(cast, 0); var resize = tf.constant(new int[] { img_h, img_w }); var bilinear = tf.image.resize_bilinear(dims_expander, resize); var sub = tf.subtract(bilinear, new float[] { img_mean }); var normalized = tf.divide(sub, new float[] { img_std }); using (var sess = tf.Session(graph)) { return sess.run(normalized); } } }
計算圖構建
構建CNN靜態計算圖,其中學習率每n輪Epoch進行1次遞減。
#region BuildGraph public Graph BuildGraph() { var graph = new Graph().as_default(); tf_with(tf.name_scope("Input"), delegate { x = tf.placeholder(tf.float32, shape: (-1, img_h, img_w, n_channels), name: "X"); y = tf.placeholder(tf.float32, shape: (-1, n_classes), name: "Y"); }); var conv1 = conv_layer(x, filter_size1, num_filters1, stride1, name: "conv1"); var pool1 = max_pool(conv1, ksize: 2, stride: 2, name: "pool1"); var conv2 = conv_layer(pool1, filter_size2, num_filters2, stride2, name: "conv2"); var pool2 = max_pool(conv2, ksize: 2, stride: 2, name: "pool2"); var layer_flat = flatten_layer(pool2); var fc1 = fc_layer(layer_flat, h1, "FC1", use_relu: true); var output_logits = fc_layer(fc1, n_classes, "OUT", use_relu: false); //Some important parameter saved with graph , easy to load later var img_h_t = tf.constant(img_h, name: "img_h"); var img_w_t = tf.constant(img_w, name: "img_w"); var img_mean_t = tf.constant(img_mean, name: "img_mean"); var img_std_t = tf.constant(img_std, name: "img_std"); var channels_t = tf.constant(n_channels, name: "img_channels"); //learning rate decay gloabl_steps = tf.Variable(0, trainable: false); learning_rate = tf.Variable(learning_rate_base); //create train images graph tf_with(tf.variable_scope("LoadImage"), delegate { decodeJpeg = tf.placeholder(tf.@byte, name: "DecodeJpeg"); var cast = tf.cast(decodeJpeg, tf.float32); var dims_expander = tf.expand_dims(cast, 0); var resize = tf.constant(new int[] { img_h, img_w }); var bilinear = tf.image.resize_bilinear(dims_expander, resize); var sub = tf.subtract(bilinear, new float[] { img_mean }); normalized = tf.divide(sub, new float[] { img_std }, name: "normalized"); }); tf_with(tf.variable_scope("Train"), delegate { tf_with(tf.variable_scope("Loss"), delegate { loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels: y, logits: output_logits), name: "loss"); }); tf_with(tf.variable_scope("Optimizer"), delegate { optimizer = tf.train.AdamOptimizer(learning_rate: learning_rate, name: "Adam-op").minimize(loss, global_step: gloabl_steps); }); tf_with(tf.variable_scope("Accuracy"), delegate { var correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name: "correct_pred"); accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name: "accuracy"); }); tf_with(tf.variable_scope("Prediction"), delegate { cls_prediction = tf.argmax(output_logits, axis: 1, name: "predictions"); prob = tf.nn.softmax(output_logits, axis: 1, name: "prob"); }); }); return graph; } /// <summary> /// Create a 2D convolution layer /// </summary> /// <param name="x">input from previous layer</param> /// <param name="filter_size">size of each filter</param> /// <param name="num_filters">number of filters(or output feature maps)</param> /// <param name="stride">filter stride</param> /// <param name="name">layer name</param> /// <returns>The output array</returns> private Tensor conv_layer(Tensor x, int filter_size, int num_filters, int stride, string name) { return tf_with(tf.variable_scope(name), delegate { var num_in_channel = x.shape[x.NDims - 1]; var shape = new[] { filter_size, filter_size, num_in_channel, num_filters }; var W = weight_variable("W", shape); // var tf.summary.histogram("weight", W); var b = bias_variable("b", new[] { num_filters }); // tf.summary.histogram("bias", b); var layer = tf.nn.conv2d(x, W, strides: new[] { 1, stride, stride, 1 }, padding: "SAME"); layer += b; return tf.nn.relu(layer); }); } /// <summary> /// Create a max pooling layer /// </summary> /// <param name="x">input to max-pooling layer</param> /// <param name="ksize">size of the max-pooling filter</param> /// <param name="stride">stride of the max-pooling filter</param> /// <param name="name">layer name</param> /// <returns>The output array</returns> private Tensor max_pool(Tensor x, int ksize, int stride, string name) { return tf.nn.max_pool(x, ksize: new[] { 1, ksize, ksize, 1 }, strides: new[] { 1, stride, stride, 1 }, padding: "SAME", name: name); } /// <summary> /// Flattens the output of the convolutional layer to be fed into fully-connected layer /// </summary> /// <param name="layer">input array</param> /// <returns>flattened array</returns> private Tensor flatten_layer(Tensor layer) { return tf_with(tf.variable_scope("Flatten_layer"), delegate { var layer_shape = layer.TensorShape; var num_features = layer_shape[new Slice(1, 4)].size; var layer_flat = tf.reshape(layer, new[] { -1, num_features }); return layer_flat; }); } /// <summary> /// Create a weight variable with appropriate initialization /// </summary> /// <param name="name"></param> /// <param name="shape"></param> /// <returns></returns> private RefVariable weight_variable(string name, int[] shape) { var initer = tf.truncated_normal_initializer(stddev: 0.01f); return tf.get_variable(name, dtype: tf.float32, shape: shape, initializer: initer); } /// <summary> /// Create a bias variable with appropriate initialization /// </summary> /// <param name="name"></param> /// <param name="shape"></param> /// <returns></returns> private RefVariable bias_variable(string name, int[] shape) { var initial = tf.constant(0f, shape: shape, dtype: tf.float32); return tf.get_variable(name, dtype: tf.float32, initializer: initial); } /// <summary> /// Create a fully-connected layer /// </summary> /// <param name="x">input from previous layer</param> /// <param name="num_units">number of hidden units in the fully-connected layer</param> /// <param name="name">layer name</param> /// <param name="use_relu">boolean to add ReLU non-linearity (or not)</param> /// <returns>The output array</returns> private Tensor fc_layer(Tensor x, int num_units, string name, bool use_relu = true) { return tf_with(tf.variable_scope(name), delegate { var in_dim = x.shape[1]; var W = weight_variable("W_" + name, shape: new[] { in_dim, num_units }); var b = bias_variable("b_" + name, new[] { num_units }); var layer = tf.matmul(x, W) + b; if (use_relu) layer = tf.nn.relu(layer); return layer; }); } #endregion
模型訓練和模型保存
-
Batch數據集的讀取,採用了 SharpCV 的cv2.imread,可以直接讀取本地圖像文件至NDArray,實現CV和Numpy的無縫對接;
-
使用.NET的非同步線程安全隊列BlockingCollection<T>,實現TensorFlow原生的隊列管理器FIFOQueue;
-
在訓練模型的時候,我們需要將樣本從硬碟讀取到記憶體之後,才能進行訓練。我們在會話中運行多個線程,並加入隊列管理器進行線程間的文件入隊出隊操作,並限制隊列容量,主線程可以利用隊列中的數據進行訓練,另一個線程進行本地文件的IO讀取,這樣可以實現數據的讀取和模型的訓練是非同步的,降低訓練時間。
-
-
模型的保存,可以選擇每輪訓練都保存,或最佳訓練模型保存
#region Train public void Train(Session sess) { // Number of training iterations in each epoch var num_tr_iter = (ArrayLabel_Train.Length) / batch_size; var init = tf.global_variables_initializer(); sess.run(init); var saver = tf.train.Saver(tf.global_variables(), max_to_keep: 10); path_model = Name + "\\MODEL"; Directory.CreateDirectory(path_model); float loss_val = 100.0f; float accuracy_val = 0f; var sw = new Stopwatch(); sw.Start(); foreach (var epoch in range(epochs)) { print($"Training epoch: {epoch + 1}"); // Randomly shuffle the training data at the beginning of each epoch (ArrayFileName_Train, ArrayLabel_Train) = ShuffleArray(ArrayLabel_Train.Length, ArrayFileName_Train, ArrayLabel_Train); y_train = np.eye(Dict_Label.Count)[new NDArray(ArrayLabel_Train)]; //decay learning rate if (learning_rate_step != 0) { if ((epoch != 0) && (epoch % learning_rate_step == 0)) { learning_rate_base = learning_rate_base * learning_rate_decay; if (learning_rate_base <= learning_rate_min) { learning_rate_base = learning_rate_min; } sess.run(tf.assign(learning_rate, learning_rate_base)); } } //Load local images asynchronously,use queue,improve train efficiency BlockingCollection<(NDArray c_x, NDArray c_y, int iter)> BlockC = new BlockingCollection<(NDArray C1, NDArray C2, int iter)>(TrainQueueCapa); Task.Run(() => { foreach (var iteration in range(num_tr_iter)) { var start = iteration * batch_size; var end = (iteration + 1) * batch_size; (NDArray x_batch, NDArray y_batch) = GetNextBatch(sess, ArrayFileName_Train, y_train, start, end); BlockC.Add((x_batch, y_batch, iteration)); } BlockC.CompleteAdding(); }); foreach (var item in BlockC.GetConsumingEnumerable()) { sess.run(optimizer, (x, item.c_x), (y, item.c_y)); if (item.iter % display_freq == 0) { // Calculate and display the batch loss and accuracy var result = sess.run(new[] { loss, accuracy }, new FeedItem(x, item.c_x), new FeedItem(y, item.c_y)); loss_val = result[0]; accuracy_val = result[1]; print("CNN:" + ($"iter {item.iter.ToString("000")}: Loss={loss_val.ToString("0.0000")}, Training Accuracy={accuracy_val.ToString("P")} {sw.ElapsedMilliseconds}ms")); sw.Restart(); } } // Run validation after every epoch (loss_val, accuracy_val) = sess.run((loss, accuracy), (x, x_valid), (y, y_valid)); print("CNN:" + "---------------------------------------------------------"); print("CNN:" + $"gloabl steps: {sess.run(gloabl_steps) },learning rate: {sess.run(learning_rate)}, validation loss: {loss_val.ToString("0.0000")}, validation accuracy: {accuracy_val.ToString("P")}"); print("CNN:" + "---------------------------------------------------------"); if (SaverBest) { if (accuracy_val > max_accuracy) { max_accuracy = accuracy_val; saver.save(sess, path_model + "\\CNN_Best"); print("