机器学习框架ML.NET学习笔记【8】目标检测（采用YOLO2模型） - 好文

一、概述

本篇文章介绍通过YOLO模型进行目标识别的应用，原始代码来源于：https://github.com/dotnet/machinelearning-samples

实现的功能是输入一张图片，对图片中的目标进行识别，输出结果在图片中通过红色框线标记出来。如下：

YOLO简介

YOLO（You Only Look
Once）是一种最先进的实时目标检测系统。官方网站：https://pjreddie.com/darknet/yolo/

本文采用的是TinyYolo2模型，可以识别的目标类型包括："aeroplane", "bicycle", "bird", "boat",
"bottle","bus", "car", "cat", "chair", "cow","diningtable", "dog", "horse",
"motorbike", "person","pottedplant", "sheep", "sofa", "train", "tvmonitor" 。

ONNX简介

ONNX 即Open Neural Network
Exchange（开放神经网络交换格式），是一个用于表示深度学习模型的通用标准，可使模型在不同框架之间进行互相访问，其规范及代码主要由微软，亚马逊
，Facebook 和 IBM 等公司共同制定与开发。有了ONNX标准，我们就可以在ML.NET代码中使用通过其他机器学习框架训练并保存的模型。

二、代码分析

1、Main方法
static void Main(string[] args) { TrainAndSave(); LoadAndPredict();
Console.WriteLine("Press any key to exit!"); Console.ReadKey(); }
第一次运行时需要运行TrainAndSave方法，生成本地模型后，可以直接运行生产代码。

2、训练并保存模型
　　　　static readonly string tagsTsv = Path.Combine(trainImagesFolder,
"tags.tsv");
　　　　 private static void TrainAndSave() { var mlContext = new MLContext(); var
trainData = mlContext.Data.LoadFromTextFile<ImageNetData>(tagsTsv); var
pipeline = mlContext.Transforms.LoadImages(outputColumnName:"image",
imageFolder: trainImagesFolder, inputColumnName:
nameof(ImageNetData.ImagePath))
.Append(mlContext.Transforms.ResizeImages(outputColumnName:"image", imageWidth:
ImageNetSettings.imageWidth, imageHeight: ImageNetSettings.imageHeight,
inputColumnName:"image"))
.Append(mlContext.Transforms.ExtractPixels(outputColumnName:"image"))
.Append(mlContext.Transforms.ApplyOnnxModel(modelFile: YOLO_ModelFilePath,
outputColumnNames:new[] { TinyYoloModelSettings.ModelOutput }, inputColumnNames:
new[] { TinyYoloModelSettings.ModelInput })); var model =
pipeline.Fit(trainData);using (var file =
File.OpenWrite(ObjectDetectionModelFilePath)) mlContext.Model.Save(model,
trainData.Schema, file); Console.WriteLine("Save Model success!"); }
ImageNetData类定义如下：
public class ImageNetData { [LoadColumn(0)] public string ImagePath;
[LoadColumn(1)] public string Label; }
tags.tsv文件中仅包含一条样本数据，因为模型已经训练好，不存在再次训练的意义。这里只要放一张图片样本即可，通过Fit方法建立数据处理通道模型。

ApplyOnnxModel方法加载第三方ONNX模型，
public struct TinyYoloModelSettings { // input tensor name public const string
ModelInput ="image"; // output tensor name public const string ModelOutput = "
grid"; }
其中，输入、输出的列名称是指定的。可以通过安装Netron这样的工具来查询ONNX文件的详细信息，可以看到输入输出的数据列名称。

3、应用 private static void LoadAndPredict() { var mlContext = new MLContext();
ITransformer trainedModel;using (var stream =
File.OpenRead(ObjectDetectionModelFilePath)) { trainedModel=
mlContext.Model.Load(stream,out var modelInputSchema); } var predictionEngine
= mlContext.Model.CreatePredictionEngine<ImageNetData, ImageNetPrediction>
(trainedModel); DirectoryInfo testdir= new DirectoryInfo(testimagesFolder);
foreach (var jpgfile in testdir.GetFiles("*.jpg")) { ImageNetData image = new
ImageNetData { ImagePath= jpgfile.FullName };
var Predicted = predictionEngine.Predict(image);
PredictImage(image.ImagePath, Predicted); } }
代码遍历一个文件夹下面的JPG文件。对每一个文件进行转换，获得预测结果。 ImageNetPrediction类定义如下： public class
ImageNetPrediction { [ColumnName(TinyYoloModelSettings.ModelOutput)]public float
[] PredictedLabels; }
输出的“grid”列数据是一个float数组，不能直接理解其含义，所以需要通过代码将其数据转换为便于理解的格式。
YoloWinMlParser _parser = new YoloWinMlParser(); IList<YoloBoundingBox>
boundingBoxes = _parser.ParseOutputs(Predicted.PredictedLabels,0.4f);

YoloWinMlParser.ParseOutputs方法将float数组转为YoloBoundingBox对象的列表，第二个参数是可信度阙值，只输出大于该可信度的数据。

YoloBoundingBox类定义如下：
class YoloBoundingBox { public string Label { get; set; } public float
Confidence {get; set; } public float X { get; set; } public float Y { get; set;
}public float Height { get; set; } public float Width { get; set; } public
RectangleF Rect {get { return new RectangleF(X, Y, Width, Height); } } }
其中：Label为目标类型，Confidence为可行程度。

由于YOLO的特点导致对同一个目标会输出多个同样的检测结果，所以还需要对检测结果进行过滤，去掉那些高度重合的结果。
YoloWinMlParser _parser = new YoloWinMlParser(); IList<YoloBoundingBox>
boundingBoxes = _parser.ParseOutputs(Predicted.PredictedLabels,0.4f); var
filteredBoxes = _parser.NonMaxSuppress(boundingBoxes,5, 0.6F);
YoloWinMlParser.NonMaxSuppress第二个参数表示最多保留多少个结果，第三个参数表示重合率阙值，将去掉重合率大于该值的记录。

四、资源获取

源码下载地址：https://github.com/seabluescn/Study_ML.NET

工程名称：YOLO_ObjectDetection

资源获取：https://gitee.com/seabluescn/ML_Assets （ObjectDetection）

点击查看机器学习框架ML.NET学习笔记系列文章目录 <https://www.cnblogs.com/seabluescn/p/10904391.html>

热门工具换一换