This blog series is for learningTensorflow andPython Of, Because it's a rookie, If there are any mistakes, please point out.

The whole project can be downloaded from Baidu cloud: link:https://pan.baidu.com/s/1f2JPJpE7m5M2kSifMP0-Lw
<https://pan.baidu.com/s/1f2JPJpE7m5M2kSifMP0-Lw> Password:9p8v

One. Training data preparation

In training data preparation, It mainly includes the following three parts:

* How to analyze the training method of vehicle detectionKITTI data set
* How to expand data to increase the diversity of training data
* How to supply model in training stagebatch Training data
1. readKITTI data set

First toKITTI Official websitehttp://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d
<http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d>
Download vehicle test data set.

Concretely, Just download the following3 Compressed package:( Email required for download link)

*
Download left color images of object data set (12 GB)
<http://www.cvlibs.net/download.php?file=data_object_image_2.zip>

*
Download training labels of object data set (5 MB)
<http://www.cvlibs.net/download.php?file=data_object_label_2.zip>

*
Download object development kit (1 MB)
<http://kitti.is.tue.mpg.de/kitti/devkit_object.zip>

KITTI The data set adopts the form of one picture corresponding to one annotation file, Where the annotation file isTXT format, Content isN That's ok15 column, Space each column. this15 The contents of the column are:

Column number Name describe
1 category Target category, common8 class:’Car’, ‘Van’, ‘Truck’,’Pedestrian’, ‘Person_sitting’, ‘Cyclist’,
‘Tram’, ‘Misc’ perhaps ‘DontCare’
2 Whether there is truncation Whether the target is beyond the image boundary,0: (non-truncated), 1: (truncated)
3 Occlusion condition 0 = fully visible, 1 = partly occluded 2 = largely occluded, 3 = unknown
4 Target observation angle Range[-pi..pi]
5-8 targetbbox Coordinate from0 start,[left, top, right, bottom]
9-11 3D dimension 3D object dimensions: height, width, length (in meters)
12-14 3D Spatial coordinates D object location x,y,z in camera coordinates (in meters)
15 Y Axial rotation angle Rotation ry around Y-axis in camera coordinates [-pi..pi]
16 confidence score For use onlyTest, Floating point number, Used for renderingp/r curve
Remarks:
‘DontCare’ Indicates an unmarked area ignored, This may be because the laser scanner is out of range. Testing time, Results in this section are automatically ignored. This part can also be ignored during training, To prevent the continuous occurrence ofHard
Mining operation.

Because only vehicle detection is carried out here, So we only focus on category andBBox information. in addition, take’Car’, ‘Van’,
‘Truck’ this3 Class merge to positive sample target, Rest as background area.

First, We need to read each annotation file in batch:
# readKITTI.py Used for parsingKITTI data set import os # Gets the list of files with the specified suffix def get_filelist(path,ext)
: # Get all files under a folder filelist_temp = os.listdir(path) filelist = [] #
By comparing suffixes, Select allTXT Tagging file for i in filelist_temp: if os.path.splitext(i)[1] == ext:
filelist.append(os.path.splitext(i)[0]) return filelist # Resolves the annotation file and returns thebounding
box information, dimensionNx4 def get_bbox(filename): bbox = [] # Judge whether the file exists if
os.path.exists(filename):with open(filename) as fi: label_data = fi.readlines()
# Read the annotation information of each line in turn for l in label_data: data = l.split() # Record if vehicle target existsbounding box if
data[0] in ['Van','Car','Truck']: bbox.append((float(data[4]),float(data[5]),
float(data[6]),float(data[7]))) return bbox # Batch access to annotation filesbounding box information def
get_bboxlist(rootpath,imagelist): bboxlist = [] for i in imagelist:
bboxlist.append(get_bbox(rootpath + i +'.txt')) return bboxlist
By calling the above function, We can read itKITTI Dataset is the form we need:
import readKITTI IMAGE_DIR = './image/training/image_2/' LABEL_DIR =
'./label/training/label_2/' imagelist = readKITTI.get_filelist(IMAGE_DIR,'.png'
) bboxlist = readKITTI.get_bboxlist(LABEL_DIR,imagelist)
2. Data expansion

In the training process of deep learning model, Data expansion(Data Augmentation) Usually used. among, Random scaling, Tailoring, random invert
Should be the most widely used and effective means.( As for contrast adjustment, Color adjustment,PCA These things, It's hard to say.)

For target detection, It's very important: While adjusting the image, We also need to ensure the goalbounding box Validity and correctness of.

zoom

It can be used for subsequent model trainingBatch, Usually we fix the input image to the same size, So imageresize It is necessary to adjust the color uniformity.
# imAugment.py Provide some functions for data expansion import cv2 # Scale the image to the specified size, Simultaneous processingboundingbox And color information
def imresize(in_img,in_bbox,out_w,out_h,is_color = True): # Determine whether it is a string if
isinstance(in_img,str): in_img = cv2.imread(in_img)# Get image width and height height, width =
in_img.shape[:2] out_img = cv2.resize(in_img,(out_w, out_h)) # Adjust image color if
is_color ==True and in_img.ndim == 2 : out_img = cv2.cvtColor(out_img,
cv2.COLOR_GRAY2BGR)elif is_color == False and in_img.ndim == 3 : out_img =
cv2.cvtColor(out_img, cv2.COLOR_BGR2GRAY)# adjustmentbounding box s_h = out_h / height
s_w = out_w / width out_bbox = []for i in in_bbox: out_bbox.append((i[0]*s_w, i[
1]*s_h, i[2]*s_w, i[3]*s_h)) return out_img, out_bbox
Horizontal flip

For vehicle detection, No need to flip vertically, We're only doing a horizontal flip here, And the corresponding flipbounding box.
# imAugment.py Provide some functions for data expansion # Flip the image horizontally, Simultaneous processingboundingbox def immirror
(in_img,in_bbox): # Determine whether it is a string if isinstance(in_img,str): in_img =
cv2.imread(in_img)# Image flip horizontally out_img = cv2.flip(in_img,1) # Get image width width =
out_img.shape[1] # Relocate the target on the flipped image out_bbox = [] for i in in_bbox:
out_bbox.append((width - i[0], i[1], width-i[2], i[3])) return out_img, out_bbox
Random clipping

In fact, there are many constraints and precautions in random cutting, There are mainly the following points:

* The minimum crop block size needs to be specified. Otherwise, if the cutting block is too small, Not for training.
* Too small an image should no longer be cropped
* Because we can't accurately describe whether a cut target is an effective and recognizable target, So our crop region should include all thebounding box. #
imAugment.py Provide some functions for data expansion import random # Randomize imagescrop, Simultaneous processingboundingbox,
min_wh bycrop Minimum width and height of block def imcrop(in_img,in_bbox,min_hw): # Determine whether it is a string if
isinstance(in_img,str): in_img = cv2.imread(in_img)# Get image width and height height, width =
in_img.shape[:2] # If the image is too small, Then give upcrop if height <= min_hw and width <= min_hw:
return in_img, in_bbox # In order to prevent effective targets from beingcrop truncation,crop Scope should include all objectives # Look for the smallest rectangle with all targets
min_x1, min_y1, min_x2, min_y2 = width-1, height-1, 0, 0 for i in in_bbox:
min_x1 = min(min_x1,int(i[0])) min_y1 = min(min_y1,int(i[1])) min_x2 =
max(min_x2,int(i[2])) min_y2 = max(min_y2,int(i[3])) #
According to the minimum bounding box, Then randomly generate a rectangle, And prevent the frame from exceeding the image range rand_x1, rand_y1, rand_x2, rand_y2 = 0, 0, width,
heightif min_x1 <= 1: rand_x1 = 0 else: rand_x1 = random.randint(0
,min(min_x1,max(width - min_hw,1))) if min_y1 <= 1: rand_y1 = 0 else: rand_y1 =
random.randint(0,min(min_y1,max(height - min_hw,1))) if min_x2 >= width or
rand_x1 + min_hw >= width: rand_x2 = widthelse: rand_x2 =
random.randint(max(rand_x1+min_hw,min_x2),width)if min_y2 >= height or rand_y1
+ min_hw >= height: rand_y2 = heightelse: rand_y2 =
random.randint(max(rand_y1+min_hw,min_y2),height)# crop image out_img =
in_img[rand_y1:rand_y2-1,rand_x1:rand_x2-1] # Handlebounding box out_bbox = [] for i
in in_bbox: out_bbox.append((i[0]-rand_x1,i[1]-rand_y1,i[2]-rand_x1,i[3
]-rand_y1))return out_img, out_bbox
Here is the rendering:( The top is the original picture, The following is a horizontal flip, Zoom and random crop)



3. Batch generate


In the training stage, we need to generate one by onebatch Used for training, Generally, the required parameter settings include:batchsize, Size of training picture, colour, Whethershuffle data, Is it random?crop etc.. Based on this, Here is a supplybatch Code:
# genBatch.py Used to provide training data in training stage # coding=utf-8 import random import readKITTI import
imAugmentimport cv2 class genBatch: image_dir, label_dir = [], [] image_list,
bbox_list = [], [] initOK =False # Initialize read data def initdata(self, imagedir,
labeldir): self.image_dir, self.label_dir = imagedir, labeldir self.image_list
= readKITTI.get_filelist(imagedir,'.png') self.bbox_list =
readKITTI.get_bboxlist(labeldir,self.image_list)# If the data is not empty and the picture andlabel Quantity matches if
len(self.image_list) >0 and len(self.image_list) == len(self.bbox_list):
self.initOK =True else: print("The amount of images is %d, while the amount of"
"corresponding label is %d"%(len(self.image_list),len(self.bbox_list)))
self.initOK =False return self.initOK readPos = 0 # Generate a newbatch def genbatch
(self,batchsize,newh,neww,iscolor=True,isshuffle=False, mirrorratio=0.0,
cropratio=0.0): if self.initOK == False: print("The initdata() function must be
successfully called first.") return [] batch_data, batch_bbox = [], [] for i in
range(batchsize):# When the data is traversed if self.readPos >= len(self.image_list)-1:
self.readPos =0 if isshuffle == True: # Specify the same random seed, Guarantee pictures andlabel In the same disordered order r_seed =
random.random() random.seed(r_seed) random.shuffle(self.image_list)
random.seed(r_seed) random.shuffle(self.bbox_list) img =
cv2.imread(self.image_dir + self.image_list[self.readPos] +'.png') bbox =
self.bbox_list[self.readPos] self.readPos +=1 # According to the specified probabilitycrop, Remember that tailoring should happen atresize before
if cropratio > 0 and random.random() < cropratio: img, bbox =
imAugment.imcrop(img,bbox,min(neww,newh))# Adjust image size and color img, bbox =
imAugment.imresize(img,bbox,neww,newh,iscolor)# Random image according to specified probability if mirrorratio > 0
and random.random() < mirrorratio: img, bbox = imAugment.immirror(img,bbox)
batch_data.append(img) batch_bbox.append(bbox)return batch_data, batch_bbox