系统环境

ubuntu14.04

python2.7


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


说明:基于cpu环境的py-faster-rcnn具体配置过程可以看我的另一篇文章点击打开链接
<https://blog.csdn.net/weixin_40369473/article/details/79941908>

下面我将从制作做数据集开始讲解利用py-faster-rcnn训练自己的数据模型的过程。

制作数据集


在制作自己的数据集之前,我们先下载VOC2007数据集。

百度云地址:http://pan.baidu.com/s/1gfdSFRX <http://pan.baidu.com/s/1gfdSFRX>

解压,然后,将该数据集放在py-faster-rcnn-master\data目录下。(后面你将用你的训练数据集替换VOC2007数据集。(替换Annotations,ImageSets和JPEGImages)

(用你的Annotations,ImagesSets和JPEGImages替换py-faster-rcnn\data\VOCdevkit2007\VOC2007中对应文件夹)


文件结构如下所示:



Annotations中是所有的xml文件
JPEGImages中是所有的训练图片
Main中是4个txt文件,其中test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集


(一)图片命名


我们需要将自己的数据集做成VOC2007格式用于训练,那么我们应该首先将图片重新命名为“000001.jpg”这种格式,这是VOC2007标准格式。我们首先将训练图片全部放入同一个文件夹下,如我刚开始做测试时将图片放在了下面路径下:/home/wlw/VS_code_projects/cat_dog_picture。下面利用python将这些图片进行批量重命名。

#_*_coding:utf-8 import os
pic_path="/home/wlw/VS_code_projects/cat_dog_picture" def rename():
piclist=os.listdir(pic_path) total_num=len(piclist) i=1 for pic in piclist: if
pic.endswith(".jpg"):
old_path=os.path.join(os.path.abspath(pic_path),pic)#os.path.abspath获得绝对路径
new_path=os.path.join(os.path.abspath(pic_path),'000'+format(str(i),'0>3')+'.jpg')
os.renames(old_path,new_path) print
u"把原图片命名格式:"+old_path+u"转换为新图片命名格式:"+new_path #print "把原图片路径:%s,转换为新图片路径:%s"
%(old_path,new_path) i=i+1 print "总共"+str(total_num)+"张图片被重命名为:"
"000001.jpg~"+'000'+format(str(i-1),'0>3')+".jpg形式" rename()
效果如下:





(二)画目标包围框并自动生成XML文件

这里我利用了labelimg工具点击打开链接,可以利用它自定义绘制目标包围框,并自动生成xml文件
<https://github.com/tzutalin/labelImg>


<https://github.com/tzutalin/labelImg>

<https://github.com/tzutalin/labelImg>

(三)利用python将XML文件生成ImageSets\Main里的四个txt文件



txt文件里的内容为:




即图片名(无后缀),test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集。这里我设定,trainval大概是整个数据集的80%,test也大概是整个数据集的20%;train大概是trainval的80%,val大概是trainval的20%。python
代码如下:

#_*_coding:utf-8 import os import random #import numpy as np #from
sklearn.model_selection import train_test_split
xmlfilepath="/home/wlw/VS_code_projects/pic_xml"
txtsavepath="/home/wlw/VS_code_projects/pic_txt" trainval_percent=0.8
#traincal占整个数据集的80%,剩下的就是test所占的百分比 train_percent=0.8
#train占trainval的百分比,剩下的就是val所占百分比 def xml_to_txt():
xmllist=os.listdir(xmlfilepath)#xml文件列表 xml_num=len(xmllist)#xml文件数量
num_list=range(xml_num)#将xml文件分别用数字表示 #
trainval=xmllist[:int(num_xml*train_percent)]#trainval数据集 #
test=xmllist[int(num_xml*train_percent):]#test数据集 #
trainvalsize=len(trainval)#trainval数据集大小 #
train=trainval[:int(trainvalsize*train_percent)]#train数据集 #
val=trainval[int(trainvalsize*train_percent):]#val数据集
trainval_num=int(xml_num*train_percent)
trainval=random.sample(num_list,trainval_num)#从xml文件中随机选取一部分当作trainval数据集
train_num=int(trainval_num*train_percent)
train=random.sample(trainval,train_num)#从trainval文件中随机选取一部分当作train数据集
ftrainval=open(txtsavepath+'/trainval.txt','w')
ftest=open(txtsavepath+'/test.txt','w')
ftrain=open(txtsavepath+'/train.txt','w') fval=open(txtsavepath+'/val.txt','w')
for i in num_list: name=xmllist[i][:-4]+'\n' if i in trainval:
ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name)
else: ftest.write(name) ftrainval.close() ftrain.close() fval.close()
ftest.close() xml_to_txt()

这样,数据集就基本做好了,将你的各个文件分别替换掉py-faster-rcnn\data\VOCdevkit2007\VOC2007中对应文件夹。Annotations中是所有的xml文件
JPEGImages中是所有的训练图片


Main中是4个txt文件,其中test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集。

至此数据集工作全部做好,下面开始做大量训练之前的修改工作。

修改步骤


(1)因为是在cpu环境下进行训练,所以首先打开py-faster-rcnn-master/tools
/train_faster_rcnn_alt_opt.py文件:

    将34-36行有关于gpu的部分注释掉;将213行cfg.GPU_ID = args.gpu_id也注释掉;
def parse_args(): """ Parse input arguments """ parser =
argparse.ArgumentParser(description='Train a Faster R-CNN network')
#parser.add_argument('--gpu', dest='gpu_id', # help='GPU device id to use [0]',
# default=0, type=int)
    将213行cfg.GPU_ID = args.gpu_id也注释掉
if __name__ == '__main__': args = parse_args() print('Called with args:')
print(args) if args.cfg_file is not None: cfg_from_file(args.cfg_file) if
args.set_cfgs is not None: cfg_from_list(args.set_cfgs) #cfg.GPU_ID =
args.gpu_id
    将102行caffe.set_mode_gpu()改为cpu
def _init_caffe(cfg): """Initialize pycaffe in a training process. """ import
caffe # fix the random seeds (numpy and caffe) for reproducibility
np.random.seed(cfg.RNG_SEED) caffe.set_random_seed(cfg.RNG_SEED) # set up caffe
caffe.set_mode_cpu() #caffe.set_device(cfg.GPU_ID)
(2)打开py-faster-rcnn-master/experiments/scripts/faster_rcnn_alt_opt.sh文件,
将其中关于gpu的内容注释掉,从46行到最后,修改如下:
#time ./tools/train_faster_rcnn_alt_opt.py --gpu ${GPU_ID} \ time cd
/home/wlw/Downloads/py-faster-rcnn-master/tools/ python
train_faster_rcnn_alt_opt.py --net_name ${NET} --weights
data/imagenet_models/${NET}.v2.caffemodel \ --imdb ${TRAIN_IMDB} \ --cfg
experiments/cfgs/faster_rcnn_alt_opt.yml \ ${EXTRA_ARGS} set +x NET_FINAL=`grep
"Final model:" ${LOG} | awk '{print $3}'` set -x#time ./tools/test_net.py --gpu
${GPU_ID} \ time cd /home/wlw/Downloads/py-faster-rcnn-master/tools/
test_net.py --def
/home/wlw/Downloads/py-faster-rcnn-master/models/${PT_DIR}/${NET}/faster_rcnn_alt_opt/faster_rcnn_test.pt
\ --net ${NET_FINAL} \ --imdb ${TEST_IMDB} \ --cfg
experiments/cfgs/faster_rcnn_alt_opt.yml \ ${EXTRA_ARGS}
至此关于将gpu相关内容修改为cpu就完成了,下面开始对训练前的一些内容进行修改。


(3)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/
stage1_fast_rcnn_train.pt文件,修改:
layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top:
'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights'
python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str:
"'num_classes':3" #按训练集类别改,该值为类别数+1 } }layer { layer { name: "bbox_pred" type:
"InnerProduct" bottom: "fc7" top: "bbox_pred" param { lr_mult: 1.0 } param {
lr_mult: 2.0 } inner_product_param { num_output: 12 #按训练集类别改,该值为(类别数+1)*4
weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant"
value: 0 } } }
name: "cls_score" type: "InnerProduct" bottom: "fc7" top: "cls_score" param {
lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 3
#按训练集类别改,该值为类别数+1 weight_filler { type: "gaussian" std: 0.01 } bias_filler {
type: "constant" value: 0 } } }
(4)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/
stage1_rpn_train.pt修改
layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top:
'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer'
param_str: "'num_classes': 3" #按训练集类别改,该值为类别数+1 } }
(5)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/
stage2_fast_rcnn_train.pt修改


layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top:
'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights'
python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str:
"'num_classes': 3" #按训练集类别改,该值为类别数+1 } }layer { name: "cls_score" type:
"InnerProduct" bottom: "fc7" top: "cls_score" param { lr_mult: 1.0 } param {
lr_mult: 2.0 } inner_product_param { num_output: 3 #按训练集类别改,该值为类别数+1
weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant"
value: 0 } } }layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top:
"bbox_pred" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param {
num_output: 12 #按训练集类别改,该值为(类别数+1)*4 weight_filler { type: "gaussian" std:
0.001 } bias_filler { type: "constant" value: 0 } } }
(6)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/
stage2_rpn_train.pt修改

layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top:
'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer'
param_str: "'num_classes': 3" #按训练集类别改,该值为类别数+1 } }
(7)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/
faster_rcnn_test.pt修改

layer { name: "cls_score" type: "InnerProduct" bottom: "fc7" top: "cls_score"
inner_product_param { num_output: 3 #按训练集类别改,该值为类别数+1 } }layer { name:
"bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred"
inner_product_param { num_output: 12 #按训练集类别改,该值为(类别数+1)*4 } }
(8)py-faster-rcnn-master/lib/datasets/pascal_voc.py修改

class pascal_voc(imdb): def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set) self._year = year
self._image_set = image_set self._devkit_path = self._get_default_path() if
devkit_path is None \ else devkit_path self._data_path =
os.path.join(self._devkit_path, 'VOC' + self._year) self._classes =
('__background__', # always index 0 #'aeroplane', 'bicycle', 'bird', 'boat',
#'bottle', 'bus', 'car', 'cat', 'chair', #'cow', 'diningtable', 'dog', 'horse',
#'motorbike', 'person', 'pottedplant', #'sheep', 'sofa', 'train', 'tvmonitor'
'cat','dog') #改为你自己的标签(9)py-faster-rcnn-master/lib/datasets/imdb.py修改

def append_flipped_images(self): num_images = self.num_images widths =
[PIL.Image.open(self.image_path_at(i)).size[0] for i in xrange(num_images)] for
i in xrange(num_images): boxes = self.roidb[i]['boxes'].copy() oldx1 = boxes[:,
0].copy() oldx2 = boxes[:, 2].copy() boxes[:, 0] = widths[i] - oldx2 - 1print
boxes[:, 0] boxes[:, 2] = widths[i] - oldx1 - 1 print boxes[:, 0] assert
(boxes[:, 2] >= boxes[:, 0]).all() entry = {'boxes' : boxes, 'gt_overlaps' :
self.roidb[i]['gt_overlaps'], 'gt_classes' : self.roidb[i]['gt_classes'],
'flipped' : True} self.roidb.append(entry) self._image_index =
self._image_index * 2
    注意:为防止与之前的模型搞混,训练前把output文件夹删除(或改个其他名),还要把py-faster-rcnn-master/data/cache
中的文件和 py-faster-rcnn-master/data/VOCdevkit2007/annotations_cache中的文件删除(如果有的话)。

    
至于学习率等之类的设置,可在py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt中的4个solve文件设置,迭代次数可在py-faster-rcnn-master/tools的train_faster_rcnn_alt_opt.py中修改:
max_iters = [80000, 40000, 80000, 40000]         分别为4个阶段(rpn第1阶段,fast
rcnn第1阶段,rpn第2阶段,fast rcnn第2阶段)的迭代次数。可改成你希望的迭代次数。


如果改了这些数值,需要把py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt里对应的solver文件(有4个)也修改,stepsize小于上面修改的数值。

按照道理,至此已经全部修改完成,应该可以训练了。

训练


在py-faster-rcnn-master下执行:
./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc但是此时可能会发生一系列错误,如:
AssertionError: num_images (2) must divide BATCH_SIZE (1),这个问题需要追溯到

py-faster-rcnn-master/lib/roi_data_layer/minibatch.py,具体有关于minibatch的内容可参考
点击打开链接 <https://blog.csdn.net/u010668907/article/details/51945917>



def get_minibatch(roidb, num_classes): """Given a roidb, construct a minibatch
sampled from it.""" num_images = len(roidb) # Sample random scales to use for
each image in this batch random_scale_inds = npr.randint(0,
high=len(cfg.TRAIN.SCALES), size=num_images) assert(cfg.TRAIN.BATCH_SIZE %
num_images == 0), \'num_images ({}) must divide BATCH_SIZE ({})'. \ #
这里要求batch_size必须整除num_images
因此我们需要到py-faster-rcnn-master/lib/fast_rcnn/config.py
中修改__C.TRAIN.BATCH__SIZE,本来这里是1,我将其修改为8




训练过程如下:




训练完成后,在py-faster-rcnn-master/output/faster_rcnn_alt_opt/voc_2007_trainval/下会有ZF_faster_rcnn_final.caffemodel
,这就是我们用自己的数据集训练得到的最终模型。




测试



将上述的ZF_faster_rcnn_final.caffemodel复制到py-faster-rcnn-master\data\faster_rcnn_models,修改py-faster-rcnn\tools\demo.py:
CLASSES = ('__background__', #'aeroplane', 'bicycle', 'bird', 'boat',
#'bottle', 'bus', 'car', 'cat', 'chair', #'cow', 'diningtable', 'dog', 'horse',
#'motorbike', 'person', 'pottedplant', #'sheep', 'sofa', 'train', 'tvmonitor'
'cat','dog')#你自己的标签def parse_args(): """Parse input arguments.""" parser =
argparse.ArgumentParser(description='Faster R-CNN demo')
parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
default=0, type=int) parser.add_argument('--cpu', dest='cpu_mode', help='Use
CPU mode (overrides --gpu)', action='store_true') parser.add_argument('--net',
dest='demo_net', help='Network to use [vgg16]', choices=NETS.keys(),default='zf'
)#默认模型改为zf args = parser.parse_args() return args# Warmup on a dummy image im =
128 * np.ones((300, 500, 3), dtype=np.uint8) for i in xrange(2): _, _=
im_detect(net, im) path =
'/home/wlw/Downloads/py-faster-rcnn-master/data/demo'#测试图片路径 for filename in
os.listdir(path): im_name=filename print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'
print 'Demo for data/demo/{}'.format(im_name) demo(net, im_name)
#plt.savefig("/home/wlw/Downloads/py-faster-rcnn-master/data/testfig/"+im_name)
plt.show()在终端中运行:wlw@wlw:~/Downloads/py-faster-rcnn-master/tools$ python
demo.py --cpu


因为我只是做一个小练习测试,所以我的整个数据集只有100张图片,加上我的训练迭代次数太少等问题,最后测试出来的图片都为空白,但是整个训练过程是没有问题的,下面我将增加数据集,重新训练。

Over!