caffe学习三：使用Faster RCNN训练自己的数据

本文假设你已经完成了安装，并可以运行demo.py

不会安装且用PASCAL VOC数据集的请看另来两篇博客。

https://www.cnblogs.com/elitphil/p/11527732.html

caffe学习二：py-faster-rcnn配置运行faster_rcnn_end2end-VGG_CNN_M_1024 (Ubuntu16.04)

https://www.cnblogs.com/elitphil/p/11547429.html

一般上面两个操作你实现了，使用Faster RCNN训练自己的数据就顺手好多了。

第一步：准备自己的数据集

(1). 首先，自己的数据集(或自己拍摄或网上下载)分辨率可能太大，不利于训练，通过一顿操作把他们缩小到跟VOC里的图片差不多大小。

在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相对应的目录)，新建一个python文件(如命名为trans2voc_format.py)

把以下内容粘贴复制进去，然后执行该python文件即可对你的图片进行裁剪缩放等操作：

#coding=utf-8
import os  #打开文件时需要
from PIL import Image
import re

Start_path='./JPEGImages/'      # 唯一一处需要修改的地方。把对应的图片目录换成你的图片目录。
iphone5_width=333               # 图片最大宽度
iphone5_depth=500               # 图片最大高度

list=os.listdir(Start_path)
#print list
count=0
for pic in list:
    path=Start_path+pic
    print path
    im=Image.open(path)
    w,h=im.size
    #print w,h
    #iphone 5的分辨率为1136*640，如果图片分辨率超过这个值，进行图片的等比例压缩

    if w>iphone5_width:
        print pic
        print "图片名称为"+pic+"图片被修改"
        h_new=iphone5_width*h/w
        w_new=iphone5_width
        count=count+1
        out = im.resize((w_new,h_new),Image.ANTIALIAS)
        new_pic=re.sub(pic[:-4],pic[:-4]+'_new',pic)
        #print new_pic
        new_path=Start_path+new_pic
        out.save(new_path)

    if h>iphone5_depth:
        print pic
        print "图片名称为"+pic+"图片被修改"
        w=iphone5_depth*w/h
        h=iphone5_depth
        count=count+1
        out = im.resize((w_new,h_new),Image.ANTIALIAS)
        new_pic=re.sub(pic[:-4],pic[:-4]+'_new',pic)
        #print new_pic
        new_path=Start_path+new_pic
        out.save(new_path)

print 'END'
count=str(count)
print "共有"+count+"张图片尺寸被修改"



(2).图片有了，然后我们需要对图片进行重命名(理论上来说你不重命名来说也没影响)。
    同样在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相对应的目录)，新建一个python文件(如命名为pic_rename.py)
    把以下内容粘贴复制进去，然后执行该文件，就可以把图片重命名(如你有一百张图片，则会重命名为：000001～0001000)：

# coding=utf-8
import os  # 打开文件时需要
from PIL import Image
import re


class BatchRename():
    def __init__(self):
        self.path = './JPEGImages'    # 同样(也是)，把图片路径换成你的图片路径

    def rename(self):
        filelist = os.listdir(self.path)
        total_num = len(filelist)
        i = 000001                    # 还有这里需要注意下，图片编号从多少开始，不要跟VOC原本的编号重复了。
        n = 6
        for item in filelist:
            if item.endswith('.jpg'):
                n = 6 - len(str(i))
                src = os.path.join(os.path.abspath(self.path), item)
                dst = os.path.join(os.path.abspath(self.path), str(0) * n + str(i) + '.jpg')
                try:
                    os.rename(src, dst)
                    print 'converting %s to %s ...' % (src, dst)
                    i = i + 1
                except:
                    continue
        print 'total %d to rename & converted %d jpgs' % (total_num, i)


if __name__ == '__main__':
    demo = BatchRename()
    demo.rename()

(3). 然后需要对图片进行手动标注，建议使用labelImg工具，简单方便。

下载地址：https://github.com/tzutalin/labelImg

使用方法特别简单，设定xml文件保存的位置，打开你的图片目录，然后一幅一幅的标注就可以了

(借用参考链接第二条的一张图)

把所有图片文件标准完毕，并且生成了相对应的.xml文件。

接下来，来到voc207这里，把原来的图片和xml删掉(或备份)，位置分别是：

/home/py-faster-rcnn/data/VOCdevkit2007/VOC2007/JPEGImages
/home/py-faster-rcnn/data/VOCdevkit2007/VOC2007/Annotations

删掉是因为我们不需要别的数据集，只想训练自己的数据集，这样能快一点

(4)数据和图片就位以后，接下来生成训练和测试用需要的txt文件索引，程序是根据这个索引来获取图像的。

在/py-faster-rcnn/data/VOCdevkit2007/VOC2007 (找到你自己文件相对应的目录)，新建一个python文件(如命名为xml2txt.py)

把以下内容粘贴复制进去，然后执行该python文件即可生成索引文件：

# !/usr/bin/python
# -*- coding: utf-8 -*-
import os
import random  

trainval_percent = 0.8           #trainval占比例多少
train_percent = 0.7              #test数据集占比例多少
xmlfilepath = 'Annotations'  
txtsavepath = 'ImageSets\Main'   # 生成的索引文集所在路径
total_xml = os.listdir(xmlfilepath)  

num=len(total_xml)  
list=range(num)  
tv=int(num*trainval_percent)  
tr=int(tv*train_percent)  
trainval= random.sample(list,tv)  
train=random.sample(trainval,tr)  

ftrainval = open('ImageSets/Main/trainval.txt', 'w')  
ftest = open('ImageSets/Main/test.txt', 'w')  
ftrain = open('ImageSets/Main/train.txt', 'w')  
fval = open('ImageSets/Main/val.txt', 'w')  

for i  in list:  
    name=total_xml[i][:-4]+'\n'  
    if i in trainval:  
        ftrainval.write(name)  
        if i in train:  
            ftrain.write(name)  
        else:  
            fval.write(name)  
    else:  
        ftest.write(name)  

ftrainval.close()  
ftrain.close()  
fval.close()  
ftest.close()


生成的索引文件在这