How to train your custom object with Tensorflow Object Detection API

Recently I’ve been assigned to work on Object Detection on BTS antenna using Deep learning modeling with Tensorflow which is very challenging for me and giving me the first time hands on project with deep learning therefore, In this blog I’d like to take a tour and review what I’ve done during my internship.

First of All, Google provide an Object Detection API which already had some models were trained on the COCO dataset and work well on the 90 commonly found objects included in this dataset.It also provide me some tutorial to create custom object from datasets that were collected from the Drone and internet. In this tutorial will base on SSD as a base model for training datasets that would be used as the model for object detection

Here I extended the API to train on a new object that is not part of the COCO dataset. In this case I chose that is an object for training set. See gif below. So far, I have been impressed by the performance of the API. The steps highlighted here can be extended to any single or multiple object detector that you want to build.

figure 2: this figure show the example of custom objects that were trained datasets

From following step I’ll take you to the process of training my own object.For installation part you can follow along with Object Detection API or you can follow inside this blog

Installation

Dependencies

Tensorflow Object Detection API depends on the following libraries:

Protobuf 3.0.0
Python-tk
Pillow 1.0
lxml
tf Slim (which is included in the “tensorflow/models/research/” checkout)
Jupyter notebook
Matplotlib
Tensorflow
Cython
contextlib2
cocoapi

For detailed steps to install Tensorflow, follow the Tensorflow installation instructions. A typical user can install Tensorflow using one of the following commands:

1
2
3
4
# For CPU
pip install tensorflow
# For GPU
pip install tensorflow-gpu

The remaining libraries can be installed on Ubuntu 16.04 using via apt-get:

1
2
3
4
5
sudo apt-get install protobuf-compiler python-pil python-lxml python-tk
pip install --user Cython
pip install --user contextlib2
pip install --user jupyter
pip install --user matplotlib

Alternatively, users can install dependencies using pip:

1
2
3
4
5
6
pip install --user Cython
pip install --user contextlib2
pip install --user pillow
pip install --user lxml
pip install --user jupyter
pip install --user matplotlib

Note that sometimes “sudo apt-get install protobuf-compiler” will install Protobuf 3+ versions for you and some users have issues when using 3.5. If that is your case, you’re suggested to download and install Protobuf 3.0.0 (available here).

COCO API installation

Download the cocoapi and copy the pycocotools subfolder to the tensorflow/models/research directory if you are interested in using COCO evaluation metrics. The default metrics are based on those used in Pascal VOC evaluation. To use the COCO object detection metrics addmetrics_set: "coco_detection_metrics"to theeval_configmessage in the config file. To use the COCO instance segmentation metrics addmetrics_set: "coco_mask_metrics"to theeval_configmessage in the config file.

1
2
3
4
git clone https://github.com/cocodataset/cocoapi.git
cd cocoapi/PythonAPI
make
cp -r pycocotools <path_to_tensorflow>/models/research/

Protobuf Compilation

The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. Before the framework can be used, the Protobuf libraries must be compiled. This should be done by running the following command from the tensorflow/models/research/ directory:

1
2
# From tensorflow/models/research/
protoc object_detection/protos/*.proto --python_out=.

Add Libraries to PYTHONPATH

When running locally, the tensorflow/models/research/ and slim directories should be appended to PYTHONPATH. This can be done by running the following from tensorflow/models/research/:

1
2
# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

Note: This command needs to run from every new terminal you start. If you wish to avoid running this manually, you can add it as a new line to the end of your ~/.bashrc file, replacingpwdwith the absolute path ofmodels/research/object_detectionon your system.

Testing the Installation

You can test that you have correctly installed the Tensorflow Object Detection
API by running the following command:

1
python object_detection/builders/model_builder_test.py

Collect datas

In this step you can collect by finding your target objects on internet or pictures that you’ve captured yourself. For my samples I’ve collect overall 400+ images it’s enough to used for training set. Then you’ll need to create 3 folders”Images“, “Train” and “Test“,

Images:

“Train” is a folder that you will create for the images where contain objects that you want to trai
“Test” is a folder for testing an images that you’ve been trained .

Label an image

In this process I use labelImg as a tool to create image annotation in the Pascal VOC format . this tool would let you to create custom object by labeling the object of your collected images. To do it you can simply drag the box and named the object.

After you finish all the labeling you can notice the new xml file has been create when you saved the file.

Create your TF Record for Dataset

Tensorflow API wants the datasets to be in TFRecord file format. This is probably the trickiest part. However tensorflow has provided a couple of handy scripts to get you started — “xml_to_csv.py"and"tf_record.py".I was able to used thetf_record.pywith minimal edits since labelimg already creates annotations in the correct format. I also like that this script randomly takes 30% of the data and creates a validation TF Record file.

after you compile two python file you will have

test_labels.csv
train.xml
test.record
train.record

You will also need to create afile that is used to convert label name to a numeric id.

I have included the label_map.pbtxt file and the create_pet_tf_records.py file on my github. In case you are ge
tting stuck anywhere, I highly recommend the Racoon Detector walkthrough provided by Tensorflow.

Creating a model config

Once the TFR datasets are created, then first you need to select if you will use an existing model and fine tune it or build from scratch. I highly recommend using an existing model since most of the features that are learnt by CNNs are often object agnostic and fine tuning an existing model is usually an easy and accurate process. Please note that if you do decide to build from scratch you will need much more than 150 images and training will take days. The API provides 5 different models that provide a trade off between speed of execution and the accuracy in placing bounding boxes. See table below:

5. Training the model

After a long preparing of training data now it’s time for the real train you can run the following command to train the model

you also need to change the checkpoint name/path, num_classes to 1, num_examples to 12, and label_map_path: “training/object-detect.pbtxt”

Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
# SSD with Mobilenet v1, configured for the BTS Antenna dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "${YOUR_GCS_BUCKET}" to find the fields that
# should be configured.
 
model {
  ssd {
    num_classes:<span style="color: #3366ff;"> 1</span>
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_ratios: 0.5
        aspect_ratios: 3.0
        aspect_ratios: 0.3333
      }
    }
    image_resizer {
      fixed_shape_resizer {
        height: 300
        width: 300
      }
    }
    box_predictor {
      convolutional_box_predictor {
        min_depth: 0
        max_depth: 0
        num_layers_before_predictor: 0
        use_dropout: false
        dropout_keep_probability: 0.8
        kernel_size: 1
        box_code_size: 4
        apply_sigmoid_to_scores: false
        conv_hyperparams {
          activation: RELU_6,
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.03
              mean: 0.0
            }
          }
          batch_norm {
            train: true,
            scale: true,
            center: true,
            decay: 0.9997,
            epsilon: 0.001,
          }
        }
      }
    }
    feature_extractor {
      type: 'ssd_mobilenet_v1'
      min_depth: 16
      depth_multiplier: 1.0
      conv_hyperparams {
        activation: RELU_6,
        regularizer {
          l2_regularizer {
            weight: 0.00004
          }
        }
        initializer {
          truncated_normal_initializer {
            stddev: 0.03
            mean: 0.0
          }
        }
        batch_norm {
          train: true,
          scale: true,
          center: true,
          decay: 0.9997,
          epsilon: 0.001,
        }
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid {
          anchorwise_output: true
        }
      }
      localization_loss {
        weighted_smooth_l1 {
          anchorwise_output: true
        }
      }
      hard_example_miner {
        num_hard_examples: 3000
        iou_threshold: 0.99
        loss_type: CLASSIFICATION
        max_negatives_per_positive: 3
        min_negatives_per_image: 0
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.6
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}
 
train_config: {
  batch_size: 10
  optimizer {
    rms_prop_optimizer: {
      learning_rate: {
        exponential_decay_learning_rate {
          initial_learning_rate: 0.004
          decay_steps: 800720
          decay_factor: 0.95
        }
      }
      momentum_optimizer_value: 0.9
      decay: 0.9
      epsilon: 1.0
    }
  }
  fine_tune_checkpoint: "<span style="color: #3366ff;">ssd_mobilenet_v1_coco_11_06_2017/model.ckpt</span>"
  from_detection_checkpoint: true
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    ssd_random_crop {
    }
  }
}
 
train_input_reader: {
  tf_record_input_reader {
    input_path: "<span style="color: #3366ff;">data/train.record</span>"
  }
  label_map_path: "<span style="color: #3366ff;">data/object-detection.pbtxt</span>"
}
 
eval_config: {
  num_examples: 40
}
 
eval_input_reader: {
  tf_record_input_reader {
    input_path: "<span style="color: #3366ff;">data/test.record</span>"
  }
  label_map_path: "<span style="color: #3366ff;">training/object-detection.pbtxt</span>"
  shuffle: false
  num_readers: 1
}

Now you’re ready to train you data

in directory models/research/object_detection/train.py

Training model by command
MS DOS
1
2
python3 train.py --logtostderr --train_dir=training/ 
--pipeline_config_path=training/ssd_mobilenet_v1_coco.config

IF there’s nothing to deal with the errors you should see the results like this

1
2
3
4
5
6
INFO:tensorflow:global step 11788: loss = 0.6717 (0.398 sec/step)
INFO:tensorflow:global step 11789: loss = 0.5310 (0.436 sec/step)
INFO:tensorflow:global step 11790: loss = 0.6614 (0.405 sec/step)
INFO:tensorflow:global step 11791: loss = 0.7758 (0.460 sec/step)
INFO:tensorflow:global step 11792: loss = 0.7164 (0.378 sec/step)
INFO:tensorflow:global step 11793: loss = 0.8096 (0.393 sec/step)

you can also run the tensor bird to see the summary of your training. from

models/object_detection, via terminal, you start TensorBoard with:

1
tensorboard --logdir='training'

This runs on 127.0.0.1:6006 in your browser.

from models/object_detection/training

you will see your new training set has been created from our datasets

6. Testing the model

Lastly after long run of training on your datas. In themodels/research/object_detectiondirectory, there is a script that does this for us:export_inference_graph.py

To run this, you just need to pass in your checkpoint and your pipeline config, then wherever you want the inference graph to be placed. For example:

1
2
3
4
5
python3 export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path training/ssd_mobilenet_v1_pets.config \
    --trained_checkpoint_prefix training/model.ckpt-10856 \
    --output_directory BTS_Antenna_inference_graph

Now we’ll go tomodels/research/object_detection/tutorialopen your Juypyter notebook select object_detection_tutorial.ipynb the to test out our trained data.

now what you need to change from following codes editing in order test your data:

Python
1
2
3
4
5
6
7
8
9
10
# What model to download.
MODEL_NAME = 'BTS_Antenna_inference_graph'
 
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = MODEL_NAME + '/frozen_inference_graph.pb'
 
# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('training', 'object-detection.pbtxt')
 
NUM_CLASSES = 1

Finally, in theDetectionsection, change theTEST_IMAGE_PATHSvar to:

1
TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR, 'image{}.jpg'.format(i)) for i in range(3, 8) ] //these I is the range of number of your test file