Object Detection
Object Detection task with YOLOv5 model.
This document contains the explanations of arguments of each script.
You can find the tutorial document for finetuning a pretrained model on COCO128 dataset under the `tutorial` folder, `tutorial/README.md`.
The ipython notebook tutorial is also prepared under the `tutorial` folder as `tutorial/tutorial.ipynb`. You may upload and run this ipython notebook on Google colab.
# Prerequisites
- Python 3.8 or above
# Installation
```bash
$ pip install -U pip
$ pip install -r requirements.txt
```
# Dataset & Preparation
The image data, annotations and dataset.yaml are required.
## MS COCO
Our traning script accepts MS COCO dataset. You may download the dataset using the following link:
- Download [2017 MS COCO Dataset](https://cocodataset.org/#download)
## Custom Datasets
You can also train the model on a custom dataset.
### Annotations Format
After using a tool like [CVAT](https://github.com/openvinotoolkit/cvat), [makesense.ai](https://www.makesense.ai) or [Labelbox](https://labelbox.com) to label your images, export your labels to YOLO format, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The `*.txt` file specifications are:
- One row per object
- Each row is `class x_center y_center width height` format.
- Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide `x_center` and `width` by image `width`, and `y_center` and `height` by image height.
- Class numbers are zero-indexed (start from 0).
The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):
### Directory Organization
Your own datasets are expected to have the following structure. We assume `/dataset` is next to the `/yolov5` directory. YOLOv5 locates labels automatically for each image by replacing the last instance of `/images/` in each image path with `/labels/`.
```bash
- Dataset name
-- images
-- train
--- img001.jpg
--- ...
-- val
--- img002.jpg
--- ...
-- labels
-- train
--- img001.txt
--- ...
-- val
--- img002.txt
--- ...
- yolov5
- generate_npy
- exporting
```
### dataset.yaml
The yaml file for COCO dataset has been prepared in `./data/coco.yaml`. For custom dataset, you need to prepare the yaml file and save it under `./data/`. The yaml file is expected to have the following format:
```bash
# train and val datasets (image directory or *.txt file with image paths)
train: ./datasets/images/train/
val: ./datasets/images/val/
# number of classes
nc: number of classes
# class names
names: list of class names
```
# Train
For training on MS COCO, execute commands in the folder `yolov5`:
```shell
CUDA_VISIBLE_DEVICES='0' python train.py --data coco.yaml --cfg yolov5s-noupsample.yaml --weights '' --batch-size 64
```
`CUDA_VISIBLE_DEVICES='0'` indicates the gpu ids.
`--data` the yaml file. (located under `./data/`)
`--cfg` the model configuration. (located under `./model/`) (`yolov5s-noupsample.yaml` for 520, `yolov5s.yaml` for 720)
`--hyp` the path to hyperparameters file. (located under `./data/`)
`--weights` the path to pretained model weights. ('' if train from scratch)
`--epochs` the number of epochs to train. (Default: 300)
`--batch-size` batch size. (Default: 16)
`--img-size` the input size of the model. (Default: (640, 640))
`--workers` the maximum number of dataloader workers. (Default: 8)
By default, the trained models are saved under `./runs/train/`.
## Generating .npy for different model input
We can generating `.npy` for different model input by using `yolov5_generate_npy.py`. Execute commands in the folder `generate_npy`:
```shell
python yolov5_generate_npy.py --input-h 640 --input-w 640
```
`--input-h` the input height. (Default: 640)
`--input-w` the input width. (Default: 640)
We could get `*.npy`
# Configure the paths yaml file
You are expected to create a yaml file which stores all the paths related to the trained models. This yaml file will be used in the following sections. You can check and modify the `pretrained_paths_520.yaml` and `pretrained_paths_720.yaml` under `/yolov5/data/`. The yaml file is expected to contain the following information:
```shell
grid_dir: path_to_npy_file_directory
grid20_path: path_to_grid20_npy_file
grid40_path: path_to_grid40_npy_file
grid80_path: path_to_grid80_npy_file
yolov5_dir: path_to_yolov5_directory
path: path_to_pretrained_yolov5_model_weights_pt_file
yaml_path: path_to_the_model_configuration_yaml_file
pt_path: path_to_export_yolov5_model_weights_kneron_supported_file
onnx_export_file: path_to_export_yolov5_onnx_model_file
input_w: model_input_weight
input_h: model_input_height
nc: number_of_classes
names: list_of_class_names
```
# Save and Convert to ONNX
This section will introduce how to save the trained model for pytorch1.4 supported format and convert to ONNX.
## Exporting ONNX model in the PyTorch 1.7 environment
We can convert the model to onnx by using `yolov5_export.py`. Execute commands in the folder `yolov5`:
```shell
python ../exporting/yolov5_export.py --data path_to_pretrained_path_yaml_file
```
`--data` the path to pretrained model paths yaml file (Default: ../yolov5/data/pretrained_paths_520.yaml)
We could get onnx model.
## Converting onnx by tool chain
Pull the latest [ONNX converter](https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts) from github. You may read the latest document from Github for converting ONNX model. Execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
```shell
python -m onnxsim input_onnx_model output_onnx_model
python pytorch2onnx.py input.pth output.onnx
```
We could get converted onnx model.
# Inference
Before model inference, we assume that the model has been converted to onnx model as in the previous section (even if only inference pth model). Create a yaml file containing the path information. For model inference on a single image, execute commands in the folder `yolov5`:
```shell
python inference.py --data path_to_pretrained_path_yaml_file --img-path path_to_image --save-path path_to_saved_image
```
`--img-path` the path to the image.
`--save-path` the path to draw and save the image with bbox.
`--data` the path to pretrained model paths yaml file. (Default: data/pretrained_paths_520.yaml)
`--conf_thres` the score threshold of bounding boxes. (Default: 0.3)
`--iou_thres` the iou threshold for NMS. (Default: 0.3)
`--onnx` whether is onnx model inference.
You could find preprocessing and postprocessing processes under the folder `exporting/yolov5/`.
# Evaluation
## Evaluation Metric
We will use mean Average Precision (mAP) for evaluation. You can find the script for computing mAP in `test.py`.
`mAP`: mAP is the average of Average Precision (AP). AP summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight:
where
and
are the precision and recall at the nth threshold. The mAP compares the ground-truth bounding box to the detected box and returns a score. The higher the score, the more accurate the model is in its detections.
## Evaluation on a Dataset
For evaluating the trained model on dataset:
```shell
python test.py --weights path_to_pth_model_weight --data path_to_data_yaml_file
```
`--weights` The path to pretrained model weight. (Defalut: best.pt)
`--data` The path to data yaml file. (Default: data/coco128.yaml)
`--img-size` Input shape of the model (Default: (640, 640))
`--conf-thres` Object confidence threshold. (Default: 0.001)
`--device` Cuda device, i.e. 0 or 0,1,2,3 or cpu. (Default: cpu)
`--verbose` Whether report mAP by class.
## End-to-End Evaluation
If you would like to perform an end-to-end test with an image dataset, you can use `inference_e2e.py` under the directory `yolov5` to obtain the prediction results.
You have to prepare an initial parameter yaml file for the inference runner. You may check `utils/init_params.yaml` for the format.
```shell
python inference_e2e.py --img-path path_to_dataset_folder --params path_to_init_params_file --save-path path_to_save_json_file
```
`--img-path` Path to the dataset directory
`--params` Path to initial parameter yaml file for the inference runner
`--save-path` Path to save the prediction to a json file
`--gpu` GPU id (-1 if cpu) (Default: -1)
The predictions will be saved into a json file that has the following structure:
```bash
[
{'img_path':image_path_1
'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
},
{'img_path':image_path_2
'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
},
...
]
```
# Model
Backbone | Input Size | FPS on 520 | FPS on 720 | Model Size | mAP
--- | --- |:---:|:---:|:---:|:---:
[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) | 640x640 | 4.91429 | - | 13.1M | 40.4%
[YOLOv5s (with upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s) | 640x640 | - | 24.4114 | 14.6M | 50.9%
[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) is the yolov5s model backbone without upsampling operation, since 520 hardware does not support upsampling operation.