Object Detection

<h1 align="center">  Object Detection </h1>
Object Detection task with YOLOv5 model.

This document contains the explanations of arguments of each script.


You can find the tutorial document for finetuning a pretrained model on COCO128 dataset under the `tutorial` folder, `tutorial/README.md`. 


The ipython notebook tutorial is also prepared under the `tutorial` folder as `tutorial/tutorial.ipynb`. You may upload and run this ipython notebook on Google colab.

# Prerequisites
- Python 3.8 or above

# Installation
```bash
$ pip install -U pip
$ pip install -r requirements.txt
```

# Dataset & Preparation

The image data, annotations and dataset.yaml are required. 

## MS COCO

Our traning script accepts MS COCO dataset. You may download the dataset using the following link:

- Download [2017 MS COCO Dataset](https://cocodataset.org/#download) 

## Custom Datasets

You can also train the model on a custom dataset. 

### Annotations Format
After using a tool like [CVAT](https://github.com/openvinotoolkit/cvat), [makesense.ai](https://www.makesense.ai) or [Labelbox](https://labelbox.com) to label your images, export your labels to YOLO format, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The `*.txt` file specifications are:

- One row per object
- Each row is `class x_center y_center width height` format.
- Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide `x_center` and `width` by image `width`, and `y_center` and `height` by image height.
- Class numbers are zero-indexed (start from 0).

<div align="center">
<img src="./tutorial/screenshots/readme_img.jpg" width="50%" /> 
</div>

The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):
<div align="center">
<img src="./tutorial/screenshots/readme_img2.png" width="40%" /> 
</div>

###  Directory Organization
Your own datasets are expected to have the following structure. We assume `/dataset` is next to the `/yolov5` directory. YOLOv5 locates labels automatically for each image by replacing the last instance of `/images/` in each image path with `/labels/`.

```bash
- Dataset name
    -- images
        -- train
            --- img001.jpg
            --- ...
        -- val
            --- img002.jpg
            --- ...
    
    -- labels
        -- train
            --- img001.txt
            --- ...
        -- val
            --- img002.txt
            --- ...

- yolov5

- generate_npy

- exporting
    
```

###  dataset.yaml

The yaml file for COCO dataset has been prepared in `./data/coco.yaml`. For custom dataset, you need to prepare the yaml file and save it under `./data/`. The yaml file is expected to have the following format:
```bash
# train and val datasets (image directory or *.txt file with image paths)  
train: ./datasets/images/train/  
val: ./datasets/images/val/  

# number of classes  
nc: number of classes  

# class names  
names: list of class names

```

# Train

For training on MS COCO, execute commands in the folder `yolov5`:
```shell
CUDA_VISIBLE_DEVICES='0' python train.py --data coco.yaml --cfg yolov5s-noupsample.yaml --weights '' --batch-size 64 
```

`CUDA_VISIBLE_DEVICES='0'` indicates the gpu ids.

`--data` the yaml file. (located under `./data/`)

`--cfg` the model configuration. (located under `./model/`) (`yolov5s-noupsample.yaml` for 520, `yolov5s.yaml` for 720)

`--hyp` the path to hyperparameters file. (located under `./data/`)

`--weights` the path to pretained model weights. ('' if train from scratch)

`--epochs` the number of epochs to train. (Default: 300)

`--batch-size` batch size. (Default: 16)

`--img-size` the input size of the model. (Default: (640, 640))

`--workers` the maximum number of dataloader workers. (Default: 8)

By default, the trained models are saved under `./runs/train/`.

## Generating .npy for different model input
We can generating `.npy` for different model input by using `yolov5_generate_npy.py`. Execute commands in the folder `generate_npy`:
```shell
python yolov5_generate_npy.py --input-h 640 --input-w 640 
```

`--input-h` the input height. (Default: 640)
`--input-w` the input width. (Default: 640)

We could get `*.npy`

# Configure the paths yaml file
You are expected to create a yaml file which stores all the paths related to the trained models. This yaml file will be used in the following sections. You can check and modify the `pretrained_paths_520.yaml` and `pretrained_paths_720.yaml` under `/yolov5/data/`. The yaml file is expected to contain the following information:

```shell
grid_dir: path_to_npy_file_directory
grid20_path: path_to_grid20_npy_file
grid40_path: path_to_grid40_npy_file
grid80_path: path_to_grid80_npy_file

yolov5_dir: path_to_yolov5_directory
path: path_to_pretrained_yolov5_model_weights_pt_file
yaml_path: path_to_the_model_configuration_yaml_file
pt_path: path_to_export_yolov5_model_weights_kneron_supported_file
onnx_export_file: path_to_export_yolov5_onnx_model_file

input_w: model_input_weight
input_h: model_input_height

nc: number_of_classes

names: list_of_class_names
```

# Save and Convert to ONNX
This section will introduce how to save the trained model for pytorch1.4 supported format and convert to ONNX. 

## Exporting ONNX model in the PyTorch 1.7 environment
We can convert the model to onnx by using `yolov5_export.py`. Execute commands in the folder `yolov5`:
```shell
python ../exporting/yolov5_export.py --data path_to_pretrained_path_yaml_file
```

`--data` the path to pretrained model paths yaml file (Default: ../yolov5/data/pretrained_paths_520.yaml)

We could get onnx model. 


## Converting onnx by tool chain
Pull the latest [ONNX converter](https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts) from github. You may read the latest document from Github for converting ONNX model. Execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)

```shell
python -m onnxsim input_onnx_model output_onnx_model

python pytorch2onnx.py input.pth output.onnx
```

We could get converted onnx model.


# Inference

Before model inference, we assume that the model has been converted to onnx model as in the previous section (even if only inference pth model). Create a yaml file containing the path information. For model inference on a single image, execute commands in the folder `yolov5`:
```shell
python inference.py --data path_to_pretrained_path_yaml_file --img-path path_to_image --save-path path_to_saved_image
```

`--img-path` the path to the image.

`--save-path` the path to draw and save the image with bbox.

`--data` the path to pretrained model paths yaml file. (Default: data/pretrained_paths_520.yaml)

`--conf_thres` the score threshold of bounding boxes. (Default: 0.3)

`--iou_thres` the iou threshold for NMS. (Default: 0.3) 

`--onnx` whether is onnx model inference. 

You could find preprocessing and postprocessing processes under the folder `exporting/yolov5/`. 


# Evaluation 

## Evaluation Metric
We will use mean Average Precision (mAP) for evaluation. You can find the script for computing mAP in `test.py`.

`mAP`: mAP is the average of Average Precision (AP). AP summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight:

<img src="https://latex.codecogs.com/svg.image?AP&space;=&space;\sum_n&space;(R_n-R_{n-1})P_n&space;" title="AP = \sum_n (R_n-R_{n-1})P_n " />

where <img src="https://latex.codecogs.com/svg.image?R_n" title="R_n" />  and <img src="https://latex.codecogs.com/svg.image?P_n" title="P_n" /> are the precision and recall at the nth threshold. The mAP compares the ground-truth bounding box to the detected box and returns a score. The higher the score, the more accurate the model is in its detections.

## Evaluation on a Dataset
For evaluating the trained model on dataset:

```shell
python test.py --weights path_to_pth_model_weight --data path_to_data_yaml_file
```

`--weights` The path to pretrained model weight. (Defalut: best.pt)

`--data` The path to data yaml file. (Default: data/coco128.yaml)

`--img-size` Input shape of the model (Default: (640, 640))

`--conf-thres` Object confidence threshold. (Default: 0.001)

`--device` Cuda device, i.e. 0 or 0,1,2,3 or cpu. (Default: cpu)

`--verbose` Whether report mAP by class.

## End-to-End Evaluation
If you would like to perform an end-to-end test with an image dataset, you can use `inference_e2e.py` under the directory `yolov5` to obtain the prediction results.
You have to prepare an initial parameter yaml file for the inference runner. You may check `utils/init_params.yaml` for the format.
```shell
python inference_e2e.py --img-path path_to_dataset_folder --params path_to_init_params_file --save-path path_to_save_json_file
```
`--img-path` Path to the dataset directory

`--params` Path to initial parameter yaml file for the inference runner

`--save-path` Path to save the prediction to a json file

`--gpu` GPU id  (-1 if cpu) (Default: -1)

The predictions will be saved into a json file that has the following structure:
```bash
[
    {'img_path':image_path_1
    'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
    },
    {'img_path':image_path_2
    'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
    },
    ...
]
```
# Model

Backbone | Input Size |  FPS on 520 | FPS on 720  | Model Size | mAP
--- | --- |:---:|:---:|:---:|:---:
[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) | 640x640 | 4.91429 | - | 13.1M | 40.4%
[YOLOv5s (with upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s) | 640x640 | - | 24.4114 | 14.6M | 50.9%

[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) is the yolov5s model backbone without upsampling operation, since 520 hardware does not support upsampling operation.