Yolov5s/ai_training/classification/tutorial/README.md

<h1 align="center">  Image Classification </h1>

The tutorial explores the basis of image classification task. In this document, we will go through a concrete example of how to train an image classification model via our AI training platform. The dataset containing bees and ants is provided.

Image Classification is a fundamental task that attempts to classify the image by assigning it to a specific label. Our AI training platform provides the training script to train a classification model for image classification task.

# Prerequisites
First of all, we have to install the libraries. Python 3.6 or above is required. For other libraries, you can check the `requirements.txt` file. Installing these packages is simple. You can install them by running:

```
pip install -r requirements.txt
```

# Dataset & Preparation
Next, we need a dataset for the training model.

## Custom Datasets
You can train the model on a custom dataset. Your own datasets are expected to have the following structure:

```shell
- Dataset name
    -- train
        --- Class1
        --- Class2

    -- val
        --- Class1
        --- Class2
```

## Example
Let's go through a toy example for preparing a custom dataset. Suppose we are going to classify bees and ants.
<div align="center">
<img src="./image_data/train/ants/0013035.jpg" width="33%" /> <img src="./image_data/train/bees/1092977343_cb42b38d62.jpg" width="33%" />
</div>
First of all, we have to split the images for bees and ants into train and validation set respectively (recommend 8:2). Then, we can move the images into difference folders with their class names. The dataset folder will have the following structure.

```shell
- image data
    -- train
        --- ants
        --- bees

    -- val
        --- ants
        --- bees
```

Now, we have finished preparing the dataset.

# Train

Following the examples in the previous section, let's finetune a pretrained model on our custom dataset. The pretrained model we used here is the MobileNet model. We download the pretrained model from [Model_Zoo](https://github.com/kneron/Model_Zoo/tree/main/classification/MobileNetV2) by:
```shell
wget https://raw.githubusercontent.com/kneron/Model_Zoo/main/classification/MobileNetV2/MobileNetV2.pth
```
Since our dataset is quite small, we choose to frezze the backbone model and only finetune the last layer. Following the instruction above, run:

```shell
python train.py --gpu -1 --freeze-backbone 1 --backbone mobilenetv2 --early-stop 1 --snapshot MobileNetV2.pth --snapshot-path snapshots/exp/ ./tutorial/image_data
```

The following training messages will be printed:

```shell
{'data_dir': './tutorial/image_data', 'model_name': 'model_ft', 'model_def_path': None, 'lr': 0.001, 'backbone': 'mobilenetv2', 'gpu': -1, 'epochs': 100, 'freeze_backbone': 1, 'batch_size': 64, 'snapshot': 'MobileNetV2.pth', 'snapshot_path': 'snapshots/exp/', 'optimizer': 'SGD', 'loss': 'cross_entropy', 'early_stop': 1, 'patience': 7}
Initializing Datasets and Dataloaders...
-------------Label mapping to Idx:--------------
{0: 'ants', 1: 'bees'}
------------------------------------------------
Params to learn:
	 model.classifier.1.weight
	 model.classifier.1.bias
Epoch 0/99
----------
train Loss: 0.7786 Acc: 0.4303
val Loss: 0.6739 Acc: 0.6056

Validation loss decreased (inf --> 0.673929).  Saving model ...

...


```
When the validation mAP stops increasing for 7 epochs, the early stopping will be triggered and the training process will be terminated. The trained model is saved under `./snapshots/exp` folder. In addition, the class label to idx mapping is printed and automatically saved in `./eval_utils/class_id.json`.

# Converting to ONNX
You may check the [Toolchain manual](http://doc.kneron.com/docs/#toolchain/manual/) for converting PyTorch model to ONNX model. Let's go through an example for converting FP_classifier PyTorch model to ONNX model.

Execute commands in the folder `classification`:
```shell
python pytorch2onnx.py --backbone mobilenetv2 --num_classes 2 --snapshot snapshots/exp/model_ft_best.pth --save-path snapshots/exp/model_ft_best.onnx
```
We could get `model_ft_best.onnx`.

Execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
```shell
git clone https://github.com/kneron/ONNX_Convertor.git
```

```shell
python ONNX_Convertor/optimizer_scripts/pytorch2onnx.py snapshots/exp/model_ft_best.onnx snapshots/exp/model_ft_best_convert.onnx
```

We could get `model_ft_best_convert.onnx`.

# Inference
In this section, we will go through an example of using a trained network for inference. That is, we will use the function `inference.py` that takes an image and predict the class label for the image. `inference.py` returns the top $K$ most likely classes along with the probabilities. Let's run our network on the following image, a bee image from our custom dataset:
<div align="center">
<img src="./image_data/val/bees/10870992_eebeeb3a12.jpg" width="30%" />
</div>

```shell
python inference.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --model-def-path models/MobileNetV2/ --class_id_path eval_utils/class_id.json --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg

{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'mobilenetv2', 'class_id_path': 'eval_utils/class_id.json', 'gpu': -1, 'model_def_path': 'models/MobileNetV2/', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'save_path': 'inference_result.json', 'onnx': False}
Label      Probability
bees          0.836
ants          0.164
```

Note that the class ID mapping file `eval_utils/class_id.json` was created during training process. After inference, we could get `inference_result.json`, which contains the following information:

```bash
{"img_path": "/home/ziyan/git_repo/ai_training/ai_training/classification/tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg", "0_0": [[0.8359974026679993, 1], [0.16400262713432312, 0]]}
```

For onnx inference, add `--onnx` argument when execute `inference.py`:
```shell
python inference.py --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg --snapshot snapshots/exp/model_ft_best_convert.onnx --onnx

{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'resnet18', 'class_id_path': './eval_utils/class_id.json', 'gpu': -1, 'model_def_path': None, 'snapshot': 'snapshots/exp/model_ft_best_convert.onnx', 'save_path': 'inference_result.json', 'onnx': True}
Label      Probability
bees          0.836
ants          0.164
```


# Evaluation

## Evaluation on a dataset
In this section, we will go through an example of evaluating a trained network on a dataset. Here, we are going to evaluate a pretrained model on the validation set of our custom dataset. The `./eval_utils/eval.py` will report the top-K score and F1 score for the model evaluated on a testing dataset. The evaluation statistics will be saved to `eval_results.txt`.

```shell
python eval_utils/eval.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --data-dir ./tutorial/image_data/val/

{'data_dir': './tutorial/image_data/val/', 'model_def_path': None, 'backbone': 'mobilenetv2', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'gpu': -1, 'preds': None, 'gts': None}

top 1 accuracy: 0.9225352112676056

Label      Precision   Recall  F1 score
ants          0.887    0.932    0.909
bees          0.950    0.916    0.933
```

## End-to-End Evaluation
For end-to-end testing, we expect that the prediction results are saved into json files, one json file for one image, with the following format:
```bash
{"img_path": image_path,
 "0_0":[[score, label], [score, label], ...]
}
```
The prediction json files for all images are expected to saved under the same folder. The ground truth json file is expected to have the following format:
```bash
{image1_path: label,
 image2_path: label,
 ...
}
```
For this tutorial, we generated some random prediction data saved under the folder `tutorial/eval_data/preds/`, and the ground turth is saved in `tutorial/eval_data/gts.json`. You may check these files for the format. To compute the evaluation statistics, execute commands in the folder `classification`:

```shell
python eval_utils/eval.py --preds tutorial/eval_data/preds/ --gts tutorial/eval_data/gts.json

{'model_def_path': None, 'data_dir': None, 'backbone': 'resnet18', 'preds': 'tutorial/eval_data/preds/', 'gts': 'tutorial/eval_data/gts.json', 'snapshot': None, 'gpu': -1}

top 1 accuracy: 1.0

Label      Precision   Recall  F1 score
0             1.000    1.000    1.000
1             1.000    1.000    1.000
2             1.000    1.000    1.000
```
The evaluation statistics will be saved to `eval_results.txt`.