8.6 KiB
Image Classification
The tutorial explores the basis of image classification task. In this document, we will go through a concrete example of how to train an image classification model via our AI training platform. The dataset containing bees and ants is provided.
Image Classification is a fundamental task that attempts to classify the image by assigning it to a specific label. Our AI training platform provides the training script to train a classification model for image classification task.
Prerequisites
First of all, we have to install the libraries. Python 3.6 or above is required. For other libraries, you can check the requirements.txt file. Installing these packages is simple. You can install them by running:
pip install -r requirements.txt
Dataset & Preparation
Next, we need a dataset for the training model.
Custom Datasets
You can train the model on a custom dataset. Your own datasets are expected to have the following structure:
- Dataset name
-- train
--- Class1
--- Class2
-- val
--- Class1
--- Class2
Example
Let's go through a toy example for preparing a custom dataset. Suppose we are going to classify bees and ants.
First of all, we have to split the images for bees and ants into train and validation set respectively (recommend 8:2). Then, we can move the images into difference folders with their class names. The dataset folder will have the following structure.- image data
-- train
--- ants
--- bees
-- val
--- ants
--- bees
Now, we have finished preparing the dataset.
Train
Following the examples in the previous section, let's finetune a pretrained model on our custom dataset. The pretrained model we used here is the MobileNet model. We download the pretrained model from Model_Zoo by:
wget https://raw.githubusercontent.com/kneron/Model_Zoo/main/classification/MobileNetV2/MobileNetV2.pth
Since our dataset is quite small, we choose to frezze the backbone model and only finetune the last layer. Following the instruction above, run:
python train.py --gpu -1 --freeze-backbone 1 --backbone mobilenetv2 --early-stop 1 --snapshot MobileNetV2.pth --snapshot-path snapshots/exp/ ./tutorial/image_data
The following training messages will be printed:
{'data_dir': './tutorial/image_data', 'model_name': 'model_ft', 'model_def_path': None, 'lr': 0.001, 'backbone': 'mobilenetv2', 'gpu': -1, 'epochs': 100, 'freeze_backbone': 1, 'batch_size': 64, 'snapshot': 'MobileNetV2.pth', 'snapshot_path': 'snapshots/exp/', 'optimizer': 'SGD', 'loss': 'cross_entropy', 'early_stop': 1, 'patience': 7}
Initializing Datasets and Dataloaders...
-------------Label mapping to Idx:--------------
{0: 'ants', 1: 'bees'}
------------------------------------------------
Params to learn:
model.classifier.1.weight
model.classifier.1.bias
Epoch 0/99
----------
train Loss: 0.7786 Acc: 0.4303
val Loss: 0.6739 Acc: 0.6056
Validation loss decreased (inf --> 0.673929). Saving model ...
...
When the validation mAP stops increasing for 7 epochs, the early stopping will be triggered and the training process will be terminated. The trained model is saved under ./snapshots/exp folder. In addition, the class label to idx mapping is printed and automatically saved in ./eval_utils/class_id.json.
Converting to ONNX
You may check the Toolchain manual for converting PyTorch model to ONNX model. Let's go through an example for converting FP_classifier PyTorch model to ONNX model.
Execute commands in the folder classification:
python pytorch2onnx.py --backbone mobilenetv2 --num_classes 2 --snapshot snapshots/exp/model_ft_best.pth --save-path snapshots/exp/model_ft_best.onnx
We could get model_ft_best.onnx.
Execute commands in the folder ONNX_Convertor/optimizer_scripts:
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
git clone https://github.com/kneron/ONNX_Convertor.git
python ONNX_Convertor/optimizer_scripts/pytorch2onnx.py snapshots/exp/model_ft_best.onnx snapshots/exp/model_ft_best_convert.onnx
We could get model_ft_best_convert.onnx.
Inference
In this section, we will go through an example of using a trained network for inference. That is, we will use the function inference.py that takes an image and predict the class label for the image. inference.py returns the top K most likely classes along with the probabilities. Let's run our network on the following image, a bee image from our custom dataset:
python inference.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --model-def-path models/MobileNetV2/ --class_id_path eval_utils/class_id.json --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg
{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'mobilenetv2', 'class_id_path': 'eval_utils/class_id.json', 'gpu': -1, 'model_def_path': 'models/MobileNetV2/', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'save_path': 'inference_result.json', 'onnx': False}
Label Probability
bees 0.836
ants 0.164
Note that the class ID mapping file eval_utils/class_id.json was created during training process. After inference, we could get inference_result.json, which contains the following information:
{"img_path": "/home/ziyan/git_repo/ai_training/ai_training/classification/tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg", "0_0": [[0.8359974026679993, 1], [0.16400262713432312, 0]]}
For onnx inference, add --onnx argument when execute inference.py:
python inference.py --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg --snapshot snapshots/exp/model_ft_best_convert.onnx --onnx
{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'resnet18', 'class_id_path': './eval_utils/class_id.json', 'gpu': -1, 'model_def_path': None, 'snapshot': 'snapshots/exp/model_ft_best_convert.onnx', 'save_path': 'inference_result.json', 'onnx': True}
Label Probability
bees 0.836
ants 0.164
Evaluation
Evaluation on a dataset
In this section, we will go through an example of evaluating a trained network on a dataset. Here, we are going to evaluate a pretrained model on the validation set of our custom dataset. The ./eval_utils/eval.py will report the top-K score and F1 score for the model evaluated on a testing dataset. The evaluation statistics will be saved to eval_results.txt.
python eval_utils/eval.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --data-dir ./tutorial/image_data/val/
{'data_dir': './tutorial/image_data/val/', 'model_def_path': None, 'backbone': 'mobilenetv2', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'gpu': -1, 'preds': None, 'gts': None}
top 1 accuracy: 0.9225352112676056
Label Precision Recall F1 score
ants 0.887 0.932 0.909
bees 0.950 0.916 0.933
End-to-End Evaluation
For end-to-end testing, we expect that the prediction results are saved into json files, one json file for one image, with the following format:
{"img_path": image_path,
"0_0":[[score, label], [score, label], ...]
}
The prediction json files for all images are expected to saved under the same folder. The ground truth json file is expected to have the following format:
{image1_path: label,
image2_path: label,
...
}
For this tutorial, we generated some random prediction data saved under the folder tutorial/eval_data/preds/, and the ground turth is saved in tutorial/eval_data/gts.json. You may check these files for the format. To compute the evaluation statistics, execute commands in the folder classification:
python eval_utils/eval.py --preds tutorial/eval_data/preds/ --gts tutorial/eval_data/gts.json
{'model_def_path': None, 'data_dir': None, 'backbone': 'resnet18', 'preds': 'tutorial/eval_data/preds/', 'gts': 'tutorial/eval_data/gts.json', 'snapshot': None, 'gpu': -1}
top 1 accuracy: 1.0
Label Precision Recall F1 score
0 1.000 1.000 1.000
1 1.000 1.000 1.000
2 1.000 1.000 1.000
The evaluation statistics will be saved to eval_results.txt.


