update files and add new datasets
308
README.md
@ -1,270 +1,84 @@
|
|||||||
<h1 align="center"> Object Detection </h1>
|
# YOLOv5 訓練與部署流程
|
||||||
Object Detection task with YOLOv5 model.
|
|
||||||
|
|
||||||
This document contains the explanations of arguments of each script.
|
## 環境設置
|
||||||
|
|
||||||
|
使用 CMD 操作。
|
||||||
|
|
||||||
You can find the tutorial document for finetuning a pretrained model on COCO128 dataset under the `tutorial` folder, `tutorial/README.md`.
|
建立一個可以運行 YOLOv5 的 conda 環境。
|
||||||
|
|
||||||
|
## 資料集準備
|
||||||
|
|
||||||
The ipython notebook tutorial is also prepared under the `tutorial` folder as `tutorial/tutorial.ipynb`. You may upload and run this ipython notebook on Google colab.
|
1. 從 Roboflow 下載 **YOLOv8 格式**的資料集,放到專案目錄(例如 `data/` 下)
|
||||||
|
2. 修改資料集內的 `data.yaml`,依照以下格式調整路徑:
|
||||||
|
|
||||||
# Prerequisites
|
```yaml
|
||||||
- Python 3.8 or above
|
path: C:/Users/rd_de/yolov5git/data/your-dataset
|
||||||
|
train: train/images
|
||||||
|
val: valid/images
|
||||||
|
test: test/images
|
||||||
|
|
||||||
# Installation
|
nc: 3 # 類別數量
|
||||||
```bash
|
names: ['class1', 'class2', 'class3']
|
||||||
$ pip install -U pip
|
|
||||||
$ pip install -r requirements.txt
|
|
||||||
```
|
```
|
||||||
|
|
||||||
# Dataset & Preparation
|
## 訓練模型
|
||||||
|
|
||||||
The image data, annotations and dataset.yaml are required.
|
先 `cd` 到 `yolov5/` 目錄,再執行:
|
||||||
|
|
||||||
## MS COCO
|
|
||||||
|
|
||||||
Our traning script accepts MS COCO dataset. You may download the dataset using the following link:
|
|
||||||
|
|
||||||
- Download [2017 MS COCO Dataset](https://cocodataset.org/#download)
|
|
||||||
|
|
||||||
## Custom Datasets
|
|
||||||
|
|
||||||
You can also train the model on a custom dataset.
|
|
||||||
|
|
||||||
### Annotations Format
|
|
||||||
After using a tool like [CVAT](https://github.com/openvinotoolkit/cvat), [makesense.ai](https://www.makesense.ai) or [Labelbox](https://labelbox.com) to label your images, export your labels to YOLO format, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The `*.txt` file specifications are:
|
|
||||||
|
|
||||||
- One row per object
|
|
||||||
- Each row is `class x_center y_center width height` format.
|
|
||||||
- Box coordinates must be in normalized xywh format (from 0 - 1). If your boxes are in pixels, divide `x_center` and `width` by image `width`, and `y_center` and `height` by image height.
|
|
||||||
- Class numbers are zero-indexed (start from 0).
|
|
||||||
|
|
||||||
<div align="center">
|
|
||||||
<img src="./tutorial/screenshots/readme_img.jpg" width="50%" />
|
|
||||||
</div>
|
|
||||||
|
|
||||||
The label file corresponding to the above image contains 2 persons (class 0) and a tie (class 27):
|
|
||||||
<div align="center">
|
|
||||||
<img src="./tutorial/screenshots/readme_img2.png" width="40%" />
|
|
||||||
</div>
|
|
||||||
|
|
||||||
### Directory Organization
|
|
||||||
Your own datasets are expected to have the following structure. We assume `/dataset` is next to the `/yolov5` directory. YOLOv5 locates labels automatically for each image by replacing the last instance of `/images/` in each image path with `/labels/`.
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
- Dataset name
|
python train.py \
|
||||||
-- images
|
--data C:/Users/rd_de/yolov5git/data/10-02+10-01+10-038class/data.yaml \
|
||||||
-- train
|
--weights for720best.pt \
|
||||||
--- img001.jpg
|
--img 640 \
|
||||||
--- ...
|
--batch-size 8 \
|
||||||
-- val
|
--epochs 300 \
|
||||||
--- img002.jpg
|
--device 0
|
||||||
--- ...
|
|
||||||
|
|
||||||
-- labels
|
|
||||||
-- train
|
|
||||||
--- img001.txt
|
|
||||||
--- ...
|
|
||||||
-- val
|
|
||||||
--- img002.txt
|
|
||||||
--- ...
|
|
||||||
|
|
||||||
- yolov5
|
|
||||||
|
|
||||||
- generate_npy
|
|
||||||
|
|
||||||
- exporting
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
### dataset.yaml
|
訓練完成後,結果與權重檔位於:
|
||||||
|
|
||||||
|
```
|
||||||
|
runs/train/expX/weights/best.pt
|
||||||
|
```
|
||||||
|
|
||||||
|
## 推論測試
|
||||||
|
|
||||||
The yaml file for COCO dataset has been prepared in `./data/coco.yaml`. For custom dataset, you need to prepare the yaml file and save it under `./data/`. The yaml file is expected to have the following format:
|
|
||||||
```bash
|
```bash
|
||||||
# train and val datasets (image directory or *.txt file with image paths)
|
python detect.py \
|
||||||
train: ./datasets/images/train/
|
--weights runs/train/exp9/weights/best.pt \
|
||||||
val: ./datasets/images/val/
|
--source test14data/test/images \
|
||||||
|
--img 640 \
|
||||||
# number of classes
|
--conf 0.25 \
|
||||||
nc: number of classes
|
--device 0
|
||||||
|
|
||||||
# class names
|
|
||||||
names: list of class names
|
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
# Train
|
## 轉換 ONNX
|
||||||
|
|
||||||
For training on MS COCO, execute commands in the folder `yolov5`:
|
|
||||||
```shell
|
|
||||||
CUDA_VISIBLE_DEVICES='0' python train.py --data coco.yaml --cfg yolov5s-noupsample.yaml --weights '' --batch-size 64
|
|
||||||
```
|
|
||||||
|
|
||||||
`CUDA_VISIBLE_DEVICES='0'` indicates the gpu ids.
|
|
||||||
|
|
||||||
`--data` the yaml file. (located under `./data/`)
|
|
||||||
|
|
||||||
`--cfg` the model configuration. (located under `./model/`) (`yolov5s-noupsample.yaml` for 520, `yolov5s.yaml` for 720)
|
|
||||||
|
|
||||||
`--hyp` the path to hyperparameters file. (located under `./data/`)
|
|
||||||
|
|
||||||
`--weights` the path to pretained model weights. ('' if train from scratch)
|
|
||||||
|
|
||||||
`--epochs` the number of epochs to train. (Default: 300)
|
|
||||||
|
|
||||||
`--batch-size` batch size. (Default: 16)
|
|
||||||
|
|
||||||
`--img-size` the input size of the model. (Default: (640, 640))
|
|
||||||
|
|
||||||
`--workers` the maximum number of dataloader workers. (Default: 8)
|
|
||||||
|
|
||||||
By default, the trained models are saved under `./runs/train/`.
|
|
||||||
|
|
||||||
## Generating .npy for different model input
|
|
||||||
We can generating `.npy` for different model input by using `yolov5_generate_npy.py`. Execute commands in the folder `generate_npy`:
|
|
||||||
```shell
|
|
||||||
python yolov5_generate_npy.py --input-h 640 --input-w 640
|
|
||||||
```
|
|
||||||
|
|
||||||
`--input-h` the input height. (Default: 640)
|
|
||||||
`--input-w` the input width. (Default: 640)
|
|
||||||
|
|
||||||
We could get `*.npy`
|
|
||||||
|
|
||||||
# Configure the paths yaml file
|
|
||||||
You are expected to create a yaml file which stores all the paths related to the trained models. This yaml file will be used in the following sections. You can check and modify the `pretrained_paths_520.yaml` and `pretrained_paths_720.yaml` under `/yolov5/data/`. The yaml file is expected to contain the following information:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
grid_dir: path_to_npy_file_directory
|
|
||||||
grid20_path: path_to_grid20_npy_file
|
|
||||||
grid40_path: path_to_grid40_npy_file
|
|
||||||
grid80_path: path_to_grid80_npy_file
|
|
||||||
|
|
||||||
yolov5_dir: path_to_yolov5_directory
|
|
||||||
path: path_to_pretrained_yolov5_model_weights_pt_file
|
|
||||||
yaml_path: path_to_the_model_configuration_yaml_file
|
|
||||||
pt_path: path_to_export_yolov5_model_weights_kneron_supported_file
|
|
||||||
onnx_export_file: path_to_export_yolov5_onnx_model_file
|
|
||||||
|
|
||||||
input_w: model_input_weight
|
|
||||||
input_h: model_input_height
|
|
||||||
|
|
||||||
nc: number_of_classes
|
|
||||||
|
|
||||||
names: list_of_class_names
|
|
||||||
```
|
|
||||||
|
|
||||||
# Save and Convert to ONNX
|
|
||||||
This section will introduce how to save the trained model for pytorch1.4 supported format and convert to ONNX.
|
|
||||||
|
|
||||||
## Exporting ONNX model in the PyTorch 1.7 environment
|
|
||||||
We can convert the model to onnx by using `yolov5_export.py`. Execute commands in the folder `yolov5`:
|
|
||||||
```shell
|
|
||||||
python ../exporting/yolov5_export.py --data path_to_pretrained_path_yaml_file
|
|
||||||
```
|
|
||||||
|
|
||||||
`--data` the path to pretrained model paths yaml file (Default: ../yolov5/data/pretrained_paths_520.yaml)
|
|
||||||
|
|
||||||
We could get onnx model.
|
|
||||||
|
|
||||||
|
|
||||||
## Converting onnx by tool chain
|
|
||||||
Pull the latest [ONNX converter](https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts) from github. You may read the latest document from Github for converting ONNX model. Execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
|
|
||||||
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
|
|
||||||
|
|
||||||
```shell
|
|
||||||
python -m onnxsim input_onnx_model output_onnx_model
|
|
||||||
|
|
||||||
python pytorch2onnx.py input.pth output.onnx
|
|
||||||
```
|
|
||||||
|
|
||||||
We could get converted onnx model.
|
|
||||||
|
|
||||||
|
|
||||||
# Inference
|
|
||||||
|
|
||||||
Before model inference, we assume that the model has been converted to onnx model as in the previous section (even if only inference pth model). Create a yaml file containing the path information. For model inference on a single image, execute commands in the folder `yolov5`:
|
|
||||||
```shell
|
|
||||||
python inference.py --data path_to_pretrained_path_yaml_file --img-path path_to_image --save-path path_to_saved_image
|
|
||||||
```
|
|
||||||
|
|
||||||
`--img-path` the path to the image.
|
|
||||||
|
|
||||||
`--save-path` the path to draw and save the image with bbox.
|
|
||||||
|
|
||||||
`--data` the path to pretrained model paths yaml file. (Default: data/pretrained_paths_520.yaml)
|
|
||||||
|
|
||||||
`--conf_thres` the score threshold of bounding boxes. (Default: 0.3)
|
|
||||||
|
|
||||||
`--iou_thres` the iou threshold for NMS. (Default: 0.3)
|
|
||||||
|
|
||||||
`--onnx` whether is onnx model inference.
|
|
||||||
|
|
||||||
You could find preprocessing and postprocessing processes under the folder `exporting/yolov5/`.
|
|
||||||
|
|
||||||
|
|
||||||
# Evaluation
|
|
||||||
|
|
||||||
## Evaluation Metric
|
|
||||||
We will use mean Average Precision (mAP) for evaluation. You can find the script for computing mAP in `test.py`.
|
|
||||||
|
|
||||||
`mAP`: mAP is the average of Average Precision (AP). AP summarizes a precision-recall curve as the weighted mean of precisions achieved at each threshold, with the increase in recall from the previous threshold used as the weight:
|
|
||||||
|
|
||||||
<img src="https://latex.codecogs.com/svg.image?AP&space;=&space;\sum_n&space;(R_n-R_{n-1})P_n&space;" title="AP = \sum_n (R_n-R_{n-1})P_n " />
|
|
||||||
|
|
||||||
where <img src="https://latex.codecogs.com/svg.image?R_n" title="R_n" /> and <img src="https://latex.codecogs.com/svg.image?P_n" title="P_n" /> are the precision and recall at the nth threshold. The mAP compares the ground-truth bounding box to the detected box and returns a score. The higher the score, the more accurate the model is in its detections.
|
|
||||||
|
|
||||||
## Evaluation on a Dataset
|
|
||||||
For evaluating the trained model on dataset:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
python test.py --weights path_to_pth_model_weight --data path_to_data_yaml_file
|
|
||||||
```
|
|
||||||
|
|
||||||
`--weights` The path to pretrained model weight. (Defalut: best.pt)
|
|
||||||
|
|
||||||
`--data` The path to data yaml file. (Default: data/coco128.yaml)
|
|
||||||
|
|
||||||
`--img-size` Input shape of the model (Default: (640, 640))
|
|
||||||
|
|
||||||
`--conf-thres` Object confidence threshold. (Default: 0.001)
|
|
||||||
|
|
||||||
`--device` Cuda device, i.e. 0 or 0,1,2,3 or cpu. (Default: cpu)
|
|
||||||
|
|
||||||
`--verbose` Whether report mAP by class.
|
|
||||||
|
|
||||||
## End-to-End Evaluation
|
|
||||||
If you would like to perform an end-to-end test with an image dataset, you can use `inference_e2e.py` under the directory `yolov5` to obtain the prediction results.
|
|
||||||
You have to prepare an initial parameter yaml file for the inference runner. You may check `utils/init_params.yaml` for the format.
|
|
||||||
```shell
|
|
||||||
python inference_e2e.py --img-path path_to_dataset_folder --params path_to_init_params_file --save-path path_to_save_json_file
|
|
||||||
```
|
|
||||||
`--img-path` Path to the dataset directory
|
|
||||||
|
|
||||||
`--params` Path to initial parameter yaml file for the inference runner
|
|
||||||
|
|
||||||
`--save-path` Path to save the prediction to a json file
|
|
||||||
|
|
||||||
`--gpu` GPU id (-1 if cpu) (Default: -1)
|
|
||||||
|
|
||||||
The predictions will be saved into a json file that has the following structure:
|
|
||||||
```bash
|
```bash
|
||||||
[
|
python exporting/yolov5_export.py --data data/mepretrained_paths_720.yaml
|
||||||
{'img_path':image_path_1
|
|
||||||
'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
|
|
||||||
},
|
|
||||||
{'img_path':image_path_2
|
|
||||||
'bbox': [[l,t,w,h,score,class_id], [l,t,w,h,score,class_id]]
|
|
||||||
},
|
|
||||||
...
|
|
||||||
]
|
|
||||||
```
|
```
|
||||||
# Model
|
|
||||||
|
|
||||||
Backbone | Input Size | FPS on 520 | FPS on 720 | Model Size | mAP
|
簡化 ONNX 模型:
|
||||||
--- | --- |:---:|:---:|:---:|:---:
|
|
||||||
[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) | 640x640 | 4.91429 | - | 13.1M | 40.4%
|
|
||||||
[YOLOv5s (with upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s) | 640x640 | - | 24.4114 | 14.6M | 50.9%
|
|
||||||
|
|
||||||
[YOLOv5s (no upsample)](https://github.com/kneron/Model_Zoo/tree/main/detection/yolov5/yolov5s-noupsample) is the yolov5s model backbone without upsampling operation, since 520 hardware does not support upsampling operation.
|
```bash
|
||||||
|
python -m onnxsim \
|
||||||
|
runs/train/exp24/weights/best.onnx \
|
||||||
|
runs/train/exp24/weights/best_simplified.onnx
|
||||||
|
```
|
||||||
|
|
||||||
|
## Kneron Toolchain(Docker)
|
||||||
|
|
||||||
|
啟動 Kneron Toolchain 容器(在 WSL 中執行):
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker run --rm -it \
|
||||||
|
-v $(wslpath -u 'C:\Users\rd_de\golfaceyolov5\yolov5'):/workspace/yolov5 \
|
||||||
|
kneron/toolchain:latest
|
||||||
|
```
|
||||||
|
|
||||||
|
從容器複製編譯好的 `.nef` 模型到本機:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker cp <container_id>:/data1/kneron_flow/runs/train/exp6/weights/models_630.nef \
|
||||||
|
C:\Users\rd_de\golfaceyolov5\yolov5\runs\train\exp6\weights
|
||||||
|
```
|
||||||
|
|||||||
7
ai_training/.gitmodules
vendored
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
|
||||||
|
[submodule "evaluation/kneron_eval/utils/kneron_globalconstant"]
|
||||||
|
path = evaluation/kneron_eval/utils/kneron_globalconstant
|
||||||
|
url = git@59.125.118.185:jenna/kneron_globalconstant.git
|
||||||
|
[submodule "mmdetection"]
|
||||||
|
path = mmdetection
|
||||||
|
url = git@59.125.118.185:eric_wu/mmdetection.git
|
||||||
0
ai_training/README.md
Normal file
1
ai_training/ai_training_hash.txt
Normal file
@ -0,0 +1 @@
|
|||||||
|
d840d94c7201cd6a7596bb8f5dc54d7866cd16c3
|
||||||
234
ai_training/classification/README.md
Normal file
@ -0,0 +1,234 @@
|
|||||||
|
<h1 align="center"> Image Classification </h1>
|
||||||
|
|
||||||
|
The tutorial explores the basis of image classification task. This document contains the explanations of arguments of each script.
|
||||||
|
|
||||||
|
|
||||||
|
You can find the tutorial for finetuning a pretrained model on custom dataset under the `tutorial` folder, `tutorial/README.md`.
|
||||||
|
|
||||||
|
|
||||||
|
The ipython notebook tutorial is also prepared under the `tutorial` folder as `tutorial/tutorial.ipynb`. You may upload and run this ipython notebook on Google colab.
|
||||||
|
|
||||||
|
|
||||||
|
Image Classification is a fundamental task that attempts to classify the image by assigning it to a specific label. Our AI training platform provides the training script to train a classification model for image classification task.
|
||||||
|
|
||||||
|
# Prerequisites
|
||||||
|
First of all, we have to install the libraries. Python 3.6 or above is required. For other libraries, you can check the `requirements.txt` file. Installing these packages is simple. You can install them by running:
|
||||||
|
|
||||||
|
```
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
# Dataset & Preparation
|
||||||
|
Next, we need a dataset for the training model.
|
||||||
|
|
||||||
|
## Custom Datasets
|
||||||
|
You can train the model on a custom dataset. Your own datasets are expected to have the following structure:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
- Dataset name
|
||||||
|
-- train
|
||||||
|
--- Class1
|
||||||
|
--- Class2
|
||||||
|
|
||||||
|
-- val
|
||||||
|
--- Class1
|
||||||
|
--- Class2
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example
|
||||||
|
Let's go through a toy example for preparing a custom dataset. Suppose we are going to classify bees and ants.
|
||||||
|
<div align="center">
|
||||||
|
<img src="./image_data/train/ants/0013035.jpg" width="33%" /> <img src="./image_data/train/bees/1092977343_cb42b38d62.jpg" width="33%" />
|
||||||
|
</div>
|
||||||
|
First of all, we have to split the images for bees and ants into train and validation set respectively (recommend 8:2). Then, we can move the images into difference folders with their class names. The dataset folder will have the following structure.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
- image data
|
||||||
|
-- train
|
||||||
|
--- ants
|
||||||
|
--- bees
|
||||||
|
|
||||||
|
-- val
|
||||||
|
--- ants
|
||||||
|
--- bees
|
||||||
|
```
|
||||||
|
|
||||||
|
Now, we have finished preparing the dataset.
|
||||||
|
|
||||||
|
# Train
|
||||||
|
Let's look at how to train or finetune a model. There are several backbone models and arguments to choose. You can find the FPS results of these backbone models evaluated on 520 and 720 in the next section.
|
||||||
|
|
||||||
|
For training on a custom dataset, run:
|
||||||
|
```shell
|
||||||
|
python train.py --gpu -1 --backbone backbone_name --model-def-path path_to_model_definition_folder --snapshot path_to_pretrained_model_weights path_to_dataset_folder
|
||||||
|
```
|
||||||
|
|
||||||
|
`--gpu` which gpu to run. (-1 if cpu)
|
||||||
|
|
||||||
|
`--workers` the number of dataloader workers. (Default: 1)
|
||||||
|
|
||||||
|
`--backbone` which backbone model to use. Options: see Models(#Models).
|
||||||
|
|
||||||
|
`--freeze-backbone` whether freeze the backbone when the pretrained model is used. (Default: 0)
|
||||||
|
|
||||||
|
`--early-stop` whether early stopping when validation accuracy increases. (Default: 1)
|
||||||
|
|
||||||
|
`--patience` patience for early stopping. (Default: 7)
|
||||||
|
|
||||||
|
`--model-name` name of your model.
|
||||||
|
|
||||||
|
`--lr` learning rate. (Default: 1e-3)
|
||||||
|
|
||||||
|
`--model-def-path` path to pretrained model definition folder. (Default: './models/')
|
||||||
|
|
||||||
|
`--snapshot` path to the pretrained model. (Default: None)
|
||||||
|
|
||||||
|
`--epochs` number of epochs to train. (Default: 100)
|
||||||
|
|
||||||
|
`--batch-size` size of the batches. (Default: 64)
|
||||||
|
|
||||||
|
`--snapshot-path` path to store snapshots of models during training. (Default: 'snapshots/{}'.format(today))
|
||||||
|
|
||||||
|
`--optimizer` optimizer for training. Options: SGD, ASGD, ADAM. (Default: SGD)
|
||||||
|
|
||||||
|
`--loss` loss function. Options: cross_entropy. (Default: cross_entropy)
|
||||||
|
|
||||||
|
# Converting to ONNX
|
||||||
|
You may check the [Toolchain manual](http://doc.kneron.com/docs/#toolchain/manual/) for converting PyTorch model to ONNX model. Let's go through an example for converting FP_classifier PyTorch model to ONNX model.
|
||||||
|
|
||||||
|
Execute commands in the folder `classification`:
|
||||||
|
```shell
|
||||||
|
python pytorch2onnx.py --backbone backbone_name --num_classes the_number_of_classes --snapshot pytorch_model_path --save-path onnx_model_path
|
||||||
|
```
|
||||||
|
|
||||||
|
`--save-path` path to save the onnx model.
|
||||||
|
`--backbone` which backbone model to use. Options: see Models(#Models).
|
||||||
|
`--num_classes` the number of classes.
|
||||||
|
`--model-def-path` path to pretrained model definition
|
||||||
|
`--snapshot` path to the pretrained model.
|
||||||
|
|
||||||
|
We could get pytorch to onnx model.
|
||||||
|
|
||||||
|
Then, execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
|
||||||
|
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python pytorch2onnx.py onnx_model_path onnx_model_convert_path
|
||||||
|
|
||||||
|
```
|
||||||
|
|
||||||
|
We could get converted onnx model.
|
||||||
|
|
||||||
|
# Inference
|
||||||
|
In this section, we will go through using a trained network for inference. That is, we will use the function `inference.py` that takes an image and predict the class label for the image. `inference.py` returns the top $K$ most likely classes along with the probabilities.
|
||||||
|
|
||||||
|
For inference on a image, run:
|
||||||
|
```shell
|
||||||
|
python train.py --gpu -1 --backbone backbone_name --model-def-path path_to_model_definition_folder --snapshot path_to_pretrained_model_weights path_to_dataset_folder
|
||||||
|
```
|
||||||
|
|
||||||
|
`--gpu` which gpu to run. (-1 if cpu)
|
||||||
|
|
||||||
|
`--backbone` which backbone model to use. Options: see Models(#Models).
|
||||||
|
|
||||||
|
`--model-def-path` path to pretrained model definition folder. (Default: './models/')
|
||||||
|
|
||||||
|
`--snapshot` path to the pretrained model. (Default: None)
|
||||||
|
|
||||||
|
`--img-path` Path to the image.
|
||||||
|
|
||||||
|
`--class_id_path` path to the class id mapping file. (Default: './eval_utils/class_id.json')
|
||||||
|
|
||||||
|
`--save-path` path to save the classification result. (Default: 'inference_result.json')
|
||||||
|
|
||||||
|
`--onnx` whether inference onnx model
|
||||||
|
|
||||||
|
You could find preprocessing and postprocessing processes in `inference.py`.
|
||||||
|
|
||||||
|
# Evaluation
|
||||||
|
|
||||||
|
## Evaluation Metric
|
||||||
|
We will consider `top-K score`, `precision`, `recall` and `F1 score` for evaluating our model. You can find the script for computing these metrics in `eval_utils/eval.py`.
|
||||||
|
|
||||||
|
`top-K score`: This metric computes the number of times where the correct label is among the top k labels predicted (ranked by predicted scores). Note that the multilabel case isn’t covered here.
|
||||||
|
|
||||||
|
`precision`: The precision is the ratio `tp / (tp + fp)` where `tp` is the number of true positives and `fp` the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative. The best value is 1 and the worst value is 0.
|
||||||
|
|
||||||
|
`recall`: The recall is the ratio `tp / (tp + fn)` where `tp` is the number of true positives and `fn` the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples. The best value is 1 and the worst value is 0.
|
||||||
|
|
||||||
|
`F1 score`: The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is:
|
||||||
|
`F1 = 2 * (precision * recall) / (precision + recall)`.
|
||||||
|
|
||||||
|
## Evaluation on a dataset
|
||||||
|
In this section, we will go through evaluating a trained network on a dataset. Here, we are going to evaluate a pretrained model on the validation set of the custom dataset. The `./eval_utils/eval.py` will report the top-K score, precision, recall and F1 score for the model evaluated on a testing dataset. The evaluation statistics will be saved to `eval_results.txt`.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python eval_utils/eval.py --gpu -1 --backbone backbone_name --snapshot path_to_pretrained_model_weights --model-def-path path_to_model_definition_folder --data-dir path_to_dataset_folder
|
||||||
|
```
|
||||||
|
|
||||||
|
`--gpu` which gpu to run. (-1 if cpu)
|
||||||
|
|
||||||
|
`--backbone` which backbone model to use. Options: see Models(#Models).
|
||||||
|
|
||||||
|
`--model-def-path` path to pretrained model definition folder. (Default: './models/')
|
||||||
|
|
||||||
|
`--snapshot` path to the pretrained model weight. (Default: None)
|
||||||
|
|
||||||
|
`--data-dir` path to dataset folder. (Default: None)
|
||||||
|
|
||||||
|
|
||||||
|
## End-to-End Evaluation
|
||||||
|
For end-to-end testing, we expect that the prediction results are saved into json files, one json file for one image, with the following format:
|
||||||
|
```bash
|
||||||
|
{"img_path": image_path,
|
||||||
|
"0_0":[[score, label], [score, label], ...]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
The prediction json files for all images are expected to saved under the same folder. The ground truth json file is expected to have the following format:
|
||||||
|
```bash
|
||||||
|
{image1_path: label,
|
||||||
|
image2_path: label,
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
To compute the evaluation statistics, execute commands in the folder `classification`:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python eval_utils/eval.py --preds path_to_predicted_results --gts path_to_ground_truth
|
||||||
|
```
|
||||||
|
|
||||||
|
`--preds` path to predicted results. (e2e eval)
|
||||||
|
|
||||||
|
`--gts` path to ground truth. (e2e eval)
|
||||||
|
|
||||||
|
The evaluation statistics will be saved to `eval_results.txt`.
|
||||||
|
|
||||||
|
|
||||||
|
# Models
|
||||||
|
|
||||||
|
Model | Input Size | FPS on 520 | FPS on 720 | Model Size
|
||||||
|
--- | :---: |:---:|:---:|:---:
|
||||||
|
[FP_classifier](https://github.com/kneron/Model_Zoo/tree/main/classification/FP_classifier)| 56x32 | 323.471 | 3370.47 | 5.1M
|
||||||
|
[mobilenetv2](https://github.com/kneron/Model_Zoo/tree/main/classification/MobileNetV2)| 224x224 | 58.9418 | 620.677 | 14M
|
||||||
|
[resnet18](https://github.com/kneron/Model_Zoo/tree/main/classification/ResNet18)| 224x224 | 20.4376 | 141.371 | 46.9M
|
||||||
|
[resnet50](https://github.com/kneron/Model_Zoo/tree/main/classification/ResNet50)| 224x224 | 6.32576 | 49.0828 | 102.9M
|
||||||
|
efficientnet-b0| 224x224 | 42.3118 | 157.482 | 18.6M
|
||||||
|
efficientnet-b1| 224x224 | 28.0051 | 110.907 | 26.7M
|
||||||
|
efficientnet-b2| 224x224 | 24.164 | 101.598 | 31.1M
|
||||||
|
efficientnet-b3| 224x224 | 18.4925 | 71.9006 | 41.4M
|
||||||
|
efficientnet-b4| 224x224 | 12.1506 | 52.3374 | 64.7M
|
||||||
|
efficientnet-b5| 224x224 | 7.7483 | 35.4869 | 100.7M
|
||||||
|
efficientnet-b6| 224x224 | 4.96453 | 26.5797 | 141.9M
|
||||||
|
efficientnet-b7| 224x224 | 3.35853 | 17.9795 | 217.4M
|
||||||
|
|
||||||
|
Note that for EfficientNet, Squeeze-and-Excitation layers are removed and Swish function is replaced by ReLU.
|
||||||
|
|
||||||
|
FP_classifier is a pretrained model for classifying person and background images. The class id label mapping file is saved as `./eval_utils/person_class_id.json`.
|
||||||
|
|
||||||
|
|
||||||
|
\ | FP_classifier | mobilenetv2 | resnet18 | resnet50
|
||||||
|
--- | :---: | :---: | :---: | :---:
|
||||||
|
Rank 1 | 94.13% | 69.82% | 66.46% | 72.80%
|
||||||
|
Rank 5 | - | 89.29% | 87.09% | 90.91%
|
||||||
|
|
||||||
|
Resnet50 is currently under training for Kneron preprocessing.
|
||||||
56
ai_training/classification/early_stopping.py
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
import numpy as np
|
||||||
|
import torch
|
||||||
|
import os
|
||||||
|
|
||||||
|
class EarlyStopping:
|
||||||
|
"""Early stops the training if validation loss doesn't improve after a given patience."""
|
||||||
|
def __init__(self, model_name = 'model_ft', patience=7, verbose=False, delta=0, path='./snapshots/'):
|
||||||
|
"""
|
||||||
|
Args:
|
||||||
|
patience (int): How long to wait after last time validation loss improved.
|
||||||
|
Default: 7
|
||||||
|
verbose (bool): If True, prints a message for each validation loss improvement.
|
||||||
|
Default: False
|
||||||
|
delta (float): Minimum change in the monitored quantity to qualify as an improvement.
|
||||||
|
Default: 0
|
||||||
|
path (str): Path for the checkpoint to be saved to.
|
||||||
|
Default: 'checkpoint.pt'
|
||||||
|
"""
|
||||||
|
self.model_name = model_name
|
||||||
|
self.patience = patience
|
||||||
|
self.verbose = verbose
|
||||||
|
self.counter = 0
|
||||||
|
self.best_score = None
|
||||||
|
self.early_stop = False
|
||||||
|
self.val_loss_min = np.Inf
|
||||||
|
self.delta = delta
|
||||||
|
self.path = path
|
||||||
|
|
||||||
|
def __call__(self, val_loss, model, epoch_label):
|
||||||
|
|
||||||
|
score = -val_loss
|
||||||
|
|
||||||
|
if self.best_score is None:
|
||||||
|
self.best_score = score
|
||||||
|
self.save_checkpoint(val_loss, model, epoch_label)
|
||||||
|
elif score < self.best_score + self.delta:
|
||||||
|
self.counter += 1
|
||||||
|
print(f'EarlyStopping counter: {self.counter} out of {self.patience}')
|
||||||
|
if self.counter >= self.patience:
|
||||||
|
self.early_stop = True
|
||||||
|
else:
|
||||||
|
self.best_score = score
|
||||||
|
self.save_checkpoint(val_loss, model, epoch_label)
|
||||||
|
self.counter = 0
|
||||||
|
|
||||||
|
def save_checkpoint(self, val_loss, model, epoch_label):
|
||||||
|
'''Saves model when validation loss decrease.'''
|
||||||
|
if self.verbose:
|
||||||
|
print(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}). Saving model ...')
|
||||||
|
save_filename = self.model_name + '_%s.pth'% epoch_label
|
||||||
|
save_path = os.path.join(self.path,save_filename)
|
||||||
|
if not os.path.isdir(self.path):
|
||||||
|
os.makedirs(self.path)
|
||||||
|
torch.save(model.state_dict(), save_path)
|
||||||
|
self.val_loss_min = val_loss
|
||||||
|
|
||||||
6
ai_training/classification/eval_results.txt
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
top 1 accuracy: 1.0
|
||||||
|
|
||||||
|
Label Precision Recall F1 score
|
||||||
|
0 1.000 1.000 1.000
|
||||||
|
1 1.000 1.000 1.000
|
||||||
|
2 1.000 1.000 1.000
|
||||||
0
ai_training/classification/eval_utils/__init__.py
Normal file
1
ai_training/classification/eval_utils/class_id.json
Normal file
@ -0,0 +1 @@
|
|||||||
|
{"0": "ants", "1": "bees"}
|
||||||
245
ai_training/classification/eval_utils/eval.py
Normal file
@ -0,0 +1,245 @@
|
|||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import json
|
||||||
|
|
||||||
|
sys.path.append(os.getcwd())
|
||||||
|
import numpy as np
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
from load_model import initialize_model
|
||||||
|
from sklearn.metrics import f1_score
|
||||||
|
from sklearn.metrics import recall_score
|
||||||
|
from sklearn.metrics import precision_score
|
||||||
|
|
||||||
|
def accuracy(output, target, topk=(1,), e2e=False) :
|
||||||
|
"""
|
||||||
|
Computes the accuracy over the k top predictions for the specified values of k
|
||||||
|
In top-5 accuracy you give yourself credit for having the right answer
|
||||||
|
if the right answer appears in your top five guesses.
|
||||||
|
|
||||||
|
ref:
|
||||||
|
- https://pytorch.org/docs/stable/generated/torch.topk.html
|
||||||
|
- https://discuss.pytorch.org/t/imagenet-example-accuracy-calculation/7840
|
||||||
|
- https://gist.github.com/weiaicunzai/2a5ae6eac6712c70bde0630f3e76b77b
|
||||||
|
- https://discuss.pytorch.org/t/top-k-error-calculation/48815/2
|
||||||
|
- https://stackoverflow.com/questions/59474987/how-to-get-top-k-accuracy-in-semantic-segmentation-using-pytorch
|
||||||
|
|
||||||
|
:param output: output is the prediction of the model e.g. scores, logits, raw y_pred before normalization or getting classes
|
||||||
|
:param target: target is the truth
|
||||||
|
:param topk: tuple of topk's to compute e.g. (1, 2, 5) computes top 1, top 2 and top 5.
|
||||||
|
e.g. in top 2 it means you get a +1 if your models's top 2 predictions are in the right label.
|
||||||
|
So if your model predicts cat, dog (0, 1) and the true label was bird (3) you get zero
|
||||||
|
but if it were either cat or dog you'd accumulate +1 for that example.
|
||||||
|
:return: list of topk accuracy [top1st, top2nd, ...] depending on your topk input
|
||||||
|
"""
|
||||||
|
with torch.no_grad():
|
||||||
|
# ---- get the topk most likely labels according to your model
|
||||||
|
# get the largest k \in [n_classes] (i.e. the number of most likely probabilities we will use)
|
||||||
|
maxk = max(topk) # max number labels we will consider in the right choices for out model
|
||||||
|
batch_size = target.size(0)
|
||||||
|
|
||||||
|
# get top maxk indicies that correspond to the most likely probability scores
|
||||||
|
# (note _ means we don't care about the actual top maxk scores just their corresponding indicies/labels)
|
||||||
|
if e2e:
|
||||||
|
y_pred = output
|
||||||
|
else:
|
||||||
|
_, y_pred = output.topk(k=maxk, dim=1) # _, [B, n_classes] -> [B, maxk]
|
||||||
|
y_pred = y_pred.t() # [B, maxk] -> [maxk, B] Expects input to be <= 2-D tensor and transposes dimensions 0 and 1.
|
||||||
|
|
||||||
|
# - get the credit for each example if the models predictions is in maxk values (main crux of code)
|
||||||
|
# for any example, the model will get credit if it's prediction matches the ground truth
|
||||||
|
# for each example we compare if the model's best prediction matches the truth. If yes we get an entry of 1.
|
||||||
|
# if the k'th top answer of the model matches the truth we get 1.
|
||||||
|
# Note: this for any example in batch we can only ever get 1 match (so we never overestimate accuracy <1)
|
||||||
|
target_reshaped = target.view(1, -1).expand_as(y_pred) # [B] -> [B, 1] -> [maxk, B]
|
||||||
|
# compare every topk's model prediction with the ground truth & give credit if any matches the ground truth
|
||||||
|
correct = (y_pred == target_reshaped) # [maxk, B] were for each example we know which topk prediction matched truth
|
||||||
|
# original: correct = pred.eq(target.view(1, -1).expand_as(pred))
|
||||||
|
|
||||||
|
# -- get topk accuracy
|
||||||
|
list_topk_accs = [] # idx is topk1, topk2, ... etc
|
||||||
|
for k in topk:
|
||||||
|
# get tensor of which topk answer was right
|
||||||
|
ind_which_topk_matched_truth = correct[:k] # [maxk, B] -> [k, B]
|
||||||
|
# flatten it to help compute if we got it correct for each example in batch
|
||||||
|
flattened_indicator_which_topk_matched_truth = ind_which_topk_matched_truth.reshape(-1).float() # [k, B] -> [kB]
|
||||||
|
# get if we got it right for any of our top k prediction for each example in batch
|
||||||
|
tot_correct_topk = flattened_indicator_which_topk_matched_truth.float().sum(dim=0, keepdim=True) # [kB] -> [1]
|
||||||
|
# compute topk accuracy - the accuracy of the mode's ability to get it right within it's top k guesses/preds
|
||||||
|
topk_acc = tot_correct_topk / batch_size # topk accuracy for entire batch
|
||||||
|
list_topk_accs.append(topk_acc.cpu().numpy()[0])
|
||||||
|
return np.array(list_topk_accs) # array of topk accuracies for entire batch [topk1, topk2, ... etc]
|
||||||
|
|
||||||
|
def evaluate(data_dir, backbone, model_def_path, pretrained_path, device, topk=(1,)):
|
||||||
|
|
||||||
|
num_classes = len([f for f in os.listdir(data_dir) if not f.startswith('.')])
|
||||||
|
if max(topk) > num_classes:
|
||||||
|
topk = np.array(topk)
|
||||||
|
topk = topk[topk<=num_classes].tolist()
|
||||||
|
|
||||||
|
model_structure, input_size = initialize_model(backbone, num_classes, False, model_def_path)
|
||||||
|
model_structure.load_state_dict(torch.load(pretrained_path))
|
||||||
|
model = model_structure.eval()
|
||||||
|
model = model.to(device)
|
||||||
|
|
||||||
|
data_transforms = transforms.Compose([
|
||||||
|
transforms.Resize(input_size),
|
||||||
|
transforms.ToTensor(),
|
||||||
|
transforms.Normalize([0, 0, 0], [1/255.0, 1/255.0, 1/255.0]),
|
||||||
|
transforms.Normalize([0.5*256, 0.5*256, 0.5*256], [256.0, 256.0, 256.0])
|
||||||
|
#transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
|
||||||
|
])
|
||||||
|
image_datasets = datasets.ImageFolder(data_dir, data_transforms)
|
||||||
|
batch_size = 32
|
||||||
|
dataloaders = torch.utils.data.DataLoader(image_datasets, shuffle=False, batch_size=batch_size, num_workers=4)
|
||||||
|
|
||||||
|
list_topk_accs = np.zeros(len(topk))
|
||||||
|
y_preds = []
|
||||||
|
y_labels = []
|
||||||
|
for inputs, labels in dataloaders:
|
||||||
|
with torch.no_grad():
|
||||||
|
inputs = inputs.to(device)
|
||||||
|
labels = labels.to(device)
|
||||||
|
|
||||||
|
outputs = model(inputs)
|
||||||
|
|
||||||
|
list_topk_accs += accuracy(outputs, labels, topk)* len(labels)
|
||||||
|
_, y_pred = outputs.topk(k=1)
|
||||||
|
y_preds += y_pred.cpu().numpy().tolist()
|
||||||
|
y_labels += labels.cpu().numpy().tolist()
|
||||||
|
|
||||||
|
print()
|
||||||
|
list_topk_accs = list_topk_accs/len(image_datasets)
|
||||||
|
with open('eval_results.txt', 'w') as writefile:
|
||||||
|
for i, k in enumerate(topk):
|
||||||
|
if k is None:
|
||||||
|
break
|
||||||
|
acc_str = 'top '+ str(k) + ' accuracy: ' + str(list_topk_accs[i])
|
||||||
|
print(acc_str)
|
||||||
|
writefile.write(acc_str)
|
||||||
|
print()
|
||||||
|
writefile.write('\n')
|
||||||
|
class_id = image_datasets.class_to_idx
|
||||||
|
class_id = dict([(value, key) for key, value in class_id.items()])
|
||||||
|
f1 = f1_score(y_labels, y_preds, average=None)
|
||||||
|
recall = recall_score(y_labels, y_preds, average=None)
|
||||||
|
precision = precision_score(y_labels, y_preds, average=None)
|
||||||
|
header = 'Label Precision Recall F1 score'
|
||||||
|
itn_line = '{:10} {:8.3f} {:8.3f} {:8.3f}'
|
||||||
|
writefile.write(header)
|
||||||
|
print(header )
|
||||||
|
for i, score in enumerate(f1):
|
||||||
|
res_str = itn_line.format(class_id[i], precision[i], recall[i], score)
|
||||||
|
print( res_str )
|
||||||
|
writefile.write(res_str)
|
||||||
|
return list_topk_accs, f1, recall,precision
|
||||||
|
|
||||||
|
def evaluate_e2e(gt_path, classification_path, topk=[1,5,10]):
|
||||||
|
preds = {}
|
||||||
|
for file in os.listdir(classification_path):
|
||||||
|
if file.split('.')[-1] != 'json':
|
||||||
|
continue
|
||||||
|
|
||||||
|
full_filename = os.path.join(classification_path, file)
|
||||||
|
with open(full_filename,'r') as fi:
|
||||||
|
dic = json.load(fi)
|
||||||
|
preds[dic['img_path'] ] = dic["0_0"] # {img_id: [[score1,label1], [score2,label2]]}
|
||||||
|
preds[dic['img_path'] ].sort(reverse=True)
|
||||||
|
with open(gt_path, 'r') as json_file2:
|
||||||
|
gts = json.load(json_file2) # {img_id: label}
|
||||||
|
|
||||||
|
pred_scores = []
|
||||||
|
pred_labels = []
|
||||||
|
pred_labels_ = []
|
||||||
|
y_true = []
|
||||||
|
|
||||||
|
for img_name in preds:
|
||||||
|
res = preds[img_name]
|
||||||
|
res0 = list(zip(*res))
|
||||||
|
pred_scores.append(list(res0[0]))
|
||||||
|
pred_labels.append(res0[1][0])
|
||||||
|
pred_labels_.append(res0[1])
|
||||||
|
y_true.append(gts[img_name])
|
||||||
|
|
||||||
|
nc = len(set(y_true))
|
||||||
|
|
||||||
|
if max(topk) > nc:
|
||||||
|
topk = np.array(topk)
|
||||||
|
topk = topk[topk<=nc].tolist()
|
||||||
|
|
||||||
|
list_topk_accs = accuracy(torch.FloatTensor(pred_labels_), torch.FloatTensor(y_true), topk=topk,e2e=True)
|
||||||
|
print()
|
||||||
|
with open('eval_results.txt', 'w') as writefile:
|
||||||
|
for i, k in enumerate(topk):
|
||||||
|
if k is None:
|
||||||
|
break
|
||||||
|
acc_str = 'top '+ str(k) + ' accuracy: ' + str(list_topk_accs[i])
|
||||||
|
print(acc_str)
|
||||||
|
writefile.write(acc_str+'\n')
|
||||||
|
print()
|
||||||
|
writefile.write('\n')
|
||||||
|
|
||||||
|
f1 = f1_score(y_true, pred_labels, average=None)
|
||||||
|
recall = recall_score(y_true, pred_labels, average=None)
|
||||||
|
precision = precision_score(y_true, pred_labels, average=None)
|
||||||
|
|
||||||
|
header = 'Label Precision Recall F1 score'
|
||||||
|
itn_line = '{:10} {:8.3f} {:8.3f} {:8.3f}'
|
||||||
|
writefile.write(header+'\n')
|
||||||
|
print(header )
|
||||||
|
for i, score in enumerate(f1):
|
||||||
|
res_str = itn_line.format(str(i), precision[i], recall[i], score)
|
||||||
|
print( res_str )
|
||||||
|
writefile.write(res_str+'\n')
|
||||||
|
|
||||||
|
return list_topk_accs, f1, recall,precision
|
||||||
|
|
||||||
|
|
||||||
|
def check_args(parsed_args):
|
||||||
|
""" Function to check for inherent contradictions within parsed arguments.
|
||||||
|
Args
|
||||||
|
parsed_args: parser.parse_args()
|
||||||
|
Returns
|
||||||
|
parsed_args
|
||||||
|
"""
|
||||||
|
if parsed_args.gpu >= 0 and torch.cuda.is_available() == False:
|
||||||
|
raise ValueError("No gpu is available")
|
||||||
|
return parsed_args
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args(args):
|
||||||
|
"""
|
||||||
|
Parse the arguments.
|
||||||
|
"""
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description='Simple training script for training a image classification network.')
|
||||||
|
parser.add_argument('--data-dir', type=str, help='Path to the image directory')
|
||||||
|
parser.add_argument('--model-def-path', type=str, help='Path to pretrained model definition', default=None )
|
||||||
|
parser.add_argument('--backbone', help='Backbone model.', default='resnet18', type=str)
|
||||||
|
parser.add_argument('--snapshot', help='Path to the pretrained models.', default=None)
|
||||||
|
parser.add_argument('--gpu', help='Id of the GPU to use (as reported by nvidia-smi). (-1 for cpu)',type=int,default=-1)
|
||||||
|
parser.add_argument('--preds', help='path to predicted results',type=str,default=None)
|
||||||
|
parser.add_argument('--gts', help='path to ground truth',type=str,default=None)
|
||||||
|
|
||||||
|
print(vars(parser.parse_args(args)))
|
||||||
|
return check_args(parser.parse_args(args))
|
||||||
|
|
||||||
|
|
||||||
|
def main(args=None):
|
||||||
|
# parse arguments
|
||||||
|
if args is None:
|
||||||
|
args = sys.argv[1:]
|
||||||
|
|
||||||
|
args = parse_args(args)
|
||||||
|
device = "cuda:"+str(args.gpu) if args.gpu >= 0 else "cpu"
|
||||||
|
if args.preds is not None:
|
||||||
|
list_topk_accs, f1, recall,precision = evaluate_e2e(args.gts, args.preds)
|
||||||
|
else:
|
||||||
|
list_topk_accs, f1, recall,precision = evaluate(args.data_dir, args.backbone, args.model_def_path, args.snapshot, device, [1,5,10])
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
|
|
||||||
@ -0,0 +1 @@
|
|||||||
|
{"0": "background", "1": "person"}
|
||||||
160
ai_training/classification/inference.py
Normal file
@ -0,0 +1,160 @@
|
|||||||
|
import os
|
||||||
|
import numpy as np
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
from load_model import initialize_model
|
||||||
|
from PIL import Image
|
||||||
|
import json
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from datetime import date
|
||||||
|
import onnxruntime
|
||||||
|
|
||||||
|
def preprocess(image_path, input_size):
|
||||||
|
data_transforms = transforms.Compose([
|
||||||
|
transforms.Resize(input_size),
|
||||||
|
transforms.ToTensor(),
|
||||||
|
transforms.Normalize([0, 0, 0], [1/255.0, 1/255.0, 1/255.0]),
|
||||||
|
transforms.Normalize([0.5*256, 0.5*256, 0.5*256], [256.0, 256.0, 256.0])
|
||||||
|
])
|
||||||
|
with torch.no_grad():
|
||||||
|
img_data_pytorch = data_transforms(Image.open(image_path))
|
||||||
|
img_data_pytorch = img_data_pytorch.unsqueeze(0)
|
||||||
|
return img_data_pytorch.numpy()
|
||||||
|
|
||||||
|
def postprocess(pre_output):
|
||||||
|
score = softmax(pre_output)
|
||||||
|
labels = list(range(len(pre_output)))
|
||||||
|
score_labels = list(zip(score, labels))
|
||||||
|
score_labels.sort(reverse=True)
|
||||||
|
score, labels = list(zip(*score_labels))
|
||||||
|
|
||||||
|
return score, labels
|
||||||
|
|
||||||
|
|
||||||
|
def onnx_runner(image_path, model_path, class_id):
|
||||||
|
sess = onnxruntime.InferenceSession(model_path)
|
||||||
|
onnx_img_size_h = sess.get_inputs()[0].shape[2]
|
||||||
|
onnx_img_size_w = sess.get_inputs()[0].shape[3]
|
||||||
|
input_name = sess.get_inputs()[0].name
|
||||||
|
input_size = (onnx_img_size_h, onnx_img_size_w)
|
||||||
|
np_images = preprocess(image_path, input_size)
|
||||||
|
np_images = np_images.astype(np.float32)
|
||||||
|
pred_onnx = sess.run(None, {input_name: np_images })[0][0]
|
||||||
|
|
||||||
|
score, labels = postprocess(pred_onnx)
|
||||||
|
|
||||||
|
header = 'Label Probability'
|
||||||
|
itn_line = '{:10} {:8.3f} '
|
||||||
|
print(header)
|
||||||
|
for i in range(len(score)):
|
||||||
|
#print(itn_line.format( class_id[str(labels[i])], score[i]) )
|
||||||
|
print(itn_line.format( str(labels[i]), score[i]) )
|
||||||
|
|
||||||
|
return score, labels
|
||||||
|
|
||||||
|
def softmax(A):
|
||||||
|
e = np.exp(A)
|
||||||
|
return e / np.sum(e, axis=1, keepdims=True)
|
||||||
|
|
||||||
|
def inference(backbone, image_path, class_id, device, model_def_path, pretrained_path, topk = None):
|
||||||
|
|
||||||
|
num_classes = len(class_id)
|
||||||
|
model_structure, input_size = initialize_model(backbone, num_classes, False, model_def_path)
|
||||||
|
|
||||||
|
model_structure.load_state_dict(torch.load(pretrained_path))
|
||||||
|
model = model_structure.eval()
|
||||||
|
model = model.to(device)
|
||||||
|
|
||||||
|
data_transforms = transforms.Compose([
|
||||||
|
transforms.Resize(input_size),
|
||||||
|
transforms.ToTensor(),
|
||||||
|
transforms.Normalize([0, 0, 0], [1/255.0, 1/255.0, 1/255.0]),
|
||||||
|
transforms.Normalize([0.5*256, 0.5*256, 0.5*256], [256.0, 256.0, 256.0])
|
||||||
|
])
|
||||||
|
|
||||||
|
img_data_pytorch = data_transforms(Image.open(image_path))
|
||||||
|
img_data_pytorch = img_data_pytorch.to(device)
|
||||||
|
with torch.no_grad():
|
||||||
|
if topk == None or topk > num_classes:
|
||||||
|
topk = num_classes
|
||||||
|
outputs = model(img_data_pytorch[None, ...]).topk(topk)
|
||||||
|
scores = outputs[0].cpu().numpy()[0]
|
||||||
|
probs = softmax(scores)
|
||||||
|
preds = outputs[1].cpu().numpy()[0]
|
||||||
|
|
||||||
|
header = 'Label Probability'
|
||||||
|
itn_line = '{:10} {:8.3f} '
|
||||||
|
print(header)
|
||||||
|
for i in range(len(preds)):
|
||||||
|
print(itn_line.format( class_id[str(preds[i])], probs[i]) )
|
||||||
|
|
||||||
|
return probs, preds
|
||||||
|
|
||||||
|
def softmax(A):
|
||||||
|
|
||||||
|
e = np.exp(A)
|
||||||
|
return e / np.sum(e, keepdims=True)
|
||||||
|
|
||||||
|
def parse_args(args):
|
||||||
|
"""
|
||||||
|
Parse the arguments.
|
||||||
|
"""
|
||||||
|
today = str(date.today())
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description='Simple training script for training a image classification network.')
|
||||||
|
parser.add_argument('--img-path', type=str, help='Path to the image.')
|
||||||
|
parser.add_argument('--backbone', help='Backbone model.', default='resnet18', type=str)
|
||||||
|
parser.add_argument('--class_id_path', help='Path to the class id mapping file.', default='./eval_utils/class_id.json')
|
||||||
|
parser.add_argument('--gpu', help='Id of the GPU to use (as reported by nvidia-smi). (-1 for cpu)',type=int,default=-1)
|
||||||
|
parser.add_argument('--model-def-path', type=str, help='Path to pretrained model definition', default=None )
|
||||||
|
parser.add_argument('--snapshot', help='Path to the pretrained models.')
|
||||||
|
parser.add_argument('--save-path', help='Path to the classification result.', default='inference_result.json')
|
||||||
|
parser.add_argument('--onnx', help='inference onnx model',action='store_true')
|
||||||
|
|
||||||
|
print(vars(parser.parse_args(args)))
|
||||||
|
return check_args(parser.parse_args(args))
|
||||||
|
|
||||||
|
def check_args(parsed_args):
|
||||||
|
""" Function to check for inherent contradictions within parsed arguments.
|
||||||
|
Args
|
||||||
|
parsed_args: parser.parse_args()
|
||||||
|
Returns
|
||||||
|
parsed_args
|
||||||
|
"""
|
||||||
|
if parsed_args.gpu >= 0 and torch.cuda.is_available() == False:
|
||||||
|
raise ValueError("No gpu is available")
|
||||||
|
return parsed_args
|
||||||
|
|
||||||
|
def main(args=None):
|
||||||
|
# parse arguments
|
||||||
|
if args is None:
|
||||||
|
args = sys.argv[1:]
|
||||||
|
|
||||||
|
args = parse_args(args)
|
||||||
|
device = "cuda:"+str(args.gpu) if args.gpu >= 0 else "cpu"
|
||||||
|
with open(args.class_id_path,'r') as fp:
|
||||||
|
class_id = json.load(fp)
|
||||||
|
|
||||||
|
# Inference
|
||||||
|
if args.onnx:
|
||||||
|
probs, preds = onnx_runner(args.img_path, args.snapshot, class_id)
|
||||||
|
else:
|
||||||
|
probs, preds = inference(args.backbone, args.img_path, class_id, device, args.model_def_path, args.snapshot)
|
||||||
|
res = {}
|
||||||
|
res['img_path'] = os.path.abspath(args.img_path)
|
||||||
|
res['0_0'] = []
|
||||||
|
|
||||||
|
for i in range(len(probs)):
|
||||||
|
res['0_0'].append([ float(probs[i]), int(preds[i]) ])
|
||||||
|
|
||||||
|
with open(args.save_path, 'w') as fp:
|
||||||
|
json.dump(res, fp)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
|
|
||||||
1
ai_training/classification/inference_result.json
Normal file
@ -0,0 +1 @@
|
|||||||
|
{"img_path": "/home/ziyan/ai_training/classification/tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg", "0_0": [[0.8352504968643188, 1], [0.16474944353103638, 0]]}
|
||||||
41
ai_training/classification/load_data.py
Normal file
@ -0,0 +1,41 @@
|
|||||||
|
import os
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
import json
|
||||||
|
|
||||||
|
def load_data(data_dir, batch_size, input_size, worker):
|
||||||
|
transform_train_list = [
|
||||||
|
transforms.RandomResizedCrop(input_size),
|
||||||
|
transforms.RandomHorizontalFlip(),
|
||||||
|
transforms.ToTensor(),
|
||||||
|
transforms.Normalize([0, 0, 0], [1/255.0, 1/255.0, 1/255.0]),
|
||||||
|
transforms.Normalize([0.5*256, 0.5*256, 0.5*256], [256.0, 256.0, 256.0])
|
||||||
|
]
|
||||||
|
transform_val_list = [
|
||||||
|
transforms.Resize(input_size),
|
||||||
|
transforms.ToTensor(),
|
||||||
|
transforms.Normalize([0, 0, 0], [1/255.0, 1/255.0, 1/255.0]),
|
||||||
|
transforms.Normalize([0.5*256, 0.5*256, 0.5*256], [256.0, 256.0, 256.0])
|
||||||
|
]
|
||||||
|
|
||||||
|
data_transforms = {
|
||||||
|
'train': transforms.Compose(transform_train_list),
|
||||||
|
'val': transforms.Compose(transform_val_list)
|
||||||
|
}
|
||||||
|
|
||||||
|
print("Initializing Datasets and Dataloaders...")
|
||||||
|
# Create training and validation datasets
|
||||||
|
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
|
||||||
|
# Create training and validation dataloaders
|
||||||
|
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=worker, pin_memory=True) for x in ['train','val']}
|
||||||
|
print('-------------Label mapping to Idx:--------------')
|
||||||
|
class_id = image_datasets['train'].class_to_idx
|
||||||
|
class_id = dict([(value, key) for key, value in class_id.items()])
|
||||||
|
print(class_id)
|
||||||
|
print('------------------------------------------------')
|
||||||
|
with open("./eval_utils/class_id.json", "w") as outfile:
|
||||||
|
json.dump(class_id, outfile)
|
||||||
|
|
||||||
|
return dataloaders
|
||||||
10
ai_training/classification/load_lr_scheduler.py
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torch.optim as optim
|
||||||
|
import numpy as np
|
||||||
|
|
||||||
|
def load_lr_scheduler(optimizer_ft, mode = 'max', patience=5):
|
||||||
|
|
||||||
|
scheduler = optim.lr_scheduler.ReduceLROnPlateau(optimizer_ft, mode = mode, patience=patience)
|
||||||
|
|
||||||
|
return scheduler
|
||||||
139
ai_training/classification/load_model.py
Normal file
@ -0,0 +1,139 @@
|
|||||||
|
import sys
|
||||||
|
import os
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
def set_parameter_requires_grad(model, feature_extracting):
|
||||||
|
if feature_extracting:
|
||||||
|
for param in model.parameters():
|
||||||
|
param.requires_grad = False
|
||||||
|
|
||||||
|
def intersect_dicts(da, db, exclude=()):
|
||||||
|
# Dictionary intersection of matching keys and shapes, omitting 'exclude' keys, using da values
|
||||||
|
return {k: v for k, v in da.items() if k in db and not any(x in k for x in exclude) and v.shape == db[k].shape}
|
||||||
|
|
||||||
|
def initialize_weights(model_ft, pretrained=''):
|
||||||
|
state_dict = torch.load(pretrained) # load checkpoint
|
||||||
|
state_dict = intersect_dicts(state_dict, model_ft.state_dict()) # intersect
|
||||||
|
model_ft.load_state_dict(state_dict, strict=False) # load
|
||||||
|
print('Transferred %g/%g items from %s' % (len(state_dict), len(model_ft.state_dict()), pretrained)) # report
|
||||||
|
|
||||||
|
def initialize_model(model_name, num_classes, feature_extract, model_def_path=None, use_pretrained=None):
|
||||||
|
# Initialize these variables which will be set in this if statement. Each of these
|
||||||
|
# variables is model specific.
|
||||||
|
model_ft = None
|
||||||
|
input_size = 0
|
||||||
|
current_path=os.getcwd()
|
||||||
|
|
||||||
|
if model_name == 'FP_classifier':
|
||||||
|
if num_classes != 2:
|
||||||
|
print("Number of classes should be two, exiting...")
|
||||||
|
exit()
|
||||||
|
if model_def_path == None:
|
||||||
|
model_def_path = './models/FP_classifier/'
|
||||||
|
sys.path.append(model_def_path)
|
||||||
|
from Mobilenet_v2_small import mobile_net_v2
|
||||||
|
if use_pretrained:
|
||||||
|
model_ft = mobile_net_v2(num_classes)
|
||||||
|
model_ft.load_state_dict(torch.load(use_pretrained))
|
||||||
|
set_parameter_requires_grad(model_ft, feature_extract)
|
||||||
|
if feature_extract:
|
||||||
|
for param in model_ft.model.classifier[1].parameters():
|
||||||
|
param.requires_grad = True
|
||||||
|
else:
|
||||||
|
model_ft = mobile_net_v2(num_classes)
|
||||||
|
|
||||||
|
|
||||||
|
input_size = (56,32)
|
||||||
|
|
||||||
|
elif model_name == 'mobilenetv2':
|
||||||
|
""" Mobilenetv2
|
||||||
|
"""
|
||||||
|
if model_def_path == None:
|
||||||
|
model_def_path = './models/MobileNetV2/'
|
||||||
|
sys.path.append(model_def_path)
|
||||||
|
from Mobilenet_v2 import mobilenet_v2
|
||||||
|
if use_pretrained is not None and len(use_pretrained)>0:
|
||||||
|
model_ft = mobilenet_v2(num_classes)
|
||||||
|
initialize_weights(model_ft, use_pretrained)
|
||||||
|
set_parameter_requires_grad(model_ft, feature_extract)
|
||||||
|
if feature_extract:
|
||||||
|
for param in model_ft.model.classifier[1].parameters():
|
||||||
|
param.requires_grad = True
|
||||||
|
else:
|
||||||
|
model_ft = mobilenet_v2(num_classes)
|
||||||
|
|
||||||
|
input_size = (224,224)
|
||||||
|
|
||||||
|
elif model_name == 'resnet18':
|
||||||
|
""" ResNet18
|
||||||
|
"""
|
||||||
|
if model_def_path == None:
|
||||||
|
model_def_path = './models/ResNet18/'
|
||||||
|
sys.path.append(model_def_path)
|
||||||
|
from ResNet18 import resnet18
|
||||||
|
if use_pretrained is not None and len(use_pretrained)>0:
|
||||||
|
model_ft = resnet18(num_classes)
|
||||||
|
initialize_weights(model_ft, use_pretrained)
|
||||||
|
set_parameter_requires_grad(model_ft, feature_extract)
|
||||||
|
if feature_extract:
|
||||||
|
for param in model_ft.model.fc.parameters():
|
||||||
|
param.requires_grad = True
|
||||||
|
else:
|
||||||
|
model_ft = resnet18(num_classes)
|
||||||
|
|
||||||
|
input_size = (224,224)
|
||||||
|
elif model_name == 'resnet50':
|
||||||
|
""" ResNet50
|
||||||
|
"""
|
||||||
|
if model_def_path == None:
|
||||||
|
model_def_path = './models/ResNet50/'
|
||||||
|
sys.path.append(model_def_path)
|
||||||
|
from ResNet50 import resnet50
|
||||||
|
if use_pretrained is not None and len(use_pretrained)>0:
|
||||||
|
model_ft = resnet50(num_classes)
|
||||||
|
initialize_weights(model_ft, use_pretrained)
|
||||||
|
set_parameter_requires_grad(model_ft, feature_extract)
|
||||||
|
if feature_extract:
|
||||||
|
for param in model_ft.model.fc.parameters():
|
||||||
|
param.requires_grad = True
|
||||||
|
else:
|
||||||
|
model_ft = resnet50(num_classes)
|
||||||
|
|
||||||
|
input_size = (224,224)
|
||||||
|
|
||||||
|
elif model_name in [ 'efficientnet-b0', 'efficientnet-b1', 'efficientnet-b2', 'efficientnet-b3', 'efficientnet-b4', 'efficientnet-b5', 'efficientnet-b6', 'efficientnet-b7']:
|
||||||
|
""" EfficientNet
|
||||||
|
"""
|
||||||
|
if model_def_path == None:
|
||||||
|
model_def_path = './models/EfficientNet/'
|
||||||
|
sys.path.append(sys.path.append(model_def_path))
|
||||||
|
from EfficientNet_520 import EfficientNet
|
||||||
|
if use_pretrained is not None and len(use_pretrained)>0:
|
||||||
|
|
||||||
|
model_ft = EfficientNet.from_name(model_name)
|
||||||
|
model_ft.set_swish(memory_efficient=False)
|
||||||
|
model_ft.load_state_dict(torch.load(use_pretrained) )
|
||||||
|
set_parameter_requires_grad(model_ft, feature_extract)
|
||||||
|
if imagenet != 0:
|
||||||
|
num_ftrs = model_ft._fc.in_features
|
||||||
|
model_ft._fc = nn.Linear(num_ftrs, num_classes, bias=True)
|
||||||
|
|
||||||
|
else:
|
||||||
|
model_ft = EfficientNet.from_name(model_name,num_classes=num_classes)
|
||||||
|
|
||||||
|
input_size = (224,224)
|
||||||
|
|
||||||
|
else:
|
||||||
|
print("Invalid model name, exiting...")
|
||||||
|
exit()
|
||||||
|
|
||||||
|
return model_ft, input_size
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
model_ft, input_size = initialize_model('resnet18', 1000, False, model_def_path=None, use_pretrained='ResNet18.pth')
|
||||||
|
print(model_ft)
|
||||||
|
#from save_model import save_model
|
||||||
|
#save_model(model_ft, 'mobilenetv2', 'exp/', 0, 'cpu')
|
||||||
33
ai_training/classification/load_optimizer.py
Normal file
@ -0,0 +1,33 @@
|
|||||||
|
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torch.optim as optim
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
def load_optimizer(model_ft, lr=0.001, momentum=0.9, freeze_backbone = True, op_type='SGD'):
|
||||||
|
params_to_update = model_ft.parameters()
|
||||||
|
print("Params to learn:")
|
||||||
|
if freeze_backbone:
|
||||||
|
params_to_update = []
|
||||||
|
for name,param in model_ft.named_parameters():
|
||||||
|
if param.requires_grad == True:
|
||||||
|
params_to_update.append(param)
|
||||||
|
print("\t",name)
|
||||||
|
else:
|
||||||
|
for name,param in model_ft.named_parameters():
|
||||||
|
if param.requires_grad == True:
|
||||||
|
print("\t",name)
|
||||||
|
|
||||||
|
if op_type == 'SGD':
|
||||||
|
optimizer_ft = optim.SGD(params_to_update, lr=lr, momentum=momentum)
|
||||||
|
elif op_type == 'ASGD':
|
||||||
|
optimizer_ft = optim.ASGD(params_to_update, lr=lr)
|
||||||
|
elif op_type == 'ADAM':
|
||||||
|
optim.Adam(params_to_update, lr=lr)
|
||||||
|
else:
|
||||||
|
print("Invalid optimizer name, exiting...")
|
||||||
|
exit()
|
||||||
|
|
||||||
|
return optimizer_ft
|
||||||
10
ai_training/classification/loss_functions.py
Normal file
@ -0,0 +1,10 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
|
||||||
|
def load_loss_functions(loss_func = 'cross_entropy'):
|
||||||
|
if loss_func == 'cross_entropy':
|
||||||
|
criterion = nn.CrossEntropyLoss()
|
||||||
|
else:
|
||||||
|
print("Invalid loss function name, exiting...")
|
||||||
|
exit()
|
||||||
|
return criterion
|
||||||
@ -0,0 +1,966 @@
|
|||||||
|
import re
|
||||||
|
import math
|
||||||
|
import collections
|
||||||
|
from functools import partial
|
||||||
|
import torch
|
||||||
|
from torch import nn
|
||||||
|
from torch.nn import functional as F
|
||||||
|
from torch.utils import model_zoo
|
||||||
|
|
||||||
|
# Parameters for the entire model (stem, all blocks, and head)
|
||||||
|
GlobalParams = collections.namedtuple('GlobalParams', [
|
||||||
|
'width_coefficient', 'depth_coefficient', 'image_size', 'dropout_rate',
|
||||||
|
'num_classes', 'batch_norm_momentum', 'batch_norm_epsilon',
|
||||||
|
'drop_connect_rate', 'depth_divisor', 'min_depth', 'include_top'])
|
||||||
|
|
||||||
|
# Parameters for an individual model block
|
||||||
|
BlockArgs = collections.namedtuple('BlockArgs', [
|
||||||
|
'num_repeat', 'kernel_size', 'stride', 'expand_ratio',
|
||||||
|
'input_filters', 'output_filters', 'se_ratio', 'id_skip'])
|
||||||
|
|
||||||
|
# Set GlobalParams and BlockArgs's defaults
|
||||||
|
GlobalParams.__new__.__defaults__ = (None,) * len(GlobalParams._fields)
|
||||||
|
BlockArgs.__new__.__defaults__ = (None,) * len(BlockArgs._fields)
|
||||||
|
|
||||||
|
|
||||||
|
# An ordinary implementation of Swish function
|
||||||
|
class Swish(nn.Module):
|
||||||
|
def forward(self, x):
|
||||||
|
self.relu = nn.ReLU(inplace=False)
|
||||||
|
#return x * torch.sigmoid(x)
|
||||||
|
return self.relu(x)
|
||||||
|
|
||||||
|
|
||||||
|
# A memory-efficient implementation of Swish function
|
||||||
|
class SwishImplementation(torch.autograd.Function):
|
||||||
|
@staticmethod
|
||||||
|
def forward(ctx, i):
|
||||||
|
result = i * torch.sigmoid(i)
|
||||||
|
ctx.save_for_backward(i)
|
||||||
|
return result
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def backward(ctx, grad_output):
|
||||||
|
i = ctx.saved_tensors[0]
|
||||||
|
sigmoid_i = torch.sigmoid(i)
|
||||||
|
return grad_output * (sigmoid_i * (1 + i * (1 - sigmoid_i)))
|
||||||
|
|
||||||
|
class MemoryEfficientSwish(nn.Module):
|
||||||
|
def forward(self, x):
|
||||||
|
return SwishImplementation.apply(x)
|
||||||
|
|
||||||
|
|
||||||
|
def round_filters(filters, global_params):
|
||||||
|
"""Calculate and round number of filters based on width multiplier.
|
||||||
|
Use width_coefficient, depth_divisor and min_depth of global_params.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
filters (int): Filters number to be calculated.
|
||||||
|
global_params (namedtuple): Global params of the model.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
new_filters: New filters number after calculating.
|
||||||
|
"""
|
||||||
|
multiplier = global_params.width_coefficient
|
||||||
|
if not multiplier:
|
||||||
|
return filters
|
||||||
|
# TODO: modify the params names.
|
||||||
|
# maybe the names (width_divisor,min_width)
|
||||||
|
# are more suitable than (depth_divisor,min_depth).
|
||||||
|
divisor = global_params.depth_divisor
|
||||||
|
min_depth = global_params.min_depth
|
||||||
|
filters *= multiplier
|
||||||
|
min_depth = min_depth or divisor # pay attention to this line when using min_depth
|
||||||
|
# follow the formula transferred from official TensorFlow implementation
|
||||||
|
new_filters = max(min_depth, int(filters + divisor / 2) // divisor * divisor)
|
||||||
|
if new_filters < 0.9 * filters: # prevent rounding by more than 10%
|
||||||
|
new_filters += divisor
|
||||||
|
return int(new_filters)
|
||||||
|
|
||||||
|
|
||||||
|
def round_repeats(repeats, global_params):
|
||||||
|
"""Calculate module's repeat number of a block based on depth multiplier.
|
||||||
|
Use depth_coefficient of global_params.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
repeats (int): num_repeat to be calculated.
|
||||||
|
global_params (namedtuple): Global params of the model.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
new repeat: New repeat number after calculating.
|
||||||
|
"""
|
||||||
|
multiplier = global_params.depth_coefficient
|
||||||
|
if not multiplier:
|
||||||
|
return repeats
|
||||||
|
# follow the formula transferred from official TensorFlow implementation
|
||||||
|
return int(math.ceil(multiplier * repeats))
|
||||||
|
|
||||||
|
|
||||||
|
def drop_connect(inputs, p, training):
|
||||||
|
"""Drop connect.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
input (tensor: BCWH): Input of this structure.
|
||||||
|
p (float: 0.0~1.0): Probability of drop connection.
|
||||||
|
training (bool): The running mode.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
output: Output after drop connection.
|
||||||
|
"""
|
||||||
|
assert 0 <= p <= 1, 'p must be in range of [0,1]'
|
||||||
|
|
||||||
|
if not training:
|
||||||
|
return inputs
|
||||||
|
|
||||||
|
batch_size = inputs.shape[0]
|
||||||
|
keep_prob = 1 - p
|
||||||
|
|
||||||
|
# generate binary_tensor mask according to probability (p for 0, 1-p for 1)
|
||||||
|
random_tensor = keep_prob
|
||||||
|
random_tensor += torch.rand([batch_size, 1, 1, 1], dtype=inputs.dtype, device=inputs.device)
|
||||||
|
binary_tensor = torch.floor(random_tensor)
|
||||||
|
|
||||||
|
output = inputs / keep_prob * binary_tensor
|
||||||
|
return output
|
||||||
|
|
||||||
|
|
||||||
|
def get_width_and_height_from_size(x):
|
||||||
|
"""Obtain height and width from x.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
x (int, tuple or list): Data size.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
size: A tuple or list (H,W).
|
||||||
|
"""
|
||||||
|
if isinstance(x, int):
|
||||||
|
return x, x
|
||||||
|
if isinstance(x, list) or isinstance(x, tuple):
|
||||||
|
return x
|
||||||
|
else:
|
||||||
|
raise TypeError()
|
||||||
|
|
||||||
|
|
||||||
|
def calculate_output_image_size(input_image_size, stride):
|
||||||
|
"""Calculates the output image size when using Conv2dSamePadding with a stride.
|
||||||
|
Necessary for static padding. Thanks to mannatsingh for pointing this out.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
input_image_size (int, tuple or list): Size of input image.
|
||||||
|
stride (int, tuple or list): Conv2d operation's stride.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
output_image_size: A list [H,W].
|
||||||
|
"""
|
||||||
|
if input_image_size is None:
|
||||||
|
return None
|
||||||
|
image_height, image_width = get_width_and_height_from_size(input_image_size)
|
||||||
|
stride = stride if isinstance(stride, int) else stride[0]
|
||||||
|
image_height = int(math.ceil(image_height / stride))
|
||||||
|
image_width = int(math.ceil(image_width / stride))
|
||||||
|
return [image_height, image_width]
|
||||||
|
|
||||||
|
|
||||||
|
# Note:
|
||||||
|
# The following 'SamePadding' functions make output size equal ceil(input size/stride).
|
||||||
|
# Only when stride equals 1, can the output size be the same as input size.
|
||||||
|
# Don't be confused by their function names ! ! !
|
||||||
|
|
||||||
|
def get_same_padding_conv2d(image_size=None):
|
||||||
|
"""Chooses static padding if you have specified an image size, and dynamic padding otherwise.
|
||||||
|
Static padding is necessary for ONNX exporting of models.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image_size (int or tuple): Size of the image.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Conv2dDynamicSamePadding or Conv2dStaticSamePadding.
|
||||||
|
"""
|
||||||
|
if image_size is None:
|
||||||
|
return Conv2dDynamicSamePadding
|
||||||
|
else:
|
||||||
|
return partial(Conv2dStaticSamePadding, image_size=image_size)
|
||||||
|
|
||||||
|
|
||||||
|
class Conv2dDynamicSamePadding(nn.Conv2d):
|
||||||
|
"""2D Convolutions like TensorFlow, for a dynamic image size.
|
||||||
|
The padding is operated in forward function by calculating dynamically.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Tips for 'SAME' mode padding.
|
||||||
|
# Given the following:
|
||||||
|
# i: width or height
|
||||||
|
# s: stride
|
||||||
|
# k: kernel size
|
||||||
|
# d: dilation
|
||||||
|
# p: padding
|
||||||
|
# Output after Conv2d:
|
||||||
|
# o = floor((i+p-((k-1)*d+1))/s+1)
|
||||||
|
# If o equals i, i = floor((i+p-((k-1)*d+1))/s+1),
|
||||||
|
# => p = (i-1)*s+((k-1)*d+1)-i
|
||||||
|
|
||||||
|
def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1, groups=1, bias=True):
|
||||||
|
super().__init__(in_channels, out_channels, kernel_size, stride, 0, dilation, groups, bias)
|
||||||
|
self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]] * 2
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
ih, iw = x.size()[-2:]
|
||||||
|
kh, kw = self.weight.size()[-2:]
|
||||||
|
sh, sw = self.stride
|
||||||
|
oh, ow = math.ceil(ih / sh), math.ceil(iw / sw) # change the output size according to stride ! ! !
|
||||||
|
pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
|
||||||
|
pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
|
||||||
|
if pad_h > 0 or pad_w > 0:
|
||||||
|
x = F.pad(x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2])
|
||||||
|
return F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
|
||||||
|
|
||||||
|
|
||||||
|
class Conv2dStaticSamePadding(nn.Conv2d):
|
||||||
|
"""2D Convolutions like TensorFlow's 'SAME' mode, with the given input image size.
|
||||||
|
The padding mudule is calculated in construction function, then used in forward.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# With the same calculation as Conv2dDynamicSamePadding
|
||||||
|
|
||||||
|
def __init__(self, in_channels, out_channels, kernel_size, stride=1, image_size=None, **kwargs):
|
||||||
|
super().__init__(in_channels, out_channels, kernel_size, stride, **kwargs)
|
||||||
|
self.stride = self.stride if len(self.stride) == 2 else [self.stride[0]] * 2
|
||||||
|
|
||||||
|
# Calculate padding based on image size and save it
|
||||||
|
assert image_size is not None
|
||||||
|
ih, iw = (image_size, image_size) if isinstance(image_size, int) else image_size
|
||||||
|
kh, kw = self.weight.size()[-2:]
|
||||||
|
sh, sw = self.stride
|
||||||
|
oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
|
||||||
|
pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
|
||||||
|
pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
|
||||||
|
if pad_h > 0 or pad_w > 0:
|
||||||
|
self.static_padding = nn.ZeroPad2d((pad_w // 2, pad_w - pad_w // 2,
|
||||||
|
pad_h // 2, pad_h - pad_h // 2))
|
||||||
|
else:
|
||||||
|
self.static_padding = nn.Identity()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
x = self.static_padding(x)
|
||||||
|
x = F.conv2d(x, self.weight, self.bias, self.stride, self.padding, self.dilation, self.groups)
|
||||||
|
return x
|
||||||
|
|
||||||
|
|
||||||
|
def get_same_padding_maxPool2d(image_size=None):
|
||||||
|
"""Chooses static padding if you have specified an image size, and dynamic padding otherwise.
|
||||||
|
Static padding is necessary for ONNX exporting of models.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
image_size (int or tuple): Size of the image.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
MaxPool2dDynamicSamePadding or MaxPool2dStaticSamePadding.
|
||||||
|
"""
|
||||||
|
if image_size is None:
|
||||||
|
return MaxPool2dDynamicSamePadding
|
||||||
|
else:
|
||||||
|
return partial(MaxPool2dStaticSamePadding, image_size=image_size)
|
||||||
|
|
||||||
|
|
||||||
|
class MaxPool2dDynamicSamePadding(nn.MaxPool2d):
|
||||||
|
"""2D MaxPooling like TensorFlow's 'SAME' mode, with a dynamic image size.
|
||||||
|
The padding is operated in forward function by calculating dynamically.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, kernel_size, stride, padding=0, dilation=1, return_indices=False, ceil_mode=False):
|
||||||
|
super().__init__(kernel_size, stride, padding, dilation, return_indices, ceil_mode)
|
||||||
|
self.stride = [self.stride] * 2 if isinstance(self.stride, int) else self.stride
|
||||||
|
self.kernel_size = [self.kernel_size] * 2 if isinstance(self.kernel_size, int) else self.kernel_size
|
||||||
|
self.dilation = [self.dilation] * 2 if isinstance(self.dilation, int) else self.dilation
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
ih, iw = x.size()[-2:]
|
||||||
|
kh, kw = self.kernel_size
|
||||||
|
sh, sw = self.stride
|
||||||
|
oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
|
||||||
|
pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
|
||||||
|
pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
|
||||||
|
if pad_h > 0 or pad_w > 0:
|
||||||
|
x = F.pad(x, [pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2])
|
||||||
|
return F.max_pool2d(x, self.kernel_size, self.stride, self.padding,
|
||||||
|
self.dilation, self.ceil_mode, self.return_indices)
|
||||||
|
|
||||||
|
class MaxPool2dStaticSamePadding(nn.MaxPool2d):
|
||||||
|
"""2D MaxPooling like TensorFlow's 'SAME' mode, with the given input image size.
|
||||||
|
The padding mudule is calculated in construction function, then used in forward.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, kernel_size, stride, image_size=None, **kwargs):
|
||||||
|
super().__init__(kernel_size, stride, **kwargs)
|
||||||
|
self.stride = [self.stride] * 2 if isinstance(self.stride, int) else self.stride
|
||||||
|
self.kernel_size = [self.kernel_size] * 2 if isinstance(self.kernel_size, int) else self.kernel_size
|
||||||
|
self.dilation = [self.dilation] * 2 if isinstance(self.dilation, int) else self.dilation
|
||||||
|
|
||||||
|
# Calculate padding based on image size and save it
|
||||||
|
assert image_size is not None
|
||||||
|
ih, iw = (image_size, image_size) if isinstance(image_size, int) else image_size
|
||||||
|
kh, kw = self.kernel_size
|
||||||
|
sh, sw = self.stride
|
||||||
|
oh, ow = math.ceil(ih / sh), math.ceil(iw / sw)
|
||||||
|
pad_h = max((oh - 1) * self.stride[0] + (kh - 1) * self.dilation[0] + 1 - ih, 0)
|
||||||
|
pad_w = max((ow - 1) * self.stride[1] + (kw - 1) * self.dilation[1] + 1 - iw, 0)
|
||||||
|
if pad_h > 0 or pad_w > 0:
|
||||||
|
self.static_padding = nn.ZeroPad2d((pad_w // 2, pad_w - pad_w // 2, pad_h // 2, pad_h - pad_h // 2))
|
||||||
|
else:
|
||||||
|
self.static_padding = nn.Identity()
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
x = self.static_padding(x)
|
||||||
|
x = F.max_pool2d(x, self.kernel_size, self.stride, self.padding,
|
||||||
|
self.dilation, self.ceil_mode, self.return_indices)
|
||||||
|
return x
|
||||||
|
|
||||||
|
class BlockDecoder(object):
|
||||||
|
"""Block Decoder for readability,
|
||||||
|
straight from the official TensorFlow repository.
|
||||||
|
"""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _decode_block_string(block_string):
|
||||||
|
"""Get a block through a string notation of arguments.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
block_string (str): A string notation of arguments.
|
||||||
|
Examples: 'r1_k3_s11_e1_i32_o16_se0.25_noskip'.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
BlockArgs: The namedtuple defined at the top of this file.
|
||||||
|
"""
|
||||||
|
assert isinstance(block_string, str)
|
||||||
|
|
||||||
|
ops = block_string.split('_')
|
||||||
|
options = {}
|
||||||
|
for op in ops:
|
||||||
|
splits = re.split(r'(\d.*)', op)
|
||||||
|
if len(splits) >= 2:
|
||||||
|
key, value = splits[:2]
|
||||||
|
options[key] = value
|
||||||
|
|
||||||
|
# Check stride
|
||||||
|
assert (('s' in options and len(options['s']) == 1) or
|
||||||
|
(len(options['s']) == 2 and options['s'][0] == options['s'][1]))
|
||||||
|
|
||||||
|
return BlockArgs(
|
||||||
|
num_repeat=int(options['r']),
|
||||||
|
kernel_size=int(options['k']),
|
||||||
|
stride=[int(options['s'][0])],
|
||||||
|
expand_ratio=int(options['e']),
|
||||||
|
input_filters=int(options['i']),
|
||||||
|
output_filters=int(options['o']),
|
||||||
|
se_ratio=float(options['se']) if 'se' in options else None,
|
||||||
|
id_skip=('noskip' not in block_string))
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _encode_block_string(block):
|
||||||
|
"""Encode a block to a string.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
block (namedtuple): A BlockArgs type argument.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
block_string: A String form of BlockArgs.
|
||||||
|
"""
|
||||||
|
args = [
|
||||||
|
'r%d' % block.num_repeat,
|
||||||
|
'k%d' % block.kernel_size,
|
||||||
|
's%d%d' % (block.strides[0], block.strides[1]),
|
||||||
|
'e%s' % block.expand_ratio,
|
||||||
|
'i%d' % block.input_filters,
|
||||||
|
'o%d' % block.output_filters
|
||||||
|
]
|
||||||
|
if 0 < block.se_ratio <= 1:
|
||||||
|
args.append('se%s' % block.se_ratio)
|
||||||
|
if block.id_skip is False:
|
||||||
|
args.append('noskip')
|
||||||
|
return '_'.join(args)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def decode(string_list):
|
||||||
|
"""Decode a list of string notations to specify blocks inside the network.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
string_list (list[str]): A list of strings, each string is a notation of block.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
blocks_args: A list of BlockArgs namedtuples of block args.
|
||||||
|
"""
|
||||||
|
assert isinstance(string_list, list)
|
||||||
|
blocks_args = []
|
||||||
|
for block_string in string_list:
|
||||||
|
blocks_args.append(BlockDecoder._decode_block_string(block_string))
|
||||||
|
return blocks_args
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def encode(blocks_args):
|
||||||
|
"""Encode a list of BlockArgs to a list of strings.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
blocks_args (list[namedtuples]): A list of BlockArgs namedtuples of block args.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
block_strings: A list of strings, each string is a notation of block.
|
||||||
|
"""
|
||||||
|
block_strings = []
|
||||||
|
for block in blocks_args:
|
||||||
|
block_strings.append(BlockDecoder._encode_block_string(block))
|
||||||
|
return block_strings
|
||||||
|
|
||||||
|
|
||||||
|
def efficientnet_params(model_name):
|
||||||
|
"""Map EfficientNet model name to parameter coefficients.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Model name to be queried.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
params_dict[model_name]: A (width,depth,res,dropout) tuple.
|
||||||
|
"""
|
||||||
|
params_dict = {
|
||||||
|
# Coefficients: width,depth,res,dropout
|
||||||
|
'efficientnet-b0': (1.0, 1.0, 224, 0.2),
|
||||||
|
'efficientnet-b1': (1.0, 1.1, 240, 0.2),
|
||||||
|
'efficientnet-b2': (1.1, 1.2, 260, 0.3),
|
||||||
|
'efficientnet-b3': (1.2, 1.4, 300, 0.3),
|
||||||
|
'efficientnet-b4': (1.4, 1.8, 380, 0.4),
|
||||||
|
'efficientnet-b5': (1.6, 2.2, 456, 0.4),
|
||||||
|
'efficientnet-b6': (1.8, 2.6, 528, 0.5),
|
||||||
|
'efficientnet-b7': (2.0, 3.1, 600, 0.5),
|
||||||
|
'efficientnet-b8': (2.2, 3.6, 672, 0.5),
|
||||||
|
'efficientnet-l2': (4.3, 5.3, 800, 0.5),
|
||||||
|
}
|
||||||
|
return params_dict[model_name]
|
||||||
|
|
||||||
|
|
||||||
|
def efficientnet(width_coefficient=None, depth_coefficient=None, image_size=None,
|
||||||
|
dropout_rate=0.2, drop_connect_rate=0.2, num_classes=1000, include_top=True):
|
||||||
|
"""Create BlockArgs and GlobalParams for efficientnet model.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
width_coefficient (float)
|
||||||
|
depth_coefficient (float)
|
||||||
|
image_size (int)
|
||||||
|
dropout_rate (float)
|
||||||
|
drop_connect_rate (float)
|
||||||
|
num_classes (int)
|
||||||
|
|
||||||
|
Meaning as the name suggests.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
blocks_args, global_params.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Blocks args for the whole model(efficientnet-b0 by default)
|
||||||
|
# It will be modified in the construction of EfficientNet Class according to model
|
||||||
|
blocks_args = [
|
||||||
|
'r1_k3_s11_e1_i32_o16_se0.25',
|
||||||
|
'r2_k3_s22_e6_i16_o24_se0.25',
|
||||||
|
'r2_k5_s22_e6_i24_o40_se0.25',
|
||||||
|
'r3_k3_s22_e6_i40_o80_se0.25',
|
||||||
|
'r3_k5_s11_e6_i80_o112_se0.25',
|
||||||
|
'r4_k5_s22_e6_i112_o192_se0.25',
|
||||||
|
'r1_k3_s11_e6_i192_o320_se0.25',
|
||||||
|
]
|
||||||
|
blocks_args = BlockDecoder.decode(blocks_args)
|
||||||
|
|
||||||
|
global_params = GlobalParams(
|
||||||
|
width_coefficient=width_coefficient,
|
||||||
|
depth_coefficient=depth_coefficient,
|
||||||
|
image_size=image_size,
|
||||||
|
dropout_rate=dropout_rate,
|
||||||
|
|
||||||
|
num_classes=num_classes,
|
||||||
|
batch_norm_momentum=0.99,
|
||||||
|
batch_norm_epsilon=1e-3,
|
||||||
|
drop_connect_rate=drop_connect_rate,
|
||||||
|
depth_divisor=8,
|
||||||
|
min_depth=None,
|
||||||
|
include_top=include_top,
|
||||||
|
)
|
||||||
|
|
||||||
|
return blocks_args, global_params
|
||||||
|
|
||||||
|
|
||||||
|
def get_model_params(model_name, override_params):
|
||||||
|
"""Get the block args and global params for a given model name.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Model's name.
|
||||||
|
override_params (dict): A dict to modify global_params.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
blocks_args, global_params
|
||||||
|
"""
|
||||||
|
if model_name.startswith('efficientnet'):
|
||||||
|
w, d, s, p = efficientnet_params(model_name)
|
||||||
|
# note: all models have drop connect rate = 0.2
|
||||||
|
blocks_args, global_params = efficientnet(
|
||||||
|
width_coefficient=w, depth_coefficient=d, dropout_rate=p, image_size=s)
|
||||||
|
else:
|
||||||
|
raise NotImplementedError('model name is not pre-defined: {}'.format(model_name))
|
||||||
|
if override_params:
|
||||||
|
# ValueError will be raised here if override_params has fields not included in global_params.
|
||||||
|
global_params = global_params._replace(**override_params)
|
||||||
|
return blocks_args, global_params
|
||||||
|
|
||||||
|
|
||||||
|
# train with Standard methods
|
||||||
|
# check more details in paper(EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks)
|
||||||
|
url_map = {
|
||||||
|
'efficientnet-b0': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b0-355c32eb.pth',
|
||||||
|
'efficientnet-b1': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b1-f1951068.pth',
|
||||||
|
'efficientnet-b2': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b2-8bb594d6.pth',
|
||||||
|
'efficientnet-b3': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b3-5fb5a3c3.pth',
|
||||||
|
'efficientnet-b4': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b4-6ed6700e.pth',
|
||||||
|
'efficientnet-b5': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b5-b6417697.pth',
|
||||||
|
'efficientnet-b6': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b6-c76e70fd.pth',
|
||||||
|
'efficientnet-b7': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b7-dcc49843.pth',
|
||||||
|
}
|
||||||
|
|
||||||
|
# train with Adversarial Examples(AdvProp)
|
||||||
|
# check more details in paper(Adversarial Examples Improve Image Recognition)
|
||||||
|
url_map_advprop = {
|
||||||
|
'efficientnet-b0': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b0-b64d5a18.pth',
|
||||||
|
'efficientnet-b1': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b1-0f3ce85a.pth',
|
||||||
|
'efficientnet-b2': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b2-6e9d97e5.pth',
|
||||||
|
'efficientnet-b3': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b3-cdd7c0f4.pth',
|
||||||
|
'efficientnet-b4': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b4-44fb3a87.pth',
|
||||||
|
'efficientnet-b5': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b5-86493f6b.pth',
|
||||||
|
'efficientnet-b6': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b6-ac80338e.pth',
|
||||||
|
'efficientnet-b7': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b7-4652b6dd.pth',
|
||||||
|
'efficientnet-b8': 'https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/adv-efficientnet-b8-22a8fe65.pth',
|
||||||
|
}
|
||||||
|
|
||||||
|
# TODO: add the petrained weights url map of 'efficientnet-l2'
|
||||||
|
|
||||||
|
|
||||||
|
def load_pretrained_weights(model, model_name, weights_path=None, load_fc=True, advprop=False):
|
||||||
|
"""Loads pretrained weights from weights path or download using url.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model (Module): The whole model of efficientnet.
|
||||||
|
model_name (str): Model name of efficientnet.
|
||||||
|
weights_path (None or str):
|
||||||
|
str: path to pretrained weights file on the local disk.
|
||||||
|
None: use pretrained weights downloaded from the Internet.
|
||||||
|
load_fc (bool): Whether to load pretrained weights for fc layer at the end of the model.
|
||||||
|
advprop (bool): Whether to load pretrained weights
|
||||||
|
trained with advprop (valid when weights_path is None).
|
||||||
|
"""
|
||||||
|
if isinstance(weights_path, str):
|
||||||
|
state_dict = torch.load(weights_path)
|
||||||
|
else:
|
||||||
|
# AutoAugment or Advprop (different preprocessing)
|
||||||
|
url_map_ = url_map_advprop if advprop else url_map
|
||||||
|
state_dict = model_zoo.load_url(url_map_[model_name])
|
||||||
|
|
||||||
|
if load_fc:
|
||||||
|
ret = model.load_state_dict(state_dict, strict=False)
|
||||||
|
assert not ret.missing_keys, 'Missing keys when loading pretrained weights: {}'.format(ret.missing_keys)
|
||||||
|
else:
|
||||||
|
state_dict.pop('_fc.weight')
|
||||||
|
state_dict.pop('_fc.bias')
|
||||||
|
ret = model.load_state_dict(state_dict, strict=False)
|
||||||
|
assert set(ret.missing_keys) == set(
|
||||||
|
['_fc.weight', '_fc.bias']), 'Missing keys when loading pretrained weights: {}'.format(ret.missing_keys)
|
||||||
|
assert not ret.unexpected_keys, 'Missing keys when loading pretrained weights: {}'.format(ret.unexpected_keys)
|
||||||
|
|
||||||
|
print('Loaded pretrained weights for {}'.format(model_name))
|
||||||
|
|
||||||
|
|
||||||
|
VALID_MODELS = (
|
||||||
|
'efficientnet-b0', 'efficientnet-b1', 'efficientnet-b2', 'efficientnet-b3',
|
||||||
|
'efficientnet-b4', 'efficientnet-b5', 'efficientnet-b6', 'efficientnet-b7',
|
||||||
|
'efficientnet-b8',
|
||||||
|
|
||||||
|
# Support the construction of 'efficientnet-l2' without pretrained weights
|
||||||
|
'efficientnet-l2'
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
class MBConvBlock(nn.Module):
|
||||||
|
"""Mobile Inverted Residual Bottleneck Block.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
block_args (namedtuple): BlockArgs, defined in utils.py.
|
||||||
|
global_params (namedtuple): GlobalParam, defined in utils.py.
|
||||||
|
image_size (tuple or list): [image_height, image_width].
|
||||||
|
|
||||||
|
References:
|
||||||
|
[1] https://arxiv.org/abs/1704.04861 (MobileNet v1)
|
||||||
|
[2] https://arxiv.org/abs/1801.04381 (MobileNet v2)
|
||||||
|
[3] https://arxiv.org/abs/1905.02244 (MobileNet v3)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, block_args, global_params, image_size=None):
|
||||||
|
super().__init__()
|
||||||
|
self._block_args = block_args
|
||||||
|
self._bn_mom = 1 - global_params.batch_norm_momentum # pytorch's difference from tensorflow
|
||||||
|
self._bn_eps = global_params.batch_norm_epsilon
|
||||||
|
#self.has_se = (self._block_args.se_ratio is not None) and (0 < self._block_args.se_ratio <= 1)
|
||||||
|
self.has_se = False
|
||||||
|
self.id_skip = block_args.id_skip # whether to use skip connection and drop connect
|
||||||
|
|
||||||
|
# Expansion phase (Inverted Bottleneck)
|
||||||
|
inp = self._block_args.input_filters # number of input channels
|
||||||
|
oup = self._block_args.input_filters * self._block_args.expand_ratio # number of output channels
|
||||||
|
if self._block_args.expand_ratio != 1:
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=image_size)
|
||||||
|
self._expand_conv = Conv2d(in_channels=inp, out_channels=oup, kernel_size=1, bias=False)
|
||||||
|
self._bn0 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)
|
||||||
|
# image_size = calculate_output_image_size(image_size, 1) <-- this wouldn't modify image_size
|
||||||
|
|
||||||
|
# Depthwise convolution phase
|
||||||
|
k = self._block_args.kernel_size
|
||||||
|
s = self._block_args.stride
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=image_size)
|
||||||
|
self._depthwise_conv = Conv2d(
|
||||||
|
in_channels=oup, out_channels=oup, groups=oup, # groups makes it depthwise
|
||||||
|
kernel_size=k, stride=s, bias=False)
|
||||||
|
self._bn1 = nn.BatchNorm2d(num_features=oup, momentum=self._bn_mom, eps=self._bn_eps)
|
||||||
|
image_size = calculate_output_image_size(image_size, s)
|
||||||
|
|
||||||
|
# Squeeze and Excitation layer, if desired
|
||||||
|
if self.has_se:
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=(1, 1))
|
||||||
|
num_squeezed_channels = max(1, int(self._block_args.input_filters * self._block_args.se_ratio))
|
||||||
|
self._se_reduce = Conv2d(in_channels=oup, out_channels=num_squeezed_channels, kernel_size=1)
|
||||||
|
self._se_expand = Conv2d(in_channels=num_squeezed_channels, out_channels=oup, kernel_size=1)
|
||||||
|
|
||||||
|
# Pointwise convolution phase
|
||||||
|
final_oup = self._block_args.output_filters
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=image_size)
|
||||||
|
self._project_conv = Conv2d(in_channels=oup, out_channels=final_oup, kernel_size=1, bias=False)
|
||||||
|
self._bn2 = nn.BatchNorm2d(num_features=final_oup, momentum=self._bn_mom, eps=self._bn_eps)
|
||||||
|
self._swish = MemoryEfficientSwish()
|
||||||
|
|
||||||
|
def forward(self, inputs, drop_connect_rate=None):
|
||||||
|
"""MBConvBlock's forward function.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
inputs (tensor): Input tensor.
|
||||||
|
drop_connect_rate (bool): Drop connect rate (float, between 0 and 1).
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Output of this block after processing.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Expansion and Depthwise Convolution
|
||||||
|
x = inputs
|
||||||
|
if self._block_args.expand_ratio != 1:
|
||||||
|
x = self._expand_conv(inputs)
|
||||||
|
x = self._bn0(x)
|
||||||
|
x = self._swish(x)
|
||||||
|
|
||||||
|
x = self._depthwise_conv(x)
|
||||||
|
x = self._bn1(x)
|
||||||
|
x = self._swish(x)
|
||||||
|
|
||||||
|
# Squeeze and Excitation
|
||||||
|
if self.has_se:
|
||||||
|
x_squeezed = F.adaptive_avg_pool2d(x, 1)
|
||||||
|
x_squeezed = self._se_reduce(x_squeezed)
|
||||||
|
x_squeezed = self._swish(x_squeezed)
|
||||||
|
x_squeezed = self._se_expand(x_squeezed)
|
||||||
|
x = torch.sigmoid(x_squeezed) * x
|
||||||
|
|
||||||
|
# Pointwise Convolution
|
||||||
|
x = self._project_conv(x)
|
||||||
|
x = self._bn2(x)
|
||||||
|
|
||||||
|
# Skip connection and drop connect
|
||||||
|
input_filters, output_filters = self._block_args.input_filters, self._block_args.output_filters
|
||||||
|
if self.id_skip and self._block_args.stride == 1 and input_filters == output_filters:
|
||||||
|
# The combination of skip connection and drop connect brings about stochastic depth.
|
||||||
|
if drop_connect_rate:
|
||||||
|
x = drop_connect(x, p=drop_connect_rate, training=self.training)
|
||||||
|
x = x + inputs # skip connection
|
||||||
|
return x
|
||||||
|
|
||||||
|
def set_swish(self, memory_efficient=True):
|
||||||
|
"""Sets swish function as memory efficient (for training) or standard (for export).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
memory_efficient (bool): Whether to use memory-efficient version of swish.
|
||||||
|
"""
|
||||||
|
self._swish = MemoryEfficientSwish() if memory_efficient else Swish()
|
||||||
|
|
||||||
|
|
||||||
|
class EfficientNet(nn.Module):
|
||||||
|
"""EfficientNet model.
|
||||||
|
Most easily loaded with the .from_name or .from_pretrained methods.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
blocks_args (list[namedtuple]): A list of BlockArgs to construct blocks.
|
||||||
|
global_params (namedtuple): A set of GlobalParams shared between blocks.
|
||||||
|
|
||||||
|
References:
|
||||||
|
[1] https://arxiv.org/abs/1905.11946 (EfficientNet)
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
|
||||||
|
import torch
|
||||||
|
>>> from efficientnet.model import EfficientNet
|
||||||
|
>>> inputs = torch.rand(1, 3, 224, 224)
|
||||||
|
>>> model = EfficientNet.from_pretrained('efficientnet-b0')
|
||||||
|
>>> model.eval()
|
||||||
|
>>> outputs = model(inputs)
|
||||||
|
"""
|
||||||
|
|
||||||
|
def __init__(self, blocks_args=None, global_params=None):
|
||||||
|
super().__init__()
|
||||||
|
assert isinstance(blocks_args, list), 'blocks_args should be a list'
|
||||||
|
assert len(blocks_args) > 0, 'block args must be greater than 0'
|
||||||
|
self._global_params = global_params
|
||||||
|
self._blocks_args = blocks_args
|
||||||
|
|
||||||
|
# Batch norm parameters
|
||||||
|
bn_mom = 1 - self._global_params.batch_norm_momentum
|
||||||
|
bn_eps = self._global_params.batch_norm_epsilon
|
||||||
|
|
||||||
|
# Get stem static or dynamic convolution depending on image size
|
||||||
|
image_size = global_params.image_size
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=image_size)
|
||||||
|
|
||||||
|
# Stem
|
||||||
|
in_channels = 3 # rgb
|
||||||
|
out_channels = round_filters(32, self._global_params) # number of output channels
|
||||||
|
self._conv_stem = Conv2d(in_channels, out_channels, kernel_size=3, stride=2, bias=False)
|
||||||
|
self._bn0 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)
|
||||||
|
image_size = calculate_output_image_size(image_size, 2)
|
||||||
|
|
||||||
|
# Build blocks
|
||||||
|
self._blocks = nn.ModuleList([])
|
||||||
|
for block_args in self._blocks_args:
|
||||||
|
|
||||||
|
# Update block input and output filters based on depth multiplier.
|
||||||
|
block_args = block_args._replace(
|
||||||
|
input_filters=round_filters(block_args.input_filters, self._global_params),
|
||||||
|
output_filters=round_filters(block_args.output_filters, self._global_params),
|
||||||
|
num_repeat=round_repeats(block_args.num_repeat, self._global_params)
|
||||||
|
)
|
||||||
|
|
||||||
|
# The first block needs to take care of stride and filter size increase.
|
||||||
|
self._blocks.append(MBConvBlock(block_args, self._global_params, image_size=image_size))
|
||||||
|
image_size = calculate_output_image_size(image_size, block_args.stride)
|
||||||
|
if block_args.num_repeat > 1: # modify block_args to keep same output size
|
||||||
|
block_args = block_args._replace(input_filters=block_args.output_filters, stride=1)
|
||||||
|
for _ in range(block_args.num_repeat - 1):
|
||||||
|
self._blocks.append(MBConvBlock(block_args, self._global_params, image_size=image_size))
|
||||||
|
# image_size = calculate_output_image_size(image_size, block_args.stride) # stride = 1
|
||||||
|
|
||||||
|
# Head
|
||||||
|
in_channels = block_args.output_filters # output of final block
|
||||||
|
out_channels = round_filters(1280, self._global_params)
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=image_size)
|
||||||
|
self._conv_head = Conv2d(in_channels, out_channels, kernel_size=1, bias=False)
|
||||||
|
self._bn1 = nn.BatchNorm2d(num_features=out_channels, momentum=bn_mom, eps=bn_eps)
|
||||||
|
|
||||||
|
# Final linear layer
|
||||||
|
self._avg_pooling = nn.AdaptiveAvgPool2d(1)
|
||||||
|
self._dropout = nn.Dropout(self._global_params.dropout_rate)
|
||||||
|
self._fc = nn.Linear(out_channels, self._global_params.num_classes)
|
||||||
|
#self._swish = MemoryEfficientSwish()
|
||||||
|
self._swish = Swish()
|
||||||
|
|
||||||
|
def set_swish(self, memory_efficient=True):
|
||||||
|
"""Sets swish function as memory efficient (for training) or standard (for export).
|
||||||
|
|
||||||
|
Args:
|
||||||
|
memory_efficient (bool): Whether to use memory-efficient version of swish.
|
||||||
|
|
||||||
|
"""
|
||||||
|
self._swish = MemoryEfficientSwish() if memory_efficient else Swish()
|
||||||
|
for block in self._blocks:
|
||||||
|
block.set_swish(memory_efficient)
|
||||||
|
|
||||||
|
def extract_endpoints(self, inputs):
|
||||||
|
"""Use convolution layer to extract features
|
||||||
|
from reduction levels i in [1, 2, 3, 4, 5].
|
||||||
|
|
||||||
|
Args:
|
||||||
|
inputs (tensor): Input tensor.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Dictionary of last intermediate features
|
||||||
|
with reduction levels i in [1, 2, 3, 4, 5].
|
||||||
|
Example:
|
||||||
|
>>> import torch
|
||||||
|
>>> from efficientnet.model import EfficientNet
|
||||||
|
>>> inputs = torch.rand(1, 3, 224, 224)
|
||||||
|
>>> model = EfficientNet.from_pretrained('efficientnet-b0')
|
||||||
|
>>> endpoints = model.extract_endpoints(inputs)
|
||||||
|
>>> print(endpoints['reduction_1'].shape) # torch.Size([1, 16, 112, 112])
|
||||||
|
>>> print(endpoints['reduction_2'].shape) # torch.Size([1, 24, 56, 56])
|
||||||
|
>>> print(endpoints['reduction_3'].shape) # torch.Size([1, 40, 28, 28])
|
||||||
|
>>> print(endpoints['reduction_4'].shape) # torch.Size([1, 112, 14, 14])
|
||||||
|
>>> print(endpoints['reduction_5'].shape) # torch.Size([1, 1280, 7, 7])
|
||||||
|
"""
|
||||||
|
endpoints = dict()
|
||||||
|
|
||||||
|
# Stem
|
||||||
|
x = self._swish(self._bn0(self._conv_stem(inputs)))
|
||||||
|
prev_x = x
|
||||||
|
|
||||||
|
# Blocks
|
||||||
|
for idx, block in enumerate(self._blocks):
|
||||||
|
drop_connect_rate = self._global_params.drop_connect_rate
|
||||||
|
if drop_connect_rate:
|
||||||
|
drop_connect_rate *= float(idx) / len(self._blocks) # scale drop connect_rate
|
||||||
|
x = block(x, drop_connect_rate=drop_connect_rate)
|
||||||
|
if prev_x.size(2) > x.size(2):
|
||||||
|
endpoints['reduction_{}'.format(len(endpoints)+1)] = prev_x
|
||||||
|
prev_x = x
|
||||||
|
|
||||||
|
# Head
|
||||||
|
x = self._swish(self._bn1(self._conv_head(x)))
|
||||||
|
endpoints['reduction_{}'.format(len(endpoints)+1)] = x
|
||||||
|
|
||||||
|
return endpoints
|
||||||
|
|
||||||
|
def extract_features(self, inputs):
|
||||||
|
"""use convolution layer to extract feature .
|
||||||
|
|
||||||
|
Args:
|
||||||
|
inputs (tensor): Input tensor.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Output of the final convolution
|
||||||
|
layer in the efficientnet model.
|
||||||
|
"""
|
||||||
|
# Stem
|
||||||
|
x = self._swish(self._bn0(self._conv_stem(inputs)))
|
||||||
|
|
||||||
|
# Blocks
|
||||||
|
for idx, block in enumerate(self._blocks):
|
||||||
|
drop_connect_rate = self._global_params.drop_connect_rate
|
||||||
|
if drop_connect_rate:
|
||||||
|
drop_connect_rate *= float(idx) / len(self._blocks) # scale drop connect_rate
|
||||||
|
x = block(x, drop_connect_rate=drop_connect_rate)
|
||||||
|
|
||||||
|
# Head
|
||||||
|
x = self._swish(self._bn1(self._conv_head(x)))
|
||||||
|
|
||||||
|
return x
|
||||||
|
|
||||||
|
def forward(self, inputs):
|
||||||
|
"""EfficientNet's forward function.
|
||||||
|
Calls extract_features to extract features, applies final linear layer, and returns logits.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
inputs (tensor): Input tensor.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Output of this model after processing.
|
||||||
|
"""
|
||||||
|
# Convolution layers
|
||||||
|
x = self.extract_features(inputs)
|
||||||
|
# Pooling and final linear layer
|
||||||
|
x = self._avg_pooling(x)
|
||||||
|
if self._global_params.include_top:
|
||||||
|
x = x.flatten(start_dim=1)
|
||||||
|
x = self._dropout(x)
|
||||||
|
x = self._fc(x)
|
||||||
|
return x
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_name(cls, model_name, in_channels=3, **override_params):
|
||||||
|
"""create an efficientnet model according to name.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Name for efficientnet.
|
||||||
|
in_channels (int): Input data's channel number.
|
||||||
|
override_params (other key word params):
|
||||||
|
Params to override model's global_params.
|
||||||
|
Optional key:
|
||||||
|
'width_coefficient', 'depth_coefficient',
|
||||||
|
'image_size', 'dropout_rate',
|
||||||
|
'num_classes', 'batch_norm_momentum',
|
||||||
|
'batch_norm_epsilon', 'drop_connect_rate',
|
||||||
|
'depth_divisor', 'min_depth'
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
An efficientnet model.
|
||||||
|
"""
|
||||||
|
cls._check_model_name_is_valid(model_name)
|
||||||
|
blocks_args, global_params = get_model_params(model_name, override_params)
|
||||||
|
model = cls(blocks_args, global_params)
|
||||||
|
model._change_in_channels(in_channels)
|
||||||
|
return model
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_pretrained(cls, model_name, weights_path=None, advprop=False,
|
||||||
|
in_channels=3, num_classes=1000, **override_params):
|
||||||
|
"""create an efficientnet model according to name.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Name for efficientnet.
|
||||||
|
weights_path (None or str):
|
||||||
|
str: path to pretrained weights file on the local disk.
|
||||||
|
None: use pretrained weights downloaded from the Internet.
|
||||||
|
advprop (bool):
|
||||||
|
Whether to load pretrained weights
|
||||||
|
trained with advprop (valid when weights_path is None).
|
||||||
|
in_channels (int): Input data's channel number.
|
||||||
|
num_classes (int):
|
||||||
|
Number of categories for classification.
|
||||||
|
It controls the output size for final linear layer.
|
||||||
|
override_params (other key word params):
|
||||||
|
Params to override model's global_params.
|
||||||
|
Optional key:
|
||||||
|
'width_coefficient', 'depth_coefficient',
|
||||||
|
'image_size', 'dropout_rate',
|
||||||
|
'batch_norm_momentum',
|
||||||
|
'batch_norm_epsilon', 'drop_connect_rate',
|
||||||
|
'depth_divisor', 'min_depth'
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
A pretrained efficientnet model.
|
||||||
|
"""
|
||||||
|
model = cls.from_name(model_name, num_classes=num_classes, **override_params)
|
||||||
|
load_pretrained_weights(model, model_name, weights_path=weights_path, load_fc=(num_classes == 1000), advprop=advprop)
|
||||||
|
model._change_in_channels(in_channels)
|
||||||
|
return model
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def get_image_size(cls, model_name):
|
||||||
|
"""Get the input image size for a given efficientnet model.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Name for efficientnet.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
Input image size (resolution).
|
||||||
|
"""
|
||||||
|
cls._check_model_name_is_valid(model_name)
|
||||||
|
_, _, res, _ = efficientnet_params(model_name)
|
||||||
|
return res
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def _check_model_name_is_valid(cls, model_name):
|
||||||
|
"""Validates model name.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
model_name (str): Name for efficientnet.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
bool: Is a valid name or not.
|
||||||
|
"""
|
||||||
|
if model_name not in VALID_MODELS:
|
||||||
|
raise ValueError('model_name should be one of: ' + ', '.join(VALID_MODELS))
|
||||||
|
|
||||||
|
def _change_in_channels(self, in_channels):
|
||||||
|
"""Adjust model's first convolution layer to in_channels, if in_channels not equals 3.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
in_channels (int): Input data's channel number.
|
||||||
|
"""
|
||||||
|
if in_channels != 3:
|
||||||
|
Conv2d = get_same_padding_conv2d(image_size=self._global_params.image_size)
|
||||||
|
out_channels = round_filters(32, self._global_params)
|
||||||
|
self._conv_stem = Conv2d(in_channels, out_channels, kernel_size=3, stride=2, bias=False)
|
||||||
@ -0,0 +1,143 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
class mobile_net_v2(nn.Module):
|
||||||
|
def __init__(self, num_classes=2):
|
||||||
|
super(mobile_net_v2, self).__init__()
|
||||||
|
self.model = models.mobilenet_v2(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
#num_ftrs = self.mobile_model.classifier.in_features
|
||||||
|
num_ftrs = self.model.classifier[-1].in_features
|
||||||
|
#self.mobile_model.reset_classifier(0)
|
||||||
|
self.model.classifier[1] = nn.Linear(num_ftrs//4*3, num_classes, bias=True)
|
||||||
|
|
||||||
|
self.model.features[0][0] = nn.Conv2d(3, 32//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1),bias=False)
|
||||||
|
self.model.features[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[1].conv[0][0] = nn.Conv2d(32//4*3, 32//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32//4*3, bias=False)
|
||||||
|
self.model.features[1].conv[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[1].conv[1] = nn.Conv2d(32//4*3, 16//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[1].conv[2] = nn.BatchNorm2d(16//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[2].conv[0][0] = nn.Conv2d(16//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[0][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[1][0] = nn.Conv2d(96//4*3, 96//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96//4*3, bias=False)
|
||||||
|
self.model.features[2].conv[1][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[2] = nn.Conv2d(96//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[3].conv[0][0] = nn.Conv2d(24//4*3, 128//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[0][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[1][0] = nn.Conv2d(128//4*3, 128//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128//4*3, bias=False)
|
||||||
|
self.model.features[3].conv[1][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[2] = nn.Conv2d(128//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[4].conv[0][0] = nn.Conv2d(24//4*3, 144//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[0][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[1][0] = nn.Conv2d(144//4*3, 144//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144//4*3, bias=False)
|
||||||
|
self.model.features[4].conv[1][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[2] = nn.Conv2d(144//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[5].conv[0][0] = nn.Conv2d(32//4*3, 176//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[0][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[1][0] = nn.Conv2d(176//4*3, 176//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=176//4*3, bias=False)
|
||||||
|
self.model.features[5].conv[1][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[2] = nn.Conv2d(176//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[6].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[6].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[2] = nn.Conv2d(192//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[7].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[7].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[2] = nn.Conv2d(192//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[8].conv[0][0] = nn.Conv2d(64//4*3, 368//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[0][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[1][0] = nn.Conv2d(368//4*3, 368//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=368//4*3, bias=False)
|
||||||
|
self.model.features[8].conv[1][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[2] = nn.Conv2d(368//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[9].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[9].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[10].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[10].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[11].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[11].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[2] = nn.Conv2d(384//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[12].conv[0][0] = nn.Conv2d(96//4*3, 560//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[0][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[1][0] = nn.Conv2d(560//4*3, 560//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=560//4*3, bias=False)
|
||||||
|
self.model.features[12].conv[1][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[2] = nn.Conv2d(560//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[13].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[13].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[2] = nn.Conv2d(576//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[14].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[14].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[2] = nn.Conv2d(576//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[15].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[15].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[16].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[16].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[17].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[17].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[2] = nn.Conv2d(960//4*3, 320//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[3] = nn.BatchNorm2d(320//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[18][0] = nn.Conv2d(320//4*3, 1280//4*3, kernel_size=(1, 1), stride=(1, 1),bias=False)
|
||||||
|
self.model.features[18][1] = nn.BatchNorm2d(1280//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
#y = self.classifier(f)
|
||||||
|
return f
|
||||||
145
ai_training/classification/models/FP_classifier/__init__.py
Normal file
@ -0,0 +1,145 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
import matplotlib
|
||||||
|
import time
|
||||||
|
|
||||||
|
class mobile_net_v2(nn.Module):
|
||||||
|
def __init__(self, num_classes=2):
|
||||||
|
super(mobile_net_v2, self).__init__()
|
||||||
|
self.model = models.mobilenet_v2(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
#num_ftrs = self.mobile_model.classifier.in_features
|
||||||
|
num_ftrs = self.model.classifier[-1].in_features
|
||||||
|
#self.mobile_model.reset_classifier(0)
|
||||||
|
self.model.classifier[1] = nn.Linear(num_ftrs//4*3, num_classes, bias=True)
|
||||||
|
|
||||||
|
self.model.features[0][0] = nn.Conv2d(3, 32//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1),bias=False)
|
||||||
|
self.model.features[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[1].conv[0][0] = nn.Conv2d(32//4*3, 32//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32//4*3, bias=False)
|
||||||
|
self.model.features[1].conv[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[1].conv[1] = nn.Conv2d(32//4*3, 16//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[1].conv[2] = nn.BatchNorm2d(16//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[2].conv[0][0] = nn.Conv2d(16//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[0][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[1][0] = nn.Conv2d(96//4*3, 96//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96//4*3, bias=False)
|
||||||
|
self.model.features[2].conv[1][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[2] = nn.Conv2d(96//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[3].conv[0][0] = nn.Conv2d(24//4*3, 128//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[0][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[1][0] = nn.Conv2d(128//4*3, 128//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128//4*3, bias=False)
|
||||||
|
self.model.features[3].conv[1][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[2] = nn.Conv2d(128//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[4].conv[0][0] = nn.Conv2d(24//4*3, 144//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[0][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[1][0] = nn.Conv2d(144//4*3, 144//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144//4*3, bias=False)
|
||||||
|
self.model.features[4].conv[1][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[2] = nn.Conv2d(144//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[5].conv[0][0] = nn.Conv2d(32//4*3, 176//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[0][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[1][0] = nn.Conv2d(176//4*3, 176//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=176//4*3, bias=False)
|
||||||
|
self.model.features[5].conv[1][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[2] = nn.Conv2d(176//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[6].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[6].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[2] = nn.Conv2d(192//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[7].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[7].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[2] = nn.Conv2d(192//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[8].conv[0][0] = nn.Conv2d(64//4*3, 368//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[0][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[1][0] = nn.Conv2d(368//4*3, 368//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=368//4*3, bias=False)
|
||||||
|
self.model.features[8].conv[1][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[2] = nn.Conv2d(368//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[9].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[9].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[10].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[10].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[11].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[11].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[2] = nn.Conv2d(384//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[12].conv[0][0] = nn.Conv2d(96//4*3, 560//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[0][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[1][0] = nn.Conv2d(560//4*3, 560//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=560//4*3, bias=False)
|
||||||
|
self.model.features[12].conv[1][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[2] = nn.Conv2d(560//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[13].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[13].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[2] = nn.Conv2d(576//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[14].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[14].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[2] = nn.Conv2d(576//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[15].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[15].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[16].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[16].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[17].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[17].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[2] = nn.Conv2d(960//4*3, 320//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[3] = nn.BatchNorm2d(320//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[18][0] = nn.Conv2d(320//4*3, 1280//4*3, kernel_size=(1, 1), stride=(1, 1),bias=False)
|
||||||
|
self.model.features[18][1] = nn.BatchNorm2d(1280//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
#y = self.classifier(f)
|
||||||
|
return f
|
||||||
@ -0,0 +1,19 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
|
||||||
|
class mobilenet_v2(nn.Module):
|
||||||
|
def __init__(self, num_classes):
|
||||||
|
super(mobilenet_v2, self).__init__()
|
||||||
|
self.model = models.mobilenet_v2(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
num_ftrs = self.model.classifier[-1].in_features
|
||||||
|
self.model.classifier[1] = nn.Linear(num_ftrs, num_classes, bias=True)
|
||||||
|
nn.init.xavier_uniform_(self.model.classifier[1].weight)
|
||||||
|
self.model.classifier[1].bias.data.fill_(0.01)
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
return f
|
||||||
145
ai_training/classification/models/MobileNetV2/__init__.py
Normal file
@ -0,0 +1,145 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
import matplotlib
|
||||||
|
import time
|
||||||
|
|
||||||
|
class mobile_net_v2(nn.Module):
|
||||||
|
def __init__(self, num_classes=2):
|
||||||
|
super(mobile_net_v2, self).__init__()
|
||||||
|
self.model = models.mobilenet_v2(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
#num_ftrs = self.mobile_model.classifier.in_features
|
||||||
|
num_ftrs = self.model.classifier[-1].in_features
|
||||||
|
#self.mobile_model.reset_classifier(0)
|
||||||
|
self.model.classifier[1] = nn.Linear(num_ftrs//4*3, num_classes, bias=True)
|
||||||
|
|
||||||
|
self.model.features[0][0] = nn.Conv2d(3, 32//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1),bias=False)
|
||||||
|
self.model.features[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[1].conv[0][0] = nn.Conv2d(32//4*3, 32//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32//4*3, bias=False)
|
||||||
|
self.model.features[1].conv[0][1] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[1].conv[1] = nn.Conv2d(32//4*3, 16//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[1].conv[2] = nn.BatchNorm2d(16//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[2].conv[0][0] = nn.Conv2d(16//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[0][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[1][0] = nn.Conv2d(96//4*3, 96//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=96//4*3, bias=False)
|
||||||
|
self.model.features[2].conv[1][1] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[2].conv[2] = nn.Conv2d(96//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[2].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[3].conv[0][0] = nn.Conv2d(24//4*3, 128//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[0][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[1][0] = nn.Conv2d(128//4*3, 128//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=128//4*3, bias=False)
|
||||||
|
self.model.features[3].conv[1][1] = nn.BatchNorm2d(128//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[3].conv[2] = nn.Conv2d(128//4*3, 24//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[3].conv[3] = nn.BatchNorm2d(24//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[4].conv[0][0] = nn.Conv2d(24//4*3, 144//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[0][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[1][0] = nn.Conv2d(144//4*3, 144//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=144//4*3, bias=False)
|
||||||
|
self.model.features[4].conv[1][1] = nn.BatchNorm2d(144//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[4].conv[2] = nn.Conv2d(144//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[4].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[5].conv[0][0] = nn.Conv2d(32//4*3, 176//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[0][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[1][0] = nn.Conv2d(176//4*3, 176//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=176//4*3, bias=False)
|
||||||
|
self.model.features[5].conv[1][1] = nn.BatchNorm2d(176//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[5].conv[2] = nn.Conv2d(176//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[5].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[6].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[6].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[6].conv[2] = nn.Conv2d(192//4*3, 32//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[6].conv[3] = nn.BatchNorm2d(32//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[7].conv[0][0] = nn.Conv2d(32//4*3, 192//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[0][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[1][0] = nn.Conv2d(192//4*3, 192//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=192//4*3, bias=False)
|
||||||
|
self.model.features[7].conv[1][1] = nn.BatchNorm2d(192//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[7].conv[2] = nn.Conv2d(192//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[7].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[8].conv[0][0] = nn.Conv2d(64//4*3, 368//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[0][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[1][0] = nn.Conv2d(368//4*3, 368//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=368//4*3, bias=False)
|
||||||
|
self.model.features[8].conv[1][1] = nn.BatchNorm2d(368//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[8].conv[2] = nn.Conv2d(368//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[8].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[9].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[9].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[9].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[9].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[10].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[10].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[10].conv[2] = nn.Conv2d(384//4*3, 64//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[10].conv[3] = nn.BatchNorm2d(64//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[11].conv[0][0] = nn.Conv2d(64//4*3, 384//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[0][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[1][0] = nn.Conv2d(384//4*3, 384//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384//4*3, bias=False)
|
||||||
|
self.model.features[11].conv[1][1] = nn.BatchNorm2d(384//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[11].conv[2] = nn.Conv2d(384//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[11].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[12].conv[0][0] = nn.Conv2d(96//4*3, 560//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[0][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[1][0] = nn.Conv2d(560//4*3, 560//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=560//4*3, bias=False)
|
||||||
|
self.model.features[12].conv[1][1] = nn.BatchNorm2d(560//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[12].conv[2] = nn.Conv2d(560//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[12].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[13].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[13].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[13].conv[2] = nn.Conv2d(576//4*3, 96//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[13].conv[3] = nn.BatchNorm2d(96//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[14].conv[0][0] = nn.Conv2d(96//4*3, 576//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[0][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[1][0] = nn.Conv2d(576//4*3, 576//4*3, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=576//4*3, bias=False)
|
||||||
|
self.model.features[14].conv[1][1] = nn.BatchNorm2d(576//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[14].conv[2] = nn.Conv2d(576//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[14].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[15].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[15].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[15].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[15].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[16].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[16].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[16].conv[2] = nn.Conv2d(960//4*3, 160//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[16].conv[3] = nn.BatchNorm2d(160//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[17].conv[0][0] = nn.Conv2d(160//4*3, 960//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[0][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[1][0] = nn.Conv2d(960//4*3, 960//4*3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=960//4*3, bias=False)
|
||||||
|
self.model.features[17].conv[1][1] = nn.BatchNorm2d(960//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
self.model.features[17].conv[2] = nn.Conv2d(960//4*3, 320//4*3, kernel_size=(1, 1), stride=(1, 1), bias=False)
|
||||||
|
self.model.features[17].conv[3] = nn.BatchNorm2d(320//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
self.model.features[18][0] = nn.Conv2d(320//4*3, 1280//4*3, kernel_size=(1, 1), stride=(1, 1),bias=False)
|
||||||
|
self.model.features[18][1] = nn.BatchNorm2d(1280//4*3, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
#y = self.classifier(f)
|
||||||
|
return f
|
||||||
19
ai_training/classification/models/ResNet18/ResNet18.py
Normal file
@ -0,0 +1,19 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
class resnet18(nn.Module):
|
||||||
|
def __init__(self, num_classes):
|
||||||
|
super(resnet18, self).__init__()
|
||||||
|
self.model = models.resnet18(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
num_ftrs = self.model.fc.in_features
|
||||||
|
self.model.fc = nn.Linear(num_ftrs, num_classes, bias=True)
|
||||||
|
nn.init.xavier_uniform_(self.model.fc.weight)
|
||||||
|
self.model.fc.bias.data.fill_(0.01)
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
return f
|
||||||
20
ai_training/classification/models/ResNet50/ResNet50.py
Normal file
@ -0,0 +1,20 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import numpy as np
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
|
||||||
|
|
||||||
|
class resnet50(nn.Module):
|
||||||
|
def __init__(self, num_classes):
|
||||||
|
super(resnet50, self).__init__()
|
||||||
|
self.model = models.resnet50(pretrained=False)
|
||||||
|
# replace the last FC layer by a FC layer for our model
|
||||||
|
num_ftrs = self.model.fc.in_features
|
||||||
|
self.model.fc = nn.Linear(num_ftrs, num_classes, bias=True)
|
||||||
|
nn.init.xavier_uniform_(self.model.fc.weight)
|
||||||
|
self.model.fc.bias.data.fill_(0.01)
|
||||||
|
|
||||||
|
def forward(self, x):
|
||||||
|
f = self.model(x)
|
||||||
|
return f
|
||||||
39
ai_training/classification/pytorch2onnx.py
Normal file
@ -0,0 +1,39 @@
|
|||||||
|
import os
|
||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torchvision
|
||||||
|
from torchvision import datasets, models, transforms
|
||||||
|
import numpy as np
|
||||||
|
from load_model import initialize_model
|
||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import scipy.io
|
||||||
|
import torch.onnx
|
||||||
|
|
||||||
|
|
||||||
|
def main(args=None):
|
||||||
|
parser = argparse.ArgumentParser(description='converter.')
|
||||||
|
parser.add_argument('--save-path', type=str, help='Path to the onnx model.', default=None)
|
||||||
|
parser.add_argument('--backbone', help='Backbone model.', default='resnet18', type=str)
|
||||||
|
parser.add_argument('--num_classes', help='the number of classes.', type = int, default=0)
|
||||||
|
parser.add_argument('--model-def-path', type=str, help='Path to pretrained model definition', default=None )
|
||||||
|
parser.add_argument('--snapshot', help='Path to the pretrained models.')
|
||||||
|
print(vars(parser.parse_args()))
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
model_structure, input_size = initialize_model(args.backbone, args.num_classes, False, args.model_def_path)
|
||||||
|
|
||||||
|
model_structure.load_state_dict(torch.load(args.snapshot))
|
||||||
|
model = model_structure.eval()
|
||||||
|
|
||||||
|
dummy_input = torch.randn(1, 3, input_size[0],input_size[1])
|
||||||
|
save_path = args.save_path
|
||||||
|
if args.save_path is None:
|
||||||
|
save_path = args.backbone+'.onnx'
|
||||||
|
torch.onnx.export(model, dummy_input, save_path, keep_initializers_as_inputs=True, opset_version=11)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
|
|
||||||
6
ai_training/classification/requirements.txt
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
numpy>=1.18.5
|
||||||
|
torch>=1.4.0
|
||||||
|
torchvision>=0.5.0
|
||||||
|
sklearn
|
||||||
|
onnx==1.6.0
|
||||||
|
onnxruntime
|
||||||
12
ai_training/classification/save_model.py
Normal file
@ -0,0 +1,12 @@
|
|||||||
|
import torch
|
||||||
|
import os
|
||||||
|
|
||||||
|
def save_model(network, model_name, snapshot_path, epoch_label, device):
|
||||||
|
save_filename = model_name + '_%s.pth'% epoch_label
|
||||||
|
save_path = os.path.join(snapshot_path,save_filename)
|
||||||
|
if not os.path.isdir(snapshot_path):
|
||||||
|
os.makedirs(snapshot_path)
|
||||||
|
print('saving model ', save_path)
|
||||||
|
torch.save(network.cpu().state_dict(), save_path)
|
||||||
|
network = network.to(device)
|
||||||
|
return network
|
||||||
87
ai_training/classification/train.py
Normal file
@ -0,0 +1,87 @@
|
|||||||
|
import argparse
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from datetime import date
|
||||||
|
|
||||||
|
import torch
|
||||||
|
from load_data import load_data
|
||||||
|
from loss_functions import load_loss_functions
|
||||||
|
from load_optimizer import load_optimizer
|
||||||
|
from load_lr_scheduler import load_lr_scheduler
|
||||||
|
from train_model import train_model
|
||||||
|
from load_model import initialize_model
|
||||||
|
from save_model import save_model
|
||||||
|
|
||||||
|
def makedirs(path):
|
||||||
|
# Intended behavior: try to create the directory,
|
||||||
|
# pass if the directory exists already, fails otherwise.
|
||||||
|
try:
|
||||||
|
os.makedirs(path)
|
||||||
|
except OSError:
|
||||||
|
if not os.path.isdir(path):
|
||||||
|
raise
|
||||||
|
|
||||||
|
def check_args(parsed_args):
|
||||||
|
""" Function to check for inherent contradictions within parsed arguments.
|
||||||
|
Args
|
||||||
|
parsed_args: parser.parse_args()
|
||||||
|
Returns
|
||||||
|
parsed_args
|
||||||
|
"""
|
||||||
|
if parsed_args.gpu >= 0 and torch.cuda.is_available() == False:
|
||||||
|
raise ValueError("No gpu is available")
|
||||||
|
return parsed_args
|
||||||
|
|
||||||
|
|
||||||
|
def parse_args(args):
|
||||||
|
"""
|
||||||
|
Parse the arguments.
|
||||||
|
"""
|
||||||
|
today = str(date.today())
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description='Simple training script for training a image classification network.')
|
||||||
|
parser.add_argument('data_dir', type=str, help='Path to your dataset')
|
||||||
|
parser.add_argument('--model-name', type=str, help='Name of your model', default='model_ft' )
|
||||||
|
parser.add_argument('--model-def-path', type=str, help='Path to pretrained model definition', default=None )
|
||||||
|
parser.add_argument('--lr', type=float, help='Learning rate', default=5e-3)
|
||||||
|
parser.add_argument('--backbone', help='Backbone model.', default='resnet18', type=str)
|
||||||
|
parser.add_argument('--gpu', help='Id of the GPU to use (as reported by nvidia-smi). (-1 for cpu)',type=int,default=-1)
|
||||||
|
parser.add_argument('--workers', help='The number of dataloader workers',type=int, default=1)
|
||||||
|
parser.add_argument('--epochs', help='Number of epochs to train.', type=int, default=100)
|
||||||
|
parser.add_argument('--freeze-backbone', help='Freeze training of backbone layers.', type=int, default=0)
|
||||||
|
parser.add_argument('--batch-size', help='Size of the batches.', default=128, type=int)
|
||||||
|
parser.add_argument('--snapshot', help='Path to the pretrained models.')
|
||||||
|
parser.add_argument('--snapshot-path', help='Path to store snapshots of models during training (defaults to \'snapshots\')', default='./snapshots/{}'.format(today))
|
||||||
|
parser.add_argument('--optimizer', help='Choose an optimizer from SGD, ASGD and ADAM', type=str, default='SGD')
|
||||||
|
parser.add_argument('--loss', help='Choose a loss function', type=str, default='cross_entropy')
|
||||||
|
parser.add_argument('--early-stop', help='Choose if early stopping', type=int, default=1)
|
||||||
|
parser.add_argument('--patience', help='Choose patience for early stopping',type=int, default=7)
|
||||||
|
|
||||||
|
print(vars(parser.parse_args(args)))
|
||||||
|
return check_args(parser.parse_args(args))
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
def main(args=None):
|
||||||
|
# parse arguments
|
||||||
|
if args is None:
|
||||||
|
args = sys.argv[1:]
|
||||||
|
|
||||||
|
args = parse_args(args)
|
||||||
|
device = "cuda:"+str(args.gpu) if args.gpu >= 0 else "cpu"
|
||||||
|
num_classes = len([f for f in os.listdir(os.path.join(args.data_dir, 'train')) if not f.startswith('.')])
|
||||||
|
model_ft, input_size = initialize_model(args.backbone, num_classes, args.freeze_backbone, model_def_path = args.model_def_path, use_pretrained=args.snapshot)
|
||||||
|
dataloaders_dict = load_data(args.data_dir, args.batch_size, input_size, args.workers)
|
||||||
|
optimizer_ft = load_optimizer(model_ft, lr=args.lr, freeze_backbone = args.freeze_backbone, op_type=args.optimizer)
|
||||||
|
lr_scheduler_ft = load_lr_scheduler(optimizer_ft)
|
||||||
|
criterion = load_loss_functions(loss_func = args.loss)
|
||||||
|
|
||||||
|
# Train
|
||||||
|
model_ft,_ = train_model(model_ft, dataloaders_dict, criterion, optimizer_ft, lr_scheduler_ft, device, args.snapshot_path, model_name = args.model_name, num_epochs=args.epochs,early_stop = args.early_stop, patience = args.patience)
|
||||||
|
|
||||||
|
save_model(model_ft, args.model_name, args.snapshot_path, 'best', device)
|
||||||
|
return model_ft
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
main()
|
||||||
|
|
||||||
92
ai_training/classification/train_model.py
Normal file
@ -0,0 +1,92 @@
|
|||||||
|
import torch
|
||||||
|
import torch.nn as nn
|
||||||
|
import torch.optim as optim
|
||||||
|
import time
|
||||||
|
import copy
|
||||||
|
|
||||||
|
from save_model import save_model
|
||||||
|
from early_stopping import EarlyStopping
|
||||||
|
|
||||||
|
def train_model(model, dataloaders, criterion, optimizer, lr_scheduler, device, snapshot_path, model_name = 'model_ft', num_epochs=25, early_stop = False, patience = 7):
|
||||||
|
since = time.time()
|
||||||
|
|
||||||
|
val_acc_history = []
|
||||||
|
model = model.to(device)
|
||||||
|
best_model_wts = copy.deepcopy(model.state_dict())
|
||||||
|
best_acc = 0.0
|
||||||
|
|
||||||
|
# initialize the early_stopping object
|
||||||
|
early_stopping = EarlyStopping(model_name, patience=patience, verbose=True, path = snapshot_path)
|
||||||
|
|
||||||
|
|
||||||
|
for epoch in range(num_epochs):
|
||||||
|
print('Epoch {}/{}'.format(epoch, num_epochs - 1))
|
||||||
|
print('-' * 10)
|
||||||
|
|
||||||
|
# Each epoch has a training and validation phase
|
||||||
|
for phase in ['train', 'val']:
|
||||||
|
if phase == 'train':
|
||||||
|
model.train() # Set model to training mode
|
||||||
|
else:
|
||||||
|
model.eval() # Set model to evaluate mode
|
||||||
|
|
||||||
|
running_loss = 0.0
|
||||||
|
running_corrects = 0
|
||||||
|
|
||||||
|
# Iterate over data.
|
||||||
|
for inputs, labels in dataloaders[phase]:
|
||||||
|
inputs = inputs.to(device)
|
||||||
|
labels = labels.to(device)
|
||||||
|
|
||||||
|
# zero the parameter gradients
|
||||||
|
optimizer.zero_grad()
|
||||||
|
|
||||||
|
# forward
|
||||||
|
# track history if only in train
|
||||||
|
with torch.set_grad_enabled(phase == 'train'):
|
||||||
|
# Get model outputs and calculate loss
|
||||||
|
outputs = model(inputs)
|
||||||
|
loss = criterion(outputs, labels)
|
||||||
|
_, preds = torch.max(outputs, 1)
|
||||||
|
|
||||||
|
# backward + optimize only if in training phase
|
||||||
|
if phase == 'train':
|
||||||
|
loss.backward()
|
||||||
|
optimizer.step()
|
||||||
|
|
||||||
|
# statistics
|
||||||
|
running_loss += loss.item() * inputs.size(0)
|
||||||
|
running_corrects += torch.sum(preds == labels.data)
|
||||||
|
|
||||||
|
epoch_loss = running_loss / len(dataloaders[phase].dataset)
|
||||||
|
epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)
|
||||||
|
|
||||||
|
print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))
|
||||||
|
|
||||||
|
|
||||||
|
# deep copy the model
|
||||||
|
if phase == 'val' and epoch_acc >= best_acc:
|
||||||
|
best_acc = epoch_acc
|
||||||
|
best_model_wts = copy.deepcopy(model.state_dict())
|
||||||
|
if phase == 'val':
|
||||||
|
val_acc_history.append(epoch_acc)
|
||||||
|
lr_scheduler.step(epoch_acc)
|
||||||
|
|
||||||
|
print()
|
||||||
|
|
||||||
|
if early_stop:
|
||||||
|
early_stopping(epoch_loss, model, epoch)
|
||||||
|
if early_stopping.early_stop:
|
||||||
|
print("Early stopping")
|
||||||
|
break
|
||||||
|
elif epoch%10 == 9:
|
||||||
|
save_model(model, model_name, snapshot_path, epoch, device)
|
||||||
|
|
||||||
|
time_elapsed = time.time() - since
|
||||||
|
print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
|
||||||
|
print('Best val Acc: {:4f}'.format(best_acc))
|
||||||
|
|
||||||
|
# load best model weights
|
||||||
|
model.load_state_dict(best_model_wts)
|
||||||
|
return model, val_acc_history
|
||||||
|
|
||||||
186
ai_training/classification/tutorial/README.md
Normal file
@ -0,0 +1,186 @@
|
|||||||
|
<h1 align="center"> Image Classification </h1>
|
||||||
|
|
||||||
|
The tutorial explores the basis of image classification task. In this document, we will go through a concrete example of how to train an image classification model via our AI training platform. The dataset containing bees and ants is provided.
|
||||||
|
|
||||||
|
Image Classification is a fundamental task that attempts to classify the image by assigning it to a specific label. Our AI training platform provides the training script to train a classification model for image classification task.
|
||||||
|
|
||||||
|
# Prerequisites
|
||||||
|
First of all, we have to install the libraries. Python 3.6 or above is required. For other libraries, you can check the `requirements.txt` file. Installing these packages is simple. You can install them by running:
|
||||||
|
|
||||||
|
```
|
||||||
|
pip install -r requirements.txt
|
||||||
|
```
|
||||||
|
|
||||||
|
# Dataset & Preparation
|
||||||
|
Next, we need a dataset for the training model.
|
||||||
|
|
||||||
|
## Custom Datasets
|
||||||
|
You can train the model on a custom dataset. Your own datasets are expected to have the following structure:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
- Dataset name
|
||||||
|
-- train
|
||||||
|
--- Class1
|
||||||
|
--- Class2
|
||||||
|
|
||||||
|
-- val
|
||||||
|
--- Class1
|
||||||
|
--- Class2
|
||||||
|
```
|
||||||
|
|
||||||
|
## Example
|
||||||
|
Let's go through a toy example for preparing a custom dataset. Suppose we are going to classify bees and ants.
|
||||||
|
<div align="center">
|
||||||
|
<img src="./image_data/train/ants/0013035.jpg" width="33%" /> <img src="./image_data/train/bees/1092977343_cb42b38d62.jpg" width="33%" />
|
||||||
|
</div>
|
||||||
|
First of all, we have to split the images for bees and ants into train and validation set respectively (recommend 8:2). Then, we can move the images into difference folders with their class names. The dataset folder will have the following structure.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
- image data
|
||||||
|
-- train
|
||||||
|
--- ants
|
||||||
|
--- bees
|
||||||
|
|
||||||
|
-- val
|
||||||
|
--- ants
|
||||||
|
--- bees
|
||||||
|
```
|
||||||
|
|
||||||
|
Now, we have finished preparing the dataset.
|
||||||
|
|
||||||
|
# Train
|
||||||
|
|
||||||
|
Following the examples in the previous section, let's finetune a pretrained model on our custom dataset. The pretrained model we used here is the MobileNet model. We download the pretrained model from [Model_Zoo](https://github.com/kneron/Model_Zoo/tree/main/classification/MobileNetV2) by:
|
||||||
|
```shell
|
||||||
|
wget https://raw.githubusercontent.com/kneron/Model_Zoo/main/classification/MobileNetV2/MobileNetV2.pth
|
||||||
|
```
|
||||||
|
Since our dataset is quite small, we choose to frezze the backbone model and only finetune the last layer. Following the instruction above, run:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python train.py --gpu -1 --freeze-backbone 1 --backbone mobilenetv2 --early-stop 1 --snapshot MobileNetV2.pth --snapshot-path snapshots/exp/ ./tutorial/image_data
|
||||||
|
```
|
||||||
|
|
||||||
|
The following training messages will be printed:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
{'data_dir': './tutorial/image_data', 'model_name': 'model_ft', 'model_def_path': None, 'lr': 0.001, 'backbone': 'mobilenetv2', 'gpu': -1, 'epochs': 100, 'freeze_backbone': 1, 'batch_size': 64, 'snapshot': 'MobileNetV2.pth', 'snapshot_path': 'snapshots/exp/', 'optimizer': 'SGD', 'loss': 'cross_entropy', 'early_stop': 1, 'patience': 7}
|
||||||
|
Initializing Datasets and Dataloaders...
|
||||||
|
-------------Label mapping to Idx:--------------
|
||||||
|
{0: 'ants', 1: 'bees'}
|
||||||
|
------------------------------------------------
|
||||||
|
Params to learn:
|
||||||
|
model.classifier.1.weight
|
||||||
|
model.classifier.1.bias
|
||||||
|
Epoch 0/99
|
||||||
|
----------
|
||||||
|
train Loss: 0.7786 Acc: 0.4303
|
||||||
|
val Loss: 0.6739 Acc: 0.6056
|
||||||
|
|
||||||
|
Validation loss decreased (inf --> 0.673929). Saving model ...
|
||||||
|
|
||||||
|
...
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
When the validation mAP stops increasing for 7 epochs, the early stopping will be triggered and the training process will be terminated. The trained model is saved under `./snapshots/exp` folder. In addition, the class label to idx mapping is printed and automatically saved in `./eval_utils/class_id.json`.
|
||||||
|
|
||||||
|
# Converting to ONNX
|
||||||
|
You may check the [Toolchain manual](http://doc.kneron.com/docs/#toolchain/manual/) for converting PyTorch model to ONNX model. Let's go through an example for converting FP_classifier PyTorch model to ONNX model.
|
||||||
|
|
||||||
|
Execute commands in the folder `classification`:
|
||||||
|
```shell
|
||||||
|
python pytorch2onnx.py --backbone mobilenetv2 --num_classes 2 --snapshot snapshots/exp/model_ft_best.pth --save-path snapshots/exp/model_ft_best.onnx
|
||||||
|
```
|
||||||
|
We could get `model_ft_best.onnx`.
|
||||||
|
|
||||||
|
Execute commands in the folder `ONNX_Convertor/optimizer_scripts`:
|
||||||
|
(reference: https://github.com/kneron/ONNX_Convertor/tree/master/optimizer_scripts)
|
||||||
|
```shell
|
||||||
|
git clone https://github.com/kneron/ONNX_Convertor.git
|
||||||
|
```
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python ONNX_Convertor/optimizer_scripts/pytorch2onnx.py snapshots/exp/model_ft_best.onnx snapshots/exp/model_ft_best_convert.onnx
|
||||||
|
```
|
||||||
|
|
||||||
|
We could get `model_ft_best_convert.onnx`.
|
||||||
|
|
||||||
|
# Inference
|
||||||
|
In this section, we will go through an example of using a trained network for inference. That is, we will use the function `inference.py` that takes an image and predict the class label for the image. `inference.py` returns the top $K$ most likely classes along with the probabilities. Let's run our network on the following image, a bee image from our custom dataset:
|
||||||
|
<div align="center">
|
||||||
|
<img src="./image_data/val/bees/10870992_eebeeb3a12.jpg" width="30%" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python inference.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --model-def-path models/MobileNetV2/ --class_id_path eval_utils/class_id.json --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg
|
||||||
|
|
||||||
|
{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'mobilenetv2', 'class_id_path': 'eval_utils/class_id.json', 'gpu': -1, 'model_def_path': 'models/MobileNetV2/', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'save_path': 'inference_result.json', 'onnx': False}
|
||||||
|
Label Probability
|
||||||
|
bees 0.836
|
||||||
|
ants 0.164
|
||||||
|
```
|
||||||
|
|
||||||
|
Note that the class ID mapping file `eval_utils/class_id.json` was created during training process. After inference, we could get `inference_result.json`, which contains the following information:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
{"img_path": "/home/ziyan/git_repo/ai_training/ai_training/classification/tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg", "0_0": [[0.8359974026679993, 1], [0.16400262713432312, 0]]}
|
||||||
|
```
|
||||||
|
|
||||||
|
For onnx inference, add `--onnx` argument when execute `inference.py`:
|
||||||
|
```shell
|
||||||
|
python inference.py --img-path tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg --snapshot snapshots/exp/model_ft_best_convert.onnx --onnx
|
||||||
|
|
||||||
|
{'img_path': 'tutorial/image_data/val/bees/10870992_eebeeb3a12.jpg', 'backbone': 'resnet18', 'class_id_path': './eval_utils/class_id.json', 'gpu': -1, 'model_def_path': None, 'snapshot': 'snapshots/exp/model_ft_best_convert.onnx', 'save_path': 'inference_result.json', 'onnx': True}
|
||||||
|
Label Probability
|
||||||
|
bees 0.836
|
||||||
|
ants 0.164
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
# Evaluation
|
||||||
|
|
||||||
|
## Evaluation on a dataset
|
||||||
|
In this section, we will go through an example of evaluating a trained network on a dataset. Here, we are going to evaluate a pretrained model on the validation set of our custom dataset. The `./eval_utils/eval.py` will report the top-K score and F1 score for the model evaluated on a testing dataset. The evaluation statistics will be saved to `eval_results.txt`.
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python eval_utils/eval.py --gpu -1 --backbone mobilenetv2 --snapshot snapshots/exp/model_ft_best.pth --data-dir ./tutorial/image_data/val/
|
||||||
|
|
||||||
|
{'data_dir': './tutorial/image_data/val/', 'model_def_path': None, 'backbone': 'mobilenetv2', 'snapshot': 'snapshots/exp/model_ft_best.pth', 'gpu': -1, 'preds': None, 'gts': None}
|
||||||
|
|
||||||
|
top 1 accuracy: 0.9225352112676056
|
||||||
|
|
||||||
|
Label Precision Recall F1 score
|
||||||
|
ants 0.887 0.932 0.909
|
||||||
|
bees 0.950 0.916 0.933
|
||||||
|
```
|
||||||
|
|
||||||
|
## End-to-End Evaluation
|
||||||
|
For end-to-end testing, we expect that the prediction results are saved into json files, one json file for one image, with the following format:
|
||||||
|
```bash
|
||||||
|
{"img_path": image_path,
|
||||||
|
"0_0":[[score, label], [score, label], ...]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
The prediction json files for all images are expected to saved under the same folder. The ground truth json file is expected to have the following format:
|
||||||
|
```bash
|
||||||
|
{image1_path: label,
|
||||||
|
image2_path: label,
|
||||||
|
...
|
||||||
|
}
|
||||||
|
```
|
||||||
|
For this tutorial, we generated some random prediction data saved under the folder `tutorial/eval_data/preds/`, and the ground turth is saved in `tutorial/eval_data/gts.json`. You may check these files for the format. To compute the evaluation statistics, execute commands in the folder `classification`:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
python eval_utils/eval.py --preds tutorial/eval_data/preds/ --gts tutorial/eval_data/gts.json
|
||||||
|
|
||||||
|
{'model_def_path': None, 'data_dir': None, 'backbone': 'resnet18', 'preds': 'tutorial/eval_data/preds/', 'gts': 'tutorial/eval_data/gts.json', 'snapshot': None, 'gpu': -1}
|
||||||
|
|
||||||
|
top 1 accuracy: 1.0
|
||||||
|
|
||||||
|
Label Precision Recall F1 score
|
||||||
|
0 1.000 1.000 1.000
|
||||||
|
1 1.000 1.000 1.000
|
||||||
|
2 1.000 1.000 1.000
|
||||||
|
```
|
||||||
|
The evaluation statistics will be saved to `eval_results.txt`.
|
||||||
|
|
||||||
1
ai_training/classification/tutorial/eval_data/gts.json
Normal file
@ -0,0 +1 @@
|
|||||||
|
{"1.jpg": 1, "2.jpg": 0, "3.jpg": 2}
|
||||||
@ -0,0 +1,3 @@
|
|||||||
|
{"img_path": "1.jpg",
|
||||||
|
"0_0":[[0.8, 1], [0.1, 0], [0.1, 2]]
|
||||||
|
}
|
||||||
@ -0,0 +1,2 @@
|
|||||||
|
{"img_path": "2.jpg",
|
||||||
|
"0_0": [[0.8, 0], [0.1, 1], [0.1, 2]]}
|
||||||
@ -0,0 +1 @@
|
|||||||
|
{"img_path": "3.jpg", "0_0": [[0.8, 2], [0.1, 1], [0.1, 0]]}
|
||||||
|
After Width: | Height: | Size: 46 KiB |
|
After Width: | Height: | Size: 170 KiB |
|
After Width: | Height: | Size: 124 KiB |
|
After Width: | Height: | Size: 72 KiB |
|
After Width: | Height: | Size: 140 KiB |
|
After Width: | Height: | Size: 67 KiB |
|
After Width: | Height: | Size: 90 KiB |
|
After Width: | Height: | Size: 57 KiB |
|
After Width: | Height: | Size: 99 KiB |
|
After Width: | Height: | Size: 111 KiB |
|
After Width: | Height: | Size: 139 KiB |
|
After Width: | Height: | Size: 89 KiB |
|
After Width: | Height: | Size: 149 KiB |
|
After Width: | Height: | Size: 100 KiB |
|
After Width: | Height: | Size: 105 KiB |
|
After Width: | Height: | Size: 107 KiB |
|
After Width: | Height: | Size: 164 KiB |
|
After Width: | Height: | Size: 155 KiB |
|
After Width: | Height: | Size: 104 KiB |
|
After Width: | Height: | Size: 57 KiB |
|
After Width: | Height: | Size: 164 KiB |
|
After Width: | Height: | Size: 162 KiB |
|
After Width: | Height: | Size: 73 KiB |
|
After Width: | Height: | Size: 49 KiB |
|
After Width: | Height: | Size: 88 KiB |
|
After Width: | Height: | Size: 29 KiB |
|
After Width: | Height: | Size: 112 KiB |
|
After Width: | Height: | Size: 191 KiB |
|
After Width: | Height: | Size: 100 KiB |
|
After Width: | Height: | Size: 79 KiB |
|
After Width: | Height: | Size: 128 KiB |
|
After Width: | Height: | Size: 174 KiB |
|
After Width: | Height: | Size: 96 KiB |
|
After Width: | Height: | Size: 134 KiB |
|
After Width: | Height: | Size: 91 KiB |
|
After Width: | Height: | Size: 186 KiB |
|
After Width: | Height: | Size: 40 KiB |
|
After Width: | Height: | Size: 69 KiB |
|
After Width: | Height: | Size: 136 KiB |
|
After Width: | Height: | Size: 104 KiB |
|
After Width: | Height: | Size: 208 KiB |
|
After Width: | Height: | Size: 136 KiB |
|
After Width: | Height: | Size: 110 KiB |
|
After Width: | Height: | Size: 127 KiB |
|
After Width: | Height: | Size: 142 KiB |
|
After Width: | Height: | Size: 38 KiB |
|
After Width: | Height: | Size: 66 KiB |
|
After Width: | Height: | Size: 72 KiB |
|
After Width: | Height: | Size: 94 KiB |
|
After Width: | Height: | Size: 106 KiB |
|
After Width: | Height: | Size: 61 KiB |
|
After Width: | Height: | Size: 192 KiB |
|
After Width: | Height: | Size: 93 KiB |
|
After Width: | Height: | Size: 218 KiB |
|
After Width: | Height: | Size: 76 KiB |
|
After Width: | Height: | Size: 19 KiB |
|
After Width: | Height: | Size: 184 KiB |
|
After Width: | Height: | Size: 117 KiB |
|
After Width: | Height: | Size: 142 KiB |
|
After Width: | Height: | Size: 152 KiB |
|
After Width: | Height: | Size: 80 KiB |