feat: add golf dataset, kneron configs, and tools
Some checks failed
deploy / build-n-publish (push) Has been cancelled
lint / lint (push) Has been cancelled
build / build_cpu (3.7, 1.5.1, torch1.5, 0.6.1) (push) Has been cancelled
build / build_cpu (3.7, 1.6.0, torch1.6, 0.7.0) (push) Has been cancelled
build / build_cpu (3.7, 1.7.0, torch1.7, 0.8.1) (push) Has been cancelled
build / build_cpu (3.7, 1.8.0, torch1.8, 0.9.0) (push) Has been cancelled
build / build_cpu (3.7, 1.9.0, torch1.9, 0.10.0) (push) Has been cancelled
build / build_cuda101 (3.7, 1.5.1+cu101, torch1.5, 0.6.1+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.6.0+cu101, torch1.6, 0.7.0+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.7.0+cu101, torch1.7, 0.8.1+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.8.0+cu101, torch1.8, 0.9.0+cu101) (push) Has been cancelled
build / build_cuda102 (3.6, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.7, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.8, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.9, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / test_windows (windows-2022, cpu, 3.8) (push) Has been cancelled
build / test_windows (windows-2022, cu111, 3.8) (push) Has been cancelled
Some checks failed
deploy / build-n-publish (push) Has been cancelled
lint / lint (push) Has been cancelled
build / build_cpu (3.7, 1.5.1, torch1.5, 0.6.1) (push) Has been cancelled
build / build_cpu (3.7, 1.6.0, torch1.6, 0.7.0) (push) Has been cancelled
build / build_cpu (3.7, 1.7.0, torch1.7, 0.8.1) (push) Has been cancelled
build / build_cpu (3.7, 1.8.0, torch1.8, 0.9.0) (push) Has been cancelled
build / build_cpu (3.7, 1.9.0, torch1.9, 0.10.0) (push) Has been cancelled
build / build_cuda101 (3.7, 1.5.1+cu101, torch1.5, 0.6.1+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.6.0+cu101, torch1.6, 0.7.0+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.7.0+cu101, torch1.7, 0.8.1+cu101) (push) Has been cancelled
build / build_cuda101 (3.7, 1.8.0+cu101, torch1.8, 0.9.0+cu101) (push) Has been cancelled
build / build_cuda102 (3.6, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.7, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.8, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / build_cuda102 (3.9, 1.9.0+cu102, torch1.9, 0.10.0+cu102) (push) Has been cancelled
build / test_windows (windows-2022, cpu, 3.8) (push) Has been cancelled
build / test_windows (windows-2022, cu111, 3.8) (push) Has been cancelled
- Add golf1/2/4/7/8 dataset classes for semantic segmentation - Add kneron-specific configs (meconfig series, kn_stdc1_golf4class) - Organize scripts into tools/check/ and tools/kneron/ - Add kneron_preprocessing module - Update README with quick-start guide - Update .gitignore to exclude data dirs, onnx, nef outputs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
793c3a5bb0
commit
7716a0060f
17
.gitignore
vendored
17
.gitignore
vendored
@ -117,3 +117,20 @@ mmseg/.mim
|
||||
|
||||
# Pytorch
|
||||
*.pth
|
||||
|
||||
# ONNX / NEF compiled outputs
|
||||
*.onnx
|
||||
*.nef
|
||||
batch_compile_out/
|
||||
conbinenef/
|
||||
|
||||
# Local data directories
|
||||
data4/
|
||||
data50/
|
||||
data512/
|
||||
data724362/
|
||||
testdata/
|
||||
|
||||
# Misc
|
||||
envs.txt
|
||||
.claude/
|
||||
|
||||
98
README.md
98
README.md
@ -1,70 +1,62 @@
|
||||
# Kneron AI Training/Deployment Platform (mmsegmentation-based)
|
||||
# STDC GolfAce — Semantic Segmentation on Kneron
|
||||
|
||||
## 快速開始
|
||||
|
||||
## Introduction
|
||||
### 環境安裝
|
||||
|
||||
[kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) is a platform built upon the well-known [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) for mmsegmentation. If you are looking for original mmsegmentation document, please visit [mmsegmentation docs](https://mmsegmentation.readthedocs.io/en/latest/) for detailed mmsegmentation usage.
|
||||
```bash
|
||||
# 建立與啟動 conda 環境
|
||||
conda create -n stdc_golface python=3.8 -y
|
||||
conda activate stdc_golface
|
||||
|
||||
In this repository, we provide an end-to-end training/deployment flow to realize on Kneron's AI accelerators:
|
||||
# 安裝 PyTorch + CUDA 11.3
|
||||
conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y
|
||||
|
||||
1. **Training/Evalulation:**
|
||||
- Modified model configuration file and verified for Kneron hardware platform
|
||||
- Please see [Overview of Benchmark and Model Zoo](#Overview-of-Benchmark-and-Model-Zoo) for Kneron-Verified model list
|
||||
2. **Converting to ONNX:**
|
||||
- tools/pytorch2onnx_kneron.py (beta)
|
||||
- Export *optimized* and *Kneron-toolchain supported* onnx
|
||||
- Automatically modify model for arbitrary data normalization preprocess
|
||||
3. **Evaluation**
|
||||
- tools/test_kneron.py (beta)
|
||||
- Evaluate the model with *pytorch checkpoint, onnx, and kneron-nef*
|
||||
4. **Testing**
|
||||
- inference_kn (beta)
|
||||
- Verify the converted [NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model on Kneron USB accelerator with this API
|
||||
5. **Converting Kneron-NEF:** (toolchain feature)
|
||||
- Convert the trained pytorch model to [Kneron-NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model, which could be used on Kneron hardware platform.
|
||||
# 安裝 mmcv-full
|
||||
pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html
|
||||
|
||||
## License
|
||||
# 安裝專案
|
||||
pip install -e .
|
||||
|
||||
This project is released under the [Apache 2.0 license](LICENSE).
|
||||
# 安裝工具套件
|
||||
pip install opencv-python tqdm matplotlib cityscapesscripts yapf==0.31.0
|
||||
```
|
||||
|
||||
## Changelog
|
||||
### 資料準備
|
||||
|
||||
N/A
|
||||
1. 使用 **Roboflow** 匯出資料集,格式選擇 `Semantic Segmentation Masks`
|
||||
2. 使用 `seg2city.py` 將 Roboflow 格式轉換為 Cityscapes 格式
|
||||
3. 將轉換後的資料放至 `data/cityscapes/`
|
||||
|
||||
## Overview of Benchmark and Kneron Model Zoo
|
||||
### 訓練與測試
|
||||
|
||||
| Backbone | Crop Size | Mem (GB) | mIoU | Config | Download |
|
||||
|:--------:|:---------:|:--------:|:----:|:------:|:--------:|
|
||||
| STDC 1 | 512x1024 | 7.15 | 69.29|[config](https://github.com/kneron/kneron-mmsegmentation/tree/master/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py)|[model](https://github.com/kneron/Model_Zoo/blob/main/mmsegmentation/stdc_1/latest.zip)
|
||||
```bash
|
||||
# 訓練
|
||||
python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py
|
||||
|
||||
NOTE: The performance may slightly differ from the original implementation since the input size is smaller.
|
||||
# 測試(輸出視覺化結果)
|
||||
python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \
|
||||
work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \
|
||||
--show-dir work_dirs/vis_results
|
||||
```
|
||||
|
||||
## Installation
|
||||
- Please refer to the Step 1 of [docs_kneron/stdc_step_by_step.md#step-1-environment](docs_kneron/stdc_step_by_step.md) for installation.
|
||||
- Please refer to [Kneron PLUS - Python: Installation](http://doc.kneron.com/docs/#plus_python/introduction/install_dependency/) for the environment setup for Kneron USB accelerator.
|
||||
### 轉換 ONNX / NEF(Kneron Toolchain)
|
||||
|
||||
## Getting Started
|
||||
### Tutorial - Kneron Edition
|
||||
- [STDC-Seg: Step-By-Step](docs_kneron/stdc_step_by_step.md): A tutorial for users to get started easily. To see detailed documents, please see below.
|
||||
```bash
|
||||
# 啟動 Docker(WSL 環境)
|
||||
docker run --rm -it \
|
||||
-v $(wslpath -u 'C:\Users\rd_de\stdc_git'):/workspace/stdc_git \
|
||||
kneron/toolchain:latest
|
||||
|
||||
### Documents - Kneron Edition
|
||||
- [Kneron ONNX Export] (under development)
|
||||
- [Kneron Inference] (under development)
|
||||
- [Kneron Toolchain Step-By-Step (YOLOv3)](http://doc.kneron.com/docs/#toolchain/yolo_example/)
|
||||
- [Kneron Toolchain Manual](http://doc.kneron.com/docs/#toolchain/manual/#0-overview)
|
||||
# 轉換 ONNX
|
||||
python tools/pytorch2onnx_kneron.py \
|
||||
configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \
|
||||
--checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \
|
||||
--output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx \
|
||||
--verify
|
||||
|
||||
### Original mmsegmentation Documents
|
||||
- [Original mmsegmentation getting started](https://github.com/open-mmlab/mmsegmentation#getting-started): It is recommended to read the original mmsegmentation getting started documents for other mmsegmentation operations.
|
||||
- [Original mmsegmentation readthedoc](https://mmsegmentation.readthedocs.io/en/latest/): Original mmsegmentation documents.
|
||||
# 將 NEF 複製到本機
|
||||
docker cp <container_id>:/data1/kneron_flow/models_630.nef \
|
||||
"C:\Users\rd_de\stdc_git\work_dirs\nef\models_630.nef"
|
||||
```
|
||||
|
||||
## Contributing
|
||||
[kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) a platform built upon [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation)
|
||||
|
||||
- For issues regarding to the original [mmsegmentation](https://github.com/open-mmlab/mmsegmentation):
|
||||
We appreciate all contributions to improve [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation). Ongoing projects can be found in out [GitHub Projects](https://github.com/open-mmlab/mmsegmentation/projects). Welcome community users to participate in these projects. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline.
|
||||
|
||||
- For issues regarding to this repository [kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation): Welcome to leave the comment or submit pull requests here to improve kneron-mmsegmentation
|
||||
|
||||
|
||||
## Related Projects
|
||||
- [kneron-mmdetection](https://github.com/kneron/kneron-mmdetection): Kneron training/deployment platform on [OpenMMLab - mmdetection](https://github.com/open-mmlab/mmdetection) object detection toolbox
|
||||
|
||||
@ -1,5 +1,6 @@
|
||||
# dataset settings
|
||||
dataset_type = 'CityscapesDataset'
|
||||
#dataset_type = 'CityscapesDataset'
|
||||
dataset_type = 'GolfDataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(
|
||||
mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
|
||||
70
configs/_base_/datasets/kn_cityscapes1.py
Normal file
70
configs/_base_/datasets/kn_cityscapes1.py
Normal file
@ -0,0 +1,70 @@
|
||||
# dataset settings
|
||||
dataset_type = 'GolfDataset'
|
||||
data_root = 'data/cityscapes0/' # ✅ 你的資料根目錄
|
||||
|
||||
img_norm_cfg = dict(
|
||||
mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (512, 1024)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', keep_ratio=True),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img']),
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline
|
||||
),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline
|
||||
),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline
|
||||
)
|
||||
)
|
||||
|
||||
# ✅ 類別與對應的調色盤(不傳給 dataset,用於繪圖/推論可視化)
|
||||
classes = ('car', 'grass', 'people', 'road')
|
||||
palette = [
|
||||
[246, 14, 135], # car
|
||||
[233, 81, 78], # grass
|
||||
[220, 148, 21], # people
|
||||
[207, 215, 220], # road
|
||||
]
|
||||
71
configs/_base_/datasets/kn_cityscapes2.py
Normal file
71
configs/_base_/datasets/kn_cityscapes2.py
Normal file
@ -0,0 +1,71 @@
|
||||
# dataset settings
|
||||
dataset_type = 'GolfDataset'
|
||||
data_root = 'data/cityscapes0/' # ✅ 你的資料根目錄
|
||||
|
||||
img_norm_cfg = dict(
|
||||
mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', keep_ratio=True),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img']),
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline
|
||||
),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline
|
||||
),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline
|
||||
)
|
||||
)
|
||||
|
||||
# ✅ 類別與對應的調色盤(不傳給 dataset,用於繪圖/推論可視化)
|
||||
classes = ('car', 'grass', 'people', 'road')
|
||||
palette = [
|
||||
[246, 14, 135], # car
|
||||
[233, 81, 78], # grass
|
||||
[220, 148, 21], # people
|
||||
[207, 215, 220], # road
|
||||
]
|
||||
|
||||
22
configs/_base_/schedules/schedule_2k.py
Normal file
22
configs/_base_/schedules/schedule_2k.py
Normal file
@ -0,0 +1,22 @@
|
||||
# optimizer
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
|
||||
# optimizer config
|
||||
optimizer_config = dict()
|
||||
|
||||
# learning policy
|
||||
lr_config = dict(
|
||||
policy='poly',
|
||||
power=0.9,
|
||||
min_lr=1e-4,
|
||||
by_epoch=False
|
||||
)
|
||||
|
||||
# runtime settings
|
||||
runner = dict(type='IterBasedRunner', max_iters=2000)
|
||||
|
||||
# checkpoint 每 2000 次儲存一次(最後一次)
|
||||
checkpoint_config = dict(by_epoch=False, interval=2000)
|
||||
|
||||
# 評估設定,每 2000 次執行一次 mIoU 評估
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
193
configs/stdc/kn_stdc1_golf4class.py
Normal file
193
configs/stdc/kn_stdc1_golf4class.py
Normal file
@ -0,0 +1,193 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
|
||||
# ---------------- 模型設定 ---------------- #
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False,
|
||||
init_cfg=dict(
|
||||
type='Pretrained',
|
||||
checkpoint='https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
)
|
||||
),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)
|
||||
),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅ 四類
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
|
||||
),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
|
||||
),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
|
||||
),
|
||||
dict(
|
||||
type='STDCHead',
|
||||
in_channels=256,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅ 最重要
|
||||
boundary_threshold=0.1,
|
||||
in_index=0,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=True,
|
||||
loss_decode=[
|
||||
dict(
|
||||
type='CrossEntropyLoss',
|
||||
loss_name='loss_ce',
|
||||
use_sigmoid=True,
|
||||
loss_weight=1.0),
|
||||
dict(
|
||||
type='DiceLoss',
|
||||
loss_name='loss_dice',
|
||||
loss_weight=1.0)
|
||||
]
|
||||
)
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
# ---------------- 資料集設定 ---------------- #
|
||||
dataset_type = 'GolfDataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(
|
||||
mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (512, 1024)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg']),
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(1024, 512),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', keep_ratio=True),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img']),
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline
|
||||
),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline
|
||||
),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline
|
||||
)
|
||||
)
|
||||
|
||||
# ---------------- 額外設定 ---------------- #
|
||||
log_config = dict(
|
||||
interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
checkpoint_config = dict(by_epoch=False, interval=1000)
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(
|
||||
policy='poly',
|
||||
power=0.9,
|
||||
min_lr=0.0001,
|
||||
by_epoch=False,
|
||||
warmup='linear',
|
||||
warmup_iters=1000)
|
||||
runner = dict(type='IterBasedRunner', max_iters=20000)
|
||||
cudnn_benchmark = True
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
work_dir = './work_dirs/kn_stdc1_golf4class'
|
||||
gpu_ids = [0]
|
||||
|
||||
# ✅ 可選:僅供視覺化或 post-processing 用,不會傳給 dataset
|
||||
classes = ('car', 'grass', 'people', 'road')
|
||||
palette = [
|
||||
[246, 14, 135], # car
|
||||
[233, 81, 78], # grass
|
||||
[220, 148, 21], # people
|
||||
[207, 215, 220], # road
|
||||
]
|
||||
@ -1,14 +1,17 @@
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' # noqa
|
||||
_base_ = [
|
||||
'../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes.py',
|
||||
'../_base_/default_runtime.py', '../_base_/schedules/schedule_80k.py'
|
||||
'../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes2.py',
|
||||
'../_base_/default_runtime.py', '../_base_/schedules/schedule_2k.py'
|
||||
]
|
||||
lr_config = dict(warmup='linear', warmup_iters=1000)
|
||||
data = dict(
|
||||
samples_per_gpu=12,
|
||||
workers_per_gpu=4,
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
)
|
||||
model = dict(
|
||||
backbone=dict(
|
||||
backbone_cfg=dict(
|
||||
init_cfg=dict(type='Pretrained', checkpoint=checkpoint))))
|
||||
|
||||
|
||||
|
||||
|
||||
137
configs/stdc/meconfig.py
Normal file
137
configs/stdc/meconfig.py
Normal file
@ -0,0 +1,137 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=4,
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4,
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4,
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
dataset_type = 'GolfDataset'
|
||||
data_root = 'data/cityscapes0/'
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', keep_ratio=True),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=80000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=2000)
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
146
configs/stdc/meconfig1.py
Normal file
146
configs/stdc/meconfig1.py
Normal file
@ -0,0 +1,146 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=1, # ✅ 只分類 grass
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=1,
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=1,
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
# ✅ 更新為你新的 dataset 類別
|
||||
dataset_type = 'GrassOnlyDataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
|
||||
# ✅ 加入 classes 與 palette 定義
|
||||
classes = ('grass',)
|
||||
palette = [[0, 128, 0]]
|
||||
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=80000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=2000)
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
149
configs/stdc/meconfig2.py
Normal file
149
configs/stdc/meconfig2.py
Normal file
@ -0,0 +1,149 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=2, # ✅ grass + road
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=2,
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=2,
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
# ✅ 使用 Golf2Dataset (草地與道路)
|
||||
dataset_type = 'Golf2Dataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
|
||||
# ✅ 類別與對應顏色
|
||||
classes = ('grass', 'road')
|
||||
palette = [
|
||||
[0, 255, 0], # grass
|
||||
[255, 165, 0], # road
|
||||
]
|
||||
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=80000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=2000)
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
151
configs/stdc/meconfig4.py
Normal file
151
configs/stdc/meconfig4.py
Normal file
@ -0,0 +1,151 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅ 改為 4 類
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅ 改為 4 類
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=4, # ✅ 改為 4 類
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
# ✅ 新 dataset 類別
|
||||
dataset_type = 'Golf4Dataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
|
||||
# ✅ 類別與配色
|
||||
classes = ('car', 'grass', 'people', 'road')
|
||||
palette = [
|
||||
[0, 0, 128], # car
|
||||
[0, 255, 0], # grass
|
||||
[255, 0, 0], # people
|
||||
[255, 165, 0], # road
|
||||
]
|
||||
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=80000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=2000)
|
||||
evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
|
||||
137
configs/stdc/meconfig7.py
Normal file
137
configs/stdc/meconfig7.py
Normal file
@ -0,0 +1,137 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=7,
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=7,
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=7,
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
dataset_type = 'Golf8Dataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=320000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=32000)
|
||||
evaluation = dict(interval=32000, metric='mIoU', pre_eval=True)
|
||||
137
configs/stdc/meconfig8.py
Normal file
137
configs/stdc/meconfig8.py
Normal file
@ -0,0 +1,137 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 改為 8 類
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 改為 8 類
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 改為 8 類
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
dataset_type = 'Golf8Dataset' # ✅ 使用 Golf8Dataset
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)])
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
load_from = None
|
||||
resume_from = None
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=320000)
|
||||
checkpoint_config = dict(by_epoch=False, interval=32000)
|
||||
evaluation = dict(interval=32000, metric='mIoU', pre_eval=True)
|
||||
147
configs/stdc/meconfig8_finetune.py
Normal file
147
configs/stdc/meconfig8_finetune.py
Normal file
@ -0,0 +1,147 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
dataset_type = 'Golf8Dataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)]
|
||||
)
|
||||
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
|
||||
# ✅ Fine-tune 用設定
|
||||
load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth'
|
||||
resume_from = None
|
||||
|
||||
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
# ✅ Fine-tune 推薦學習率
|
||||
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=160000)
|
||||
|
||||
checkpoint_config = dict(by_epoch=False, interval=16000)
|
||||
evaluation = dict(interval=16000, metric='mIoU', pre_eval=True)
|
||||
147
configs/stdc/test.py
Normal file
147
configs/stdc/test.py
Normal file
@ -0,0 +1,147 @@
|
||||
norm_cfg = dict(type='BN', requires_grad=True)
|
||||
|
||||
model = dict(
|
||||
type='EncoderDecoder',
|
||||
pretrained=None,
|
||||
backbone=dict(
|
||||
type='STDCContextPathNet',
|
||||
backbone_cfg=dict(
|
||||
type='STDCNet',
|
||||
stdc_type='STDCNet1',
|
||||
in_channels=3,
|
||||
channels=(32, 64, 256, 512, 1024),
|
||||
bottleneck_type='cat',
|
||||
num_convs=4,
|
||||
norm_cfg=norm_cfg,
|
||||
act_cfg=dict(type='ReLU'),
|
||||
with_final_conv=False),
|
||||
last_in_channels=(1024, 512),
|
||||
out_channels=128,
|
||||
ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
|
||||
decode_head=dict(
|
||||
type='FCNHead',
|
||||
in_channels=256,
|
||||
channels=256,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=3,
|
||||
concat_input=False,
|
||||
dropout_ratio=0.1,
|
||||
norm_cfg=norm_cfg,
|
||||
align_corners=True,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
auxiliary_head=[
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=2,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
dict(
|
||||
type='FCNHead',
|
||||
in_channels=128,
|
||||
channels=64,
|
||||
num_convs=1,
|
||||
num_classes=8, # ✅ 8 類
|
||||
in_index=1,
|
||||
norm_cfg=norm_cfg,
|
||||
concat_input=False,
|
||||
align_corners=False,
|
||||
sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
|
||||
loss_decode=dict(
|
||||
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
|
||||
],
|
||||
train_cfg=dict(),
|
||||
test_cfg=dict(mode='whole')
|
||||
)
|
||||
|
||||
dataset_type = 'Golf8Dataset'
|
||||
data_root = 'data/cityscapes/'
|
||||
img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
|
||||
crop_size = (360, 720)
|
||||
|
||||
train_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(type='LoadAnnotations'),
|
||||
dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
|
||||
dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
|
||||
dict(type='RandomFlip', prob=0.5),
|
||||
dict(type='PhotoMetricDistortion'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
|
||||
dict(type='DefaultFormatBundle'),
|
||||
dict(type='Collect', keys=['img', 'gt_semantic_seg'])
|
||||
]
|
||||
|
||||
test_pipeline = [
|
||||
dict(type='LoadImageFromFile'),
|
||||
dict(
|
||||
type='MultiScaleFlipAug',
|
||||
img_scale=(724, 362),
|
||||
flip=False,
|
||||
transforms=[
|
||||
dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
|
||||
dict(type='RandomFlip'),
|
||||
dict(type='Normalize', **img_norm_cfg),
|
||||
dict(type='ImageToTensor', keys=['img']),
|
||||
dict(type='Collect', keys=['img'])
|
||||
])
|
||||
]
|
||||
|
||||
data = dict(
|
||||
samples_per_gpu=2,
|
||||
workers_per_gpu=2,
|
||||
train=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/train',
|
||||
ann_dir='gtFine/train',
|
||||
pipeline=train_pipeline),
|
||||
val=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/val',
|
||||
ann_dir='gtFine/val',
|
||||
pipeline=test_pipeline),
|
||||
test=dict(
|
||||
type=dataset_type,
|
||||
data_root=data_root,
|
||||
img_dir='leftImg8bit/test',
|
||||
ann_dir='gtFine/test',
|
||||
pipeline=test_pipeline)
|
||||
)
|
||||
|
||||
log_config = dict(
|
||||
interval=50,
|
||||
hooks=[dict(type='TextLoggerHook', by_epoch=False)]
|
||||
)
|
||||
|
||||
dist_params = dict(backend='nccl')
|
||||
log_level = 'INFO'
|
||||
|
||||
# ✅ Fine-tune 用設定
|
||||
load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth'
|
||||
resume_from = None
|
||||
|
||||
|
||||
workflow = [('train', 1)]
|
||||
cudnn_benchmark = True
|
||||
|
||||
# ✅ Fine-tune 推薦學習率
|
||||
optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005)
|
||||
optimizer_config = dict()
|
||||
|
||||
lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
|
||||
runner = dict(type='IterBasedRunner', max_iters=160)
|
||||
|
||||
checkpoint_config = dict(by_epoch=False, interval=16)
|
||||
evaluation = dict(interval=16, metric='mIoU', pre_eval=True)
|
||||
684
kneron_preprocessing/API.py
Normal file
684
kneron_preprocessing/API.py
Normal file
@ -0,0 +1,684 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
import numpy as np
|
||||
import os
|
||||
from .funcs.utils import str2int, str2bool
|
||||
from . import Flow
|
||||
|
||||
flow = Flow()
|
||||
flow.set_numerical_type('floating')
|
||||
flow_520 = Flow()
|
||||
flow_520.set_numerical_type('520')
|
||||
flow_720 = Flow()
|
||||
flow_720.set_numerical_type('720')
|
||||
|
||||
DEFAULT = None
|
||||
default = {
|
||||
'crop':{
|
||||
'align_w_to_4':False
|
||||
},
|
||||
'resize':{
|
||||
'type':'bilinear',
|
||||
'calculate_ratio_using_CSim':False
|
||||
}
|
||||
}
|
||||
|
||||
def set_default_as_520():
|
||||
"""
|
||||
Set some default parameter as 520 setting
|
||||
|
||||
crop.align_w_to_4 = True
|
||||
crop.pad_square_to_4 = True
|
||||
resize.type = 'fixed_520'
|
||||
resize.calculate_ratio_using_CSim = True
|
||||
"""
|
||||
global default
|
||||
default['crop']['align_w_to_4'] = True
|
||||
default['resize']['type'] = 'fixed_520'
|
||||
default['resize']['calculate_ratio_using_CSim'] = True
|
||||
return
|
||||
|
||||
def set_default_as_floating():
|
||||
"""
|
||||
Set some default parameter as floating setting
|
||||
|
||||
crop.align_w_to_4 = False
|
||||
crop.pad_square_to_4 = False
|
||||
resize.type = 'bilinear'
|
||||
resize.calculate_ratio_using_CSim = False
|
||||
"""
|
||||
global default
|
||||
default['crop']['align_w_to_4'] = False
|
||||
default['resize']['type'] = 'bilinear'
|
||||
default['resize']['calculate_ratio_using_CSim'] = False
|
||||
pass
|
||||
|
||||
def print_info_on():
|
||||
"""
|
||||
turn print infomation on.
|
||||
"""
|
||||
flow.set_print_info(True)
|
||||
flow_520.set_print_info(True)
|
||||
|
||||
def print_info_off():
|
||||
"""
|
||||
turn print infomation off.
|
||||
"""
|
||||
flow.set_print_info(False)
|
||||
flow_520.set_print_info(False)
|
||||
|
||||
def load_image(image):
|
||||
"""
|
||||
load_image function
|
||||
load load_image and output as rgb888 format np.array
|
||||
|
||||
Args:
|
||||
image: [np.array/str], can be np.array or image file path
|
||||
|
||||
Returns:
|
||||
out: [np.array], rgb888 format
|
||||
|
||||
Examples:
|
||||
"""
|
||||
image = flow.load_image(image, is_raw = False)
|
||||
return image
|
||||
|
||||
def load_bin(image, fmt=None, size=None):
|
||||
"""
|
||||
load_bin function
|
||||
load bin file and output as rgb888 format np.array
|
||||
|
||||
Args:
|
||||
image: [str], bin file path
|
||||
fmt: [str], "rgb888" / "rgb565" / "nir"
|
||||
size: [tuble], (image_w, image_h)
|
||||
|
||||
Returns:
|
||||
out: [np.array], rgb888 format
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.load_bin(image,'rgb565',(raw_w,raw_h))
|
||||
"""
|
||||
assert isinstance(size, tuple)
|
||||
assert isinstance(fmt, str)
|
||||
# assert (fmt.lower() in ['rgb888', "rgb565" , "nir",'RGB888', "RGB565" , "NIR", 'NIR888', 'nir888'])
|
||||
|
||||
image = flow.load_image(image, is_raw = True, raw_img_type='bin', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1])
|
||||
flow.set_color_conversion(source_format=fmt, out_format = 'rgb888')
|
||||
image,_ = flow.funcs['color'](image)
|
||||
return image
|
||||
|
||||
def load_hex(file, fmt=None, size=None):
|
||||
"""
|
||||
load_hex function
|
||||
load hex file and output as rgb888 format np.array
|
||||
|
||||
Args:
|
||||
image: [str], hex file path
|
||||
fmt: [str], "rgb888" / "yuv444" / "ycbcr444" / "yuv422" / "ycbcr422" / "rgb565"
|
||||
size: [tuble], (image_w, image_h)
|
||||
|
||||
Returns:
|
||||
out: [np.array], rgb888 format
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.load_hex(image,'rgb565',(raw_w,raw_h))
|
||||
"""
|
||||
assert isinstance(size, tuple)
|
||||
assert isinstance(fmt, str)
|
||||
assert (fmt.lower() in ['rgb888',"yuv444" , "ycbcr444" , "yuv422" , "ycbcr422" , "rgb565"])
|
||||
|
||||
image = flow.load_image(file, is_raw = True, raw_img_type='hex', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1])
|
||||
flow.set_color_conversion(source_format=fmt, out_format = 'rgb888')
|
||||
image,_ = flow.funcs['color'](image)
|
||||
return image
|
||||
|
||||
def dump_image(image, output=None, file_fmt='txt',image_fmt='rgb888',order=0):
|
||||
"""
|
||||
dump_image function
|
||||
|
||||
dump txt, bin or hex, default is txt
|
||||
image format as following format: RGB888, RGBA8888, RGB565, NIR, YUV444, YCbCr444, YUV422, YCbCr422, default is RGB888
|
||||
|
||||
Args:
|
||||
image: [np.array/str], can be np.array or image file path
|
||||
output: [str], dump file path
|
||||
file_fmt: [str], "bin" / "txt" / "hex", set dump file format, default is txt
|
||||
image_fmt: [str], RGB888 / RGBA8888 / RGB565 / NIR / YUV444 / YCbCr444 / YUV422 / YCbCr422, default is RGB888
|
||||
|
||||
Examples:
|
||||
>>> kneron_preprocessing.API.dump_image(image_data,out_path,fmt='bin')
|
||||
"""
|
||||
if isinstance(image, str):
|
||||
image = load_image(image)
|
||||
|
||||
assert isinstance(image, np.ndarray)
|
||||
if output is None:
|
||||
return
|
||||
|
||||
flow.set_output_setting(is_dump=False, dump_format=file_fmt, image_format=image_fmt ,output_file=output)
|
||||
flow.dump_image(image)
|
||||
return
|
||||
|
||||
def convert(image, out_fmt = 'RGB888', source_fmt = 'RGB888'):
|
||||
"""
|
||||
color convert
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
out_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422"
|
||||
source_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422"
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
|
||||
"""
|
||||
flow.set_color_conversion(source_format = source_fmt, out_format=out_fmt, simulation=False)
|
||||
image,_ = flow.funcs['color'](image)
|
||||
return image
|
||||
|
||||
def get_crop_range(box,align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0):
|
||||
"""
|
||||
get exact crop box according different setting
|
||||
|
||||
Args:
|
||||
box: [tuble], (x1, y1, x2, y2)
|
||||
align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
|
||||
pad_square_to_4: [bool], pad to square(align 4) or not, default False
|
||||
rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding
|
||||
|
||||
Returns:
|
||||
out: [tuble,4], (crop_x1, crop_y1, crop_x2, crop_y2)
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.get_crop_range((272,145,461,341), align_w_to_4=True, pad_square_to_4=True)
|
||||
(272, 145, 460, 341)
|
||||
"""
|
||||
if box is None:
|
||||
return (0,0,0,0)
|
||||
if align_w_to_4 is None:
|
||||
align_w_to_4 = default['crop']['align_w_to_4']
|
||||
|
||||
flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
|
||||
image = np.zeros((1,1,3)).astype('uint8')
|
||||
_,info = flow.funcs['crop'](image)
|
||||
|
||||
return info['box']
|
||||
|
||||
def crop(image, box=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}):
|
||||
"""
|
||||
crop function
|
||||
|
||||
specific crop range by box
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
box: [tuble], (x1, y1, x2, y2)
|
||||
align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
|
||||
pad_square_to_4: [bool], pad to square(align 4) or not, default False
|
||||
rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding
|
||||
info_out: [dic], save the final crop box into info_out['box']
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), align_w_to_4=True, info_out=info)
|
||||
>>> info['box']
|
||||
(272, 145, 460, 341)
|
||||
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), pad_square_to_4=True, info_out=info)
|
||||
>>> info['box']
|
||||
(268, 145, 464, 341)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if box is None:
|
||||
return image
|
||||
if align_w_to_4 is None:
|
||||
align_w_to_4 = default['crop']['align_w_to_4']
|
||||
|
||||
flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
|
||||
image,info = flow.funcs['crop'](image)
|
||||
|
||||
info_out['box'] = info['box']
|
||||
return image
|
||||
|
||||
def crop_center(image, range=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}):
|
||||
"""
|
||||
crop function
|
||||
|
||||
center crop by range
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
range: [tuble], (crop_w, crop_h)
|
||||
align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
|
||||
pad_square_to_4: [bool], pad to square(align 4) or not, default False
|
||||
rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding
|
||||
info_out: [dic], save the final crop box into info_out['box']
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), align_w_to_4=True,info_out=info)
|
||||
>>> info['box']
|
||||
(268, 220, 372, 260)
|
||||
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), pad_square_to_4=True, info_out=info)
|
||||
>>> info['box']
|
||||
(269, 192, 371, 294)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if range is None:
|
||||
return image
|
||||
if align_w_to_4 is None:
|
||||
align_w_to_4 = default['crop']['align_w_to_4']
|
||||
|
||||
flow.set_crop(type='center', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
|
||||
image,info = flow.funcs['crop'](image)
|
||||
|
||||
info_out['box'] = info['box']
|
||||
return image
|
||||
|
||||
def crop_corner(image, range=None, align_w_to_4=DEFAULT,pad_square_to_4=False,rounding_type=0 ,info_out = {}):
|
||||
"""
|
||||
crop function
|
||||
|
||||
corner crop by range
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
range: [tuble], (crop_w, crop_h)
|
||||
align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
|
||||
pad_square_to_4: [bool], pad to square(align 4) or not, default False
|
||||
rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding
|
||||
info_out: [dic], save the final crop box into info_out['box']
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), align_w_to_4=True,info_out=info)
|
||||
>>> info['box']
|
||||
(0, 0, 104, 40)
|
||||
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), pad_square_to_4=True,info_out=info)
|
||||
>>> info['box']
|
||||
(0, -28, 102, 74)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if range is None:
|
||||
return image
|
||||
if align_w_to_4 is None:
|
||||
align_w_to_4 = default['crop']['align_w_to_4']
|
||||
|
||||
flow.set_crop(type='corner', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4)
|
||||
image, info = flow.funcs['crop'](image)
|
||||
|
||||
info_out['box'] = info['box']
|
||||
return image
|
||||
|
||||
def resize(image, size=None, keep_ratio = True, zoom = True, type=DEFAULT, calculate_ratio_using_CSim = DEFAULT, info_out = {}):
|
||||
"""
|
||||
resize function
|
||||
|
||||
resize type can be bilinear or bilicubic as floating type, fixed or fixed_520/fixed_720 as fixed type.
|
||||
fixed_520/fixed_720 type has add some function to simulate 520/720 bug.
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
size: [tuble], (input_w, input_h)
|
||||
keep_ratio: [bool], keep_ratio or not, default True
|
||||
zoom: [bool], enable resize can zoom image or not, default True
|
||||
type: [str], "bilinear" / "bilicubic" / "cv2" / "fixed" / "fixed_520" / "fixed_720"
|
||||
calculate_ratio_using_CSim: [bool], calculate the ratio and scale using Csim function and C float, default False
|
||||
info_out: [dic], save the final scale size(w,h) into info_out['size']
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> info = {}
|
||||
>>> image_data = kneron_preprocessing.API.resize(image_data,size=(56,56),type='fixed',info_out=info)
|
||||
>>> info_out['size']
|
||||
(54,56)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if size is None:
|
||||
return image
|
||||
if type is None:
|
||||
type = default['resize']['type']
|
||||
if calculate_ratio_using_CSim is None:
|
||||
calculate_ratio_using_CSim = default['resize']['calculate_ratio_using_CSim']
|
||||
|
||||
flow.set_resize(resize_w = size[0], resize_h = size[1], type=type, keep_ratio=keep_ratio,zoom=zoom, calculate_ratio_using_CSim=calculate_ratio_using_CSim)
|
||||
image, info = flow.funcs['resize'](image)
|
||||
info_out['size'] = info['size']
|
||||
|
||||
return image
|
||||
|
||||
def pad(image, pad_l=0, pad_r=0, pad_t=0, pad_b=0, pad_val=0):
|
||||
"""
|
||||
pad function
|
||||
|
||||
specific left, right, top and bottom pad size.
|
||||
|
||||
Args:
|
||||
image[np.array]: input
|
||||
pad_l: [int], pad size from left, default 0
|
||||
pad_r: [int], pad size form right, default 0
|
||||
pad_t: [int], pad size from top, default 0
|
||||
pad_b: [int], pad size form bottom, default 0
|
||||
pad_val: [float], the value of pad, , default 0
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.pad(image_data,20,40,20,40,-0.5)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
|
||||
flow.set_padding(type='specific',pad_l=pad_l,pad_r=pad_r,pad_t=pad_t,pad_b=pad_b,pad_val=pad_val)
|
||||
image, _ = flow.funcs['padding'](image)
|
||||
return image
|
||||
|
||||
def pad_center(image,size=None, pad_val=0):
|
||||
"""
|
||||
pad function
|
||||
|
||||
center pad with pad size.
|
||||
|
||||
Args:
|
||||
image[np.array]: input
|
||||
size: [tuble], (padded_size_w, padded_size_h)
|
||||
pad_val: [float], the value of pad, , default 0
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.pad_center(image_data,size=(56,56),pad_val=-0.5)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if size is None:
|
||||
return image
|
||||
assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) )
|
||||
|
||||
flow.set_padding(type='center',padded_w=size[0],padded_h=size[1],pad_val=pad_val)
|
||||
image, _ = flow.funcs['padding'](image)
|
||||
return image
|
||||
|
||||
def pad_corner(image,size=None, pad_val=0):
|
||||
"""
|
||||
pad function
|
||||
|
||||
corner pad with pad size.
|
||||
|
||||
Args:
|
||||
image[np.array]: input
|
||||
size: [tuble], (padded_size_w, padded_size_h)
|
||||
pad_val: [float], the value of pad, , default 0
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.pad_corner(image_data,size=(56,56),pad_val=-0.5)
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
if size is None:
|
||||
return image
|
||||
assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) )
|
||||
|
||||
flow.set_padding(type='corner',padded_w=size[0],padded_h=size[1],pad_val=pad_val)
|
||||
image, _ = flow.funcs['padding'](image)
|
||||
return image
|
||||
|
||||
def norm(image,scale=256.,bias=-0.5, mean=None, std=None):
|
||||
"""
|
||||
norm function
|
||||
|
||||
x = (x/scale - bias)
|
||||
x[0,1,2] = x - mean[0,1,2]
|
||||
x[0,1,2] = x / std[0,1,2]
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
scale: [float], default = 256
|
||||
bias: [float], default = -0.5
|
||||
mean: [tuble,3], default = None
|
||||
std: [tuble,3], default = None
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.norm(image_data)
|
||||
>>> image_data = kneron_preprocessing.API.norm(image_data,mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
|
||||
"""
|
||||
assert isinstance(image, np.ndarray)
|
||||
|
||||
flow.set_normalize(type='specific',scale=scale, bias=bias, mean=mean, std =std)
|
||||
image, _ = flow.funcs['normalize'](image)
|
||||
return image
|
||||
|
||||
def inproc_520(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False, rotate=0, radix=8, bit_width=8, round_w_to_16=True, NUM_BANK_LINE=32,BANK_ENTRY_CNT=512,MAX_IMG_PREPROC_ROW_NUM=511,MAX_IMG_PREPROC_COL_NUM=256):
|
||||
"""
|
||||
inproc_520
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop
|
||||
pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0
|
||||
norm: [str], default = 'kneron'
|
||||
rotate: [int], 0 / 1 / 2 ,default = 0
|
||||
radix: [int], default = 8
|
||||
bit_width: [int], default = 8
|
||||
round_w_to_16: [bool], default = True
|
||||
gray: [bool], default = False
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1)
|
||||
"""
|
||||
# assert isinstance(image, np.ndarray)
|
||||
|
||||
if (not isinstance(image, np.ndarray)):
|
||||
flow_520.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1])
|
||||
else:
|
||||
flow_520.set_raw_img(is_raw_img='no')
|
||||
flow_520.set_color_conversion(source_format='rgb888')
|
||||
|
||||
if npu_size is None:
|
||||
return image
|
||||
|
||||
flow_520.set_model_size(w=npu_size[0],h=npu_size[1])
|
||||
|
||||
## Crop
|
||||
if crop_box != None:
|
||||
flow_520.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3])
|
||||
crop_fisrt = True
|
||||
else:
|
||||
crop_fisrt = False
|
||||
|
||||
## Color
|
||||
if gray:
|
||||
flow_520.set_color_conversion(out_format='l',simulation='no')
|
||||
else:
|
||||
flow_520.set_color_conversion(out_format='rgb888',simulation='no')
|
||||
|
||||
## Resize & Pad
|
||||
pad_mode = str2int(pad_mode)
|
||||
if (pad_mode == 0):
|
||||
pad_type = 'center'
|
||||
resize_keep_ratio = 'yes'
|
||||
elif (pad_mode == 1):
|
||||
pad_type = 'corner'
|
||||
resize_keep_ratio = 'yes'
|
||||
else:
|
||||
pad_type = 'center'
|
||||
resize_keep_ratio = 'no'
|
||||
|
||||
flow_520.set_resize(keep_ratio=resize_keep_ratio)
|
||||
flow_520.set_padding(type=pad_type)
|
||||
|
||||
## Norm
|
||||
flow_520.set_normalize(type=norm)
|
||||
|
||||
## 520 inproc
|
||||
flow_520.set_520_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM)
|
||||
image_data, _ = flow_520.run_whole_process(image)
|
||||
|
||||
return image_data
|
||||
|
||||
def inproc_720(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False):
|
||||
"""
|
||||
inproc_720
|
||||
|
||||
Args:
|
||||
image: [np.array], input
|
||||
crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop
|
||||
pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0
|
||||
norm: [str], default = 'kneron'
|
||||
rotate: [int], 0 / 1 / 2 ,default = 0
|
||||
radix: [int], default = 8
|
||||
bit_width: [int], default = 8
|
||||
round_w_to_16: [bool], default = True
|
||||
gray: [bool], default = False
|
||||
|
||||
Returns:
|
||||
out: [np.array]
|
||||
|
||||
Examples:
|
||||
>>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1)
|
||||
"""
|
||||
# assert isinstance(image, np.ndarray)
|
||||
|
||||
if (not isinstance(image, np.ndarray)):
|
||||
flow_720.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1])
|
||||
else:
|
||||
flow_720.set_raw_img(is_raw_img='no')
|
||||
flow_720.set_color_conversion(source_format='rgb888')
|
||||
|
||||
if npu_size is None:
|
||||
return image
|
||||
|
||||
flow_720.set_model_size(w=npu_size[0],h=npu_size[1])
|
||||
|
||||
## Crop
|
||||
if crop_box != None:
|
||||
flow_720.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3])
|
||||
crop_fisrt = True
|
||||
else:
|
||||
crop_fisrt = False
|
||||
|
||||
## Color
|
||||
if gray:
|
||||
flow_720.set_color_conversion(out_format='l',simulation='no')
|
||||
else:
|
||||
flow_720.set_color_conversion(out_format='rgb888',simulation='no')
|
||||
|
||||
## Resize & Pad
|
||||
pad_mode = str2int(pad_mode)
|
||||
if (pad_mode == 0):
|
||||
pad_type = 'center'
|
||||
resize_keep_ratio = 'yes'
|
||||
elif (pad_mode == 1):
|
||||
pad_type = 'corner'
|
||||
resize_keep_ratio = 'yes'
|
||||
else:
|
||||
pad_type = 'center'
|
||||
resize_keep_ratio = 'no'
|
||||
|
||||
flow_720.set_resize(keep_ratio=resize_keep_ratio)
|
||||
flow_720.set_padding(type=pad_type)
|
||||
|
||||
## 720 inproc
|
||||
# flow_720.set_720_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM)
|
||||
image_data, _ = flow_720.run_whole_process(image)
|
||||
|
||||
return image_data
|
||||
|
||||
def bit_match(data1, data2):
|
||||
"""
|
||||
bit_match function
|
||||
|
||||
check data1 is equal to data2 or not.
|
||||
|
||||
Args:
|
||||
data1: [np.array / str], can be array or txt/bin file
|
||||
data2: [np.array / str], can be array or txt/bin file
|
||||
|
||||
Returns:
|
||||
out1: [bool], is match or not
|
||||
out2: [np.array], if not match, save the position for mismatched data
|
||||
|
||||
Examples:
|
||||
>>> result, mismatched = kneron_preprocessing.API.bit_match(data1,data2)
|
||||
"""
|
||||
if isinstance(data1, str):
|
||||
if os.path.splitext(data1)[1] == '.bin':
|
||||
data1 = np.fromfile(data1, dtype='uint8')
|
||||
elif os.path.splitext(data1)[1] == '.txt':
|
||||
data1 = np.loadtxt(data1)
|
||||
|
||||
assert isinstance(data1, np.ndarray)
|
||||
|
||||
if isinstance(data2, str):
|
||||
if os.path.splitext(data2)[1] == '.bin':
|
||||
data2 = np.fromfile(data2, dtype='uint8')
|
||||
elif os.path.splitext(data2)[1] == '.txt':
|
||||
data2 = np.loadtxt(data2)
|
||||
|
||||
assert isinstance(data2, np.ndarray)
|
||||
|
||||
|
||||
data1 = data1.reshape((-1,1))
|
||||
data2 = data2.reshape((-1,1))
|
||||
|
||||
if not(len(data1) == len(data2)):
|
||||
print('error len')
|
||||
return False, np.zeros((1))
|
||||
else:
|
||||
ans = data2 - data1
|
||||
if len(np.where(ans>0)[0]) > 0:
|
||||
print('error',np.where(ans>0)[0])
|
||||
return False, np.where(ans>0)[0]
|
||||
else:
|
||||
print('pass')
|
||||
return True, np.zeros((1))
|
||||
|
||||
def cpr_to_crp(x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end):
|
||||
"""
|
||||
calculate the parameters of crop->pad->resize flow to HW crop->resize->padding flow
|
||||
|
||||
Args:
|
||||
|
||||
Returns:
|
||||
|
||||
Examples:
|
||||
|
||||
"""
|
||||
pad_l = round(pad_l * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r))
|
||||
pad_r = round(pad_r * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r))
|
||||
pad_t = round(pad_t * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b))
|
||||
pad_b = round(pad_b * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b))
|
||||
|
||||
rx_start +=pad_l
|
||||
rx_end -=pad_r
|
||||
ry_start +=pad_t
|
||||
ry_end -=pad_b
|
||||
|
||||
return x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end
|
||||
172
kneron_preprocessing/Cflow.py
Normal file
172
kneron_preprocessing/Cflow.py
Normal file
@ -0,0 +1,172 @@
|
||||
import numpy as np
|
||||
import argparse
|
||||
import kneron_preprocessing
|
||||
|
||||
def main_(args):
|
||||
image = args.input_file
|
||||
filefmt = args.file_fmt
|
||||
if filefmt == 'bin':
|
||||
raw_format = args.raw_format
|
||||
raw_w = args.input_width
|
||||
raw_h = args.input_height
|
||||
|
||||
image_data = kneron_preprocessing.API.load_bin(image,raw_format,(raw_w,raw_h))
|
||||
else:
|
||||
image_data = kneron_preprocessing.API.load_image(image)
|
||||
|
||||
|
||||
npu_w = args.width
|
||||
npu_h = args.height
|
||||
|
||||
crop_first = True if args.crop_first == "True" else False
|
||||
if crop_first:
|
||||
x1 = args.x_pos
|
||||
y1 = args.y_pos
|
||||
x2 = args.crop_w + x1
|
||||
y2 = args.crop_h + y1
|
||||
crop_box = [x1,y1,x2,y2]
|
||||
else:
|
||||
crop_box = None
|
||||
|
||||
pad_mode = args.pad_mode
|
||||
norm_mode = args.norm_mode
|
||||
bitwidth = args.bitwidth
|
||||
radix = args.radix
|
||||
rotate = args.rotate_mode
|
||||
|
||||
##
|
||||
image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(npu_w,npu_h),crop_box=crop_box,pad_mode=pad_mode,norm=norm_mode,rotate=rotate,radix=radix,bit_width=bitwidth)
|
||||
|
||||
output_file = args.output_file
|
||||
kneron_preprocessing.API.dump_image(image_data,output_file,'bin','rgba')
|
||||
|
||||
return
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
argparser = argparse.ArgumentParser(
|
||||
description="preprocessing"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-i',
|
||||
'--input_file',
|
||||
help="input file name"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-ff',
|
||||
'--file_fmt',
|
||||
help="input file format, jpg or bin"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-rf',
|
||||
'--raw_format',
|
||||
help="input file image format, rgb or rgb565 or nir"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-i_w',
|
||||
'--input_width',
|
||||
type=int,
|
||||
help="input image width"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-i_h',
|
||||
'--input_height',
|
||||
type=int,
|
||||
help="input image height"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-o',
|
||||
'--output_file',
|
||||
help="output file name"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-s_w',
|
||||
'--width',
|
||||
type=int,
|
||||
help="output width for npu input",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-s_h',
|
||||
'--height',
|
||||
type=int,
|
||||
help="output height for npu input",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-c_f',
|
||||
'--crop_first',
|
||||
help="crop first True or False",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-x',
|
||||
'--x_pos',
|
||||
type=int,
|
||||
help="left up coordinate x",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-y',
|
||||
'--y_pos',
|
||||
type=int,
|
||||
help="left up coordinate y",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-c_w',
|
||||
'--crop_w',
|
||||
type=int,
|
||||
help="crop width",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-c_h',
|
||||
'--crop_h',
|
||||
type=int,
|
||||
help="crop height",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-p_m',
|
||||
'--pad_mode',
|
||||
type=int,
|
||||
help=" 0: pad 2 sides, 1: pad 1 side, 2: no pad.",
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-n_m',
|
||||
'--norm_mode',
|
||||
help="normalizaton mode: yolo, kneron, tf."
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-r_m',
|
||||
'--rotate_mode',
|
||||
type=int,
|
||||
help="rotate mode:0,1,2"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-bw',
|
||||
'--bitwidth',
|
||||
type=int,
|
||||
help="Int for bitwidth"
|
||||
)
|
||||
|
||||
argparser.add_argument(
|
||||
'-r',
|
||||
'--radix',
|
||||
type=int,
|
||||
help="Int for radix"
|
||||
)
|
||||
|
||||
args = argparser.parse_args()
|
||||
main_(args)
|
||||
1226
kneron_preprocessing/Flow.py
Normal file
1226
kneron_preprocessing/Flow.py
Normal file
File diff suppressed because it is too large
Load Diff
2
kneron_preprocessing/__init__.py
Normal file
2
kneron_preprocessing/__init__.py
Normal file
@ -0,0 +1,2 @@
|
||||
from .Flow import *
|
||||
from .API import *
|
||||
285
kneron_preprocessing/funcs/ColorConversion.py
Normal file
285
kneron_preprocessing/funcs/ColorConversion.py
Normal file
@ -0,0 +1,285 @@
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
from .utils import signed_rounding, clip, str2bool
|
||||
|
||||
format_bit = 10
|
||||
c00_yuv = 1
|
||||
c02_yuv = 1436
|
||||
c10_yuv = 1
|
||||
c11_yuv = -354
|
||||
c12_yuv = -732
|
||||
c20_yuv = 1
|
||||
c21_yuv = 1814
|
||||
c00_ycbcr = 1192
|
||||
c02_ycbcr = 1634
|
||||
c10_ycbcr = 1192
|
||||
c11_ycbcr = -401
|
||||
c12_ycbcr = -833
|
||||
c20_ycbcr = 1192
|
||||
c21_ycbcr = 2065
|
||||
|
||||
Matrix_ycbcr_to_rgb888 = np.array(
|
||||
[[1.16438356e+00, 1.16438356e+00, 1.16438356e+00],
|
||||
[2.99747219e-07, - 3.91762529e-01, 2.01723263e+00],
|
||||
[1.59602686e+00, - 8.12968294e-01, 3.04059479e-06]])
|
||||
|
||||
Matrix_rgb888_to_ycbcr = np.array(
|
||||
[[0.25678824, - 0.14822353, 0.43921569],
|
||||
[0.50412941, - 0.29099216, - 0.36778824],
|
||||
[0.09790588, 0.43921569, - 0.07142745]])
|
||||
|
||||
Matrix_rgb888_to_yuv = np.array(
|
||||
[[ 0.29899106, -0.16877996, 0.49988381],
|
||||
[ 0.5865453, -0.33110385, -0.41826072],
|
||||
[ 0.11446364, 0.49988381, -0.08162309]])
|
||||
|
||||
# Matrix_rgb888_to_yuv = np.array(
|
||||
# [[0.299, - 0.147, 0.615],
|
||||
# [0.587, - 0.289, - 0.515],
|
||||
# [0.114, 0.436, - 0.100]])
|
||||
|
||||
# Matrix_yuv_to_rgb888 = np.array(
|
||||
# [[1.000, 1.000, 1.000],
|
||||
# [0.000, - 0.394, 2.032],
|
||||
# [1.140, - 0.581, 0.000]])
|
||||
|
||||
class runner(object):
|
||||
def __init__(self):
|
||||
self.set = {
|
||||
'print_info':'no',
|
||||
'model_size':[0,0],
|
||||
'numerical_type':'floating',
|
||||
"source_format": "rgb888",
|
||||
"out_format": "rgb888",
|
||||
"options": {
|
||||
"simulation": "no",
|
||||
"simulation_format": "rgb888"
|
||||
}
|
||||
}
|
||||
|
||||
def update(self, **kwargs):
|
||||
#
|
||||
self.set.update(kwargs)
|
||||
|
||||
## simulation
|
||||
self.funs = []
|
||||
if str2bool(self.set['options']['simulation']) and self.set['source_format'].lower() in ['RGB888', 'rgb888', 'RGB', 'rgb']:
|
||||
if self.set['options']['simulation_format'].lower() in ['YUV422', 'yuv422', 'YUV', 'yuv']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YUV422)
|
||||
self.set['source_format'] = 'YUV422'
|
||||
elif self.set['options']['simulation_format'].lower() in ['YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YCbCr422)
|
||||
self.set['source_format'] = 'YCbCr422'
|
||||
elif self.set['options']['simulation_format'].lower() in['RGB565', 'rgb565']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_RGB565)
|
||||
self.set['source_format'] = 'RGB565'
|
||||
|
||||
## to rgb888
|
||||
if self.set['source_format'].lower() in ['YUV444', 'yuv444','YUV422', 'yuv422', 'YUV', 'yuv']:
|
||||
self.funs.append(self._ColorConversion_YUV_to_RGB888)
|
||||
elif self.set['source_format'].lower() in ['YCBCR444', 'YCbCr444', 'ycbcr444','YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']:
|
||||
self.funs.append(self._ColorConversion_YCbCr_to_RGB888)
|
||||
elif self.set['source_format'].lower() in ['RGB565', 'rgb565']:
|
||||
self.funs.append(self._ColorConversion_RGB565_to_RGB888)
|
||||
elif self.set['source_format'].lower() in ['l', 'L' , 'nir', 'NIR']:
|
||||
self.funs.append(self._ColorConversion_L_to_RGB888)
|
||||
elif self.set['source_format'].lower() in ['RGBA8888', 'rgba8888' , 'RGBA', 'rgba']:
|
||||
self.funs.append(self._ColorConversion_RGBA8888_to_RGB888)
|
||||
|
||||
## output format
|
||||
if self.set['out_format'].lower() in ['L', 'l']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_L)
|
||||
elif self.set['out_format'].lower() in['RGB565', 'rgb565']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_RGB565)
|
||||
elif self.set['out_format'].lower() in['RGBA', 'RGBA8888','rgba','rgba8888']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_RGBA8888)
|
||||
elif self.set['out_format'].lower() in['YUV', 'YUV444','yuv','yuv444']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YUV444)
|
||||
elif self.set['out_format'].lower() in['YUV422','yuv422']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YUV422)
|
||||
elif self.set['out_format'].lower() in['YCBCR', 'YCBCR444','YCbCr','YCbCr444','ycbcr','ycbcr444']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YCbCr444)
|
||||
elif self.set['out_format'].lower() in['YCBCR422','YCbCr422','ycbcr422']:
|
||||
self.funs.append(self._ColorConversion_RGB888_to_YCbCr422)
|
||||
|
||||
def print_info(self):
|
||||
print("<colorConversion>",
|
||||
"source_format:", self.set['source_format'],
|
||||
', out_format:', self.set['out_format'],
|
||||
', simulation:', self.set['options']['simulation'],
|
||||
', simulation_format:', self.set['options']['simulation_format'])
|
||||
|
||||
def run(self, image_data):
|
||||
assert isinstance(image_data, np.ndarray)
|
||||
# print info
|
||||
if str2bool(self.set['print_info']):
|
||||
self.print_info()
|
||||
|
||||
# color
|
||||
for _, f in enumerate(self.funs):
|
||||
image_data = f(image_data)
|
||||
|
||||
# output
|
||||
info = {}
|
||||
return image_data, info
|
||||
|
||||
def _ColorConversion_RGB888_to_YUV444(self, image):
|
||||
## floating
|
||||
image = image.astype('float')
|
||||
image = (image @ Matrix_rgb888_to_yuv + 0.5).astype('uint8')
|
||||
return image
|
||||
|
||||
def _ColorConversion_RGB888_to_YUV422(self, image):
|
||||
# rgb888 to yuv444
|
||||
image = self._ColorConversion_RGB888_to_YUV444(image)
|
||||
|
||||
# yuv444 to yuv422
|
||||
u2 = image[:, 0::2, 1]
|
||||
u4 = np.repeat(u2, 2, axis=1)
|
||||
v2 = image[:, 1::2, 2]
|
||||
v4 = np.repeat(v2, 2, axis=1)
|
||||
image[..., 1] = u4
|
||||
image[..., 2] = v4
|
||||
return image
|
||||
|
||||
def _ColorConversion_YUV_to_RGB888(self, image):
|
||||
## fixed
|
||||
h, w, c = image.shape
|
||||
image_f = image.reshape((h * w, c))
|
||||
image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8)
|
||||
|
||||
for i in range(h * w):
|
||||
image_y = image_f[i, 0] *1024
|
||||
if image_f[i, 1] > 127:
|
||||
image_u = -((~(image_f[i, 1] - 1)) & 0xFF)
|
||||
else:
|
||||
image_u = image_f[i, 1]
|
||||
if image_f[i, 2] > 127:
|
||||
image_v = -((~(image_f[i, 2] - 1)) & 0xFF)
|
||||
else:
|
||||
image_v = image_f[i, 2]
|
||||
|
||||
image_r = c00_yuv * image_y + c02_yuv * image_v
|
||||
image_g = c10_yuv * image_y + c11_yuv * image_u + c12_yuv * image_v
|
||||
image_b = c20_yuv * image_y + c21_yuv * image_u
|
||||
|
||||
image_r = signed_rounding(image_r, format_bit)
|
||||
image_g = signed_rounding(image_g, format_bit)
|
||||
image_b = signed_rounding(image_b, format_bit)
|
||||
|
||||
image_r = image_r >> format_bit
|
||||
image_g = image_g >> format_bit
|
||||
image_b = image_b >> format_bit
|
||||
|
||||
image_rgb_f[i, 0] = clip(image_r, 0, 255)
|
||||
image_rgb_f[i, 1] = clip(image_g, 0, 255)
|
||||
image_rgb_f[i, 2] = clip(image_b, 0, 255)
|
||||
|
||||
image_rgb = image_rgb_f.reshape((h, w, c))
|
||||
return image_rgb
|
||||
|
||||
def _ColorConversion_RGB888_to_YCbCr444(self, image):
|
||||
## floating
|
||||
image = image.astype('float')
|
||||
image = (image @ Matrix_rgb888_to_ycbcr + 0.5).astype('uint8')
|
||||
image[:, :, 0] += 16
|
||||
image[:, :, 1] += 128
|
||||
image[:, :, 2] += 128
|
||||
|
||||
return image
|
||||
|
||||
def _ColorConversion_RGB888_to_YCbCr422(self, image):
|
||||
# rgb888 to ycbcr444
|
||||
image = self._ColorConversion_RGB888_to_YCbCr444(image)
|
||||
|
||||
# ycbcr444 to ycbcr422
|
||||
cb2 = image[:, 0::2, 1]
|
||||
cb4 = np.repeat(cb2, 2, axis=1)
|
||||
cr2 = image[:, 1::2, 2]
|
||||
cr4 = np.repeat(cr2, 2, axis=1)
|
||||
image[..., 1] = cb4
|
||||
image[..., 2] = cr4
|
||||
return image
|
||||
|
||||
def _ColorConversion_YCbCr_to_RGB888(self, image):
|
||||
## floating
|
||||
if (self.set['numerical_type'] == 'floating'):
|
||||
image = image.astype('float')
|
||||
image[:, :, 0] -= 16
|
||||
image[:, :, 1] -= 128
|
||||
image[:, :, 2] -= 128
|
||||
image = ((image @ Matrix_ycbcr_to_rgb888) + 0.5).astype('uint8')
|
||||
return image
|
||||
|
||||
## fixed
|
||||
h, w, c = image.shape
|
||||
image_f = image.reshape((h * w, c))
|
||||
image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8)
|
||||
|
||||
for i in range(h * w):
|
||||
image_y = (image_f[i, 0] - 16) * c00_ycbcr
|
||||
image_cb = image_f[i, 1] - 128
|
||||
image_cr = image_f[i, 2] - 128
|
||||
|
||||
image_r = image_y + c02_ycbcr * image_cr
|
||||
image_g = image_y + c11_ycbcr * image_cb + c12_ycbcr * image_cr
|
||||
image_b = image_y + c21_ycbcr * image_cb
|
||||
|
||||
image_r = signed_rounding(image_r, format_bit)
|
||||
image_g = signed_rounding(image_g, format_bit)
|
||||
image_b = signed_rounding(image_b, format_bit)
|
||||
|
||||
image_r = image_r >> format_bit
|
||||
image_g = image_g >> format_bit
|
||||
image_b = image_b >> format_bit
|
||||
|
||||
image_rgb_f[i, 0] = clip(image_r, 0, 255)
|
||||
image_rgb_f[i, 1] = clip(image_g, 0, 255)
|
||||
image_rgb_f[i, 2] = clip(image_b, 0, 255)
|
||||
|
||||
image_rgb = image_rgb_f.reshape((h, w, c))
|
||||
return image_rgb
|
||||
|
||||
def _ColorConversion_RGB888_to_RGB565(self, image):
|
||||
assert (len(image.shape)==3)
|
||||
assert (image.shape[2]>=3)
|
||||
|
||||
image_rgb565 = np.zeros(image.shape, dtype=np.uint8)
|
||||
image_rgb = image.astype('uint8')
|
||||
image_rgb565[:, :, 0] = image_rgb[:, :, 0] >> 3
|
||||
image_rgb565[:, :, 1] = image_rgb[:, :, 1] >> 2
|
||||
image_rgb565[:, :, 2] = image_rgb[:, :, 2] >> 3
|
||||
return image_rgb565
|
||||
|
||||
def _ColorConversion_RGB565_to_RGB888(self, image):
|
||||
assert (len(image.shape)==3)
|
||||
assert (image.shape[2]==3)
|
||||
|
||||
image_rgb = np.zeros(image.shape, dtype=np.uint8)
|
||||
image_rgb[:, :, 0] = image[:, :, 0] << 3
|
||||
image_rgb[:, :, 1] = image[:, :, 1] << 2
|
||||
image_rgb[:, :, 2] = image[:, :, 2] << 3
|
||||
return image_rgb
|
||||
|
||||
def _ColorConversion_L_to_RGB888(self, image):
|
||||
image_L = image.astype('uint8')
|
||||
img = Image.fromarray(image_L).convert('RGB')
|
||||
image_data = np.array(img).astype('uint8')
|
||||
return image_data
|
||||
|
||||
def _ColorConversion_RGB888_to_L(self, image):
|
||||
image_rgb = image.astype('uint8')
|
||||
img = Image.fromarray(image_rgb).convert('L')
|
||||
image_data = np.array(img).astype('uint8')
|
||||
return image_data
|
||||
|
||||
def _ColorConversion_RGBA8888_to_RGB888(self, image):
|
||||
assert (len(image.shape)==3)
|
||||
assert (image.shape[2]==4)
|
||||
return image[:,:,:3]
|
||||
|
||||
def _ColorConversion_RGB888_to_RGBA8888(self, image):
|
||||
assert (len(image.shape)==3)
|
||||
assert (image.shape[2]==3)
|
||||
imageA = np.concatenate((image, np.zeros((image.shape[0], image.shape[1], 1), dtype=np.uint8) ), axis=2)
|
||||
return imageA
|
||||
145
kneron_preprocessing/funcs/Crop.py
Normal file
145
kneron_preprocessing/funcs/Crop.py
Normal file
@ -0,0 +1,145 @@
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
from .utils import str2int, str2float, str2bool, pad_square_to_4
|
||||
from .utils_520 import round_up_n
|
||||
from .Runner_base import Runner_base, Param_base
|
||||
|
||||
class General(Param_base):
|
||||
type = 'center'
|
||||
align_w_to_4 = False
|
||||
pad_square_to_4 = False
|
||||
rounding_type = 0
|
||||
crop_w = 0
|
||||
crop_h = 0
|
||||
start_x = 0.
|
||||
start_y = 0.
|
||||
end_x = 0.
|
||||
end_y = 0.
|
||||
def update(self, **dic):
|
||||
self.type = dic['type']
|
||||
self.align_w_to_4 = str2bool(dic['align_w_to_4'])
|
||||
self.rounding_type = str2int(dic['rounding_type'])
|
||||
self.crop_w = str2int(dic['crop_w'])
|
||||
self.crop_h = str2int(dic['crop_h'])
|
||||
self.start_x = str2float(dic['start_x'])
|
||||
self.start_y = str2float(dic['start_y'])
|
||||
self.end_x = str2float(dic['end_x'])
|
||||
self.end_y = str2float(dic['end_y'])
|
||||
|
||||
def __str__(self):
|
||||
str_out = [
|
||||
', type:',str(self.type),
|
||||
', align_w_to_4:',str(self.align_w_to_4),
|
||||
', pad_square_to_4:',str(self.pad_square_to_4),
|
||||
', crop_w:',str(self.crop_w),
|
||||
', crop_h:',str(self.crop_h),
|
||||
', start_x:',str(self.start_x),
|
||||
', start_y:',str(self.start_y),
|
||||
', end_x:',str(self.end_x),
|
||||
', end_y:',str(self.end_y)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
class runner(Runner_base):
|
||||
## overwrite the class in Runner_base
|
||||
general = General()
|
||||
|
||||
def __str__(self):
|
||||
return('<Crop>')
|
||||
|
||||
def update(self, **kwargs):
|
||||
##
|
||||
super().update(**kwargs)
|
||||
|
||||
##
|
||||
if (self.general.start_x != self.general.end_x) and (self.general.start_y != self.general.end_y):
|
||||
self.general.type = 'specific'
|
||||
elif(self.general.type != 'specific'):
|
||||
if self.general.crop_w == 0 or self.general.crop_h == 0:
|
||||
self.general.crop_w = self.common.model_size[0]
|
||||
self.general.crop_h = self.common.model_size[1]
|
||||
assert(self.general.crop_w > 0)
|
||||
assert(self.general.crop_h > 0)
|
||||
assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner'])
|
||||
else:
|
||||
assert(self.general.type == 'specific')
|
||||
|
||||
def run(self, image_data):
|
||||
## init
|
||||
img = Image.fromarray(image_data)
|
||||
w, h = img.size
|
||||
|
||||
## get range
|
||||
if self.general.type.lower() in ['CENTER', 'Center', 'center']:
|
||||
x1, y1, x2, y2 = self._calcuate_xy_center(w, h)
|
||||
elif self.general.type.lower() in ['CORNER', 'Corner', 'corner']:
|
||||
x1, y1, x2, y2 = self._calcuate_xy_corner(w, h)
|
||||
else:
|
||||
x1 = self.general.start_x
|
||||
y1 = self.general.start_y
|
||||
x2 = self.general.end_x
|
||||
y2 = self.general.end_y
|
||||
assert( ((x1 != x2) and (y1 != y2)) )
|
||||
|
||||
## rounding
|
||||
if self.general.rounding_type == 0:
|
||||
x1 = int(np.floor(x1))
|
||||
y1 = int(np.floor(y1))
|
||||
x2 = int(np.ceil(x2))
|
||||
y2 = int(np.ceil(y2))
|
||||
else:
|
||||
x1 = int(round(x1))
|
||||
y1 = int(round(y1))
|
||||
x2 = int(round(x2))
|
||||
y2 = int(round(y2))
|
||||
|
||||
if self.general.align_w_to_4:
|
||||
# x1 = (x1+1) &(~3) #//+2
|
||||
# x2 = (x2+2) &(~3) #//+1
|
||||
x1 = (x1+3) &(~3) #//+2
|
||||
left = w - x2
|
||||
left = (left+3) &(~3)
|
||||
x2 = w - left
|
||||
|
||||
## pad_square_to_4
|
||||
if str2bool(self.general.pad_square_to_4):
|
||||
x1,x2,y1,y2 = pad_square_to_4(x1,x2,y1,y2)
|
||||
|
||||
# do crop
|
||||
box = (x1,y1,x2,y2)
|
||||
img = img.crop(box)
|
||||
|
||||
# print info
|
||||
if str2bool(self.common.print_info):
|
||||
self.general.start_x = x1
|
||||
self.general.start_y = y1
|
||||
self.general.end_x = x2
|
||||
self.general.end_y = y2
|
||||
self.general.crop_w = x2 - x1
|
||||
self.general.crop_h = y2 - y1
|
||||
self.print_info()
|
||||
|
||||
# output
|
||||
image_data = np.array(img)
|
||||
info = {}
|
||||
info['box'] = box
|
||||
|
||||
return image_data, info
|
||||
|
||||
|
||||
## protect fun
|
||||
def _calcuate_xy_center(self, w, h):
|
||||
x1 = w/2 - self.general.crop_w / 2
|
||||
y1 = h/2 - self.general.crop_h / 2
|
||||
x2 = w/2 + self.general.crop_w / 2
|
||||
y2 = h/2 + self.general.crop_h / 2
|
||||
return x1, y1, x2, y2
|
||||
|
||||
def _calcuate_xy_corner(self, _1, _2):
|
||||
x1 = 0
|
||||
y1 = 0
|
||||
x2 = self.general.crop_w
|
||||
y2 = self.general.crop_h
|
||||
return x1, y1, x2, y2
|
||||
|
||||
def do_crop(self, image_data, startW, startH, endW, endH):
|
||||
return image_data[startH:endH, startW:endW, :]
|
||||
186
kneron_preprocessing/funcs/Normalize.py
Normal file
186
kneron_preprocessing/funcs/Normalize.py
Normal file
@ -0,0 +1,186 @@
|
||||
import numpy as np
|
||||
from .utils import str2bool, str2int, str2float, clip_ary
|
||||
|
||||
class runner(object):
|
||||
def __init__(self):
|
||||
self.set = {
|
||||
'general': {
|
||||
'print_info':'no',
|
||||
'model_size':[0,0],
|
||||
'numerical_type':'floating',
|
||||
'type': 'kneron'
|
||||
},
|
||||
'floating':{
|
||||
"scale": 1,
|
||||
"bias": 0,
|
||||
"mean": "",
|
||||
"std": "",
|
||||
},
|
||||
'hw':{
|
||||
"radix":8,
|
||||
"shift":"",
|
||||
"sub":""
|
||||
}
|
||||
}
|
||||
return
|
||||
|
||||
def update(self, **kwargs):
|
||||
#
|
||||
self.set.update(kwargs)
|
||||
|
||||
#
|
||||
if self.set['general']['numerical_type'] == '520':
|
||||
if self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']:
|
||||
self.fun_normalize = self._chen_520
|
||||
self.shift = 7 - self.set['hw']['radix']
|
||||
self.sub = 128
|
||||
elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']:
|
||||
self.fun_normalize = self._chen_520
|
||||
self.shift = 8 - self.set['hw']['radix']
|
||||
self.sub = 0
|
||||
elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']:
|
||||
self.fun_normalize = self._chen_520
|
||||
self.shift = 8 - self.set['hw']['radix']
|
||||
self.sub = 128
|
||||
else:
|
||||
self.fun_normalize = self._chen_520
|
||||
self.shift = 0
|
||||
self.sub = 0
|
||||
elif self.set['general']['numerical_type'] == '720':
|
||||
self.fun_normalize = self._chen_720
|
||||
self.shift = 0
|
||||
self.sub = 0
|
||||
else:
|
||||
if self.set['general']['type'].lower() in ['TORCH', 'Torch', 'torch']:
|
||||
self.fun_normalize = self._normalize_torch
|
||||
self.set['floating']['scale'] = 255.
|
||||
self.set['floating']['mean'] = [0.485, 0.456, 0.406]
|
||||
self.set['floating']['std'] = [0.229, 0.224, 0.225]
|
||||
elif self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']:
|
||||
self.fun_normalize = self._normalize_tf
|
||||
self.set['floating']['scale'] = 127.5
|
||||
self.set['floating']['bias'] = -1.
|
||||
elif self.set['general']['type'].lower() in ['CAFFE', 'Caffe', 'caffe']:
|
||||
self.fun_normalize = self._normalize_caffe
|
||||
self.set['floating']['mean'] = [103.939, 116.779, 123.68]
|
||||
elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']:
|
||||
self.fun_normalize = self._normalize_yolo
|
||||
self.set['floating']['scale'] = 255.
|
||||
elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']:
|
||||
self.fun_normalize = self._normalize_kneron
|
||||
self.set['floating']['scale'] = 256.
|
||||
self.set['floating']['bias'] = -0.5
|
||||
else:
|
||||
self.fun_normalize = self._normalize_customized
|
||||
self.set['floating']['scale'] = str2float(self.set['floating']['scale'])
|
||||
self.set['floating']['bias'] = str2float(self.set['floating']['bias'])
|
||||
if self.set['floating']['mean'] != None:
|
||||
if len(self.set['floating']['mean']) != 3:
|
||||
self.set['floating']['mean'] = None
|
||||
if self.set['floating']['std'] != None:
|
||||
if len(self.set['floating']['std']) != 3:
|
||||
self.set['floating']['std'] = None
|
||||
|
||||
|
||||
def print_info(self):
|
||||
if self.set['general']['numerical_type'] == '520':
|
||||
print("<normalize>",
|
||||
'numerical_type', self.set['general']['numerical_type'],
|
||||
", type:", self.set['general']['type'],
|
||||
', shift:',self.shift,
|
||||
', sub:', self.sub)
|
||||
else:
|
||||
print("<normalize>",
|
||||
'numerical_type', self.set['general']['numerical_type'],
|
||||
", type:", self.set['general']['type'],
|
||||
', scale:',self.set['floating']['scale'],
|
||||
', bias:', self.set['floating']['bias'],
|
||||
', mean:', self.set['floating']['mean'],
|
||||
', std:',self.set['floating']['std'])
|
||||
|
||||
def run(self, image_data):
|
||||
# print info
|
||||
if str2bool(self.set['general']['print_info']):
|
||||
self.print_info()
|
||||
|
||||
# norm
|
||||
image_data = self.fun_normalize(image_data)
|
||||
|
||||
# output
|
||||
info = {}
|
||||
return image_data, info
|
||||
|
||||
def _normalize_torch(self, x):
|
||||
if len(x.shape) != 3:
|
||||
return x
|
||||
x = x.astype('float')
|
||||
x = x / self.set['floating']['scale']
|
||||
x[..., 0] -= self.set['floating']['mean'][0]
|
||||
x[..., 1] -= self.set['floating']['mean'][1]
|
||||
x[..., 2] -= self.set['floating']['mean'][2]
|
||||
x[..., 0] /= self.set['floating']['std'][0]
|
||||
x[..., 1] /= self.set['floating']['std'][1]
|
||||
x[..., 2] /= self.set['floating']['std'][2]
|
||||
return x
|
||||
|
||||
def _normalize_tf(self, x):
|
||||
# print('_normalize_tf')
|
||||
x = x.astype('float')
|
||||
x = x / self.set['floating']['scale']
|
||||
x = x + self.set['floating']['bias']
|
||||
return x
|
||||
|
||||
def _normalize_caffe(self, x):
|
||||
if len(x.shape) != 3:
|
||||
return x
|
||||
x = x.astype('float')
|
||||
x = x[..., ::-1]
|
||||
x[..., 0] -= self.set['floating']['mean'][0]
|
||||
x[..., 1] -= self.set['floating']['mean'][1]
|
||||
x[..., 2] -= self.set['floating']['mean'][2]
|
||||
return x
|
||||
|
||||
def _normalize_yolo(self, x):
|
||||
# print('_normalize_yolo')
|
||||
x = x.astype('float')
|
||||
x = x / self.set['floating']['scale']
|
||||
return x
|
||||
|
||||
def _normalize_kneron(self, x):
|
||||
# print('_normalize_kneron')
|
||||
x = x.astype('float')
|
||||
x = x/self.set['floating']['scale']
|
||||
x = x + self.set['floating']['bias']
|
||||
return x
|
||||
|
||||
def _normalize_customized(self, x):
|
||||
# print('_normalize_customized')
|
||||
x = x.astype('float')
|
||||
if self.set['floating']['scale'] != 0:
|
||||
x = x/ self.set['floating']['scale']
|
||||
x = x + self.set['floating']['bias']
|
||||
if self.set['floating']['mean'] is not None:
|
||||
x[..., 0] -= self.set['floating']['mean'][0]
|
||||
x[..., 1] -= self.set['floating']['mean'][1]
|
||||
x[..., 2] -= self.set['floating']['mean'][2]
|
||||
if self.set['floating']['std'] is not None:
|
||||
x[..., 0] /= self.set['floating']['std'][0]
|
||||
x[..., 1] /= self.set['floating']['std'][1]
|
||||
x[..., 2] /= self.set['floating']['std'][2]
|
||||
|
||||
return x
|
||||
|
||||
def _chen_520(self, x):
|
||||
# print('_chen_520')
|
||||
x = (x - self.sub).astype('uint8')
|
||||
x = (np.right_shift(x,self.shift))
|
||||
x=x.astype('uint8')
|
||||
return x
|
||||
|
||||
def _chen_720(self, x):
|
||||
# print('_chen_720')
|
||||
if self.shift == 1:
|
||||
x = x + np.array([[self.sub], [self.sub], [self.sub]])
|
||||
else:
|
||||
x = x + np.array([[self.sub], [self.sub], [self.sub]])
|
||||
return x
|
||||
187
kneron_preprocessing/funcs/Padding.py
Normal file
187
kneron_preprocessing/funcs/Padding.py
Normal file
@ -0,0 +1,187 @@
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
from .utils import str2bool, str2int, str2float
|
||||
from .Runner_base import Runner_base, Param_base
|
||||
|
||||
class General(Param_base):
|
||||
type = ''
|
||||
pad_val = ''
|
||||
padded_w = ''
|
||||
padded_h = ''
|
||||
pad_l = ''
|
||||
pad_r = ''
|
||||
pad_t = ''
|
||||
pad_b = ''
|
||||
padding_ch = 3
|
||||
padding_ch_type = 'RGB'
|
||||
def update(self, **dic):
|
||||
self.type = dic['type']
|
||||
self.pad_val = dic['pad_val']
|
||||
self.padded_w = str2int(dic['padded_w'])
|
||||
self.padded_h = str2int(dic['padded_h'])
|
||||
self.pad_l = str2int(dic['pad_l'])
|
||||
self.pad_r = str2int(dic['pad_r'])
|
||||
self.pad_t = str2int(dic['pad_t'])
|
||||
self.pad_b = str2int(dic['pad_b'])
|
||||
|
||||
def __str__(self):
|
||||
str_out = [
|
||||
', type:',str(self.type),
|
||||
', pad_val:',str(self.pad_val),
|
||||
', pad_l:',str(self.pad_l),
|
||||
', pad_r:',str(self.pad_r),
|
||||
', pad_r:',str(self.pad_t),
|
||||
', pad_b:',str(self.pad_b),
|
||||
', padding_ch:',str(self.padding_ch)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
class Hw(Param_base):
|
||||
radix = 8
|
||||
normalize_type = 'floating'
|
||||
def update(self, **dic):
|
||||
self.radix = dic['radix']
|
||||
self.normalize_type = dic['normalize_type']
|
||||
|
||||
def __str__(self):
|
||||
str_out = [
|
||||
', radix:', str(self.radix),
|
||||
', normalize_type:',str(self.normalize_type)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
|
||||
class runner(Runner_base):
|
||||
## overwrite the class in Runner_base
|
||||
general = General()
|
||||
hw = Hw()
|
||||
|
||||
def __str__(self):
|
||||
return('<Padding>')
|
||||
|
||||
def update(self, **kwargs):
|
||||
super().update(**kwargs)
|
||||
|
||||
## update pad type & pad length
|
||||
if (self.general.pad_l != 0) or (self.general.pad_r != 0) or (self.general.pad_t != 0) or (self.general.pad_b != 0):
|
||||
self.general.type = 'specific'
|
||||
assert(self.general.pad_l >= 0)
|
||||
assert(self.general.pad_r >= 0)
|
||||
assert(self.general.pad_t >= 0)
|
||||
assert(self.general.pad_b >= 0)
|
||||
elif(self.general.type != 'specific'):
|
||||
if self.general.padded_w == 0 or self.general.padded_h == 0:
|
||||
self.general.padded_w = self.common.model_size[0]
|
||||
self.general.padded_h = self.common.model_size[1]
|
||||
assert(self.general.padded_w > 0)
|
||||
assert(self.general.padded_h > 0)
|
||||
assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner'])
|
||||
else:
|
||||
assert(self.general.type == 'specific')
|
||||
|
||||
## decide pad_val & padding ch
|
||||
# if numerical_type is floating
|
||||
if (self.common.numerical_type == 'floating'):
|
||||
if self.general.pad_val != 'edge':
|
||||
self.general.pad_val = str2float(self.general.pad_val)
|
||||
self.general.padding_ch = 3
|
||||
self.general.padding_ch_type = 'RGB'
|
||||
# if numerical_type is 520 or 720
|
||||
else:
|
||||
if self.general.pad_val == '':
|
||||
if self.hw.normalize_type.lower() in ['TF', 'Tf', 'tf']:
|
||||
self.general.pad_val = np.uint8(-128 >> (7 - self.hw.radix))
|
||||
elif self.hw.normalize_type.lower() in ['YOLO', 'Yolo', 'yolo']:
|
||||
self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix))
|
||||
elif self.hw.normalize_type.lower() in ['KNERON', 'Kneron', 'kneron']:
|
||||
self.general.pad_val = np.uint8(-128 >> (8 - self.hw.radix))
|
||||
else:
|
||||
self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix))
|
||||
else:
|
||||
self.general.pad_val = str2int(self.general.pad_val)
|
||||
self.general.padding_ch = 4
|
||||
self.general.padding_ch_type = 'RGBA'
|
||||
|
||||
def run(self, image_data):
|
||||
# init
|
||||
shape = image_data.shape
|
||||
w = shape[1]
|
||||
h = shape[0]
|
||||
if len(shape) < 3:
|
||||
self.general.padding_ch = 1
|
||||
self.general.padding_ch_type = 'L'
|
||||
else:
|
||||
if shape[2] == 3 and self.general.padding_ch == 4:
|
||||
image_data = np.concatenate((image_data, np.zeros((h, w, 1), dtype=np.uint8) ), axis=2)
|
||||
|
||||
## padding
|
||||
if self.general.type.lower() in ['CENTER', 'Center', 'center']:
|
||||
img_pad = self._padding_center(image_data, w, h)
|
||||
elif self.general.type.lower() in ['CORNER', 'Corner', 'corner']:
|
||||
img_pad = self._padding_corner(image_data, w, h)
|
||||
else:
|
||||
img_pad = self._padding_sp(image_data, w, h)
|
||||
|
||||
# print info
|
||||
if str2bool(self.common.print_info):
|
||||
self.print_info()
|
||||
|
||||
# output
|
||||
info = {}
|
||||
return img_pad, info
|
||||
|
||||
## protect fun
|
||||
def _padding_center(self, img, ori_w, ori_h):
|
||||
# img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), int(self.general.pad_val[0]))
|
||||
# img = Image.fromarray(img)
|
||||
# img_pad.paste(img, ((self.general.padded_w-ori_w)//2, (self.general.padded_h-ori_h)//2))
|
||||
# return img_pad
|
||||
padH = self.general.padded_h - ori_h
|
||||
padW = self.general.padded_w - ori_w
|
||||
self.general.pad_t = padH // 2
|
||||
self.general.pad_b = (padH // 2) + (padH % 2)
|
||||
self.general.pad_l = padW // 2
|
||||
self.general.pad_r = (padW // 2) + (padW % 2)
|
||||
if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0:
|
||||
return img
|
||||
img_pad = self._padding_sp(img,ori_w,ori_h)
|
||||
return img_pad
|
||||
|
||||
def _padding_corner(self, img, ori_w, ori_h):
|
||||
# img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), self.general.pad_val)
|
||||
# img_pad.paste(img, (0, 0))
|
||||
self.general.pad_l = 0
|
||||
self.general.pad_r = self.general.padded_w - ori_w
|
||||
self.general.pad_t = 0
|
||||
self.general.pad_b = self.general.padded_h - ori_h
|
||||
if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0:
|
||||
return img
|
||||
img_pad = self._padding_sp(img,ori_w,ori_h)
|
||||
return img_pad
|
||||
|
||||
def _padding_sp(self, img, ori_w, ori_h):
|
||||
# block_t = np.zeros((self.general.pad_t, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float)
|
||||
# block_l = np.zeros((ori_h, self.general.pad_l, self.general.padding_ch), dtype=np.float)
|
||||
# block_r = np.zeros((ori_h, self.general.pad_r, self.general.padding_ch), dtype=np.float)
|
||||
# block_b = np.zeros((self.general.pad_b, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float)
|
||||
# for i in range(self.general.padding_ch):
|
||||
# block_t[:, :, i] = np.ones(block_t[:, :, i].shape, dtype=np.float) * self.general.pad_val
|
||||
# block_l[:, :, i] = np.ones(block_l[:, :, i].shape, dtype=np.float) * self.general.pad_val
|
||||
# block_r[:, :, i] = np.ones(block_r[:, :, i].shape, dtype=np.float) * self.general.pad_val
|
||||
# block_b[:, :, i] = np.ones(block_b[:, :, i].shape, dtype=np.float) * self.general.pad_val
|
||||
# padded_image_hor = np.concatenate((block_l, img, block_r), axis=1)
|
||||
# padded_image = np.concatenate((block_t, padded_image_hor, block_b), axis=0)
|
||||
# return padded_image
|
||||
if self.general.padding_ch == 1:
|
||||
pad_range = ( (self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r) )
|
||||
else:
|
||||
pad_range = ((self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r),(0,0))
|
||||
|
||||
if isinstance(self.general.pad_val, str):
|
||||
if self.general.pad_val == 'edge':
|
||||
padded_image = np.pad(img, pad_range, mode="edge")
|
||||
else:
|
||||
padded_image = np.pad(img, pad_range, mode="constant",constant_values=0)
|
||||
else:
|
||||
padded_image = np.pad(img, pad_range, mode="constant",constant_values=self.general.pad_val)
|
||||
|
||||
return padded_image
|
||||
|
||||
237
kneron_preprocessing/funcs/Resize.py
Normal file
237
kneron_preprocessing/funcs/Resize.py
Normal file
@ -0,0 +1,237 @@
|
||||
import numpy as np
|
||||
import cv2
|
||||
from PIL import Image
|
||||
from .utils import str2bool, str2int
|
||||
from ctypes import c_float
|
||||
from .Runner_base import Runner_base, Param_base
|
||||
|
||||
class General(Param_base):
|
||||
type = 'bilinear'
|
||||
keep_ratio = True
|
||||
zoom = True
|
||||
calculate_ratio_using_CSim = True
|
||||
resize_w = 0
|
||||
resize_h = 0
|
||||
resized_w = 0
|
||||
resized_h = 0
|
||||
def update(self, **dic):
|
||||
self.type = dic['type']
|
||||
self.keep_ratio = str2bool(dic['keep_ratio'])
|
||||
self.zoom = str2bool(dic['zoom'])
|
||||
self.calculate_ratio_using_CSim = str2bool(dic['calculate_ratio_using_CSim'])
|
||||
self.resize_w = str2int(dic['resize_w'])
|
||||
self.resize_h = str2int(dic['resize_h'])
|
||||
|
||||
def __str__(self):
|
||||
str_out = [
|
||||
', type:',str(self.type),
|
||||
', keep_ratio:',str(self.keep_ratio),
|
||||
', zoom:',str(self.zoom),
|
||||
', calculate_ratio_using_CSim:',str(self.calculate_ratio_using_CSim),
|
||||
', resize_w:',str(self.resize_w),
|
||||
', resize_h:',str(self.resize_h),
|
||||
', resized_w:',str(self.resized_w),
|
||||
', resized_h:',str(self.resized_h)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
class Hw(Param_base):
|
||||
resize_bit = 12
|
||||
def update(self, **dic):
|
||||
pass
|
||||
|
||||
def __str__(self):
|
||||
str_out = [
|
||||
', resize_bit:',str(self.resize_bit)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
class runner(Runner_base):
|
||||
## overwrite the class in Runner_base
|
||||
general = General()
|
||||
hw = Hw()
|
||||
|
||||
def __str__(self):
|
||||
return('<Resize>')
|
||||
|
||||
def update(self, **kwargs):
|
||||
super().update(**kwargs)
|
||||
|
||||
## if resize size has not been assigned, then it will take model size as resize size
|
||||
if self.general.resize_w == 0 or self.general.resize_h == 0:
|
||||
self.general.resize_w = self.common.model_size[0]
|
||||
self.general.resize_h = self.common.model_size[1]
|
||||
assert(self.general.resize_w > 0)
|
||||
assert(self.general.resize_h > 0)
|
||||
|
||||
##
|
||||
if self.common.numerical_type == '520':
|
||||
self.general.type = 'fixed_520'
|
||||
elif self.common.numerical_type == '720':
|
||||
self.general.type = 'fixed_720'
|
||||
assert(self.general.type.lower() in ['BILINEAR', 'Bilinear', 'bilinear', 'BICUBIC', 'Bicubic', 'bicubic', 'FIXED', 'Fixed', 'fixed', 'FIXED_520', 'Fixed_520', 'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720','CV', 'cv', 'opencv', 'OpenCV', 'CV2', 'cv2'])
|
||||
|
||||
|
||||
def run(self, image_data):
|
||||
## init
|
||||
ori_w = image_data.shape[1]
|
||||
ori_h = image_data.shape[0]
|
||||
info = {}
|
||||
|
||||
##
|
||||
if self.general.keep_ratio:
|
||||
self.general.resized_w, self.general.resized_h = self.calcuate_scale_keep_ratio(self.general.resize_w,self.general.resize_h, ori_w, ori_h, self.general.calculate_ratio_using_CSim)
|
||||
else:
|
||||
self.general.resized_w = int(self.general.resize_w)
|
||||
self.general.resized_h = int(self.general.resize_h)
|
||||
assert(self.general.resized_w > 0)
|
||||
assert(self.general.resized_h > 0)
|
||||
|
||||
##
|
||||
if (self.general.resized_w > ori_w) or (self.general.resized_h > ori_h):
|
||||
if not self.general.zoom:
|
||||
info['size'] = (ori_w,ori_h)
|
||||
if str2bool(self.common.print_info):
|
||||
print('no resize')
|
||||
self.print_info()
|
||||
return image_data, info
|
||||
|
||||
## resize
|
||||
if self.general.type.lower() in ['BILINEAR', 'Bilinear', 'bilinear']:
|
||||
image_data = self.do_resize_bilinear(image_data, self.general.resized_w, self.general.resized_h)
|
||||
elif self.general.type.lower() in ['BICUBIC', 'Bicubic', 'bicubic']:
|
||||
image_data = self.do_resize_bicubic(image_data, self.general.resized_w, self.general.resized_h)
|
||||
elif self.general.type.lower() in ['CV', 'cv', 'opencv', 'OpenCV', 'CV2', 'cv2']:
|
||||
image_data = self.do_resize_cv2(image_data, self.general.resized_w, self.general.resized_h)
|
||||
elif self.general.type.lower() in ['FIXED', 'Fixed', 'fixed', 'FIXED_520', 'Fixed_520', 'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720']:
|
||||
image_data = self.do_resize_fixed(image_data, self.general.resized_w, self.general.resized_h, self.hw.resize_bit, self.general.type)
|
||||
|
||||
|
||||
# output
|
||||
info['size'] = (self.general.resized_w, self.general.resized_h)
|
||||
|
||||
# print info
|
||||
if str2bool(self.common.print_info):
|
||||
self.print_info()
|
||||
|
||||
return image_data, info
|
||||
|
||||
def calcuate_scale_keep_ratio(self, tar_w, tar_h, ori_w, ori_h, calculate_ratio_using_CSim):
|
||||
if not calculate_ratio_using_CSim:
|
||||
scale_w = tar_w * 1.0 / ori_w*1.0
|
||||
scale_h = tar_h * 1.0 / ori_h*1.0
|
||||
scale = scale_w if scale_w < scale_h else scale_h
|
||||
new_w = int(round(ori_w * scale))
|
||||
new_h = int(round(ori_h * scale))
|
||||
return new_w, new_h
|
||||
|
||||
## calculate_ratio_using_CSim
|
||||
scale_w = c_float(tar_w * 1.0 / (ori_w * 1.0)).value
|
||||
scale_h = c_float(tar_h * 1.0 / (ori_h * 1.0)).value
|
||||
scale_ratio = 0.0
|
||||
scale_target_w = 0
|
||||
scale_target_h = 0
|
||||
padH = 0
|
||||
padW = 0
|
||||
|
||||
bScaleW = True if scale_w < scale_h else False
|
||||
if bScaleW:
|
||||
scale_ratio = scale_w
|
||||
scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value)
|
||||
scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value)
|
||||
assert (abs(scale_target_w - tar_w) <= 1), "Error: scale down width cannot meet expectation\n"
|
||||
padH = tar_h - scale_target_h
|
||||
padW = 0
|
||||
assert (padH >= 0), "Error: padH shouldn't be less than zero\n"
|
||||
else:
|
||||
scale_ratio = scale_h
|
||||
scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value)
|
||||
scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value)
|
||||
assert (abs(scale_target_h - tar_h) <= 1), "Error: scale down height cannot meet expectation\n"
|
||||
padW = tar_w - scale_target_w
|
||||
padH = 0
|
||||
assert (padW >= 0), "Error: padW shouldn't be less than zero\n"
|
||||
new_w = tar_w - padW
|
||||
new_h = tar_h - padH
|
||||
return new_w, new_h
|
||||
|
||||
def do_resize_bilinear(self, image_data, resized_w, resized_h):
|
||||
img = Image.fromarray(image_data)
|
||||
img = img.resize((resized_w, resized_h), Image.BILINEAR)
|
||||
image_data = np.array(img).astype('uint8')
|
||||
return image_data
|
||||
|
||||
def do_resize_bicubic(self, image_data, resized_w, resized_h):
|
||||
img = Image.fromarray(image_data)
|
||||
img = img.resize((resized_w, resized_h), Image.BICUBIC)
|
||||
image_data = np.array(img).astype('uint8')
|
||||
return image_data
|
||||
|
||||
def do_resize_cv2(self, image_data, resized_w, resized_h):
|
||||
image_data = cv2.resize(image_data, (resized_w, resized_h))
|
||||
image_data = np.array(image_data)
|
||||
# image_data = np.array(image_data).astype('uint8')
|
||||
return image_data
|
||||
|
||||
def do_resize_fixed(self, image_data, resized_w, resized_h, resize_bit, type):
|
||||
if len(image_data.shape) < 3:
|
||||
m, n = image_data.shape
|
||||
tmp = np.zeros((m,n,3), dtype=np.uint8)
|
||||
tmp[:,:,0] = image_data
|
||||
image_data = tmp
|
||||
c = 3
|
||||
gray = True
|
||||
else:
|
||||
m, n, c = image_data.shape
|
||||
gray = False
|
||||
|
||||
resolution = 1 << resize_bit
|
||||
|
||||
# Width
|
||||
ratio = int(((n - 1) << resize_bit) / (resized_w - 1))
|
||||
ratio_cnt = 0
|
||||
src_x = 0
|
||||
resized_image_w = np.zeros((m, resized_w, c), dtype=np.uint8)
|
||||
|
||||
for dst_x in range(resized_w):
|
||||
while ratio_cnt > resolution:
|
||||
ratio_cnt = ratio_cnt - resolution
|
||||
src_x = src_x + 1
|
||||
mul1 = np.ones((m, c)) * (resolution - ratio_cnt)
|
||||
mul2 = np.ones((m, c)) * ratio_cnt
|
||||
resized_image_w[:, dst_x, :] = np.multiply(np.multiply(
|
||||
image_data[:, src_x, :], mul1) + np.multiply(image_data[:, src_x + 1, :], mul2), 1/resolution)
|
||||
ratio_cnt = ratio_cnt + ratio
|
||||
|
||||
# Height
|
||||
ratio = int(((m - 1) << resize_bit) / (resized_h - 1))
|
||||
## NPU HW special case 2 , only on 520
|
||||
if type.lower() in ['FIXED_520', 'Fixed_520', 'fixed_520']:
|
||||
if (((ratio * (resized_h - 1)) % 4096 == 0) and ratio != 4096):
|
||||
ratio -= 1
|
||||
|
||||
ratio_cnt = 0
|
||||
src_x = 0
|
||||
resized_image = np.zeros(
|
||||
(resized_h, resized_w, c), dtype=np.uint8)
|
||||
for dst_x in range(resized_h):
|
||||
while ratio_cnt > resolution:
|
||||
ratio_cnt = ratio_cnt - resolution
|
||||
src_x = src_x + 1
|
||||
|
||||
mul1 = np.ones((resized_w, c)) * (resolution - ratio_cnt)
|
||||
mul2 = np.ones((resized_w, c)) * ratio_cnt
|
||||
|
||||
## NPU HW special case 1 , both on 520 / 720
|
||||
if (((dst_x > 0) and ratio_cnt == resolution) and (ratio != resolution)):
|
||||
if type.lower() in ['FIXED_520', 'Fixed_520', 'fixed_520','FIXED_720', 'Fixed_720', 'fixed_720' ]:
|
||||
resized_image[dst_x, :, :] = np.multiply(np.multiply(
|
||||
resized_image_w[src_x+1, :, :], mul1) + np.multiply(resized_image_w[src_x + 2, :, :], mul2), 1/resolution)
|
||||
else:
|
||||
resized_image[dst_x, :, :] = np.multiply(np.multiply(
|
||||
resized_image_w[src_x, :, :], mul1) + np.multiply(resized_image_w[src_x + 1, :, :], mul2), 1/resolution)
|
||||
|
||||
ratio_cnt = ratio_cnt + ratio
|
||||
|
||||
if gray:
|
||||
resized_image = resized_image[:,:,0]
|
||||
|
||||
return resized_image
|
||||
45
kneron_preprocessing/funcs/Rotate.py
Normal file
45
kneron_preprocessing/funcs/Rotate.py
Normal file
@ -0,0 +1,45 @@
|
||||
import numpy as np
|
||||
from .utils import str2bool, str2int
|
||||
|
||||
class runner(object):
|
||||
def __init__(self, *args, **kwargs):
|
||||
self.set = {
|
||||
'operator': '',
|
||||
"rotate_direction": 0,
|
||||
|
||||
}
|
||||
self.update(*args, **kwargs)
|
||||
|
||||
def update(self, *args, **kwargs):
|
||||
self.set.update(kwargs)
|
||||
self.rotate_direction = str2int(self.set['rotate_direction'])
|
||||
|
||||
# print info
|
||||
if str2bool(self.set['b_print']):
|
||||
self.print_info()
|
||||
|
||||
def print_info(self):
|
||||
print("<rotate>",
|
||||
'rotate_direction', self.rotate_direction,)
|
||||
|
||||
|
||||
def run(self, image_data):
|
||||
image_data = self._rotate(image_data)
|
||||
return image_data
|
||||
|
||||
def _rotate(self,img):
|
||||
if self.rotate_direction == 1 or self.rotate_direction == 2:
|
||||
col, row, unit = img.shape
|
||||
pInBuf = img.reshape((-1,1))
|
||||
pOutBufTemp = np.zeros((col* row* unit))
|
||||
for r in range(row):
|
||||
for c in range(col):
|
||||
for u in range(unit):
|
||||
if self.rotate_direction == 1:
|
||||
pOutBufTemp[unit * (c * row + (row - r - 1))+u] = pInBuf[unit * (r * col + c)+u]
|
||||
elif self.rotate_direction == 2:
|
||||
pOutBufTemp[unit * (row * (col - c - 1) + r)+u] = pInBuf[unit * (r * col + c)+u]
|
||||
|
||||
img = pOutBufTemp.reshape((col,row,unit))
|
||||
|
||||
return img
|
||||
59
kneron_preprocessing/funcs/Runner_base.py
Normal file
59
kneron_preprocessing/funcs/Runner_base.py
Normal file
@ -0,0 +1,59 @@
|
||||
from abc import ABCMeta, abstractmethod
|
||||
|
||||
class Param_base(object):
|
||||
@abstractmethod
|
||||
def update(self,**dic):
|
||||
raise NotImplementedError("Must override")
|
||||
|
||||
def load_dic(self, key, **dic):
|
||||
if key in dic:
|
||||
param = eval('self.'+key)
|
||||
param = dic[key]
|
||||
|
||||
def __str__(self):
|
||||
str_out = []
|
||||
return(' '.join(str_out))
|
||||
|
||||
|
||||
class Common(Param_base):
|
||||
print_info = False
|
||||
model_size = [0,0]
|
||||
numerical_type = 'floating'
|
||||
|
||||
def update(self, **dic):
|
||||
self.print_info = dic['print_info']
|
||||
self.model_size = dic['model_size']
|
||||
self.numerical_type = dic['numerical_type']
|
||||
|
||||
def __str__(self):
|
||||
str_out = ['numerical_type:',str(self.numerical_type)]
|
||||
return(' '.join(str_out))
|
||||
|
||||
class Runner_base(metaclass=ABCMeta):
|
||||
common = Common()
|
||||
general = Param_base()
|
||||
floating = Param_base()
|
||||
hw = Param_base()
|
||||
|
||||
def update(self, **kwargs):
|
||||
## update param
|
||||
self.common.update(**kwargs['common'])
|
||||
self.general.update(**kwargs['general'])
|
||||
assert(self.common.numerical_type.lower() in ['floating', '520', '720'])
|
||||
if (self.common.numerical_type == 'floating'):
|
||||
if (self.floating.__class__.__name__ != 'Param_base'):
|
||||
self.floating.update(**kwargs['floating'])
|
||||
else:
|
||||
if (self.hw.__class__.__name__ != 'Param_base'):
|
||||
self.hw.update(**kwargs['hw'])
|
||||
|
||||
def print_info(self):
|
||||
if (self.common.numerical_type == 'floating'):
|
||||
print(self, self.common, self.general, self.floating)
|
||||
else:
|
||||
print(self, self.common, self.general, self.hw)
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
2
kneron_preprocessing/funcs/__init__.py
Normal file
2
kneron_preprocessing/funcs/__init__.py
Normal file
@ -0,0 +1,2 @@
|
||||
from . import ColorConversion, Padding, Resize, Crop, Normalize, Rotate
|
||||
|
||||
372
kneron_preprocessing/funcs/utils.py
Normal file
372
kneron_preprocessing/funcs/utils.py
Normal file
@ -0,0 +1,372 @@
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
import struct
|
||||
|
||||
def pad_square_to_4(x_start, x_end, y_start, y_end):
|
||||
w_int = x_end - x_start
|
||||
h_int = y_end - y_start
|
||||
pad = w_int - h_int
|
||||
if pad > 0:
|
||||
pad_s = (pad >> 1) &(~3)
|
||||
pad_e = pad - pad_s
|
||||
y_start -= pad_s
|
||||
y_end += pad_e
|
||||
else:#//pad <=0
|
||||
pad_s = -(((pad) >> 1) &(~3))
|
||||
pad_e = (-pad) - pad_s
|
||||
x_start -= pad_s
|
||||
x_end += pad_e
|
||||
return x_start, x_end, y_start, y_end
|
||||
|
||||
def str_fill(value):
|
||||
if len(value) == 1:
|
||||
value = "0" + value
|
||||
elif len(value) == 0:
|
||||
value = "00"
|
||||
|
||||
return value
|
||||
|
||||
def clip_ary(value):
|
||||
list_v = []
|
||||
for i in range(len(value)):
|
||||
v = value[i] % 256
|
||||
list_v.append(v)
|
||||
|
||||
return list_v
|
||||
|
||||
def str2bool(v):
|
||||
if isinstance(v,bool):
|
||||
return v
|
||||
return v.lower() in ('TRUE', 'True', 'true', '1', 'T', 't', 'Y', 'YES', 'y', 'yes')
|
||||
|
||||
|
||||
def str2int(s):
|
||||
if s == "":
|
||||
s = 0
|
||||
s = int(s)
|
||||
return s
|
||||
|
||||
def str2float(s):
|
||||
if s == "":
|
||||
s = 0
|
||||
s = float(s)
|
||||
return s
|
||||
|
||||
def clip(value, mini, maxi):
|
||||
if value < mini:
|
||||
result = mini
|
||||
elif value > maxi:
|
||||
result = maxi
|
||||
else:
|
||||
result = value
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def clip_ary(value):
|
||||
list_v = []
|
||||
for i in range(len(value)):
|
||||
v = value[i] % 256
|
||||
list_v.append(v)
|
||||
|
||||
return list_v
|
||||
|
||||
|
||||
def signed_rounding(value, bit):
|
||||
if value < 0:
|
||||
value = value - (1 << (bit - 1))
|
||||
else:
|
||||
value = value + (1 << (bit - 1))
|
||||
|
||||
return value
|
||||
|
||||
def hex_loader(data_folder,**kwargs):
|
||||
format_mode = kwargs['raw_img_fmt']
|
||||
src_h = kwargs['img_in_height']
|
||||
src_w = kwargs['img_in_width']
|
||||
|
||||
if format_mode in ['YUV444', 'yuv444', 'YCBCR444', 'YCbCr444', 'ycbcr444']:
|
||||
output = hex_yuv444(data_folder,src_h,src_w)
|
||||
elif format_mode in ['RGB565', 'rgb565']:
|
||||
output = hex_rgb565(data_folder,src_h,src_w)
|
||||
elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']:
|
||||
output = hex_yuv422(data_folder,src_h,src_w)
|
||||
|
||||
return output
|
||||
|
||||
def hex_rgb565(hex_folder,src_h,src_w):
|
||||
pix_per_line = 8
|
||||
byte_per_line = 16
|
||||
|
||||
f = open(hex_folder)
|
||||
pixel_r = []
|
||||
pixel_g = []
|
||||
pixel_b = []
|
||||
|
||||
# Ignore the first line
|
||||
f.readline()
|
||||
input_line = int((src_h * src_w)/pix_per_line)
|
||||
for i in range(input_line):
|
||||
readline = f.readline()
|
||||
for j in range(int(byte_per_line/2)-1, -1, -1):
|
||||
data1 = int(readline[(j * 4 + 0):(j * 4 + 2)], 16)
|
||||
data0 = int(readline[(j * 4 + 2):(j * 4 + 4)], 16)
|
||||
r = ((data1 & 0xf8) >> 3)
|
||||
g = (((data0 & 0xe0) >> 5) + ((data1 & 0x7) << 3))
|
||||
b = (data0 & 0x1f)
|
||||
pixel_r.append(r)
|
||||
pixel_g.append(g)
|
||||
pixel_b.append(b)
|
||||
|
||||
ary_r = np.array(pixel_r, dtype=np.uint8)
|
||||
ary_g = np.array(pixel_g, dtype=np.uint8)
|
||||
ary_b = np.array(pixel_b, dtype=np.uint8)
|
||||
output = np.concatenate((ary_r[:, None], ary_g[:, None], ary_b[:, None]), axis=1)
|
||||
output = output.reshape((src_h, src_w, 3))
|
||||
|
||||
return output
|
||||
|
||||
def hex_yuv444(hex_folder,src_h,src_w):
|
||||
pix_per_line = 4
|
||||
byte_per_line = 16
|
||||
|
||||
f = open(hex_folder)
|
||||
byte0 = []
|
||||
byte1 = []
|
||||
byte2 = []
|
||||
byte3 = []
|
||||
|
||||
# Ignore the first line
|
||||
f.readline()
|
||||
input_line = int((src_h * src_w)/pix_per_line)
|
||||
for i in range(input_line):
|
||||
readline = f.readline()
|
||||
for j in range(byte_per_line-1, -1, -1):
|
||||
data = int(readline[(j*2):(j*2+2)], 16)
|
||||
if (j+1) % 4 == 0:
|
||||
byte0.append(data)
|
||||
elif (j+2) % 4 == 0:
|
||||
byte1.append(data)
|
||||
elif (j+3) % 4 == 0:
|
||||
byte2.append(data)
|
||||
elif (j+4) % 4 == 0:
|
||||
byte3.append(data)
|
||||
# ary_a = np.array(byte0, dtype=np.uint8)
|
||||
ary_v = np.array(byte1, dtype=np.uint8)
|
||||
ary_u = np.array(byte2, dtype=np.uint8)
|
||||
ary_y = np.array(byte3, dtype=np.uint8)
|
||||
output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1)
|
||||
output = output.reshape((src_h, src_w, 3))
|
||||
|
||||
return output
|
||||
|
||||
def hex_yuv422(hex_folder,src_h,src_w):
|
||||
pix_per_line = 8
|
||||
byte_per_line = 16
|
||||
f = open(hex_folder)
|
||||
pixel_y = []
|
||||
pixel_u = []
|
||||
pixel_v = []
|
||||
|
||||
# Ignore the first line
|
||||
f.readline()
|
||||
input_line = int((src_h * src_w)/pix_per_line)
|
||||
for i in range(input_line):
|
||||
readline = f.readline()
|
||||
for j in range(int(byte_per_line/4)-1, -1, -1):
|
||||
data3 = int(readline[(j * 8 + 0):(j * 8 + 2)], 16)
|
||||
data2 = int(readline[(j * 8 + 2):(j * 8 + 4)], 16)
|
||||
data1 = int(readline[(j * 8 + 4):(j * 8 + 6)], 16)
|
||||
data0 = int(readline[(j * 8 + 6):(j * 8 + 8)], 16)
|
||||
pixel_y.append(data3)
|
||||
pixel_y.append(data1)
|
||||
pixel_u.append(data2)
|
||||
pixel_u.append(data2)
|
||||
pixel_v.append(data0)
|
||||
pixel_v.append(data0)
|
||||
|
||||
ary_y = np.array(pixel_y, dtype=np.uint8)
|
||||
ary_u = np.array(pixel_u, dtype=np.uint8)
|
||||
ary_v = np.array(pixel_v, dtype=np.uint8)
|
||||
output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1)
|
||||
output = output.reshape((src_h, src_w, 3))
|
||||
|
||||
return output
|
||||
|
||||
def bin_loader(data_folder,**kwargs):
|
||||
format_mode = kwargs['raw_img_fmt']
|
||||
src_h = kwargs['img_in_height']
|
||||
src_w = kwargs['img_in_width']
|
||||
if format_mode in ['YUV','yuv','YUV444', 'yuv444', 'YCBCR','YCbCr','ycbcr','YCBCR444', 'YCbCr444', 'ycbcr444']:
|
||||
output = bin_yuv444(data_folder,src_h,src_w)
|
||||
elif format_mode in ['RGB565', 'rgb565']:
|
||||
output = bin_rgb565(data_folder,src_h,src_w)
|
||||
elif format_mode in ['NIR', 'nir','NIR888', 'nir888']:
|
||||
output = bin_nir(data_folder,src_h,src_w)
|
||||
elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']:
|
||||
output = bin_yuv422(data_folder,src_h,src_w)
|
||||
elif format_mode in ['RGB888','rgb888']:
|
||||
output = np.fromfile(data_folder, dtype='uint8')
|
||||
output = output.reshape(src_h,src_w,3)
|
||||
elif format_mode in ['RGBA8888','rgba8888', 'RGBA' , 'rgba']:
|
||||
output_temp = np.fromfile(data_folder, dtype='uint8')
|
||||
output_temp = output_temp.reshape(src_h,src_w,4)
|
||||
output = output_temp[:,:,0:3]
|
||||
|
||||
return output
|
||||
|
||||
def bin_yuv444(in_img_path,src_h,src_w):
|
||||
# load bin
|
||||
struct_fmt = '1B'
|
||||
struct_len = struct.calcsize(struct_fmt)
|
||||
struct_unpack = struct.Struct(struct_fmt).unpack_from
|
||||
|
||||
row = src_h
|
||||
col = src_w
|
||||
pixels = row*col
|
||||
|
||||
raw = []
|
||||
with open(in_img_path, "rb") as f:
|
||||
while True:
|
||||
data = f.read(struct_len)
|
||||
if not data: break
|
||||
s = struct_unpack(data)
|
||||
raw.append(s[0])
|
||||
|
||||
|
||||
raw = raw[:pixels*4]
|
||||
|
||||
#
|
||||
output = np.zeros((pixels * 3), dtype=np.uint8)
|
||||
cnt = 0
|
||||
for i in range(0, pixels*4, 4):
|
||||
#Y
|
||||
output[cnt] = raw[i+3]
|
||||
#U
|
||||
cnt += 1
|
||||
output[cnt] = raw[i+2]
|
||||
#V
|
||||
cnt += 1
|
||||
output[cnt] = raw[i+1]
|
||||
|
||||
cnt += 1
|
||||
|
||||
output = output.reshape((src_h,src_w,3))
|
||||
return output
|
||||
|
||||
def bin_yuv422(in_img_path,src_h,src_w):
|
||||
# load bin
|
||||
struct_fmt = '1B'
|
||||
struct_len = struct.calcsize(struct_fmt)
|
||||
struct_unpack = struct.Struct(struct_fmt).unpack_from
|
||||
|
||||
row = src_h
|
||||
col = src_w
|
||||
pixels = row*col
|
||||
|
||||
raw = []
|
||||
with open(in_img_path, "rb") as f:
|
||||
while True:
|
||||
data = f.read(struct_len)
|
||||
if not data: break
|
||||
s = struct_unpack(data)
|
||||
raw.append(s[0])
|
||||
|
||||
|
||||
raw = raw[:pixels*2]
|
||||
|
||||
#
|
||||
output = np.zeros((pixels * 3), dtype=np.uint8)
|
||||
cnt = 0
|
||||
for i in range(0, pixels*2, 4):
|
||||
#Y0
|
||||
output[cnt] = raw[i+3]
|
||||
#U0
|
||||
cnt += 1
|
||||
output[cnt] = raw[i+2]
|
||||
#V0
|
||||
cnt += 1
|
||||
output[cnt] = raw[i]
|
||||
#Y1
|
||||
cnt += 1
|
||||
output[cnt] = raw[i+1]
|
||||
#U1
|
||||
cnt += 1
|
||||
output[cnt] = raw[i+2]
|
||||
#V1
|
||||
cnt += 1
|
||||
output[cnt] = raw[i]
|
||||
|
||||
cnt += 1
|
||||
|
||||
output = output.reshape((src_h,src_w,3))
|
||||
return output
|
||||
|
||||
def bin_rgb565(in_img_path,src_h,src_w):
|
||||
# load bin
|
||||
struct_fmt = '1B'
|
||||
struct_len = struct.calcsize(struct_fmt)
|
||||
struct_unpack = struct.Struct(struct_fmt).unpack_from
|
||||
|
||||
row = src_h
|
||||
col = src_w
|
||||
pixels = row*col
|
||||
|
||||
rgba565 = []
|
||||
with open(in_img_path, "rb") as f:
|
||||
while True:
|
||||
data = f.read(struct_len)
|
||||
if not data: break
|
||||
s = struct_unpack(data)
|
||||
rgba565.append(s[0])
|
||||
|
||||
|
||||
rgba565 = rgba565[:pixels*2]
|
||||
|
||||
# rgb565_bin to numpy_array
|
||||
output = np.zeros((pixels * 3), dtype=np.uint8)
|
||||
cnt = 0
|
||||
for i in range(0, pixels*2, 2):
|
||||
temp = rgba565[i]
|
||||
temp2 = rgba565[i+1]
|
||||
#R-5
|
||||
output[cnt] = (temp2 >>3)
|
||||
|
||||
#G-6
|
||||
cnt += 1
|
||||
output[cnt] = ((temp & 0xe0) >> 5) + ((temp2 & 0x07) << 3)
|
||||
|
||||
#B-5
|
||||
cnt += 1
|
||||
output[cnt] = (temp & 0x1f)
|
||||
|
||||
cnt += 1
|
||||
|
||||
output = output.reshape((src_h,src_w,3))
|
||||
return output
|
||||
|
||||
def bin_nir(in_img_path,src_h,src_w):
|
||||
# load bin
|
||||
struct_fmt = '1B'
|
||||
struct_len = struct.calcsize(struct_fmt)
|
||||
struct_unpack = struct.Struct(struct_fmt).unpack_from
|
||||
|
||||
nir = []
|
||||
with open(in_img_path, "rb") as f:
|
||||
while True:
|
||||
data = f.read(struct_len)
|
||||
if not data: break
|
||||
s = struct_unpack(data)
|
||||
nir.append(s[0])
|
||||
|
||||
nir = nir[:src_h*src_w]
|
||||
pixels = len(nir)
|
||||
# nir_bin to numpy_array
|
||||
output = np.zeros((len(nir) * 3), dtype=np.uint8)
|
||||
for i in range(0, pixels):
|
||||
output[i*3]=nir[i]
|
||||
output[i*3+1]=nir[i]
|
||||
output[i*3+2]=nir[i]
|
||||
|
||||
output = output.reshape((src_h,src_w,3))
|
||||
return output
|
||||
50
kneron_preprocessing/funcs/utils_520.py
Normal file
50
kneron_preprocessing/funcs/utils_520.py
Normal file
@ -0,0 +1,50 @@
|
||||
import math
|
||||
|
||||
def round_up_16(num):
|
||||
return ((num + (16 - 1)) & ~(16 - 1))
|
||||
|
||||
def round_up_n(num, n):
|
||||
if (num > 0):
|
||||
temp = float(num) / n
|
||||
return math.ceil(temp) * n
|
||||
else:
|
||||
return -math.ceil(float(-num) / n) * n
|
||||
|
||||
def cal_img_row_offset(crop_num, pad_num, start_row, out_row, orig_row):
|
||||
|
||||
scaled_img_row = int(out_row - (pad_num[1] + pad_num[3]))
|
||||
if ((start_row - pad_num[1]) > 0):
|
||||
img_str_row = int((start_row - pad_num[1]))
|
||||
else:
|
||||
img_str_row = 0
|
||||
valid_row = int(orig_row - (crop_num[1] + crop_num[3]))
|
||||
img_str_row = int(valid_row * img_str_row / scaled_img_row)
|
||||
return int(img_str_row + crop_num[1])
|
||||
|
||||
def get_pad_num(pad_num_orig, left, up, right, bottom):
|
||||
pad_num = [0]*4
|
||||
for i in range(0,4):
|
||||
pad_num[i] = pad_num_orig[i]
|
||||
|
||||
if not (left):
|
||||
pad_num[0] = 0
|
||||
if not (up):
|
||||
pad_num[1] = 0
|
||||
if not (right):
|
||||
pad_num[2] = 0
|
||||
if not (bottom):
|
||||
pad_num[3] = 0
|
||||
|
||||
return pad_num
|
||||
|
||||
def get_byte_per_pixel(raw_fmt):
|
||||
if raw_fmt.lower() in ['RGB888', 'rgb888', 'RGB', 'rgb888']:
|
||||
return 4
|
||||
elif raw_fmt.lower() in ['YUV', 'yuv', 'YUV422', 'yuv422']:
|
||||
return 2
|
||||
elif raw_fmt.lower() in ['RGB565', 'rgb565']:
|
||||
return 2
|
||||
elif raw_fmt.lower() in ['NIR888', 'nir888', 'NIR', 'nir']:
|
||||
return 1
|
||||
else:
|
||||
return -1
|
||||
42
kneron_preprocessing/funcs/utils_720.py
Normal file
42
kneron_preprocessing/funcs/utils_720.py
Normal file
@ -0,0 +1,42 @@
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
|
||||
def twos_complement(value):
|
||||
value = int(value)
|
||||
# msb = (value & 0x8000) * (1/np.power(2, 15))
|
||||
msb = (value & 0x8000) >> 15
|
||||
if msb == 1:
|
||||
if (((~value) & 0xFFFF) + 1) >= 0xFFFF:
|
||||
result = ((~value) & 0xFFFF)
|
||||
else:
|
||||
result = (((~value) & 0xFFFF) + 1)
|
||||
result = result * (-1)
|
||||
else:
|
||||
result = value
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def twos_complement_pix(value):
|
||||
h, _ = value.shape
|
||||
for i in range(h):
|
||||
value[i, 0] = twos_complement(value[i, 0])
|
||||
|
||||
return value
|
||||
|
||||
def clip(value, mini, maxi):
|
||||
if value < mini:
|
||||
result = mini
|
||||
elif value > maxi:
|
||||
result = maxi
|
||||
else:
|
||||
result = value
|
||||
|
||||
return result
|
||||
|
||||
def clip_pix(value, mini, maxi):
|
||||
h, _ = value.shape
|
||||
for i in range(h):
|
||||
value[i, 0] = clip(value[i, 0], mini, maxi)
|
||||
|
||||
return value
|
||||
@ -18,6 +18,12 @@ from .pascal_context import PascalContextDataset, PascalContextDataset59
|
||||
from .potsdam import PotsdamDataset
|
||||
from .stare import STAREDataset
|
||||
from .voc import PascalVOCDataset
|
||||
from .golf_dataset import GolfDataset
|
||||
from .golf7_dataset import Golf7Dataset
|
||||
from .golf1_dataset import GrassOnlyDataset
|
||||
from .golf4_dataset import Golf4Dataset
|
||||
from .golf2_dataset import Golf2Dataset
|
||||
from .golf8_dataset import Golf8Dataset
|
||||
|
||||
__all__ = [
|
||||
'CustomDataset', 'build_dataloader', 'ConcatDataset', 'RepeatDataset',
|
||||
|
||||
80
mmseg/datasets/golf1_dataset.py
Normal file
80
mmseg/datasets/golf1_dataset.py
Normal file
@ -0,0 +1,80 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class GrassOnlyDataset(CustomDataset):
|
||||
"""GrassOnlyDataset for semantic segmentation with only one valid class: grass."""
|
||||
|
||||
CLASSES = ('grass',)
|
||||
|
||||
PALETTE = [
|
||||
[0, 128, 0], # grass - green
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(GrassOnlyDataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
print("✅ [GrassOnlyDataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
print("🧪 [GrassOnlyDataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(GrassOnlyDataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
84
mmseg/datasets/golf2_dataset.py
Normal file
84
mmseg/datasets/golf2_dataset.py
Normal file
@ -0,0 +1,84 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class Golf2Dataset(CustomDataset):
|
||||
"""Golf2Dataset for semantic segmentation with 2 valid classes (ignore background)."""
|
||||
|
||||
CLASSES = (
|
||||
'grass', 'road'
|
||||
)
|
||||
|
||||
PALETTE = [
|
||||
[0, 255, 0], # grass - green
|
||||
[255, 165, 0], # road - orange
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(Golf2Dataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
print("✅ [Golf2Dataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
print("🧪 [Golf2Dataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(Golf2Dataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
86
mmseg/datasets/golf4_dataset.py
Normal file
86
mmseg/datasets/golf4_dataset.py
Normal file
@ -0,0 +1,86 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class Golf4Dataset(CustomDataset):
|
||||
"""Golf4Dataset for semantic segmentation with 4 valid classes (ignore background)."""
|
||||
|
||||
CLASSES = (
|
||||
'car', 'grass', 'people', 'road'
|
||||
)
|
||||
|
||||
PALETTE = [
|
||||
[0, 0, 128], # car - dark blue
|
||||
[0, 255, 0], # grass - green
|
||||
[255, 0, 0], # people - red
|
||||
[255, 165, 0], # road - orange
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(Golf4Dataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
print("✅ [Golf4Dataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
print("🧪 [Golf4Dataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(Golf4Dataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
90
mmseg/datasets/golf7_dataset.py
Normal file
90
mmseg/datasets/golf7_dataset.py
Normal file
@ -0,0 +1,90 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class Golf7Dataset(CustomDataset):
|
||||
"""Golf8Dataset for semantic segmentation with 7 valid classes (ignore background)."""
|
||||
|
||||
CLASSES = (
|
||||
'bunker', 'car', 'grass',
|
||||
'greenery', 'person', 'road', 'tree'
|
||||
)
|
||||
|
||||
PALETTE = [
|
||||
[128, 0, 0], # bunker - dark red
|
||||
[0, 0, 128], # car - dark blue
|
||||
[0, 128, 0], # grass - green
|
||||
[0, 255, 0], # greenery - light green
|
||||
[255, 0, 0], # person - red
|
||||
[255, 165, 0], # road - gray
|
||||
[0, 255, 255], # tree - cyan
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(Golf7Dataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
print("✅ [Golf7Dataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
print("🧪 [Golf8Dataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(Golf7Dataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
92
mmseg/datasets/golf8_dataset.py
Normal file
92
mmseg/datasets/golf8_dataset.py
Normal file
@ -0,0 +1,92 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class Golf8Dataset(CustomDataset):
|
||||
"""Golf8Dataset for semantic segmentation with 8 valid classes (ignore background)."""
|
||||
|
||||
CLASSES = (
|
||||
'bunker', 'car', 'grass',
|
||||
'greenery', 'person', 'pond',
|
||||
'road', 'tree'
|
||||
)
|
||||
|
||||
PALETTE = [
|
||||
[128, 0, 0], # bunker - dark red
|
||||
[0, 0, 128], # car - dark blue
|
||||
[0, 128, 0], # grass - green
|
||||
[0, 255, 0], # greenery - light green
|
||||
[255, 0, 0], # person - red
|
||||
[0, 255, 255], # pond - cyan
|
||||
[255, 165, 0], # road - orange
|
||||
[0, 128, 128], # tree - dark cyan
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(Golf8Dataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
print("✅ [Golf8Dataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
print("🧪 [Golf8Dataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(Golf8Dataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
96
mmseg/datasets/golf_dataset.py
Normal file
96
mmseg/datasets/golf_dataset.py
Normal file
@ -0,0 +1,96 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class GolfDataset(CustomDataset):
|
||||
"""GolfDataset for semantic segmentation with four classes: car, grass, people, and road."""
|
||||
|
||||
# ✅ 固定的類別與調色盤(不從 config 接收)
|
||||
CLASSES = ('car', 'grass', 'people', 'road')
|
||||
PALETTE = [
|
||||
[246, 14, 135], # car
|
||||
[233, 81, 78], # grass
|
||||
[220, 148, 21], # people
|
||||
[207, 215, 220], # road
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(GolfDataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
# ✅ DEBUG:初始化時印出 CLASSES 與 PALETTE
|
||||
print("✅ [GolfDataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
result = result.astype(np.uint8)
|
||||
|
||||
# ✅ 把所有無效類別設為 255(當作背景處理)
|
||||
result[result >= len(self.PALETTE)] = 255
|
||||
|
||||
output = Image.fromarray(result).convert('P')
|
||||
|
||||
# ✅ 建立 palette,支援背景 class 255 為黑色
|
||||
palette = np.zeros((256, 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
palette[255] = [0, 0, 0] # 黑色背景
|
||||
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
# ✅ DEBUG:評估時印出目前 CLASSES 使用狀況
|
||||
print("🧪 [GolfDataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
# ✅ DEBUG:印出最終的 eval_results keys
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
66
mmseg/datasets/golf_dataset1.py
Normal file
66
mmseg/datasets/golf_dataset1.py
Normal file
@ -0,0 +1,66 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
|
||||
@DATASETS.register_module()
|
||||
class GolfDataset(CustomDataset):
|
||||
"""GolfDataset for custom semantic segmentation with two classes: road and grass."""
|
||||
|
||||
CLASSES = ('road', 'grass')
|
||||
|
||||
PALETTE = [[128, 64, 128], # road
|
||||
[0, 255, 0]] # grass
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(GolfDataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
result_files = self.results2img(results, imgfile_prefix, indices)
|
||||
return result_files
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
|
||||
return eval_results
|
||||
87
mmseg/datasets/golf_datasetcanuse.py
Normal file
87
mmseg/datasets/golf_datasetcanuse.py
Normal file
@ -0,0 +1,87 @@
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import os.path as osp
|
||||
import mmcv
|
||||
import numpy as np
|
||||
from mmcv.utils import print_log
|
||||
from PIL import Image
|
||||
|
||||
from .builder import DATASETS
|
||||
from .custom import CustomDataset
|
||||
|
||||
@DATASETS.register_module()
|
||||
class GolfDataset(CustomDataset):
|
||||
"""GolfDataset for semantic segmentation with four classes: car, grass, people, and road."""
|
||||
|
||||
# ✅ 固定的類別與調色盤(不從 config 接收)
|
||||
CLASSES = ('car', 'grass', 'people', 'road')
|
||||
PALETTE = [
|
||||
[246, 14, 135], # car
|
||||
[233, 81, 78], # grass
|
||||
[220, 148, 21], # people
|
||||
[207, 215, 220], # road
|
||||
]
|
||||
|
||||
def __init__(self,
|
||||
img_suffix='_leftImg8bit.png',
|
||||
seg_map_suffix='_gtFine_labelIds.png',
|
||||
**kwargs):
|
||||
super(GolfDataset, self).__init__(
|
||||
img_suffix=img_suffix,
|
||||
seg_map_suffix=seg_map_suffix,
|
||||
**kwargs)
|
||||
|
||||
# ✅ DEBUG:初始化時印出 CLASSES 與 PALETTE
|
||||
print("✅ [GolfDataset] 初始化完成")
|
||||
print(f" ➤ CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ PALETTE: {self.PALETTE}")
|
||||
print(f" ➤ img_suffix: {img_suffix}")
|
||||
print(f" ➤ seg_map_suffix: {seg_map_suffix}")
|
||||
print(f" ➤ img_dir: {self.img_dir}")
|
||||
print(f" ➤ ann_dir: {self.ann_dir}")
|
||||
print(f" ➤ dataset length: {len(self)}")
|
||||
|
||||
def results2img(self, results, imgfile_prefix, indices=None):
|
||||
"""Write the segmentation results to images."""
|
||||
if indices is None:
|
||||
indices = list(range(len(self)))
|
||||
|
||||
mmcv.mkdir_or_exist(imgfile_prefix)
|
||||
result_files = []
|
||||
for result, idx in zip(results, indices):
|
||||
filename = self.img_infos[idx]['filename']
|
||||
basename = osp.splitext(osp.basename(filename))[0]
|
||||
png_filename = osp.join(imgfile_prefix, f'{basename}.png')
|
||||
|
||||
output = Image.fromarray(result.astype(np.uint8)).convert('P')
|
||||
palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
|
||||
for label_id, color in enumerate(self.PALETTE):
|
||||
palette[label_id] = color
|
||||
output.putpalette(palette)
|
||||
output.save(png_filename)
|
||||
result_files.append(png_filename)
|
||||
|
||||
return result_files
|
||||
|
||||
def format_results(self, results, imgfile_prefix, indices=None):
|
||||
"""Format the results into dir (for evaluation or visualization)."""
|
||||
return self.results2img(results, imgfile_prefix, indices)
|
||||
|
||||
def evaluate(self,
|
||||
results,
|
||||
metric='mIoU',
|
||||
logger=None,
|
||||
imgfile_prefix=None):
|
||||
"""Evaluate the results with the given metric."""
|
||||
|
||||
# ✅ DEBUG:評估時印出目前 CLASSES 使用狀況
|
||||
print("🧪 [GolfDataset.evaluate] 被呼叫")
|
||||
print(f" ➤ 當前 CLASSES: {self.CLASSES}")
|
||||
print(f" ➤ 評估 metric: {metric}")
|
||||
print(f" ➤ 結果數量: {len(results)}")
|
||||
|
||||
metrics = metric if isinstance(metric, list) else [metric]
|
||||
eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
|
||||
|
||||
# ✅ DEBUG:印出最終的 eval_results keys
|
||||
print(f" ➤ 返回評估指標: {list(eval_results.keys())}")
|
||||
return eval_results
|
||||
70
tools/check/check_lane_offset.py
Normal file
70
tools/check/check_lane_offset.py
Normal file
@ -0,0 +1,70 @@
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
# === 1. 檔案與參數設定 ===
|
||||
img_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\pic_0441_jpg.rf.6e56eb8c0bed7f773fb447b9e217f779_leftImg8bit.png'
|
||||
|
||||
# 色彩轉 label ID(RGB)
|
||||
CLASS_RGB_TO_ID = {
|
||||
(128, 64, 128): 3, # road(灰)
|
||||
(0, 255, 0): 1, # grass(綠)
|
||||
(255, 0, 255): 9, # background or sky(紫)可忽略
|
||||
}
|
||||
|
||||
ROAD_ID = 3
|
||||
GRASS_ID = 1
|
||||
|
||||
# === 2. 讀圖並轉為 label mask ===
|
||||
bgr_img = cv2.imread(img_path)
|
||||
rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB)
|
||||
height, width, _ = rgb_img.shape
|
||||
|
||||
label_mask = np.zeros((height, width), dtype=np.uint8)
|
||||
for rgb, label in CLASS_RGB_TO_ID.items():
|
||||
match = np.all(rgb_img == rgb, axis=-1)
|
||||
label_mask[match] = label
|
||||
|
||||
# === 3. 分析畫面中下區域 ===
|
||||
y_start = int(height * 0.6)
|
||||
x_start = int(width * 0.4)
|
||||
x_end = int(width * 0.6)
|
||||
roi = label_mask[y_start:, x_start:x_end]
|
||||
|
||||
total_pixels = roi.size
|
||||
road_pixels = np.sum(roi == ROAD_ID)
|
||||
grass_pixels = np.sum(roi == GRASS_ID)
|
||||
|
||||
road_ratio = road_pixels / total_pixels
|
||||
grass_ratio = grass_pixels / total_pixels
|
||||
|
||||
# === 4. 重心偏移分析 ===
|
||||
road_mask = (label_mask == ROAD_ID).astype(np.uint8)
|
||||
M = cv2.moments(road_mask)
|
||||
center_x = width // 2
|
||||
offset = 0
|
||||
cx = center_x
|
||||
if M["m00"] > 0:
|
||||
cx = int(M["m10"] / M["m00"])
|
||||
offset = cx - center_x
|
||||
|
||||
# === 5. 結果輸出 ===
|
||||
print(f"🔍 中央 ROI - road比例: {road_ratio:.2f}, grass比例: {grass_ratio:.2f}")
|
||||
if road_ratio < 0.5:
|
||||
print("⚠️ 偏離道路(ROI 中道路比例過少)")
|
||||
if grass_ratio > 0.3:
|
||||
print("❗ 車輛壓到草地!")
|
||||
if abs(offset) > 40:
|
||||
print(f"⚠️ 道路重心偏移:{offset} px")
|
||||
else:
|
||||
print("✅ 道路重心正常")
|
||||
|
||||
# === 6. 可視化 ===
|
||||
vis_img = bgr_img.copy()
|
||||
cv2.rectangle(vis_img, (x_start, y_start), (x_end, height), (0, 255, 255), 2) # 黃色框 ROI
|
||||
cv2.line(vis_img, (center_x, 0), (center_x, height), (255, 0, 0), 2) # 藍色中心線
|
||||
cv2.circle(vis_img, (cx, height // 2), 6, (0, 0, 255), -1) # 紅色重心點
|
||||
|
||||
# 輸出圖片
|
||||
save_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\visual_check.png'
|
||||
cv2.imwrite(save_path, vis_img)
|
||||
print(f"✅ 分析圖儲存成功:{save_path}")
|
||||
33
tools/check/checklatest.py
Normal file
33
tools/check/checklatest.py
Normal file
@ -0,0 +1,33 @@
|
||||
import torch
|
||||
|
||||
def check_pth_num_classes(pth_path):
|
||||
checkpoint = torch.load(pth_path, map_location='cpu')
|
||||
|
||||
if 'state_dict' not in checkpoint:
|
||||
print("❌ 找不到 state_dict,這可能不是 MMSegmentation 的模型檔")
|
||||
return
|
||||
|
||||
state_dict = checkpoint['state_dict']
|
||||
|
||||
# 找出 decode head 最後一層分類器的 weight tensor
|
||||
num_classes = None
|
||||
for k in state_dict.keys():
|
||||
if 'decode_head' in k and 'weight' in k and 'decode_head.classifier' in k:
|
||||
weight_tensor = state_dict[k]
|
||||
num_classes = weight_tensor.shape[0]
|
||||
print(f"✅ 檢查到類別數: {num_classes}")
|
||||
break
|
||||
|
||||
if num_classes is None:
|
||||
print("⚠️ 無法判斷類別數,可能模型架構非標準格式")
|
||||
else:
|
||||
if num_classes == 19:
|
||||
print("⚠️ 這是 Cityscapes 預設模型 (19 類)")
|
||||
elif num_classes == 4:
|
||||
print("✅ 這是 GolfDataset 自訂模型 (4 類)")
|
||||
else:
|
||||
print("❓ 類別數異常,請確認訓練資料與 config 設定是否一致")
|
||||
|
||||
if __name__ == '__main__':
|
||||
pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth'
|
||||
check_pth_num_classes(pth_path)
|
||||
32
tools/check/checkonnx.py
Normal file
32
tools/check/checkonnx.py
Normal file
@ -0,0 +1,32 @@
|
||||
import onnx
|
||||
|
||||
def check_onnx_num_classes(onnx_path):
|
||||
model = onnx.load(onnx_path)
|
||||
graph = model.graph
|
||||
|
||||
print(f"📂 模型路徑: {onnx_path}")
|
||||
print(f"📦 輸出節點總數: {len(graph.output)}")
|
||||
|
||||
for output in graph.output:
|
||||
name = output.name
|
||||
shape = []
|
||||
for dim in output.type.tensor_type.shape.dim:
|
||||
if dim.dim_param:
|
||||
shape.append(dim.dim_param)
|
||||
else:
|
||||
shape.append(dim.dim_value)
|
||||
print(f"🔎 輸出節點名稱: {name}")
|
||||
print(f" 輸出形狀: {shape}")
|
||||
if len(shape) == 4:
|
||||
num_classes = shape[1]
|
||||
print(f"✅ 偵測到類別數: {num_classes}")
|
||||
if num_classes == 19:
|
||||
print("⚠️ 這是 Cityscapes 預設模型 (19 類)")
|
||||
elif num_classes == 4:
|
||||
print("✅ 這是你訓練的 GolfDataset 模型 (4 類)")
|
||||
else:
|
||||
print("❓ 類別數未知,請確認是否正確訓練/轉換模型")
|
||||
|
||||
if __name__ == '__main__':
|
||||
onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.onnx'
|
||||
check_onnx_num_classes(onnx_path)
|
||||
29
tools/check/list_pth_keys.py
Normal file
29
tools/check/list_pth_keys.py
Normal file
@ -0,0 +1,29 @@
|
||||
import torch
|
||||
|
||||
def check_num_classes_from_pth(pth_path):
|
||||
checkpoint = torch.load(pth_path, map_location='cpu')
|
||||
|
||||
if 'state_dict' not in checkpoint:
|
||||
print("❌ 找不到 state_dict")
|
||||
return
|
||||
|
||||
state_dict = checkpoint['state_dict']
|
||||
weight_key = 'decode_head.conv_seg.weight'
|
||||
|
||||
if weight_key in state_dict:
|
||||
weight = state_dict[weight_key]
|
||||
num_classes = weight.shape[0]
|
||||
print(f"✅ 類別數: {num_classes}")
|
||||
|
||||
if num_classes == 19:
|
||||
print("⚠️ 這是 Cityscapes 模型 (19 類)")
|
||||
elif num_classes == 4:
|
||||
print("✅ 這是 GolfDataset 模型 (4 類)")
|
||||
else:
|
||||
print("❓ 非常規類別數,請自行確認資料與 config")
|
||||
else:
|
||||
print(f"❌ 找不到分類層: {weight_key}")
|
||||
|
||||
if __name__ == '__main__':
|
||||
pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth'
|
||||
check_num_classes_from_pth(pth_path)
|
||||
36
tools/custom_infer.py
Normal file
36
tools/custom_infer.py
Normal file
@ -0,0 +1,36 @@
|
||||
import os
|
||||
import torch
|
||||
from mmseg.apis import inference_segmentor, init_segmentor
|
||||
|
||||
def main():
|
||||
# 設定路徑
|
||||
config_file = 'configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py'
|
||||
checkpoint_file = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth'
|
||||
img_dir = 'data/cityscapes/leftImg8bit/val'
|
||||
out_dir = 'work_dirs/vis_results'
|
||||
|
||||
# 初始化模型
|
||||
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
|
||||
print('CLASSES:', model.CLASSES)
|
||||
print('PALETTE:', model.PALETTE)
|
||||
# 建立輸出資料夾
|
||||
os.makedirs(out_dir, exist_ok=True)
|
||||
|
||||
# 找出所有圖片檔
|
||||
img_list = []
|
||||
for root, _, files in os.walk(img_dir):
|
||||
for f in files:
|
||||
if f.endswith('.png') or f.endswith('.jpg'):
|
||||
img_list.append(os.path.join(root, f))
|
||||
|
||||
# 推論每一張圖片
|
||||
for img_path in img_list:
|
||||
result = inference_segmentor(model, img_path)
|
||||
filename = os.path.basename(img_path)
|
||||
out_path = os.path.join(out_dir, filename)
|
||||
model.show_result(img_path, result, out_file=out_path, opacity=0.5)
|
||||
|
||||
print(f'✅ 推論完成,共處理 {len(img_list)} 張圖片,結果已輸出至:{out_dir}')
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
61
tools/kneron/e2eonnx.py
Normal file
61
tools/kneron/e2eonnx.py
Normal file
@ -0,0 +1,61 @@
|
||||
import numpy as np
|
||||
import ktc
|
||||
import cv2
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 前處理 + 推論 ===
|
||||
def run_e2e_simulation(img_path, onnx_path):
|
||||
# 圖片前處理(724x362)
|
||||
image = Image.open(img_path).convert("RGB")
|
||||
image = image.resize((724, 362), Image.BILINEAR)
|
||||
img_data = np.array(image) / 255.0
|
||||
img_data = np.transpose(img_data, (2, 0, 1)) # HWC → CHW
|
||||
img_data = np.expand_dims(img_data, 0) # → NCHW (1,3,362,724)
|
||||
|
||||
input_data = [img_data]
|
||||
inf_results = ktc.kneron_inference(
|
||||
input_data,
|
||||
onnx_file=onnx_path,
|
||||
input_names=["input"]
|
||||
)
|
||||
|
||||
return inf_results
|
||||
|
||||
# === 2. 呼叫推論 ===
|
||||
image_path = "test.png"
|
||||
onnx_path = "work_dirs/meconfig8/latest_optimized.onnx"
|
||||
result = run_e2e_simulation(image_path, onnx_path)
|
||||
|
||||
print("推論結果 shape:", np.array(result).shape) # (1, 1, 7, 46, 91)
|
||||
|
||||
# === 3. 提取與處理輸出 ===
|
||||
output_tensor = np.array(result)[0][0] # shape: (7, 46, 91)
|
||||
pred_mask = np.argmax(output_tensor, axis=0) # shape: (46, 91)
|
||||
|
||||
print("預測的 segmentation mask:")
|
||||
print(pred_mask)
|
||||
|
||||
# === 4. 上採樣回 724x362 ===
|
||||
upsampled_mask = cv2.resize(pred_mask.astype(np.uint8), (724, 362), interpolation=cv2.INTER_NEAREST)
|
||||
|
||||
# === 5. 上色(簡單使用固定 palette)===
|
||||
# 根據你的 7 類別自行定義顏色 (BGR)
|
||||
colors = np.array([
|
||||
[0, 0, 0], # 0: 背景
|
||||
[0, 255, 0], # 1: 草地
|
||||
[255, 0, 0], # 2: 車子
|
||||
[0, 0, 255], # 3: 人
|
||||
[255, 255, 0], # 4: 道路
|
||||
[255, 0, 255], # 5: 樹
|
||||
[0, 255, 255], # 6: 其他
|
||||
], dtype=np.uint8)
|
||||
|
||||
colored_mask = colors[upsampled_mask] # shape: (362, 724, 3)
|
||||
colored_mask = np.asarray(colored_mask, dtype=np.uint8)
|
||||
|
||||
# === 6. 檢查並儲存 ===
|
||||
if colored_mask.shape != (362, 724, 3):
|
||||
raise ValueError(f"❌ mask shape 不對: {colored_mask.shape}")
|
||||
|
||||
cv2.imwrite("pred_mask_resized.png", colored_mask)
|
||||
print("✅ 已儲存語意遮罩圖:pred_mask_resized.png")
|
||||
96
tools/kneron/onnx2nef720.py
Normal file
96
tools/kneron/onnx2nef720.py
Normal file
@ -0,0 +1,96 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/' # 你的 onnx存放路徑
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
data_path = "data724362" # 測試圖片資料夾
|
||||
imgsz_w, imgsz_h = 724, 362 # 輸入圖片尺寸,跟ONNX模型要求一致
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 載入並優化 ONNX 模型 ===
|
||||
print("🔄 Loading and optimizing ONNX...")
|
||||
m = onnx.load(onnx_path)
|
||||
m = ktc.onnx_optimizer.onnx2onnx_flow(m)
|
||||
opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
|
||||
onnx.save(m, opt_onnx_path)
|
||||
|
||||
# === 4. 檢查 ONNX 輸入尺寸是否符合要求 ===
|
||||
input_tensor = m.graph.input[0]
|
||||
input_shape = [dim.dim_value for dim in input_tensor.type.tensor_type.shape.dim]
|
||||
print(f"📏 ONNX Input Shape: {input_shape}")
|
||||
|
||||
expected_shape = [1, 3, imgsz_h, imgsz_w] # (N, C, H, W)
|
||||
|
||||
if input_shape != expected_shape:
|
||||
raise ValueError(f"❌ Error: ONNX input shape {input_shape} does not match expected {expected_shape}.")
|
||||
|
||||
# === 5. 設定 Kneron 模型編譯參數 ===
|
||||
print("📐 Configuring model for KL720...")
|
||||
km = ktc.ModelConfig(20008, "0001", "720", onnx_model=m)
|
||||
|
||||
# (可選)模型效能評估
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU Performance Evaluation:\n" + str(eval_result))
|
||||
|
||||
# === 6. 準備圖片資料 ===
|
||||
print("🖼️ Preparing image data...")
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ No images found in {data_path}!")
|
||||
|
||||
print(f"✅ Found {len(files_found)} images in {data_path}")
|
||||
|
||||
input_name = input_tensor.name
|
||||
img_list = []
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
fullpath = os.path.join(root, f)
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➔ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32).copy()
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➔ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➔ NCHW
|
||||
img_list.append(img_np)
|
||||
print(f"✅ Processed: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ Failed to process {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ Error: No valid images were processed!")
|
||||
|
||||
# === 7. BIE 量化分析 ===
|
||||
print("📦 Running fixed-point analysis (BIE)...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ Error: BIE model was not generated!")
|
||||
|
||||
print("✅ BIE model saved to:", bie_save_path)
|
||||
|
||||
# === 8. 編譯 NEF 模型 ===
|
||||
print("⚙️ Compiling NEF model...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ Error: NEF model was not generated!")
|
||||
|
||||
print("✅ NEF compile done!")
|
||||
print("📁 NEF file saved to:", nef_save_path)
|
||||
103
tools/kneron/onnx2nefSTDC630.py
Normal file
103
tools/kneron/onnx2nefSTDC630.py
Normal file
@ -0,0 +1,103 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
import kneronnxopt
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
|
||||
data_path = "data724362"
|
||||
imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 優化 ONNX 模型(使用 kneronnxopt API)===
|
||||
print("⚙️ 使用 kneronnxopt 優化 ONNX...")
|
||||
try:
|
||||
model = onnx.load(onnx_path)
|
||||
input_tensor = model.graph.input[0]
|
||||
input_name = input_tensor.name
|
||||
input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
|
||||
print(f"📌 模型實際的 input name 是: {input_name}")
|
||||
|
||||
model = kneronnxopt.optimize(
|
||||
model,
|
||||
duplicate_shared_weights=1,
|
||||
skip_check=False,
|
||||
skip_fuse_qkv=True
|
||||
)
|
||||
onnx.save(model, optimized_path)
|
||||
except Exception as e:
|
||||
print(f"❌ 優化失敗: {e}")
|
||||
exit(1)
|
||||
|
||||
# === 4. 載入優化後的模型 ===
|
||||
print("🔄 載入優化後的 ONNX...")
|
||||
m = onnx.load(optimized_path)
|
||||
|
||||
# === 5. 設定 Kneron 模型編譯參數 ===
|
||||
print("📐 配置模型...")
|
||||
km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m)
|
||||
|
||||
# (可選)模型效能評估
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU 效能評估:\n" + str(eval_result))
|
||||
|
||||
# === 6. 處理輸入圖片 ===
|
||||
print("🖼️ 處理輸入圖片...")
|
||||
input_name = m.graph.input[0].name
|
||||
img_list = []
|
||||
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}!")
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
fullpath = os.path.join(root, f)
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW
|
||||
img_list.append(img_np)
|
||||
print(f"✅ 處理成功: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ 圖片處理失敗 {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ 錯誤:沒有有效圖片被處理!")
|
||||
|
||||
# === 7. BIE 分析(量化)===
|
||||
print("📦 執行固定點分析 BIE...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ 無法產生 BIE 模型")
|
||||
|
||||
print("✅ BIE 模型儲存於:", bie_save_path)
|
||||
|
||||
# === 8. 編譯 NEF 模型 ===
|
||||
print("⚙️ 編譯 NEF 模型...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ 無法產生 NEF 模型")
|
||||
|
||||
print("✅ NEF 編譯完成")
|
||||
print("📁 NEF 檔案儲存於:", nef_save_path)
|
||||
64
tools/kneron/onnx2nefSTDC630_2.py
Normal file
64
tools/kneron/onnx2nefSTDC630_2.py
Normal file
@ -0,0 +1,64 @@
|
||||
import os
|
||||
import numpy as np
|
||||
import onnx
|
||||
import shutil
|
||||
import cv2
|
||||
import ktc
|
||||
|
||||
onnx_dir = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
data_path = "data512"
|
||||
imgsz = (512, 512)
|
||||
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
print("🔄 Loading and optimizing ONNX...")
|
||||
model = onnx.load(onnx_path)
|
||||
model = ktc.onnx_optimizer.onnx2onnx_flow(model)
|
||||
opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
|
||||
onnx.save(model, opt_onnx_path)
|
||||
|
||||
print("📐 Configuring model...")
|
||||
km = ktc.ModelConfig(20008, "0001", "630", onnx_model=model)
|
||||
|
||||
# Optional: performance check
|
||||
print("\n📊 Evaluating model...")
|
||||
print(km.evaluate())
|
||||
|
||||
input_name = model.graph.input[0].name
|
||||
print("📥 ONNX input name:", input_name)
|
||||
|
||||
img_list = []
|
||||
print("🖼️ Preprocessing images...")
|
||||
for root, _, files in os.walk(data_path):
|
||||
for fname in files:
|
||||
if fname.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp')):
|
||||
path = os.path.join(root, fname)
|
||||
img = cv2.imread(path)
|
||||
img = cv2.resize(img, imgsz)
|
||||
img = img.astype(np.float32) / 256.0 - 0.5
|
||||
img = np.transpose(img, (2, 0, 1)) # HWC ➝ CHW
|
||||
img = np.expand_dims(img, axis=0) # Add batch dim
|
||||
img_list.append(img)
|
||||
print("✅", path)
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ No images processed!")
|
||||
|
||||
print("📦 Quantizing (BIE)...")
|
||||
bie_path = km.analysis({input_name: img_list})
|
||||
bie_save = os.path.join(onnx_dir, os.path.basename(bie_path))
|
||||
shutil.copy(bie_path, bie_save)
|
||||
|
||||
if not os.path.exists(bie_save):
|
||||
raise RuntimeError("❌ BIE model not saved!")
|
||||
|
||||
print("⚙️ Compiling NEF...")
|
||||
nef_path = ktc.compile([km])
|
||||
nef_save = os.path.join(onnx_dir, os.path.basename(nef_path))
|
||||
shutil.copy(nef_path, nef_save)
|
||||
|
||||
if not os.path.exists(nef_save):
|
||||
raise RuntimeError("❌ NEF model not saved!")
|
||||
|
||||
print("✅ Compile finished. NEF at:", nef_save)
|
||||
86
tools/kneron/onnx2nefSTDC630canuse.py
Normal file
86
tools/kneron/onnx2nefSTDC630canuse.py
Normal file
@ -0,0 +1,86 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
data_path = "data724362"
|
||||
imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 載入並優化 ONNX 模型 ===
|
||||
print("🔄 Loading and optimizing ONNX...")
|
||||
m = onnx.load(onnx_path)
|
||||
m = ktc.onnx_optimizer.onnx2onnx_flow(m)
|
||||
opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
|
||||
onnx.save(m, opt_onnx_path)
|
||||
|
||||
# === 4. 設定 Kneron 模型編譯參數 ===
|
||||
print("📐 Configuring model...")
|
||||
km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m)
|
||||
|
||||
# (可選)模型效能評估
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU Performance Evaluation:\n" + str(eval_result))
|
||||
|
||||
# === 5. 準備圖片資料 ===
|
||||
print("🖼️ Preparing image data...")
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ No images found in {data_path}!")
|
||||
|
||||
print(f"✅ Found {len(files_found)} images in {data_path}")
|
||||
|
||||
input_name = m.graph.input[0].name
|
||||
img_list = []
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
fullpath = os.path.join(root, f)
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW (加上 batch 維度)
|
||||
img_list.append(img_np)
|
||||
print(f"✅ Processed: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ Failed to process {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ Error: No valid images were processed!")
|
||||
|
||||
# === 6. BIE 量化分析 ===
|
||||
print("📦 Running fixed-point analysis...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ Error: BIE model was not generated!")
|
||||
|
||||
print("✅ BIE model saved to:", bie_save_path)
|
||||
|
||||
# === 7. 編譯 NEF 模型 ===
|
||||
print("⚙️ Compiling NEF model...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ Error: NEF model was not generated!")
|
||||
|
||||
print("✅ NEF compile done!")
|
||||
print("📁 NEF file saved to:", nef_save_path)
|
||||
92
tools/kneron/onnx2nef_stdc630_safe.py
Normal file
92
tools/kneron/onnx2nef_stdc630_safe.py
Normal file
@ -0,0 +1,92 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
|
||||
data_path = 'data724362'
|
||||
imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)===
|
||||
print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
|
||||
model = onnx.load(onnx_path)
|
||||
model = ktc.onnx_optimizer.onnx2onnx_flow(model)
|
||||
onnx.save(model, optimized_path)
|
||||
|
||||
# === 4. 驗證輸入 Shape 是否正確 ===
|
||||
print("📏 驗證 ONNX Input Shape...")
|
||||
input_tensor = model.graph.input[0]
|
||||
input_name = input_tensor.name
|
||||
input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
|
||||
expected_shape = [1, 3, imgsz_h, imgsz_w]
|
||||
print(f"📌 input_name: {input_name}")
|
||||
print(f"📌 input_shape: {input_shape}")
|
||||
if input_shape != expected_shape:
|
||||
raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
|
||||
|
||||
# === 5. 初始化模型編譯器 (for KL630) ===
|
||||
print("📐 配置模型 for KL630...")
|
||||
km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model)
|
||||
|
||||
# (可選)效能分析
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU 效能分析:\n" + str(eval_result))
|
||||
|
||||
# === 6. 圖片預處理 ===
|
||||
print("🖼️ 處理輸入圖片...")
|
||||
img_list = []
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
fullpath = os.path.join(root, f)
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW
|
||||
img_list.append(img_np)
|
||||
print(f"✅ 處理成功: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ 處理失敗 {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ 沒有成功處理任何圖片!")
|
||||
|
||||
# === 7. 執行 BIE 量化分析 ===
|
||||
print("📦 執行固定點分析 (BIE)...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ 無法產生 BIE 模型")
|
||||
|
||||
print("✅ BIE 模型儲存於:", bie_save_path)
|
||||
|
||||
# === 8. 編譯 NEF 模型 ===
|
||||
print("⚙️ 編譯 NEF 模型 for KL630...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ 無法產生 NEF 模型")
|
||||
|
||||
print("✅ NEF 編譯完成")
|
||||
print("📁 NEF 檔案儲存於:", nef_save_path)
|
||||
92
tools/kneron/onnx2nef_stdc830_safe.py
Normal file
92
tools/kneron/onnx2nef_stdc830_safe.py
Normal file
@ -0,0 +1,92 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
|
||||
data_path = 'data724362'
|
||||
imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)===
|
||||
print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
|
||||
model = onnx.load(onnx_path)
|
||||
model = ktc.onnx_optimizer.onnx2onnx_flow(model)
|
||||
onnx.save(model, optimized_path)
|
||||
|
||||
# === 4. 驗證輸入 Shape 是否正確 ===
|
||||
print("📏 驗證 ONNX Input Shape...")
|
||||
input_tensor = model.graph.input[0]
|
||||
input_name = input_tensor.name
|
||||
input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
|
||||
expected_shape = [1, 3, imgsz_h, imgsz_w]
|
||||
print(f"📌 input_name: {input_name}")
|
||||
print(f"📌 input_shape: {input_shape}")
|
||||
if input_shape != expected_shape:
|
||||
raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
|
||||
|
||||
# === 5. 初始化模型編譯器 (for KL630) ===
|
||||
print("📐 配置模型 for KL630...")
|
||||
km = ktc.ModelConfig(40000, "0001", "730", onnx_model=model)
|
||||
|
||||
# (可選)效能分析
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU 效能分析:\n" + str(eval_result))
|
||||
|
||||
# === 6. 圖片預處理 ===
|
||||
print("🖼️ 處理輸入圖片...")
|
||||
img_list = []
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
fullpath = os.path.join(root, f)
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW
|
||||
img_list.append(img_np)
|
||||
print(f"✅ 處理成功: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ 處理失敗 {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ 沒有成功處理任何圖片!")
|
||||
|
||||
# === 7. 執行 BIE 量化分析 ===
|
||||
print("📦 執行固定點分析 (BIE)...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ 無法產生 BIE 模型")
|
||||
|
||||
print("✅ BIE 模型儲存於:", bie_save_path)
|
||||
|
||||
# === 8. 編譯 NEF 模型 ===
|
||||
print("⚙️ 編譯 NEF 模型 for KL630...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ 無法產生 NEF 模型")
|
||||
|
||||
print("✅ NEF 編譯完成")
|
||||
print("📁 NEF 檔案儲存於:", nef_save_path)
|
||||
47
tools/kneron/onnxe2e.py
Normal file
47
tools/kneron/onnxe2e.py
Normal file
@ -0,0 +1,47 @@
|
||||
import onnxruntime as ort
|
||||
import numpy as np
|
||||
from PIL import Image
|
||||
import cv2
|
||||
|
||||
# === 1. 載入 ONNX 模型 ===
|
||||
onnx_path = "work_dirs/meconfig8/latest.onnx"
|
||||
session = ort.InferenceSession(onnx_path, providers=['CPUExecutionProvider'])
|
||||
|
||||
# === 2. 前處理輸入圖像(724x362) ===
|
||||
def preprocess(img_path):
|
||||
image = Image.open(img_path).convert("RGB")
|
||||
image = image.resize((724, 362), Image.BILINEAR)
|
||||
img = np.array(image) / 255.0
|
||||
img = np.transpose(img, (2, 0, 1)) # HWC → CHW
|
||||
img = np.expand_dims(img, 0).astype(np.float32) # (1, 3, 362, 724)
|
||||
return img
|
||||
|
||||
img_path = "test.png"
|
||||
input_tensor = preprocess(img_path)
|
||||
|
||||
# === 3. 執行推論 ===
|
||||
input_name = session.get_inputs()[0].name
|
||||
output = session.run(None, {input_name: input_tensor}) # list of np.array
|
||||
|
||||
# === 4. 後處理 + 預測 Mask ===
|
||||
output_tensor = output[0][0] # shape: (num_classes, H, W)
|
||||
pred_mask = np.argmax(output_tensor, axis=0).astype(np.uint8) # (H, W)
|
||||
|
||||
# === 5. 可視化結果 ===
|
||||
colors = [
|
||||
[128, 0, 0], # 0: bunker
|
||||
[0, 0, 128], # 1: car
|
||||
[0, 128, 0], # 2: grass
|
||||
[0, 255, 0], # 3: greenery
|
||||
[255, 0, 0], # 4: person
|
||||
[255, 165, 0], # 5: road
|
||||
[0, 255, 255], # 6: tree
|
||||
]
|
||||
|
||||
color_mask = np.zeros((pred_mask.shape[0], pred_mask.shape[1], 3), dtype=np.uint8)
|
||||
for cls_id, color in enumerate(colors):
|
||||
color_mask[pred_mask == cls_id] = color
|
||||
|
||||
# 儲存可視化圖片
|
||||
cv2.imwrite("onnx_pred_mask.png", color_mask)
|
||||
print("✅ 預測結果已儲存為:onnx_pred_mask.png")
|
||||
92
tools/kneron/test.py
Normal file
92
tools/kneron/test.py
Normal file
@ -0,0 +1,92 @@
|
||||
import ktc
|
||||
import numpy as np
|
||||
import os
|
||||
import onnx
|
||||
import shutil
|
||||
from PIL import Image
|
||||
|
||||
# === 1. 設定路徑與參數 ===
|
||||
onnx_dir = 'work_dirs/meconfig8/'
|
||||
onnx_path = os.path.join(onnx_dir, 'latest.onnx')
|
||||
optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
|
||||
data_path = 'data724362'
|
||||
imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度
|
||||
|
||||
# === 2. 建立輸出資料夾 ===
|
||||
os.makedirs(onnx_dir, exist_ok=True)
|
||||
|
||||
# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)===
|
||||
print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
|
||||
model = onnx.load(onnx_path)
|
||||
model = ktc.onnx_optimizer.onnx2onnx_flow(model)
|
||||
onnx.save(model, optimized_path)
|
||||
|
||||
# === 4. 驗證輸入 Shape 是否正確 ===
|
||||
print("📏 驗證 ONNX Input Shape...")
|
||||
input_tensor = model.graph.input[0]
|
||||
input_name = input_tensor.name
|
||||
input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
|
||||
expected_shape = [1, 3, imgsz_h, imgsz_w]
|
||||
print(f"📌 input_name: {input_name}")
|
||||
print(f"📌 input_shape: {input_shape}")
|
||||
if input_shape != expected_shape:
|
||||
raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
|
||||
|
||||
# === 5. 初始化模型編譯器 (for KL630) ===
|
||||
print("📐 配置模型 for KL630...")
|
||||
km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model)
|
||||
|
||||
# (可選)效能分析
|
||||
eval_result = km.evaluate()
|
||||
print("\n📊 NPU 效能分析:\n" + str(eval_result))
|
||||
|
||||
# === 6. 圖片預處理 ===
|
||||
print("🖼️ 處理輸入圖片...")
|
||||
img_list = []
|
||||
files_found = [f for _, _, files in os.walk(data_path)
|
||||
for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
|
||||
if not files_found:
|
||||
raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
|
||||
|
||||
for root, _, files in os.walk(data_path):
|
||||
for f in files:
|
||||
if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
|
||||
continue
|
||||
fullpath = os.path.join(root, f)
|
||||
try:
|
||||
img = Image.open(fullpath).convert("RGB")
|
||||
img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR
|
||||
img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
|
||||
img_np = img_np / 256.0 - 0.5
|
||||
img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW
|
||||
img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW
|
||||
img_list.append(img_np)
|
||||
print(f"✅ 處理成功: {fullpath}")
|
||||
except Exception as e:
|
||||
print(f"❌ 處理失敗 {fullpath}: {e}")
|
||||
|
||||
if not img_list:
|
||||
raise RuntimeError("❌ 沒有成功處理任何圖片!")
|
||||
|
||||
# === 7. 執行 BIE 量化分析 ===
|
||||
print("📦 執行固定點分析 (BIE)...")
|
||||
bie_model_path = km.analysis({input_name: img_list})
|
||||
bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
|
||||
shutil.copy(bie_model_path, bie_save_path)
|
||||
|
||||
if not os.path.exists(bie_save_path):
|
||||
raise RuntimeError("❌ 無法產生 BIE 模型")
|
||||
|
||||
print("✅ BIE 模型儲存於:", bie_save_path)
|
||||
|
||||
# === 8. 編譯 NEF 模型 ===
|
||||
print("⚙️ 編譯 NEF 模型 for KL630...")
|
||||
nef_model_path = ktc.compile([km])
|
||||
nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
|
||||
shutil.copy(nef_model_path, nef_save_path)
|
||||
|
||||
if not os.path.exists(nef_save_path):
|
||||
raise RuntimeError("❌ 無法產生 NEF 模型")
|
||||
|
||||
print("✅ NEF 編譯完成")
|
||||
print("📁 NEF 檔案儲存於:", nef_save_path)
|
||||
24
tools/kneron/test_onnx_dummy.py
Normal file
24
tools/kneron/test_onnx_dummy.py
Normal file
@ -0,0 +1,24 @@
|
||||
import onnxruntime as ort
|
||||
import numpy as np
|
||||
|
||||
# ✅ 模型路徑(你指定的)
|
||||
onnx_path = r"C:\Users\rd_de\kneron-mmsegmentation\work_dirs\kn_stdc1_in1k-pre_512x1024_80k_cityscapes\latest.onnx"
|
||||
|
||||
# 建立 ONNX session
|
||||
session = ort.InferenceSession(onnx_path)
|
||||
|
||||
# 印出模型 input 相關資訊
|
||||
input_name = session.get_inputs()[0].name
|
||||
input_shape = session.get_inputs()[0].shape
|
||||
print(f"✅ Input name: {input_name}")
|
||||
print(f"✅ Input shape: {input_shape}")
|
||||
|
||||
# 建立假圖輸入 (float32, shape = [1, 3, 512, 1024])
|
||||
dummy_input = np.random.rand(1, 3, 512, 1024).astype(np.float32)
|
||||
|
||||
# 執行推論
|
||||
outputs = session.run(None, {input_name: dummy_input})
|
||||
|
||||
# 顯示模型輸出資訊
|
||||
for i, output in enumerate(outputs):
|
||||
print(f"✅ Output {i}: shape = {output.shape}, dtype = {output.dtype}")
|
||||
43
tools/optimize_onnx_kneron.py
Normal file
43
tools/optimize_onnx_kneron.py
Normal file
@ -0,0 +1,43 @@
|
||||
import os
|
||||
import sys
|
||||
import onnx
|
||||
|
||||
# === 動態加入 optimizer_scripts 模組路徑 ===
|
||||
current_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
sys.path.insert(0, os.path.join(current_dir, 'tools'))
|
||||
|
||||
from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow
|
||||
|
||||
def main():
|
||||
# === 設定路徑 ===
|
||||
onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest.onnx'
|
||||
optimized_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest_optimized.onnx'
|
||||
|
||||
if not os.path.exists(onnx_path):
|
||||
print(f'❌ 找不到 ONNX 檔案: {onnx_path}')
|
||||
return
|
||||
|
||||
# === 載入 ONNX 模型 ===
|
||||
print(f'🔄 載入 ONNX: {onnx_path}')
|
||||
m = onnx.load(onnx_path)
|
||||
|
||||
# === 修正 ir_version(避免 opset11 時報錯)===
|
||||
if m.ir_version == 7:
|
||||
print(f'⚠️ 調整 ir_version 7 → 6(相容性修正)')
|
||||
m.ir_version = 6
|
||||
|
||||
# === 執行 Kneron 優化流程 ===
|
||||
print('⚙️ 執行 Kneron 優化 flow...')
|
||||
try:
|
||||
m = torch_exported_onnx_flow(m, disable_fuse_bn=False)
|
||||
except Exception as e:
|
||||
print(f'❌ 優化失敗: {type(e).__name__} → {e}')
|
||||
return
|
||||
|
||||
# === 儲存結果 ===
|
||||
os.makedirs(os.path.dirname(optimized_path), exist_ok=True)
|
||||
onnx.save(m, optimized_path)
|
||||
print(f'✅ 已儲存最佳化 ONNX: {optimized_path}')
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
@ -328,6 +328,15 @@ def topological_sort(g):
|
||||
if in_degree[node_name] == 0:
|
||||
to_add.append(node_name)
|
||||
del in_degree[node_name]
|
||||
# deal with initializers (weights/biases)
|
||||
for initializer in g.initializer:
|
||||
init_name = initializer.name
|
||||
for node_name in output_nodes[init_name]:
|
||||
if node_name in in_degree:
|
||||
in_degree[node_name] -= 1
|
||||
if in_degree[node_name] == 0:
|
||||
to_add.append(node_name)
|
||||
del in_degree[node_name]
|
||||
# main sort loop
|
||||
sorted_nodes = []
|
||||
while to_add:
|
||||
|
||||
242
tools/pytorch2onnx_kneron13.py
Normal file
242
tools/pytorch2onnx_kneron13.py
Normal file
@ -0,0 +1,242 @@
|
||||
# All modification made by Kneron Corp.: Copyright (c) 2022 Kneron Corp.
|
||||
# Copyright (c) OpenMMLab. All rights reserved.
|
||||
import argparse
|
||||
import warnings
|
||||
import os
|
||||
import onnx
|
||||
import mmcv
|
||||
import numpy as np
|
||||
import onnxruntime as rt
|
||||
import torch
|
||||
from mmcv import DictAction
|
||||
from mmcv.onnx import register_extra_symbolics
|
||||
from mmcv.runner import load_checkpoint
|
||||
from torch import nn
|
||||
|
||||
from mmseg.apis import show_result_pyplot
|
||||
from mmseg.apis.inference import LoadImage
|
||||
from mmseg.datasets.pipelines import Compose
|
||||
from mmseg.models import build_segmentor
|
||||
|
||||
from optimizer_scripts.tools import other
|
||||
from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow
|
||||
|
||||
torch.manual_seed(3)
|
||||
|
||||
|
||||
def _parse_normalize_cfg(test_pipeline):
|
||||
transforms = None
|
||||
for pipeline in test_pipeline:
|
||||
if 'transforms' in pipeline:
|
||||
transforms = pipeline['transforms']
|
||||
break
|
||||
assert transforms is not None, 'Failed to find `transforms`'
|
||||
norm_config_li = [_ for _ in transforms if _['type'] == 'Normalize']
|
||||
assert len(norm_config_li) == 1, '`norm_config` should only have one'
|
||||
return norm_config_li[0]
|
||||
|
||||
|
||||
def _convert_batchnorm(module):
|
||||
module_output = module
|
||||
if isinstance(module, torch.nn.SyncBatchNorm):
|
||||
module_output = torch.nn.BatchNorm2d(
|
||||
module.num_features, module.eps,
|
||||
module.momentum, module.affine, module.track_running_stats)
|
||||
if module.affine:
|
||||
module_output.weight.data = module.weight.data.clone().detach()
|
||||
module_output.bias.data = module.bias.data.clone().detach()
|
||||
module_output.weight.requires_grad = module.weight.requires_grad
|
||||
module_output.bias.requires_grad = module.bias.requires_grad
|
||||
module_output.running_mean = module.running_mean
|
||||
module_output.running_var = module.running_var
|
||||
module_output.num_batches_tracked = module.num_batches_tracked
|
||||
for name, child in module.named_children():
|
||||
module_output.add_module(name, _convert_batchnorm(child))
|
||||
del module
|
||||
return module_output
|
||||
|
||||
|
||||
def _demo_mm_inputs(input_shape):
|
||||
(N, C, H, W) = input_shape
|
||||
rng = np.random.RandomState(0)
|
||||
img = torch.FloatTensor(rng.rand(*input_shape))
|
||||
return img
|
||||
|
||||
|
||||
def _prepare_input_img(img_path, test_pipeline, shape=None):
|
||||
if shape is not None:
|
||||
test_pipeline[1]['img_scale'] = (shape[1], shape[0])
|
||||
test_pipeline[1]['transforms'][0]['keep_ratio'] = False
|
||||
test_pipeline = [LoadImage()] + test_pipeline[1:]
|
||||
test_pipeline = Compose(test_pipeline)
|
||||
data = dict(img=img_path)
|
||||
data = test_pipeline(data)
|
||||
img = torch.FloatTensor(data['img']).unsqueeze_(0)
|
||||
return img
|
||||
|
||||
|
||||
def pytorch2onnx(model, img, norm_cfg=None, opset_version=13, show=False, output_file='tmp.onnx', verify=False):
|
||||
model.cpu().eval()
|
||||
|
||||
if isinstance(model.decode_head, nn.ModuleList):
|
||||
num_classes = model.decode_head[-1].num_classes
|
||||
else:
|
||||
num_classes = model.decode_head.num_classes
|
||||
|
||||
model.forward = model.forward_dummy
|
||||
origin_forward = model.forward
|
||||
|
||||
register_extra_symbolics(opset_version)
|
||||
with torch.no_grad():
|
||||
torch.onnx.export(
|
||||
model, img, output_file,
|
||||
input_names=['input'],
|
||||
output_names=['output'],
|
||||
export_params=True,
|
||||
keep_initializers_as_inputs=False,
|
||||
verbose=show,
|
||||
opset_version=opset_version,
|
||||
dynamic_axes=None)
|
||||
print(f'Successfully exported ONNX model: {output_file} (opset_version={opset_version})')
|
||||
|
||||
model.forward = origin_forward
|
||||
|
||||
# NOTE: optimize onnx
|
||||
m = onnx.load(output_file)
|
||||
if opset_version == 11:
|
||||
m.ir_version = 6
|
||||
m = torch_exported_onnx_flow(m, disable_fuse_bn=False)
|
||||
onnx.save(m, output_file)
|
||||
print(f'{output_file} optimized by KNERON successfully.')
|
||||
|
||||
if verify:
|
||||
onnx_model = onnx.load(output_file)
|
||||
onnx.checker.check_model(onnx_model)
|
||||
|
||||
with torch.no_grad():
|
||||
pytorch_result = model(img).numpy()
|
||||
|
||||
input_all = [node.name for node in onnx_model.graph.input]
|
||||
input_initializer = [node.name for node in onnx_model.graph.initializer]
|
||||
net_feed_input = list(set(input_all) - set(input_initializer))
|
||||
assert len(net_feed_input) == 1
|
||||
sess = rt.InferenceSession(output_file, providers=['CPUExecutionProvider'])
|
||||
onnx_result = sess.run(None, {net_feed_input[0]: img.detach().numpy()})[0]
|
||||
|
||||
if show:
|
||||
import cv2
|
||||
img_show = img[0][:3, ...].permute(1, 2, 0) * 255
|
||||
img_show = img_show.detach().numpy().astype(np.uint8)
|
||||
ori_shape = img_show.shape[:2]
|
||||
|
||||
onnx_result_ = onnx_result[0].argmax(0)
|
||||
onnx_result_ = cv2.resize(onnx_result_.astype(np.uint8), (ori_shape[1], ori_shape[0]))
|
||||
show_result_pyplot(model, img_show, (onnx_result_, ), palette=model.PALETTE,
|
||||
block=False, title='ONNXRuntime', opacity=0.5)
|
||||
|
||||
pytorch_result_ = pytorch_result.squeeze().argmax(0)
|
||||
pytorch_result_ = cv2.resize(pytorch_result_.astype(np.uint8), (ori_shape[1], ori_shape[0]))
|
||||
show_result_pyplot(model, img_show, (pytorch_result_, ), title='PyTorch',
|
||||
palette=model.PALETTE, opacity=0.5)
|
||||
|
||||
np.testing.assert_allclose(
|
||||
pytorch_result.astype(np.float32) / num_classes,
|
||||
onnx_result.astype(np.float32) / num_classes,
|
||||
rtol=1e-5,
|
||||
atol=1e-5,
|
||||
err_msg='The outputs are different between Pytorch and ONNX')
|
||||
print('The outputs are same between Pytorch and ONNX.')
|
||||
|
||||
if norm_cfg is not None:
|
||||
print("Prepending BatchNorm layer to ONNX as data normalization...")
|
||||
mean = norm_cfg['mean']
|
||||
std = norm_cfg['std']
|
||||
i_n = m.graph.input[0]
|
||||
if (i_n.type.tensor_type.shape.dim[1].dim_value != len(mean) or
|
||||
i_n.type.tensor_type.shape.dim[1].dim_value != len(std)):
|
||||
raise ValueError(f"--pixel-bias-value ({mean}) and --pixel-scale-value ({std}) should match input dimension.")
|
||||
norm_bn_bias = [-1 * cm / cs + 128. / cs for cm, cs in zip(mean, std)]
|
||||
norm_bn_scale = [1 / cs for cs in std]
|
||||
other.add_bias_scale_bn_after(m.graph, i_n.name, norm_bn_bias, norm_bn_scale)
|
||||
m = other.polish_model(m)
|
||||
bn_outf = os.path.splitext(output_file)[0] + "_bn_prepended.onnx"
|
||||
onnx.save(m, bn_outf)
|
||||
print(f"BN-Prepended ONNX saved to {bn_outf}")
|
||||
|
||||
return
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser(description='Convert MMSeg to ONNX')
|
||||
parser.add_argument('config', help='test config file path')
|
||||
parser.add_argument('--checkpoint', help='checkpoint file', default=None)
|
||||
parser.add_argument('--input-img', type=str, help='Images for input', default=None)
|
||||
parser.add_argument('--show', action='store_true', help='show onnx graph and segmentation results')
|
||||
parser.add_argument('--verify', action='store_true', help='verify the onnx model')
|
||||
parser.add_argument('--output-file', type=str, default='tmp.onnx')
|
||||
parser.add_argument('--opset-version', type=int, default=13) # default opset=13
|
||||
parser.add_argument('--shape', type=int, nargs='+', default=None, help='input image height and width.')
|
||||
parser.add_argument('--cfg-options', nargs='+', action=DictAction, help='Override config options.')
|
||||
parser.add_argument('--normalization-in-onnx', action='store_true', help='Prepend BN for normalization.')
|
||||
args = parser.parse_args()
|
||||
return args
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
args = parse_args()
|
||||
|
||||
if args.opset_version < 11:
|
||||
raise ValueError(f"Only opset_version >=11 is supported (got {args.opset_version}).")
|
||||
|
||||
cfg = mmcv.Config.fromfile(args.config)
|
||||
if args.cfg_options is not None:
|
||||
cfg.merge_from_dict(args.cfg_options)
|
||||
cfg.model.pretrained = None
|
||||
|
||||
test_mode = cfg.model.test_cfg.mode
|
||||
if args.shape is None:
|
||||
if test_mode == 'slide':
|
||||
crop_size = cfg.model.test_cfg['crop_size']
|
||||
input_shape = (1, 3, crop_size[1], crop_size[0])
|
||||
else:
|
||||
img_scale = cfg.test_pipeline[1]['img_scale']
|
||||
input_shape = (1, 3, img_scale[1], img_scale[0])
|
||||
else:
|
||||
if test_mode == 'slide':
|
||||
warnings.warn("Shape assignment for slide-mode models may cause unexpected results.")
|
||||
if len(args.shape) == 1:
|
||||
input_shape = (1, 3, args.shape[0], args.shape[0])
|
||||
elif len(args.shape) == 2:
|
||||
input_shape = (1, 3) + tuple(args.shape)
|
||||
else:
|
||||
raise ValueError('Invalid input shape')
|
||||
|
||||
cfg.model.train_cfg = None
|
||||
segmentor = build_segmentor(cfg.model, train_cfg=None, test_cfg=cfg.get('test_cfg'))
|
||||
segmentor = _convert_batchnorm(segmentor)
|
||||
|
||||
if args.checkpoint:
|
||||
checkpoint = load_checkpoint(segmentor, args.checkpoint, map_location='cpu')
|
||||
segmentor.CLASSES = checkpoint['meta']['CLASSES']
|
||||
segmentor.PALETTE = checkpoint['meta']['PALETTE']
|
||||
|
||||
if args.input_img is not None:
|
||||
preprocess_shape = (input_shape[2], input_shape[3])
|
||||
img = _prepare_input_img(args.input_img, cfg.data.test.pipeline, shape=preprocess_shape)
|
||||
else:
|
||||
img = _demo_mm_inputs(input_shape)
|
||||
|
||||
if args.normalization_in_onnx:
|
||||
norm_cfg = _parse_normalize_cfg(cfg.test_pipeline)
|
||||
else:
|
||||
norm_cfg = None
|
||||
|
||||
pytorch2onnx(
|
||||
segmentor,
|
||||
img,
|
||||
norm_cfg=norm_cfg,
|
||||
opset_version=args.opset_version,
|
||||
show=args.show,
|
||||
output_file=args.output_file,
|
||||
verify=args.verify,
|
||||
)
|
||||
161
tools/yolov5_preprocess.py
Normal file
161
tools/yolov5_preprocess.py
Normal file
@ -0,0 +1,161 @@
|
||||
# coding: utf-8
|
||||
import torch
|
||||
import cv2
|
||||
import numpy as np
|
||||
import math
|
||||
import time
|
||||
import kneron_preprocessing
|
||||
|
||||
kneron_preprocessing.API.set_default_as_520()
|
||||
torch.backends.cudnn.deterministic = True
|
||||
img_formats = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.tiff', '.dng']
|
||||
def make_divisible(x, divisor):
|
||||
# Returns x evenly divisble by divisor
|
||||
return math.ceil(x / divisor) * divisor
|
||||
|
||||
def check_img_size(img_size, s=32):
|
||||
# Verify img_size is a multiple of stride s
|
||||
new_size = make_divisible(img_size, int(s)) # ceil gs-multiple
|
||||
if new_size != img_size:
|
||||
print('WARNING: --img-size %g must be multiple of max stride %g, updating to %g' % (img_size, s, new_size))
|
||||
return new_size
|
||||
|
||||
def letterbox_ori(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
|
||||
# Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
|
||||
shape = img.shape[:2] # current shape [height, width]
|
||||
if isinstance(new_shape, int):
|
||||
new_shape = (new_shape, new_shape)
|
||||
|
||||
# Scale ratio (new / old)
|
||||
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
|
||||
if not scaleup: # only scale down, do not scale up (for better test mAP)
|
||||
r = min(r, 1.0)
|
||||
|
||||
# Compute padding
|
||||
ratio = r, r # width, height ratios
|
||||
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height
|
||||
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
|
||||
|
||||
dw /= 2 # divide padding into 2 sides
|
||||
dh /= 2
|
||||
|
||||
if shape[::-1] != new_unpad: # resize
|
||||
img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
|
||||
#img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False)
|
||||
|
||||
top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
|
||||
left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
|
||||
# top, bottom = int(0), int(round(dh + 0.1))
|
||||
# left, right = int(0), int(round(dw + 0.1))
|
||||
img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
|
||||
#img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0)
|
||||
|
||||
return img, ratio, (dw, dh)
|
||||
|
||||
def letterbox(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
|
||||
# Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
|
||||
shape = img.shape[:2] # current shape [height, width]
|
||||
if isinstance(new_shape, int):
|
||||
new_shape = (new_shape, new_shape)
|
||||
|
||||
# Scale ratio (new / old)
|
||||
r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
|
||||
if not scaleup: # only scale down, do not scale up (for better test mAP)
|
||||
r = min(r, 1.0)
|
||||
|
||||
# Compute padding
|
||||
ratio = r, r # width, height ratios
|
||||
new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height
|
||||
dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding
|
||||
|
||||
# dw /= 2 # divide padding into 2 sides
|
||||
# dh /= 2
|
||||
|
||||
if shape[::-1] != new_unpad: # resize
|
||||
#img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
|
||||
img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False)
|
||||
|
||||
# top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
|
||||
# left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
|
||||
top, bottom = int(0), int(round(dh + 0.1))
|
||||
left, right = int(0), int(round(dw + 0.1))
|
||||
#img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border
|
||||
img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0)
|
||||
|
||||
return img, ratio, (dw, dh)
|
||||
|
||||
def letterbox_test(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
|
||||
|
||||
ratio = 1.0, 1.0
|
||||
dw, dh = 0, 0
|
||||
img = kneron_preprocessing.API.resize(img, size=(480, 256), keep_ratio=False, type='bilinear')
|
||||
return img, ratio, (dw, dh)
|
||||
|
||||
def LoadImages(path,img_size): #_rgb # for inference
|
||||
if isinstance(path, str):
|
||||
img0 = cv2.imread(path) # BGR
|
||||
else:
|
||||
img0 = path # BGR
|
||||
|
||||
# Padded resize
|
||||
img = letterbox(img0, new_shape=img_size)[0]
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
|
||||
img = np.ascontiguousarray(img)
|
||||
return img, img0
|
||||
|
||||
def LoadImages_yyy(path,img_size): #_yyy # for inference
|
||||
if isinstance(path, str):
|
||||
img0 = cv2.imread(path) # BGR
|
||||
else:
|
||||
img0 = path # BGR
|
||||
|
||||
yvu = cv2.cvtColor(img0, cv2.COLOR_BGR2YCrCb)
|
||||
y, v, u = cv2.split(yvu)
|
||||
img0 = np.stack((y,)*3, axis=-1)
|
||||
|
||||
# Padded resize
|
||||
img = letterbox(img0, new_shape=img_size)[0]
|
||||
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
|
||||
img = np.ascontiguousarray(img)
|
||||
return img, img0
|
||||
|
||||
def LoadImages_yuv420(path,img_size): #_yuv420 # for inference
|
||||
if isinstance(path, str):
|
||||
img0 = cv2.imread(path) # BGR
|
||||
else:
|
||||
img0 = path # BGR
|
||||
img_h, img_w = img0.shape[:2]
|
||||
img_h = (img_h // 2) * 2
|
||||
img_w = (img_w // 2) * 2
|
||||
img = img0[:img_h,:img_w,:]
|
||||
yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV_I420)
|
||||
img0= cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420) #yuv420
|
||||
|
||||
|
||||
# Padded resize
|
||||
img = letterbox(img0, new_shape=img_size)[0]
|
||||
|
||||
# Convert
|
||||
img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416
|
||||
img = np.ascontiguousarray(img)
|
||||
return img, img0
|
||||
|
||||
def Yolov5_preprocess(image_path, device, imgsz_h, imgsz_w) :
|
||||
model_stride_max = 32
|
||||
imgsz_h = check_img_size(imgsz_h, s=model_stride_max) # check img_size
|
||||
imgsz_w = check_img_size(imgsz_w, s=model_stride_max) # check img_size
|
||||
img, im0 = LoadImages(image_path, img_size=(imgsz_h,imgsz_w))
|
||||
img = kneron_preprocessing.API.norm(img) #path1
|
||||
#print('img',img.shape)
|
||||
img = torch.from_numpy(img).to(device) #path1,path2
|
||||
# img = img.float() # uint8 to fp16/32 #path2
|
||||
# img /= 255.0#256.0 - 0.5 # 0 - 255 to -0.5 - 0.5 #path2
|
||||
|
||||
if img.ndimension() == 3:
|
||||
img = img.unsqueeze(0)
|
||||
|
||||
return img, im0
|
||||
|
||||
57
使用手冊.txt
Normal file
57
使用手冊.txt
Normal file
@ -0,0 +1,57 @@
|
||||
環境安裝:
|
||||
# 建立與啟動 conda 環境
|
||||
conda create -n stdc_golface python=3.8 -y
|
||||
conda activate stdc_golface
|
||||
|
||||
# 安裝 PyTorch + 對應 CUDA 11.3 版本
|
||||
conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y
|
||||
|
||||
# 安裝對應版本的 mmcv-full
|
||||
pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html
|
||||
|
||||
# 安裝 kneronstdc 專案
|
||||
cd kneronstdc
|
||||
pip install -e .
|
||||
|
||||
# 安裝常用工具套件
|
||||
pip install opencv-python tqdm matplotlib cityscapesscripts
|
||||
|
||||
# 安裝 yapf 格式化工具(指定版本)
|
||||
pip install yapf==0.31.0
|
||||
--------------------------------------------------------------------------------------
|
||||
data:
|
||||
使用 Roboflow 匯出資料集格式請選擇:
|
||||
|
||||
Semantic Segmentation Masks
|
||||
|
||||
使用 seg2city.py 腳本將 Roboflow 格式轉換為 Cityscapes 格式
|
||||
|
||||
Cityscapes 範例資料可作為參考
|
||||
|
||||
將轉換後的資料放置至 data/cityscapes 資料夾
|
||||
|
||||
(cityscapes 為訓練預設的 dataset 名稱)
|
||||
--------------------------------------------------------------------------------------
|
||||
訓練模型:
|
||||
開剛剛新裝好的env,用cmd下指令,cd到kneronstdc裡面
|
||||
train的指令:
|
||||
python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py
|
||||
|
||||
test的指令:
|
||||
python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --show-dir work_dirs/vis_results
|
||||
------------------------------------------------------------------------------------
|
||||
映射到資料夾
|
||||
docker run --rm -it -v $(wslpath -u 'C:\Users\rd_de\kneronstdc'):/workspace/kneronstdc kneron/toolchain:latest
|
||||
|
||||
轉ONNX指令
|
||||
python tools/pytorch2onnx_kneron.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py --checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx --verify
|
||||
|
||||
把nef拉出來到電腦
|
||||
docker cp f78594411e1b:/data1/kneron_flow/models_630.nef "C:\Users\rd_de\kneronstdc\work_dirs\nef\models_630.nef"
|
||||
---------------------------------------------------------------------------------------
|
||||
pip install opencv-python
|
||||
RUN apt update && apt install -y libgl1
|
||||
|
||||
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user