diff --git a/.gitignore b/.gitignore index 2c1ffb5..9a37a5b 100644 --- a/.gitignore +++ b/.gitignore @@ -117,3 +117,20 @@ mmseg/.mim # Pytorch *.pth + +# ONNX / NEF compiled outputs +*.onnx +*.nef +batch_compile_out/ +conbinenef/ + +# Local data directories +data4/ +data50/ +data512/ +data724362/ +testdata/ + +# Misc +envs.txt +.claude/ diff --git a/README.md b/README.md index 8b59bda..808ccb9 100644 --- a/README.md +++ b/README.md @@ -1,70 +1,62 @@ -# Kneron AI Training/Deployment Platform (mmsegmentation-based) +# STDC GolfAce — Semantic Segmentation on Kneron +## 快速開始 -## Introduction +### 環境安裝 - [kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) is a platform built upon the well-known [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) for mmsegmentation. If you are looking for original mmsegmentation document, please visit [mmsegmentation docs](https://mmsegmentation.readthedocs.io/en/latest/) for detailed mmsegmentation usage. +```bash +# 建立與啟動 conda 環境 +conda create -n stdc_golface python=3.8 -y +conda activate stdc_golface - In this repository, we provide an end-to-end training/deployment flow to realize on Kneron's AI accelerators: +# 安裝 PyTorch + CUDA 11.3 +conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y - 1. **Training/Evalulation:** - - Modified model configuration file and verified for Kneron hardware platform - - Please see [Overview of Benchmark and Model Zoo](#Overview-of-Benchmark-and-Model-Zoo) for Kneron-Verified model list - 2. **Converting to ONNX:** - - tools/pytorch2onnx_kneron.py (beta) - - Export *optimized* and *Kneron-toolchain supported* onnx - - Automatically modify model for arbitrary data normalization preprocess - 3. **Evaluation** - - tools/test_kneron.py (beta) - - Evaluate the model with *pytorch checkpoint, onnx, and kneron-nef* - 4. **Testing** - - inference_kn (beta) - - Verify the converted [NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model on Kneron USB accelerator with this API - 5. **Converting Kneron-NEF:** (toolchain feature) - - Convert the trained pytorch model to [Kneron-NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model, which could be used on Kneron hardware platform. +# 安裝 mmcv-full +pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html -## License +# 安裝專案 +pip install -e . -This project is released under the [Apache 2.0 license](LICENSE). +# 安裝工具套件 +pip install opencv-python tqdm matplotlib cityscapesscripts yapf==0.31.0 +``` -## Changelog +### 資料準備 -N/A +1. 使用 **Roboflow** 匯出資料集,格式選擇 `Semantic Segmentation Masks` +2. 使用 `seg2city.py` 將 Roboflow 格式轉換為 Cityscapes 格式 +3. 將轉換後的資料放至 `data/cityscapes/` -## Overview of Benchmark and Kneron Model Zoo +### 訓練與測試 -| Backbone | Crop Size | Mem (GB) | mIoU | Config | Download | -|:--------:|:---------:|:--------:|:----:|:------:|:--------:| -| STDC 1 | 512x1024 | 7.15 | 69.29|[config](https://github.com/kneron/kneron-mmsegmentation/tree/master/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py)|[model](https://github.com/kneron/Model_Zoo/blob/main/mmsegmentation/stdc_1/latest.zip) +```bash +# 訓練 +python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py -NOTE: The performance may slightly differ from the original implementation since the input size is smaller. +# 測試(輸出視覺化結果) +python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \ + work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \ + --show-dir work_dirs/vis_results +``` -## Installation -- Please refer to the Step 1 of [docs_kneron/stdc_step_by_step.md#step-1-environment](docs_kneron/stdc_step_by_step.md) for installation. -- Please refer to [Kneron PLUS - Python: Installation](http://doc.kneron.com/docs/#plus_python/introduction/install_dependency/) for the environment setup for Kneron USB accelerator. +### 轉換 ONNX / NEF(Kneron Toolchain) -## Getting Started -### Tutorial - Kneron Edition -- [STDC-Seg: Step-By-Step](docs_kneron/stdc_step_by_step.md): A tutorial for users to get started easily. To see detailed documents, please see below. +```bash +# 啟動 Docker(WSL 環境) +docker run --rm -it \ + -v $(wslpath -u 'C:\Users\rd_de\stdc_git'):/workspace/stdc_git \ + kneron/toolchain:latest -### Documents - Kneron Edition -- [Kneron ONNX Export] (under development) -- [Kneron Inference] (under development) -- [Kneron Toolchain Step-By-Step (YOLOv3)](http://doc.kneron.com/docs/#toolchain/yolo_example/) -- [Kneron Toolchain Manual](http://doc.kneron.com/docs/#toolchain/manual/#0-overview) +# 轉換 ONNX +python tools/pytorch2onnx_kneron.py \ + configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \ + --checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \ + --output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx \ + --verify -### Original mmsegmentation Documents -- [Original mmsegmentation getting started](https://github.com/open-mmlab/mmsegmentation#getting-started): It is recommended to read the original mmsegmentation getting started documents for other mmsegmentation operations. -- [Original mmsegmentation readthedoc](https://mmsegmentation.readthedocs.io/en/latest/): Original mmsegmentation documents. +# 將 NEF 複製到本機 +docker cp :/data1/kneron_flow/models_630.nef \ + "C:\Users\rd_de\stdc_git\work_dirs\nef\models_630.nef" +``` -## Contributing -[kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) a platform built upon [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation) - -- For issues regarding to the original [mmsegmentation](https://github.com/open-mmlab/mmsegmentation): -We appreciate all contributions to improve [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation). Ongoing projects can be found in out [GitHub Projects](https://github.com/open-mmlab/mmsegmentation/projects). Welcome community users to participate in these projects. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline. - -- For issues regarding to this repository [kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation): Welcome to leave the comment or submit pull requests here to improve kneron-mmsegmentation - - -## Related Projects -- [kneron-mmdetection](https://github.com/kneron/kneron-mmdetection): Kneron training/deployment platform on [OpenMMLab - mmdetection](https://github.com/open-mmlab/mmdetection) object detection toolbox diff --git a/configs/_base_/datasets/kn_cityscapes.py b/configs/_base_/datasets/kn_cityscapes.py index e15ad34..996db1b 100644 --- a/configs/_base_/datasets/kn_cityscapes.py +++ b/configs/_base_/datasets/kn_cityscapes.py @@ -1,5 +1,6 @@ # dataset settings -dataset_type = 'CityscapesDataset' +#dataset_type = 'CityscapesDataset' +dataset_type = 'GolfDataset' data_root = 'data/cityscapes/' img_norm_cfg = dict( mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) diff --git a/configs/_base_/datasets/kn_cityscapes1.py b/configs/_base_/datasets/kn_cityscapes1.py new file mode 100644 index 0000000..dd64a37 --- /dev/null +++ b/configs/_base_/datasets/kn_cityscapes1.py @@ -0,0 +1,70 @@ +# dataset settings +dataset_type = 'GolfDataset' +data_root = 'data/cityscapes0/' # ✅ 你的資料根目錄 + +img_norm_cfg = dict( + mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (512, 1024) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']), +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']), + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline + ), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline + ), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline + ) +) + +# ✅ 類別與對應的調色盤(不傳給 dataset,用於繪圖/推論可視化) +classes = ('car', 'grass', 'people', 'road') +palette = [ + [246, 14, 135], # car + [233, 81, 78], # grass + [220, 148, 21], # people + [207, 215, 220], # road +] diff --git a/configs/_base_/datasets/kn_cityscapes2.py b/configs/_base_/datasets/kn_cityscapes2.py new file mode 100644 index 0000000..d735e8c --- /dev/null +++ b/configs/_base_/datasets/kn_cityscapes2.py @@ -0,0 +1,71 @@ +# dataset settings +dataset_type = 'GolfDataset' +data_root = 'data/cityscapes0/' # ✅ 你的資料根目錄 + +img_norm_cfg = dict( + mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']), +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']), + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline + ), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline + ), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline + ) +) + +# ✅ 類別與對應的調色盤(不傳給 dataset,用於繪圖/推論可視化) +classes = ('car', 'grass', 'people', 'road') +palette = [ + [246, 14, 135], # car + [233, 81, 78], # grass + [220, 148, 21], # people + [207, 215, 220], # road +] + diff --git a/configs/_base_/schedules/schedule_2k.py b/configs/_base_/schedules/schedule_2k.py new file mode 100644 index 0000000..b7f5d10 --- /dev/null +++ b/configs/_base_/schedules/schedule_2k.py @@ -0,0 +1,22 @@ +# optimizer +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) + +# optimizer config +optimizer_config = dict() + +# learning policy +lr_config = dict( + policy='poly', + power=0.9, + min_lr=1e-4, + by_epoch=False +) + +# runtime settings +runner = dict(type='IterBasedRunner', max_iters=2000) + +# checkpoint 每 2000 次儲存一次(最後一次) +checkpoint_config = dict(by_epoch=False, interval=2000) + +# 評估設定,每 2000 次執行一次 mIoU 評估 +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/kn_stdc1_golf4class.py b/configs/stdc/kn_stdc1_golf4class.py new file mode 100644 index 0000000..65220fb --- /dev/null +++ b/configs/stdc/kn_stdc1_golf4class.py @@ -0,0 +1,193 @@ +# Copyright (c) OpenMMLab. All rights reserved. + +# ---------------- 模型設定 ---------------- # +norm_cfg = dict(type='BN', requires_grad=True) +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False, + init_cfg=dict( + type='Pretrained', + checkpoint='https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + ) + ), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4) + ), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=4, # ✅ 四類 + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0) + ), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, # ✅ + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0) + ), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, # ✅ + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0) + ), + dict( + type='STDCHead', + in_channels=256, + channels=64, + num_convs=1, + num_classes=4, # ✅ 最重要 + boundary_threshold=0.1, + in_index=0, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=True, + loss_decode=[ + dict( + type='CrossEntropyLoss', + loss_name='loss_ce', + use_sigmoid=True, + loss_weight=1.0), + dict( + type='DiceLoss', + loss_name='loss_dice', + loss_weight=1.0) + ] + ) + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +# ---------------- 資料集設定 ---------------- # +dataset_type = 'GolfDataset' +data_root = 'data/cityscapes/' +img_norm_cfg = dict( + mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (512, 1024) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']), +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(1024, 512), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']), + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline + ), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline + ), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline + ) +) + +# ---------------- 額外設定 ---------------- # +log_config = dict( + interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +checkpoint_config = dict(by_epoch=False, interval=1000) +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict( + policy='poly', + power=0.9, + min_lr=0.0001, + by_epoch=False, + warmup='linear', + warmup_iters=1000) +runner = dict(type='IterBasedRunner', max_iters=20000) +cudnn_benchmark = True +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +work_dir = './work_dirs/kn_stdc1_golf4class' +gpu_ids = [0] + +# ✅ 可選:僅供視覺化或 post-processing 用,不會傳給 dataset +classes = ('car', 'grass', 'people', 'road') +palette = [ + [246, 14, 135], # car + [233, 81, 78], # grass + [220, 148, 21], # people + [207, 215, 220], # road +] diff --git a/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py b/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py index 9d12e27..e71b87a 100644 --- a/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py +++ b/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py @@ -1,14 +1,17 @@ checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' # noqa _base_ = [ - '../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes.py', - '../_base_/default_runtime.py', '../_base_/schedules/schedule_80k.py' + '../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes2.py', + '../_base_/default_runtime.py', '../_base_/schedules/schedule_2k.py' ] lr_config = dict(warmup='linear', warmup_iters=1000) data = dict( - samples_per_gpu=12, - workers_per_gpu=4, + samples_per_gpu=2, + workers_per_gpu=2, ) model = dict( backbone=dict( backbone_cfg=dict( init_cfg=dict(type='Pretrained', checkpoint=checkpoint)))) + + + diff --git a/configs/stdc/meconfig.py b/configs/stdc/meconfig.py new file mode 100644 index 0000000..9db2d37 --- /dev/null +++ b/configs/stdc/meconfig.py @@ -0,0 +1,137 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=4, + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +dataset_type = 'GolfDataset' +data_root = 'data/cityscapes0/' +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', keep_ratio=True), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=80000) +checkpoint_config = dict(by_epoch=False, interval=2000) +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig1.py b/configs/stdc/meconfig1.py new file mode 100644 index 0000000..100e51a --- /dev/null +++ b/configs/stdc/meconfig1.py @@ -0,0 +1,146 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=1, # ✅ 只分類 grass + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=1, + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=1, + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +# ✅ 更新為你新的 dataset 類別 +dataset_type = 'GrassOnlyDataset' +data_root = 'data/cityscapes/' + +# ✅ 加入 classes 與 palette 定義 +classes = ('grass',) +palette = [[0, 128, 0]] + +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=80000) +checkpoint_config = dict(by_epoch=False, interval=2000) +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig2.py b/configs/stdc/meconfig2.py new file mode 100644 index 0000000..ab6e970 --- /dev/null +++ b/configs/stdc/meconfig2.py @@ -0,0 +1,149 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=2, # ✅ grass + road + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=2, + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=2, + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +# ✅ 使用 Golf2Dataset (草地與道路) +dataset_type = 'Golf2Dataset' +data_root = 'data/cityscapes/' + +# ✅ 類別與對應顏色 +classes = ('grass', 'road') +palette = [ + [0, 255, 0], # grass + [255, 165, 0], # road +] + +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=80000) +checkpoint_config = dict(by_epoch=False, interval=2000) +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig4.py b/configs/stdc/meconfig4.py new file mode 100644 index 0000000..1b15ce0 --- /dev/null +++ b/configs/stdc/meconfig4.py @@ -0,0 +1,151 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=4, # ✅ 改為 4 類 + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, # ✅ 改為 4 類 + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=4, # ✅ 改為 4 類 + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +# ✅ 新 dataset 類別 +dataset_type = 'Golf4Dataset' +data_root = 'data/cityscapes/' + +# ✅ 類別與配色 +classes = ('car', 'grass', 'people', 'road') +palette = [ + [0, 0, 128], # car + [0, 255, 0], # grass + [255, 0, 0], # people + [255, 165, 0], # road +] + +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=80000) +checkpoint_config = dict(by_epoch=False, interval=2000) +evaluation = dict(interval=2000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig7.py b/configs/stdc/meconfig7.py new file mode 100644 index 0000000..8d22ef5 --- /dev/null +++ b/configs/stdc/meconfig7.py @@ -0,0 +1,137 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=7, + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=7, + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=7, + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +dataset_type = 'Golf8Dataset' +data_root = 'data/cityscapes/' +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=320000) +checkpoint_config = dict(by_epoch=False, interval=32000) +evaluation = dict(interval=32000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig8.py b/configs/stdc/meconfig8.py new file mode 100644 index 0000000..09866bc --- /dev/null +++ b/configs/stdc/meconfig8.py @@ -0,0 +1,137 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth' + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=8, # ✅ 改為 8 類 + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 改為 8 類 + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 改為 8 類 + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +dataset_type = 'Golf8Dataset' # ✅ 使用 Golf8Dataset +data_root = 'data/cityscapes/' +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)]) +dist_params = dict(backend='nccl') +log_level = 'INFO' +load_from = None +resume_from = None +workflow = [('train', 1)] +cudnn_benchmark = True + +optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=320000) +checkpoint_config = dict(by_epoch=False, interval=32000) +evaluation = dict(interval=32000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/meconfig8_finetune.py b/configs/stdc/meconfig8_finetune.py new file mode 100644 index 0000000..e7d7384 --- /dev/null +++ b/configs/stdc/meconfig8_finetune.py @@ -0,0 +1,147 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +dataset_type = 'Golf8Dataset' +data_root = 'data/cityscapes/' +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)] +) + +dist_params = dict(backend='nccl') +log_level = 'INFO' + +# ✅ Fine-tune 用設定 +load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth' +resume_from = None + + +workflow = [('train', 1)] +cudnn_benchmark = True + +# ✅ Fine-tune 推薦學習率 +optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() + +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=160000) + +checkpoint_config = dict(by_epoch=False, interval=16000) +evaluation = dict(interval=16000, metric='mIoU', pre_eval=True) diff --git a/configs/stdc/test.py b/configs/stdc/test.py new file mode 100644 index 0000000..3fbcbb9 --- /dev/null +++ b/configs/stdc/test.py @@ -0,0 +1,147 @@ +norm_cfg = dict(type='BN', requires_grad=True) + +model = dict( + type='EncoderDecoder', + pretrained=None, + backbone=dict( + type='STDCContextPathNet', + backbone_cfg=dict( + type='STDCNet', + stdc_type='STDCNet1', + in_channels=3, + channels=(32, 64, 256, 512, 1024), + bottleneck_type='cat', + num_convs=4, + norm_cfg=norm_cfg, + act_cfg=dict(type='ReLU'), + with_final_conv=False), + last_in_channels=(1024, 512), + out_channels=128, + ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)), + decode_head=dict( + type='FCNHead', + in_channels=256, + channels=256, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=3, + concat_input=False, + dropout_ratio=0.1, + norm_cfg=norm_cfg, + align_corners=True, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + auxiliary_head=[ + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=2, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + dict( + type='FCNHead', + in_channels=128, + channels=64, + num_convs=1, + num_classes=8, # ✅ 8 類 + in_index=1, + norm_cfg=norm_cfg, + concat_input=False, + align_corners=False, + sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000), + loss_decode=dict( + type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)), + ], + train_cfg=dict(), + test_cfg=dict(mode='whole') +) + +dataset_type = 'Golf8Dataset' +data_root = 'data/cityscapes/' +img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True) +crop_size = (360, 720) + +train_pipeline = [ + dict(type='LoadImageFromFile'), + dict(type='LoadAnnotations'), + dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)), + dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75), + dict(type='RandomFlip', prob=0.5), + dict(type='PhotoMetricDistortion'), + dict(type='Normalize', **img_norm_cfg), + dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255), + dict(type='DefaultFormatBundle'), + dict(type='Collect', keys=['img', 'gt_semantic_seg']) +] + +test_pipeline = [ + dict(type='LoadImageFromFile'), + dict( + type='MultiScaleFlipAug', + img_scale=(724, 362), + flip=False, + transforms=[ + dict(type='Resize', img_scale=(724, 362), keep_ratio=False), + dict(type='RandomFlip'), + dict(type='Normalize', **img_norm_cfg), + dict(type='ImageToTensor', keys=['img']), + dict(type='Collect', keys=['img']) + ]) +] + +data = dict( + samples_per_gpu=2, + workers_per_gpu=2, + train=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/train', + ann_dir='gtFine/train', + pipeline=train_pipeline), + val=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/val', + ann_dir='gtFine/val', + pipeline=test_pipeline), + test=dict( + type=dataset_type, + data_root=data_root, + img_dir='leftImg8bit/test', + ann_dir='gtFine/test', + pipeline=test_pipeline) +) + +log_config = dict( + interval=50, + hooks=[dict(type='TextLoggerHook', by_epoch=False)] +) + +dist_params = dict(backend='nccl') +log_level = 'INFO' + +# ✅ Fine-tune 用設定 +load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth' +resume_from = None + + +workflow = [('train', 1)] +cudnn_benchmark = True + +# ✅ Fine-tune 推薦學習率 +optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005) +optimizer_config = dict() + +lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False) +runner = dict(type='IterBasedRunner', max_iters=160) + +checkpoint_config = dict(by_epoch=False, interval=16) +evaluation = dict(interval=16, metric='mIoU', pre_eval=True) diff --git a/kneron_preprocessing/API.py b/kneron_preprocessing/API.py new file mode 100644 index 0000000..3630caa --- /dev/null +++ b/kneron_preprocessing/API.py @@ -0,0 +1,684 @@ +# -*- coding: utf-8 -*- + +import numpy as np +import os +from .funcs.utils import str2int, str2bool +from . import Flow + +flow = Flow() +flow.set_numerical_type('floating') +flow_520 = Flow() +flow_520.set_numerical_type('520') +flow_720 = Flow() +flow_720.set_numerical_type('720') + +DEFAULT = None +default = { + 'crop':{ + 'align_w_to_4':False + }, + 'resize':{ + 'type':'bilinear', + 'calculate_ratio_using_CSim':False + } +} + +def set_default_as_520(): + """ + Set some default parameter as 520 setting + + crop.align_w_to_4 = True + crop.pad_square_to_4 = True + resize.type = 'fixed_520' + resize.calculate_ratio_using_CSim = True + """ + global default + default['crop']['align_w_to_4'] = True + default['resize']['type'] = 'fixed_520' + default['resize']['calculate_ratio_using_CSim'] = True + return + +def set_default_as_floating(): + """ + Set some default parameter as floating setting + + crop.align_w_to_4 = False + crop.pad_square_to_4 = False + resize.type = 'bilinear' + resize.calculate_ratio_using_CSim = False + """ + global default + default['crop']['align_w_to_4'] = False + default['resize']['type'] = 'bilinear' + default['resize']['calculate_ratio_using_CSim'] = False + pass + +def print_info_on(): + """ + turn print infomation on. + """ + flow.set_print_info(True) + flow_520.set_print_info(True) + +def print_info_off(): + """ + turn print infomation off. + """ + flow.set_print_info(False) + flow_520.set_print_info(False) + +def load_image(image): + """ + load_image function + load load_image and output as rgb888 format np.array + + Args: + image: [np.array/str], can be np.array or image file path + + Returns: + out: [np.array], rgb888 format + + Examples: + """ + image = flow.load_image(image, is_raw = False) + return image + +def load_bin(image, fmt=None, size=None): + """ + load_bin function + load bin file and output as rgb888 format np.array + + Args: + image: [str], bin file path + fmt: [str], "rgb888" / "rgb565" / "nir" + size: [tuble], (image_w, image_h) + + Returns: + out: [np.array], rgb888 format + + Examples: + >>> image_data = kneron_preprocessing.API.load_bin(image,'rgb565',(raw_w,raw_h)) + """ + assert isinstance(size, tuple) + assert isinstance(fmt, str) + # assert (fmt.lower() in ['rgb888', "rgb565" , "nir",'RGB888', "RGB565" , "NIR", 'NIR888', 'nir888']) + + image = flow.load_image(image, is_raw = True, raw_img_type='bin', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1]) + flow.set_color_conversion(source_format=fmt, out_format = 'rgb888') + image,_ = flow.funcs['color'](image) + return image + +def load_hex(file, fmt=None, size=None): + """ + load_hex function + load hex file and output as rgb888 format np.array + + Args: + image: [str], hex file path + fmt: [str], "rgb888" / "yuv444" / "ycbcr444" / "yuv422" / "ycbcr422" / "rgb565" + size: [tuble], (image_w, image_h) + + Returns: + out: [np.array], rgb888 format + + Examples: + >>> image_data = kneron_preprocessing.API.load_hex(image,'rgb565',(raw_w,raw_h)) + """ + assert isinstance(size, tuple) + assert isinstance(fmt, str) + assert (fmt.lower() in ['rgb888',"yuv444" , "ycbcr444" , "yuv422" , "ycbcr422" , "rgb565"]) + + image = flow.load_image(file, is_raw = True, raw_img_type='hex', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1]) + flow.set_color_conversion(source_format=fmt, out_format = 'rgb888') + image,_ = flow.funcs['color'](image) + return image + +def dump_image(image, output=None, file_fmt='txt',image_fmt='rgb888',order=0): + """ + dump_image function + + dump txt, bin or hex, default is txt + image format as following format: RGB888, RGBA8888, RGB565, NIR, YUV444, YCbCr444, YUV422, YCbCr422, default is RGB888 + + Args: + image: [np.array/str], can be np.array or image file path + output: [str], dump file path + file_fmt: [str], "bin" / "txt" / "hex", set dump file format, default is txt + image_fmt: [str], RGB888 / RGBA8888 / RGB565 / NIR / YUV444 / YCbCr444 / YUV422 / YCbCr422, default is RGB888 + + Examples: + >>> kneron_preprocessing.API.dump_image(image_data,out_path,fmt='bin') + """ + if isinstance(image, str): + image = load_image(image) + + assert isinstance(image, np.ndarray) + if output is None: + return + + flow.set_output_setting(is_dump=False, dump_format=file_fmt, image_format=image_fmt ,output_file=output) + flow.dump_image(image) + return + +def convert(image, out_fmt = 'RGB888', source_fmt = 'RGB888'): + """ + color convert + + Args: + image: [np.array], input + out_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422" + source_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422" + + Returns: + out: [np.array] + + Examples: + + """ + flow.set_color_conversion(source_format = source_fmt, out_format=out_fmt, simulation=False) + image,_ = flow.funcs['color'](image) + return image + +def get_crop_range(box,align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0): + """ + get exact crop box according different setting + + Args: + box: [tuble], (x1, y1, x2, y2) + align_w_to_4: [bool], crop length in w direction align to 4 or not, default False + pad_square_to_4: [bool], pad to square(align 4) or not, default False + rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding + + Returns: + out: [tuble,4], (crop_x1, crop_y1, crop_x2, crop_y2) + + Examples: + >>> image_data = kneron_preprocessing.API.get_crop_range((272,145,461,341), align_w_to_4=True, pad_square_to_4=True) + (272, 145, 460, 341) + """ + if box is None: + return (0,0,0,0) + if align_w_to_4 is None: + align_w_to_4 = default['crop']['align_w_to_4'] + + flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type) + image = np.zeros((1,1,3)).astype('uint8') + _,info = flow.funcs['crop'](image) + + return info['box'] + +def crop(image, box=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}): + """ + crop function + + specific crop range by box + + Args: + image: [np.array], input + box: [tuble], (x1, y1, x2, y2) + align_w_to_4: [bool], crop length in w direction align to 4 or not, default False + pad_square_to_4: [bool], pad to square(align 4) or not, default False + rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding + info_out: [dic], save the final crop box into info_out['box'] + + Returns: + out: [np.array] + + Examples: + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), align_w_to_4=True, info_out=info) + >>> info['box'] + (272, 145, 460, 341) + + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), pad_square_to_4=True, info_out=info) + >>> info['box'] + (268, 145, 464, 341) + """ + assert isinstance(image, np.ndarray) + if box is None: + return image + if align_w_to_4 is None: + align_w_to_4 = default['crop']['align_w_to_4'] + + flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type) + image,info = flow.funcs['crop'](image) + + info_out['box'] = info['box'] + return image + +def crop_center(image, range=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}): + """ + crop function + + center crop by range + + Args: + image: [np.array], input + range: [tuble], (crop_w, crop_h) + align_w_to_4: [bool], crop length in w direction align to 4 or not, default False + pad_square_to_4: [bool], pad to square(align 4) or not, default False + rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding + info_out: [dic], save the final crop box into info_out['box'] + + Returns: + out: [np.array] + + Examples: + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), align_w_to_4=True,info_out=info) + >>> info['box'] + (268, 220, 372, 260) + + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), pad_square_to_4=True, info_out=info) + >>> info['box'] + (269, 192, 371, 294) + """ + assert isinstance(image, np.ndarray) + if range is None: + return image + if align_w_to_4 is None: + align_w_to_4 = default['crop']['align_w_to_4'] + + flow.set_crop(type='center', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type) + image,info = flow.funcs['crop'](image) + + info_out['box'] = info['box'] + return image + +def crop_corner(image, range=None, align_w_to_4=DEFAULT,pad_square_to_4=False,rounding_type=0 ,info_out = {}): + """ + crop function + + corner crop by range + + Args: + image: [np.array], input + range: [tuble], (crop_w, crop_h) + align_w_to_4: [bool], crop length in w direction align to 4 or not, default False + pad_square_to_4: [bool], pad to square(align 4) or not, default False + rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding + info_out: [dic], save the final crop box into info_out['box'] + + Returns: + out: [np.array] + + Examples: + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), align_w_to_4=True,info_out=info) + >>> info['box'] + (0, 0, 104, 40) + + >>> info = {} + >>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), pad_square_to_4=True,info_out=info) + >>> info['box'] + (0, -28, 102, 74) + """ + assert isinstance(image, np.ndarray) + if range is None: + return image + if align_w_to_4 is None: + align_w_to_4 = default['crop']['align_w_to_4'] + + flow.set_crop(type='corner', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4) + image, info = flow.funcs['crop'](image) + + info_out['box'] = info['box'] + return image + +def resize(image, size=None, keep_ratio = True, zoom = True, type=DEFAULT, calculate_ratio_using_CSim = DEFAULT, info_out = {}): + """ + resize function + + resize type can be bilinear or bilicubic as floating type, fixed or fixed_520/fixed_720 as fixed type. + fixed_520/fixed_720 type has add some function to simulate 520/720 bug. + + Args: + image: [np.array], input + size: [tuble], (input_w, input_h) + keep_ratio: [bool], keep_ratio or not, default True + zoom: [bool], enable resize can zoom image or not, default True + type: [str], "bilinear" / "bilicubic" / "cv2" / "fixed" / "fixed_520" / "fixed_720" + calculate_ratio_using_CSim: [bool], calculate the ratio and scale using Csim function and C float, default False + info_out: [dic], save the final scale size(w,h) into info_out['size'] + + Returns: + out: [np.array] + + Examples: + >>> info = {} + >>> image_data = kneron_preprocessing.API.resize(image_data,size=(56,56),type='fixed',info_out=info) + >>> info_out['size'] + (54,56) + """ + assert isinstance(image, np.ndarray) + if size is None: + return image + if type is None: + type = default['resize']['type'] + if calculate_ratio_using_CSim is None: + calculate_ratio_using_CSim = default['resize']['calculate_ratio_using_CSim'] + + flow.set_resize(resize_w = size[0], resize_h = size[1], type=type, keep_ratio=keep_ratio,zoom=zoom, calculate_ratio_using_CSim=calculate_ratio_using_CSim) + image, info = flow.funcs['resize'](image) + info_out['size'] = info['size'] + + return image + +def pad(image, pad_l=0, pad_r=0, pad_t=0, pad_b=0, pad_val=0): + """ + pad function + + specific left, right, top and bottom pad size. + + Args: + image[np.array]: input + pad_l: [int], pad size from left, default 0 + pad_r: [int], pad size form right, default 0 + pad_t: [int], pad size from top, default 0 + pad_b: [int], pad size form bottom, default 0 + pad_val: [float], the value of pad, , default 0 + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.pad(image_data,20,40,20,40,-0.5) + """ + assert isinstance(image, np.ndarray) + + flow.set_padding(type='specific',pad_l=pad_l,pad_r=pad_r,pad_t=pad_t,pad_b=pad_b,pad_val=pad_val) + image, _ = flow.funcs['padding'](image) + return image + +def pad_center(image,size=None, pad_val=0): + """ + pad function + + center pad with pad size. + + Args: + image[np.array]: input + size: [tuble], (padded_size_w, padded_size_h) + pad_val: [float], the value of pad, , default 0 + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.pad_center(image_data,size=(56,56),pad_val=-0.5) + """ + assert isinstance(image, np.ndarray) + if size is None: + return image + assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) ) + + flow.set_padding(type='center',padded_w=size[0],padded_h=size[1],pad_val=pad_val) + image, _ = flow.funcs['padding'](image) + return image + +def pad_corner(image,size=None, pad_val=0): + """ + pad function + + corner pad with pad size. + + Args: + image[np.array]: input + size: [tuble], (padded_size_w, padded_size_h) + pad_val: [float], the value of pad, , default 0 + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.pad_corner(image_data,size=(56,56),pad_val=-0.5) + """ + assert isinstance(image, np.ndarray) + if size is None: + return image + assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) ) + + flow.set_padding(type='corner',padded_w=size[0],padded_h=size[1],pad_val=pad_val) + image, _ = flow.funcs['padding'](image) + return image + +def norm(image,scale=256.,bias=-0.5, mean=None, std=None): + """ + norm function + + x = (x/scale - bias) + x[0,1,2] = x - mean[0,1,2] + x[0,1,2] = x / std[0,1,2] + + Args: + image: [np.array], input + scale: [float], default = 256 + bias: [float], default = -0.5 + mean: [tuble,3], default = None + std: [tuble,3], default = None + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.norm(image_data) + >>> image_data = kneron_preprocessing.API.norm(image_data,mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) + """ + assert isinstance(image, np.ndarray) + + flow.set_normalize(type='specific',scale=scale, bias=bias, mean=mean, std =std) + image, _ = flow.funcs['normalize'](image) + return image + +def inproc_520(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False, rotate=0, radix=8, bit_width=8, round_w_to_16=True, NUM_BANK_LINE=32,BANK_ENTRY_CNT=512,MAX_IMG_PREPROC_ROW_NUM=511,MAX_IMG_PREPROC_COL_NUM=256): + """ + inproc_520 + + Args: + image: [np.array], input + crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop + pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0 + norm: [str], default = 'kneron' + rotate: [int], 0 / 1 / 2 ,default = 0 + radix: [int], default = 8 + bit_width: [int], default = 8 + round_w_to_16: [bool], default = True + gray: [bool], default = False + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1) + """ + # assert isinstance(image, np.ndarray) + + if (not isinstance(image, np.ndarray)): + flow_520.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1]) + else: + flow_520.set_raw_img(is_raw_img='no') + flow_520.set_color_conversion(source_format='rgb888') + + if npu_size is None: + return image + + flow_520.set_model_size(w=npu_size[0],h=npu_size[1]) + + ## Crop + if crop_box != None: + flow_520.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3]) + crop_fisrt = True + else: + crop_fisrt = False + + ## Color + if gray: + flow_520.set_color_conversion(out_format='l',simulation='no') + else: + flow_520.set_color_conversion(out_format='rgb888',simulation='no') + + ## Resize & Pad + pad_mode = str2int(pad_mode) + if (pad_mode == 0): + pad_type = 'center' + resize_keep_ratio = 'yes' + elif (pad_mode == 1): + pad_type = 'corner' + resize_keep_ratio = 'yes' + else: + pad_type = 'center' + resize_keep_ratio = 'no' + + flow_520.set_resize(keep_ratio=resize_keep_ratio) + flow_520.set_padding(type=pad_type) + + ## Norm + flow_520.set_normalize(type=norm) + + ## 520 inproc + flow_520.set_520_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM) + image_data, _ = flow_520.run_whole_process(image) + + return image_data + +def inproc_720(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False): + """ + inproc_720 + + Args: + image: [np.array], input + crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop + pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0 + norm: [str], default = 'kneron' + rotate: [int], 0 / 1 / 2 ,default = 0 + radix: [int], default = 8 + bit_width: [int], default = 8 + round_w_to_16: [bool], default = True + gray: [bool], default = False + + Returns: + out: [np.array] + + Examples: + >>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1) + """ + # assert isinstance(image, np.ndarray) + + if (not isinstance(image, np.ndarray)): + flow_720.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1]) + else: + flow_720.set_raw_img(is_raw_img='no') + flow_720.set_color_conversion(source_format='rgb888') + + if npu_size is None: + return image + + flow_720.set_model_size(w=npu_size[0],h=npu_size[1]) + + ## Crop + if crop_box != None: + flow_720.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3]) + crop_fisrt = True + else: + crop_fisrt = False + + ## Color + if gray: + flow_720.set_color_conversion(out_format='l',simulation='no') + else: + flow_720.set_color_conversion(out_format='rgb888',simulation='no') + + ## Resize & Pad + pad_mode = str2int(pad_mode) + if (pad_mode == 0): + pad_type = 'center' + resize_keep_ratio = 'yes' + elif (pad_mode == 1): + pad_type = 'corner' + resize_keep_ratio = 'yes' + else: + pad_type = 'center' + resize_keep_ratio = 'no' + + flow_720.set_resize(keep_ratio=resize_keep_ratio) + flow_720.set_padding(type=pad_type) + + ## 720 inproc + # flow_720.set_720_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM) + image_data, _ = flow_720.run_whole_process(image) + + return image_data + +def bit_match(data1, data2): + """ + bit_match function + + check data1 is equal to data2 or not. + + Args: + data1: [np.array / str], can be array or txt/bin file + data2: [np.array / str], can be array or txt/bin file + + Returns: + out1: [bool], is match or not + out2: [np.array], if not match, save the position for mismatched data + + Examples: + >>> result, mismatched = kneron_preprocessing.API.bit_match(data1,data2) + """ + if isinstance(data1, str): + if os.path.splitext(data1)[1] == '.bin': + data1 = np.fromfile(data1, dtype='uint8') + elif os.path.splitext(data1)[1] == '.txt': + data1 = np.loadtxt(data1) + + assert isinstance(data1, np.ndarray) + + if isinstance(data2, str): + if os.path.splitext(data2)[1] == '.bin': + data2 = np.fromfile(data2, dtype='uint8') + elif os.path.splitext(data2)[1] == '.txt': + data2 = np.loadtxt(data2) + + assert isinstance(data2, np.ndarray) + + + data1 = data1.reshape((-1,1)) + data2 = data2.reshape((-1,1)) + + if not(len(data1) == len(data2)): + print('error len') + return False, np.zeros((1)) + else: + ans = data2 - data1 + if len(np.where(ans>0)[0]) > 0: + print('error',np.where(ans>0)[0]) + return False, np.where(ans>0)[0] + else: + print('pass') + return True, np.zeros((1)) + +def cpr_to_crp(x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end): + """ + calculate the parameters of crop->pad->resize flow to HW crop->resize->padding flow + + Args: + + Returns: + + Examples: + + """ + pad_l = round(pad_l * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r)) + pad_r = round(pad_r * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r)) + pad_t = round(pad_t * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b)) + pad_b = round(pad_b * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b)) + + rx_start +=pad_l + rx_end -=pad_r + ry_start +=pad_t + ry_end -=pad_b + + return x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end \ No newline at end of file diff --git a/kneron_preprocessing/Cflow.py b/kneron_preprocessing/Cflow.py new file mode 100644 index 0000000..02fffe1 --- /dev/null +++ b/kneron_preprocessing/Cflow.py @@ -0,0 +1,172 @@ +import numpy as np +import argparse +import kneron_preprocessing + +def main_(args): + image = args.input_file + filefmt = args.file_fmt + if filefmt == 'bin': + raw_format = args.raw_format + raw_w = args.input_width + raw_h = args.input_height + + image_data = kneron_preprocessing.API.load_bin(image,raw_format,(raw_w,raw_h)) + else: + image_data = kneron_preprocessing.API.load_image(image) + + + npu_w = args.width + npu_h = args.height + + crop_first = True if args.crop_first == "True" else False + if crop_first: + x1 = args.x_pos + y1 = args.y_pos + x2 = args.crop_w + x1 + y2 = args.crop_h + y1 + crop_box = [x1,y1,x2,y2] + else: + crop_box = None + + pad_mode = args.pad_mode + norm_mode = args.norm_mode + bitwidth = args.bitwidth + radix = args.radix + rotate = args.rotate_mode + + ## + image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(npu_w,npu_h),crop_box=crop_box,pad_mode=pad_mode,norm=norm_mode,rotate=rotate,radix=radix,bit_width=bitwidth) + + output_file = args.output_file + kneron_preprocessing.API.dump_image(image_data,output_file,'bin','rgba') + + return + + +if __name__ == "__main__": + argparser = argparse.ArgumentParser( + description="preprocessing" + ) + + argparser.add_argument( + '-i', + '--input_file', + help="input file name" + ) + + argparser.add_argument( + '-ff', + '--file_fmt', + help="input file format, jpg or bin" + ) + + argparser.add_argument( + '-rf', + '--raw_format', + help="input file image format, rgb or rgb565 or nir" + ) + + argparser.add_argument( + '-i_w', + '--input_width', + type=int, + help="input image width" + ) + + argparser.add_argument( + '-i_h', + '--input_height', + type=int, + help="input image height" + ) + + argparser.add_argument( + '-o', + '--output_file', + help="output file name" + ) + + argparser.add_argument( + '-s_w', + '--width', + type=int, + help="output width for npu input", + ) + + argparser.add_argument( + '-s_h', + '--height', + type=int, + help="output height for npu input", + ) + + argparser.add_argument( + '-c_f', + '--crop_first', + help="crop first True or False", + ) + + argparser.add_argument( + '-x', + '--x_pos', + type=int, + help="left up coordinate x", + ) + + argparser.add_argument( + '-y', + '--y_pos', + type=int, + help="left up coordinate y", + ) + + argparser.add_argument( + '-c_w', + '--crop_w', + type=int, + help="crop width", + ) + + argparser.add_argument( + '-c_h', + '--crop_h', + type=int, + help="crop height", + ) + + argparser.add_argument( + '-p_m', + '--pad_mode', + type=int, + help=" 0: pad 2 sides, 1: pad 1 side, 2: no pad.", + ) + + argparser.add_argument( + '-n_m', + '--norm_mode', + help="normalizaton mode: yolo, kneron, tf." + ) + + argparser.add_argument( + '-r_m', + '--rotate_mode', + type=int, + help="rotate mode:0,1,2" + ) + + argparser.add_argument( + '-bw', + '--bitwidth', + type=int, + help="Int for bitwidth" + ) + + argparser.add_argument( + '-r', + '--radix', + type=int, + help="Int for radix" + ) + + args = argparser.parse_args() + main_(args) \ No newline at end of file diff --git a/kneron_preprocessing/Flow.py b/kneron_preprocessing/Flow.py new file mode 100644 index 0000000..bab0041 --- /dev/null +++ b/kneron_preprocessing/Flow.py @@ -0,0 +1,1226 @@ +import numpy as np +from PIL import Image +import json +import math +import sys +from .funcs import * +from .funcs.utils import str2bool, bin_loader, hex_loader, str_fill, clip_ary +from .funcs.utils_520 import round_up_16, round_up_n, cal_img_row_offset, get_pad_num, get_byte_per_pixel +from .funcs.utils_720 import twos_complement_pix, clip_pix +from ctypes import c_float + + +class Flow(object): + # class function + def __init__(self, config_path = ''): + ''' + @brief: + Class name: Flow + Constructor with config_path + + @param: + config_path[str]: json file path or empty, init this class with json file. If empty, will use default setting. + ''' + # init config + self.__init_config() + + # update config with joson file + try: + with open(config_path, encoding='utf-8') as f: + self.config = json.load(f) + except IOError: + pass + + # print info + if str2bool(self.config['print_info']): + print("pre-processing type:", self.config['type_name'],", model_size:",self.config['model_size'],", numerical_type",self.config['numerical_type']) + + # init funcs + self.error_state = 0 + self.subclass = {} + self.subclass['color'] = ColorConversion.runner() + self.subclass['resize'] = Resize.runner() + self.subclass['crop'] = Crop.runner() + self.subclass['padding'] = Padding.runner() + self.subclass['normalize'] = Normalize.runner() + + self.funcs = {} + self.funcs['crop'] = self.run_crop + self.funcs['color'] = self.run_color_conversion + self.funcs['resize'] = self.run_resize + self.funcs['normalize'] = self.run_normalize + self.funcs['padding'] = self.run_padding + + return + + def __init_config(self): + ''' + private function + ''' + self.config = { + "_comment": "PreProcessing", + "type_name": "default", + "numerical_type": "floating", + "print_info":"no", + "model_size": [ + 56, + 56 + ], + "raw_img":{ + "is_raw_img": "no", + "raw_img_type": "bin", + "raw_img_fmt": "rgb565", + "img_in_width": 640, + "img_in_height": 480 + }, + "output_setting":{ + "is_dump": "no", + "dump_format":"bin", + "output_file":"default.bin", + "image_format":"RGB888" + }, + "520_setting":{ + "radix": 8, + "bit_width": 8, + "rotate": 0, + "crop_fisrt": "no", + "NUM_BANK_LINE": 32, + "BANK_ENTRY_CNT": 512, + "MAX_IMG_PREPROC_ROW_NUM": 511, + "MAX_IMG_PREPROC_COL_NUM": 256, + "round_w_to_16": "no" + }, + "720_setting":{ + "radix": 8, + "shift":0, + "sub":0, + "bit_width": 8, + "rotate": 0, + "crop_fisrt": "no", + "matrix_c00": 1, + "matrix_c01": 0, + "matrix_c02": 0, + "matrix_c10": 0, + "matrix_c11": 1, + "matrix_c12": 0, + "matrix_c20": 0, + "matrix_c21": 0, + "matrix_c22": 1, + "vector_b00": 0, + "vector_b01": 0, + "vector_b02": 0 + }, + "floating_setting":{ + "job_list":[ + "color", + "crop", + "resize", + "padding", + "normalize", + ] + }, + "function_setting": { + "color": { + "out_format": "rgb888", + "options": { + "simulation": "no", + "simulation_format": "" + } + }, + "crop": { + "type": "corner", + "align_w_to_4":"no", + "pad_square_to_4":"no", + "rounding_type":0, + "crop_w": "", + "crop_h": "", + "start_x": "", + "start_y": "", + "end_x": "", + "end_y": "" + }, + "resize": { + "type": "fixed", + "keep_ratio": "yes", + "calculate_ratio_using_CSim": "yes", + "zoom": "yes", + "resize_w": "", + "resize_h": "", + }, + "padding": { + "type": "corner", + "pad_val": "", + "padded_w": "", + "padded_h": "", + "pad_l": "", + "pad_r": "", + "pad_t": "", + "pad_b": "" + }, + "normalize": { + "type": "kneron", + "scale": "", + "bias": "", + "mean": "", + "std": "" + } + } + } + return + + def __update_color(self): + ''' + private function + ''' + # + dic = self.config['function_setting']['color'] + dic['model_size'] = self.config['model_size'] + dic['print_info'] = self.config['print_info'] + self.subclass['color'].update(**dic) + + return + + def __update_crop(self): + ''' + private function + ''' + dic = {} + # common + dic['common'] = {} + dic['common']['print_info'] = self.config['print_info'] + dic['common']['model_size'] = self.config['model_size'] + dic['common']['numerical_type'] = self.config['numerical_type'] + + # general + dic['general'] = {} + dic['general']['type'] = self.config['function_setting']['crop']['type'] + dic['general']['align_w_to_4'] = self.config['function_setting']['crop']['align_w_to_4'] + dic['general']['pad_square_to_4'] = self.config['function_setting']['crop']['pad_square_to_4'] + dic['general']['rounding_type'] = self.config['function_setting']['crop']['rounding_type'] + dic['general']['crop_w'] = self.config['function_setting']['crop']['crop_w'] + dic['general']['crop_h'] = self.config['function_setting']['crop']['crop_h'] + dic['general']['start_x'] = self.config['function_setting']['crop']['start_x'] + dic['general']['start_y'] = self.config['function_setting']['crop']['start_y'] + dic['general']['end_x'] = self.config['function_setting']['crop']['end_x'] + dic['general']['end_y'] = self.config['function_setting']['crop']['end_y'] + + # floating + dic['floating'] = {} + + # hw + dic['hw'] = {} + + + self.subclass['crop'].update(**dic) + return + + def __update_resize(self): + ''' + private function + ''' + dic = {} + # common + dic['common'] = {} + dic['common']['print_info'] = self.config['print_info'] + dic['common']['model_size'] = self.config['model_size'] + dic['common']['numerical_type'] = self.config['numerical_type'] + + # general + dic['general'] = {} + dic['general']['type'] = self.config['function_setting']['resize']['type'] + dic['general']['keep_ratio'] = self.config['function_setting']['resize']['keep_ratio'] + dic['general']['zoom'] = self.config['function_setting']['resize']['zoom'] + dic['general']['calculate_ratio_using_CSim'] = self.config['function_setting']['resize']['calculate_ratio_using_CSim'] + dic['general']['resize_w'] = self.config['function_setting']['resize']['resize_w'] + dic['general']['resize_h'] = self.config['function_setting']['resize']['resize_h'] + + # floating + dic['floating'] = {} + + # hw + dic['hw'] = {} + + self.subclass['resize'].update(**dic) + return + + def __update_normalize(self): + ''' + private function + ''' + dic = {} + # general + dic['general'] = {} + dic['general']['print_info'] = self.config['print_info'] + dic['general']['model_size'] = self.config['model_size'] + dic['general']['numerical_type'] = self.config['numerical_type'] + dic['general']['type'] = self.config['function_setting']['normalize']['type'] + + # floating + dic['floating'] = {} + dic['floating']['scale'] = self.config['function_setting']['normalize']['scale'] + dic['floating']['bias'] = self.config['function_setting']['normalize']['bias'] + dic['floating']['mean'] = self.config['function_setting']['normalize']['mean'] + dic['floating']['std'] = self.config['function_setting']['normalize']['std'] + + # hw + dic['hw'] = {} + if self.config['numerical_type'] == '520': + dic['hw']['radix'] = self.config['520_setting']['radix'] + if self.config['numerical_type'] == '720': + dic['hw']['radix'] = self.config['720_setting']['radix'] + + self.subclass['normalize'].update(**dic) + return + + def __update_padding(self): + ''' + private function + ''' + dic = {} + # common + dic['common'] = {} + dic['common']['print_info'] = self.config['print_info'] + dic['common']['model_size'] = self.config['model_size'] + dic['common']['numerical_type'] = self.config['numerical_type'] + + # general + dic['general'] = {} + dic['general']['type'] = self.config['function_setting']['padding']['type'] + dic['general']['pad_val'] = self.config['function_setting']['padding']['pad_val'] + dic['general']['padded_w'] = self.config['function_setting']['padding']['padded_w'] + dic['general']['padded_h'] = self.config['function_setting']['padding']['padded_h'] + dic['general']['pad_l'] = self.config['function_setting']['padding']['pad_l'] + dic['general']['pad_r'] = self.config['function_setting']['padding']['pad_r'] + dic['general']['pad_t'] = self.config['function_setting']['padding']['pad_t'] + dic['general']['pad_b'] = self.config['function_setting']['padding']['pad_b'] + + # floating + dic['floating'] = {} + + # hw + dic['hw'] = {} + if self.config['numerical_type'] == '520': + dic['hw']['radix'] = self.config['520_setting']['radix'] + dic['hw']['normalize_type'] = self.config['function_setting']['normalize']['type'] + elif self.config['numerical_type'] == '720': + dic['hw']['radix'] = self.config['720_setting']['radix'] + dic['hw']['normalize_type'] = self.config['function_setting']['normalize']['type'] + + self.subclass['padding'].update(**dic) + return + + def set_numerical_type(self, type = ''): + ''' + set_numerical_type + + set the preprocess type, now support floating, 520 and 720 + + Args: + type: [str], "520" / "720" / "floating" + ''' + if not (type.lower() in ['520', '720', 'floating']): + type = 'floating' + self.config['numerical_type'] = type + return + + def set_print_info(self, print_info = ''): + ''' + turn print infomation on or off. + + Args: + print_info: [str], "yes" / "no" + ''' + self.config['print_info'] = print_info + return + + def set_model_size(self, w, h): + ''' + set_model_size, set out image size, or npu size + + Args: + w: [int] + h: [int] + ''' + if w <= 0 or h <= 0: + return + self.config['model_size'][0] = w + self.config['model_size'][1] = h + + return + + def set_raw_img(self, is_raw_img='', raw_img_type = '', raw_img_fmt='', img_in_width='',img_in_height=''): + ''' + set if input is raw file + + now support for rgb888,rgb565,nir,yuv and ycbcr + + Args: + is_raw_img: [str], "yes" / "no", is raw file or not + raw_img_type: [str], "bin" / "hex", set the raw file format, now support bin and hex file. + raw_img_fmt: [str], "rgb888" / "rgb565" / "nir" / "ycbcr422" / "ycbcr444" / "yuv422" / "yuv444", set the raw image format. + img_in_width: [int] + img_in_height: [int] + ''' + if not(is_raw_img==''): + self.config['raw_img']['is_raw_img'] = is_raw_img + if not(raw_img_type==''): + self.config['raw_img']['raw_img_type'] = raw_img_type + if not(raw_img_fmt==''): + self.config['raw_img']['raw_img_fmt'] = raw_img_fmt + if not(img_in_width==''): + self.config['raw_img']['img_in_width'] = img_in_width + if not(img_in_height==''): + self.config['raw_img']['img_in_height'] = img_in_height + return + + def set_output_setting(self, is_dump='', dump_format='',image_format='', output_file=''): + ''' + set_output_setting, dump output or not, dump format can be bin , hex or txt + + Args: + is_dump: [str], "yes" / "no", open dump function or not + dump_format: [str], "bin" / "txt" / "hex", set dump file format. + image_format: [str], RGB888 / RGBA8888 / RGB565 / NIR / YUV444 / YCbCr444 / YUV422 / YCbCr422 + output_file: [str], dump file path + ''' + if not(is_dump==''): + self.config['output_setting']['is_dump'] = is_dump + if not(dump_format==''): + self.config['output_setting']['dump_format'] = dump_format + if not(image_format==''): + self.config['output_setting']['image_format'] = image_format + if not(output_file==''): + self.config['output_setting']['output_file'] = output_file + return + + def set_520_setting(self, radix='', bit_width='', rotate='',crop_fisrt='', round_w_to_16 ='',NUM_BANK_LINE='',BANK_ENTRY_CNT='',MAX_IMG_PREPROC_ROW_NUM='',MAX_IMG_PREPROC_COL_NUM=''): + ''' + setting about 520 inproc + + Args: + radix: [int], default 8 + bit_width: [int], default 8 + rotate: [int], 0 / 1 / 2, set rotate type + crop_fisrt: [str], "yes" / "no", crop before inproc or not + round_w_to_16: [str], "yes" / "no", round w align to 16 or not + NUM_BANK_LINE: [int], default 32 + BANK_ENTRY_CNT: [int], default 512 + MAX_IMG_PREPROC_ROW_NUM: [int], default 511 + MAX_IMG_PREPROC_COL_NUM: [int], default 256 + ''' + if not(radix==''): + self.config['520_setting']['radix'] = radix + if not(bit_width==''): + self.config['520_setting']['bit_width'] = bit_width + if not(rotate==''): + self.config['520_setting']['rotate'] = rotate + if not(crop_fisrt==''): + self.config['520_setting']['crop_fisrt'] = crop_fisrt + if not(round_w_to_16==''): + self.config['520_setting']['round_w_to_16'] = round_w_to_16 + if not(NUM_BANK_LINE==''): + self.config['520_setting']['NUM_BANK_LINE'] = NUM_BANK_LINE + if not(BANK_ENTRY_CNT==''): + self.config['520_setting']['BANK_ENTRY_CNT'] = BANK_ENTRY_CNT + if not(MAX_IMG_PREPROC_ROW_NUM==''): + self.config['520_setting']['MAX_IMG_PREPROC_ROW_NUM'] = MAX_IMG_PREPROC_ROW_NUM + if not(MAX_IMG_PREPROC_COL_NUM==''): + self.config['520_setting']['MAX_IMG_PREPROC_COL_NUM'] = MAX_IMG_PREPROC_COL_NUM + return + + def set_720_setting(self, radix='', bit_width='', rotate='',crop_fisrt='', matrix='',vector=''): + ''' + setting about 720 inproc + + Args: + radix: [int], default 8 + bit_width: [int], default 8 + rotate: [int], 0 / 1 / 2, set rotate type + crop_fisrt: [str], "yes" / "no", crop before inproc or not + matrix: [list] + vector: [list] + ''' + if not(radix==''): + self.config['720_setting']['radix'] = radix + if not(bit_width==''): + self.config['720_setting']['bit_width'] = bit_width + if not(rotate==''): + self.config['720_setting']['rotate'] = rotate + if not(crop_fisrt==''): + self.config['720_setting']['crop_fisrt'] = crop_fisrt + return + + def set_floating_setting(self, job_list = []): + ''' + set_floating_setting, set floating pre-processing job list and order, can be combination of color, crop, resize, padding, normalize + + Args: + job_list: [list], combination of "color" / "crop" / "resize" / "padding" / "normalize" + ''' + if not(job_list==[]): + self.config['floating_setting']['job_list'] = job_list + return + + def set_color_conversion(self, source_format = '', out_format='', simulation='', simulation_format=''): + ''' + set_color_conversion + + setting about corlor conversion and inproc format unit. + Turn simulation on can simulate rgb image to other image type. + + Args: + source_format: [str], "rgb888" / "rgb565" / "yuv" / "ycbcr" + out_format: [str], "rgb888" / "l" + simulation: [str], "yes" / "no" + simulation_format: [str], "rgb565" / "yuv" / "ycbcr" + ''' + if not(source_format==''): + self.config['function_setting']['color']['source_format'] = source_format + if not(out_format==''): + self.config['function_setting']['color']['out_format'] = out_format + if not(simulation==''): + self.config['function_setting']['color']['options']['simulation'] = simulation + if not(simulation_format==''): + self.config['function_setting']['color']['options']['simulation_format'] = simulation_format + + return + + def set_resize(self, type='', keep_ratio='', calculate_ratio_using_CSim='',zoom='', resize_w='', resize_h = ''): + ''' + set_resize, setting about resize and inproc resize unit. + + resize type can be bilinear or bilicubic as floating type, fixed or fixed_520 as fixed type. + fixed_520 type has add some function to simulate 520 bug. + + Args: + type[str]: "bilinear" / "bilicubic" / "cv2" / "fixed" / "fixed_520" + keep_ratio[str]: "yes" / "no" + calculate_ratio_using_CSim[str]: "yes" / "no" , calculate the ratio and scale using Csim function and C float + zoom[str]: "yes" / "no", enable resize can zoom image or not + resize_w[int]: if empty, then default will be model_size[0] + resize_h[int]: if empty, then default will be model_size[0] + ''' + if not(type==''): + self.config['function_setting']['resize']['type'] = type + if not(keep_ratio==''): + self.config['function_setting']['resize']['keep_ratio'] = keep_ratio + if not(calculate_ratio_using_CSim==''): + self.config['function_setting']['resize']['calculate_ratio_using_CSim'] = calculate_ratio_using_CSim + if not(zoom==''): + self.config['function_setting']['resize']['zoom'] = zoom + if not(resize_w==''): + self.config['function_setting']['resize']['resize_w'] = resize_w + if not(resize_h==''): + self.config['function_setting']['resize']['resize_h'] = resize_h + + return + + def set_crop(self, type='', crop_w='', crop_h='', start_x='', start_y='', end_x='', end_y='',align_w_to_4="",pad_square_to_4="",rounding_type=""): + ''' + set_crop, setting about crop and rdma crop unit. + + crop type can be corner,center or specific. + + if type = corner and center, need to set crop_w and crop_h(or keep empty to set as model_size) + + if type = specific, need to set start_x, start_y, end_x and end_y + + if start_x, start_y, end_x and end_y all are not empty, then the type will turn to specific automatically + + Args: + type: [str], "corner" / "center" / "specific" + crop_w: [int], if empty, then default will be model_size[0] + crop_h: [int], if empty, then default will be model_size[0] + start_x: [int] + start_y: [int] + end_x: [int] + end_y: [int] + align_w_to_4: [str], crop length in w direction align to 4 or not + pad_square_to_4: [str], pad to square(align 4) or not + rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding + ''' + if not(type==''): + self.config['function_setting']['crop']['type'] = type + if not(align_w_to_4==''): + self.config['function_setting']['crop']['align_w_to_4'] = align_w_to_4 + if not(pad_square_to_4==''): + self.config['function_setting']['crop']['pad_square_to_4'] = pad_square_to_4 + if not(rounding_type==''): + self.config['function_setting']['crop']['rounding_type'] = rounding_type + if not(crop_w==''): + self.config['function_setting']['crop']['crop_w'] = crop_w + if not(crop_h==''): + self.config['function_setting']['crop']['crop_h'] = crop_h + if not(start_x==''): + self.config['function_setting']['crop']['start_x'] = start_x + if not(start_y==''): + self.config['function_setting']['crop']['start_y'] = start_y + if not(end_x==''): + self.config['function_setting']['crop']['end_x'] = end_x + if not(end_y==''): + self.config['function_setting']['crop']['end_y'] = end_y + return + + def set_padding(self, type='', pad_val='', padded_w='', padded_h='', pad_l='', pad_r='', pad_t='', pad_b=''): + ''' + set_padding, setting about padding and inproc padding unit. + + crop type can be corner,center or specific. + + if type = corner and center, need to set out_w and out_h(or keep empty to set as model_size) + + if type = specific, need to set pad_l, pad_r, pad_t and pad_b + + if pad_l, pad_r, pad_t and pad_b all are not empty, then the type will turn to specific automatically + + if numerical type = 520 or 720, then the pad_val will adjust according radix automatically + + Args: + type: [str], "corner" / "center" / "specific" + pad_val: [float] + out_w: [int] + out_h: [int] + pad_l: [int] + pad_r: [int] + pad_t: [int] + pad_b: [int] + ''' + if not(type==''): + self.config['function_setting']['padding']['type'] = type + if not(pad_val==''): + self.config['function_setting']['padding']['pad_val'] = pad_val + if not(padded_w==''): + self.config['function_setting']['padding']['padded_w'] = padded_w + if not(padded_h==''): + self.config['function_setting']['padding']['padded_h'] = padded_h + if not(pad_l==''): + self.config['function_setting']['padding']['pad_l'] = pad_l + if not(pad_r==''): + self.config['function_setting']['padding']['pad_r'] = pad_r + if not(pad_t==''): + self.config['function_setting']['padding']['pad_t'] = pad_t + if not(pad_b==''): + self.config['function_setting']['padding']['pad_b'] = pad_b + return + + def set_normalize(self, type='', scale='', bias='', mean='', std =''): + ''' + set_normalize, setting about normalize and inproc chen unit. + + if numerical type = floating: + normalize type can be customized, torch, tf, caffe, yolo or kneron + if type = customized, need to set scale, bias, mean and std + + if numerical type = 520 or 720: + normalize type can be tf, yolo or kneron + + Args: + type: [str], "customized" / "torch" / "tf" / "caffe" / "yolo" / "kneron" + scale: [float] + bias: [float] + mean: [list,3] + std: [list,3] + ''' + if not(type==''): + self.config['function_setting']['normalize']['type'] = type + if not(scale==''): + self.config['function_setting']['normalize']['scale'] = scale + if not(bias==''): + self.config['function_setting']['normalize']['bias'] = bias + if not(mean==''): + self.config['function_setting']['normalize']['mean'] = mean + if not(std==''): + self.config['function_setting']['normalize']['std'] = std + return + + def load_image(self, image, is_raw = False , raw_img_type = '', raw_img_fmt = '', img_in_height = 0, img_in_width = 0): + ''' + load_image function + + Args: + image: [np.array/str], can be np.array or file path(bin/hex/jpg) + is_raw: [bool], is raw image or not (bin or hex) + raw_img_type: [str], "bin" / "hex" + raw_img_fmt: [str], "yuv444" / "ycbcr444" / "yuv422" / "ycbcr422" / "rgb565" / "nir" + img_in_width: [int] + img_in_height: [int] + + Returns: + out: [np.array], not include color convert + ''' + if isinstance(image, np.ndarray): + return image + if str2bool(is_raw): + dic ={} + dic['raw_img_fmt'] = raw_img_fmt + dic['img_in_height'] = img_in_height + dic['img_in_width'] = img_in_width + if raw_img_type.lower() in ['bin','BIN']: + image_data = bin_loader(image,**dic) + elif raw_img_type.lower() in ['hex','HEX']: + image_data = hex_loader(image,**dic) + elif isinstance(image, str): + image = Image.open(image).convert("RGB") + image_data = np.array(image).astype('uint8') + + assert isinstance(image_data, np.ndarray) + return image_data + + def dump_image(self,image_data): + ''' + dump_image function, according config setting to dump image, txt, bin or hex + + Args: + image: [np.array] + ''' + assert isinstance(image_data, np.ndarray) + assert (len(image_data.shape) >= 2) + + if (len(image_data.shape) == 2): + source_format = 'L' + if (image_data.shape[2] == 4): + source_format = 'RGBA8888' + else: + source_format = 'RGB888' + + convert = ColorConversion.runner() + if (source_format == 'L') & (self.config['output_setting']['image_format'].lower() not in ['L', 'l', 'NIR', 'nir']): + convert.update(**{"source_format": "L","out_format": "RGB888"}) + image_data, _ = convert.run(image_data) + source_format = 'RGB888' + + if (source_format == 'RGBA8888') & (self.config['output_setting']['image_format'].lower() not in ['RGBA8888', 'rgba8888','RGBA','rgba']): + convert.update(**{"source_format": "RGBA8888","out_format": "RGB888"}) + image_data, _ = convert.run(image_data) + source_format = 'RGB888' + + + if (self.config['output_setting']['image_format'].lower() in ['RGB565', 'rgb565']): + convert.update(**{"source_format": source_format,"out_format": "RGB565"}) + image_data_565, _ = convert.run(image_data) + image_data = np.zeros((image_data_565.shape[0],image_data_565.shape[1],2), dtype=np.uint8) + image_data[:,:,1] = ( image_data_565[:,:,0] << 3 ) + ( image_data_565[:,:,1] >> 3 ) + image_data[:,:,0] = ( (image_data_565[:,:,1] & 0x07) << 5 ) + image_data_565[:,:,2] + elif (self.config['output_setting']['image_format'].lower() in ['RGBA8888', 'rgba8888','RGBA','rgba']) & (source_format != 'RGBA8888'): + convert.update(**{"source_format": source_format,"out_format": "rgba"}) + image_data, _ = convert.run(image_data) + elif (self.config['output_setting']['image_format'].lower() in ['L', 'l', 'NIR', 'nir']): + convert.update(**{"source_format": source_format,"out_format": "L"}) + image_data, _ = convert.run(image_data) + elif (self.config['output_setting']['image_format'].lower() in['YUV', 'YUV444','yuv','yuv444']): + convert.update(**{"source_format": source_format,"out_format": "YUV444"}) + image_data_YUV, _ = convert.run(image_data) + image_data = np.zeros((image_data_YUV.shape[0],image_data_YUV.shape[1],4), dtype=np.uint8) + image_data[:,:,3] = image_data_YUV[:,:,0] + image_data[:,:,2] = image_data_YUV[:,:,1] + image_data[:,:,1] = image_data_YUV[:,:,2] + elif (self.config['output_setting']['image_format'].lower() in['YUV422','yuv422']): + convert.update(**{"source_format": source_format,"out_format": "YUV444"}) + image_data_YUV, _ = convert.run(image_data) + pixels = image_data_YUV.shape[0] * image_data_YUV.shape[1] + image_data = np.zeros((pixels*2,1), dtype=np.uint8) + image_data_YUV = image_data_YUV.reshape((-1,1)) + for i in range(0,image_data.shape[0],4): + j = i//2 #source index + image_data[i+3,0] = image_data_YUV[j*3,0] + image_data[i+2,0] = image_data_YUV[j*3+1,0] + image_data[i+1,0] = image_data_YUV[j*3+3,0] + image_data[i,0] = image_data_YUV[j*3+5,0] + elif (self.config['output_setting']['image_format'].lower() in['YCBCR', 'YCBCR444','YCbCr','YCbCr444','ycbcr','ycbcr444']): + convert.update(**{"source_format": source_format,"out_format": "YCBCR444"}) + image_data_YCBCR, _ = convert.run(image_data) + image_data = np.zeros((image_data_YCBCR.shape[0],image_data_YCBCR.shape[1],4), dtype=np.uint8) + image_data[:,:,3] = image_data_YCBCR[:,:,0] + image_data[:,:,2] = image_data_YCBCR[:,:,1] + image_data[:,:,1] = image_data_YCBCR[:,:,2] + elif (self.config['output_setting']['image_format'].lower() in['YCBCR422','YCbCr422','ycbcr422']): + convert.update(**{"source_format": source_format,"out_format": "YCBCR422"}) + image_data_YCBCR, _ = convert.run(image_data) + image_data = np.zeros((image_data_YCBCR.shape[0],image_data_YCBCR.shape[1],2), dtype=np.uint8) + pixels = image_data_YCBCR.shape[0] * image_data_YCBCR.shape[1] + image_data = np.zeros((pixels*2,1), dtype=np.uint8) + image_data_YCBCR = image_data_YCBCR.reshape((-1,1)) + for i in range(0,image_data.shape[0],4): + j = i//2 #source index + image_data[i+3,0] = image_data_YCBCR[j*3,0] + image_data[i+2,0] = image_data_YCBCR[j*3+1,0] + image_data[i+1,0] = image_data_YCBCR[j*3+3,0] + image_data[i,0] = image_data_YCBCR[j*3+5,0] + + if self.config['output_setting']['dump_format'].lower() in ['txt', 'TXT']: + np.savetxt(self.config['output_setting']['output_file'],image_data.reshape((-1,1)),fmt="%.8f") + elif self.config['output_setting']['dump_format'].lower() in ['bin', 'BIN']: + image_data.reshape((-1,1)).astype("uint8").tofile(self.config['output_setting']['output_file']) + elif self.config['output_setting']['dump_format'].lower() in ['hex', 'HEX']: + height, width, c = image_data.shape + output_line = math.floor((height * width) / 4) + image_f = image_data.reshape((height * width, c)) + f = open(self.config['output_setting']['output_file'], "w") + for i in range(output_line): + pixels = "" + for j in range(min((i+1)*4-1, image_f.shape[0]-1), i*4-1, -1): + pixels = pixels + str_fill(hex(image_f[j, 3]).lstrip("0x")) + pixels = pixels + str_fill(hex(image_f[j, 2]).lstrip("0x")) + pixels = pixels + str_fill(hex(image_f[j, 1]).lstrip("0x")) + pixels = pixels + str_fill(hex(image_f[j, 0]).lstrip("0x")) + f.write(pixels + "\n") + return + + def run_whole_process(self, image): + ''' + run_whole_process, according config setting to run all pre-processing + + Args: + image: [np.array/str], can be np.array or file path(bin/jpg) + + Returns: + out: [np.array] + ''' + assert (self.error_state == 0) + + image_data = self.load_image( + image, + is_raw = self.config['raw_img']["is_raw_img"], + raw_img_type = self.config['raw_img']["raw_img_type"], + raw_img_fmt = self.config['raw_img']["raw_img_fmt"], + img_in_height= self.config['raw_img']["img_in_height"], + img_in_width=self.config['raw_img']["img_in_width"]) + + if str2bool(self.config['raw_img']["is_raw_img"]): + self.set_color_conversion(source_format=self.config['raw_img']["raw_img_fmt"]) + elif isinstance(image, str): + self.set_color_conversion(source_format='RGB888') + + h_ori = image_data.shape[0] + w_ori = image_data.shape[1] + + if self.config['numerical_type'] == 'floating': + image_data = self.__run_whole_process_floating(image_data) + elif self.config['numerical_type'] == '520': + image_data = self.__run_whole_process_520(image_data) + elif self.config['numerical_type'] == '720': + image_data = self.__run_whole_process_720(image_data) + + if str2bool(self.config['output_setting']['is_dump']): + self.dump_image(image_data) + + scale = max(1.0*w_ori / image_data.shape[1], 1.0*h_ori / image_data.shape[0]) + out = {'h_ori': h_ori, 'w_ori': w_ori, "scale": scale} + return image_data, out + + def __run_whole_process_floating(self,image_data): + ''' + private function + ''' + for job in self.config['floating_setting']['job_list']: + if job.lower() in ['crop','color','resize','normalize','padding']: + image_data, _ = self.funcs[job](image_data) + + return image_data + + def __run_whole_process_520(self,image_data): + ''' + private function + ''' + # init from config + originH, originW, _ = image_data.shape + npu_img_w = self.config['model_size'][0] + npu_img_h = self.config['model_size'][1] + + if self.config['function_setting']['padding']['type'].lower() in ['center','CENTER','Center','0',0]: + pad_mode = 0 + elif self.config['function_setting']['padding']['type'].lower() in ['corner','CORNER','Corner','1',1]: + pad_mode = 1 + else: + pad_mode = 2 + + if not str2bool(self.config['function_setting']['resize']['keep_ratio']): + pad_mode = 2 + + NUM_BANK_LINE = self.config['520_setting']['NUM_BANK_LINE'] + BANK_ENTRY_CNT = self.config['520_setting']['BANK_ENTRY_CNT'] + MAX_IMG_PREPROC_ROW_NUM = self.config['520_setting']['MAX_IMG_PREPROC_ROW_NUM'] + MAX_IMG_PREPROC_COL_NUM = self.config['520_setting']['MAX_IMG_PREPROC_COL_NUM'] + + raw_fmt = self.config['function_setting']['color']['source_format'] + crop_fisrt = str2bool(self.config['520_setting']['crop_fisrt']) + keep_ratio = str2bool(self.config['function_setting']['resize']['keep_ratio']) + + # init crop + if crop_fisrt: + startW = self.config['function_setting']['crop']['start_x'] + startH = self.config['function_setting']['crop']['start_y'] + cropW = self.config['function_setting']['crop']['end_x'] - self.config['function_setting']['crop']['start_x'] + cropH = self.config['function_setting']['crop']['end_y'] - self.config['function_setting']['crop']['start_y'] + else: + startW = 0 + startH = 0 + cropW = originW + cropH = originH + + crop_num = [0] * 4 + crop_num[0] = startW #left + crop_num[1] = startH #top + crop_num[2] = originW - (startW + cropW) #right + crop_num[3] = originH - (startH + cropH) #bottom + + # calculate scaleW scaleH padW padH + if keep_ratio: + out_w = npu_img_w + out_h = npu_img_h + orig_w = cropW + orig_h = cropH + + w_ratio = c_float(out_w * 1.0 / (orig_w * 1.0)).value + h_ratio = c_float(out_h * 1.0 / (orig_h * 1.0)).value + scale_ratio = 0.0 + scale_target_w = 0 + scale_target_h = 0 + padH = 0 + padW = 0 + + bScaleW = True if w_ratio < h_ratio else False + if bScaleW: + scale_ratio = w_ratio + scale_target_w = int(c_float(scale_ratio * orig_w + 0.5).value) + scale_target_h = int(c_float(scale_ratio * orig_h + 0.5).value) + assert (abs(scale_target_w - out_w) <= 1), "Error: scale down width cannot meet expectation\n" + padH = out_h - scale_target_h + padW = 0 + assert (padH >= 0), "Error: padH shouldn't be less than zero\n" + else: + scale_ratio = h_ratio + scale_target_w = int(c_float(scale_ratio * orig_w + 0.5).value) + scale_target_h = int(c_float(scale_ratio * orig_h + 0.5).value) + assert (abs(scale_target_h - out_h) <= 1), "Error: scale down height cannot meet expectation\n" + padW = out_w - scale_target_w + padH = 0 + assert (padW >= 0), "Error: padW shouldn't be less than zero\n" + + scaleW = out_w - padW + scaleH = out_h - padH + else: + scaleW = npu_img_w + scaleH = npu_img_h + padW = 0 + padH = 0 + + # calculate pad_top pad_bottom pad_left pad_right + if (pad_mode == 0): + # pad on both side + pad_top = padH // 2 + pad_bottom = (padH // 2) + (padH % 2) + pad_left = padW // 2 + pad_right = (padW // 2) + (padW % 2) + elif (pad_mode == 1): + # only pad right and bottom + pad_top = 0 + pad_bottom = padH + pad_left = 0 + pad_right = padW + else: + pad_top = 0 + pad_bottom = 0 + pad_left = 0 + pad_right = 0 + + if (pad_right > 127 or pad_bottom > 127): + print("Pad value larger than 127 is not supported\n") + + orig_pad_num = [0] * 4 + orig_pad_num[0] = pad_left + orig_pad_num[1] = pad_top + orig_pad_num[2] = pad_right + orig_pad_num[3] = pad_bottom + + valid_in_row = cropH + valid_in_col = cropW + out_row = scaleH + padH + out_col = scaleW + padW + + # calculate cut_total + max_row = int(math.floor(BANK_ENTRY_CNT * NUM_BANK_LINE / (out_col / 4))) + max_row = min(max_row, MAX_IMG_PREPROC_ROW_NUM) + + if (pad_mode == 0): + big_pad_row = (out_row % max_row) < (pad_bottom + 4) + if (big_pad_row): + last_row = int(pad_bottom + 4) + cut_total = int(math.ceil( float(out_row - last_row) / max_row) + 1) + else: + cut_total = int(math.ceil( float(out_row) / max_row)) + elif (pad_mode == 1): + big_pad_row = (out_row % max_row) < (pad_bottom + 4) + last_row = max_row + if (big_pad_row): + cut_total = int(math.ceil( float(out_row - last_row) / max_row) + 1) + else: + cut_total = int(math.ceil( float(out_row) / max_row)) + else: + big_pad_row = False + cut_total = int(math.ceil( float(out_row) / max_row)) + + # calculate seg_cnt + max_col = MAX_IMG_PREPROC_COL_NUM + last_col = 0 + if (out_col % max_col): + if (pad_mode == 0): + big_pad_col = (out_col % max_col) < (pad_right + 4) + if (big_pad_col): + last_col = round_up_n(pad_right + 4, 4) + seg_cnt = math.ceil( float(out_col - last_col) / max_col) + 1 + else: + seg_cnt = math.ceil( float(out_col) / max_col) + elif (pad_mode == 1): + big_pad_col = (out_col % max_col) < (pad_right + 4) + last_col = max_col + if (big_pad_col): + seg_cnt = math.ceil( float(out_col - last_col) / max_col) + 1 + else: + seg_cnt = math.ceil( float(out_col) / max_col) + else: + big_pad_col = False + seg_cnt = math.ceil( float(out_col) / max_col) + else: + big_pad_col = False + seg_cnt = math.ceil( float(out_col) / max_col) + + # start loop + if (big_pad_row): + remain_row = out_row - last_row + else: + remain_row = out_row + start_row = 0 + row_num = 0 + for r in range(0, cut_total): + start_row += row_num + block_start_row = cal_img_row_offset(crop_num, orig_pad_num, start_row, out_row, originH) + if (big_pad_row) and (r == (cut_total - 1)): + row_num = last_row + else: + row_num = min(max_row, remain_row) + + # due to HW only support max col = 256, we may need to process data in segments */ + if(big_pad_col): + remain_col = (out_col - last_col) + else: + remain_col = out_col + start_col = 0 + col_num = 0 + block_start_col = crop_num[0] + block_col = 0 + for c in range(0,seg_cnt): + start_col += col_num + block_start_col += block_col + if (big_pad_col) and (c == (seg_cnt - 1)): + col_num = last_col + else: + col_num = min(remain_col, MAX_IMG_PREPROC_COL_NUM) + + pad_num = get_pad_num(orig_pad_num, (c == 0), (r == 0), (c == seg_cnt - 1), (r == cut_total - 1)) + block_row = int(valid_in_row * (row_num - pad_num[1] - pad_num[3]) / (out_row - orig_pad_num[1] - orig_pad_num[3])) + block_col = int(valid_in_col * (col_num - pad_num[0] - pad_num[2]) / (out_col - orig_pad_num[0] - orig_pad_num[2])) + #/* (src_w * byte_per_pixel) should align to multiple of 4-byte and 2 cols */ + byte_per_pixel = get_byte_per_pixel(raw_fmt) + new_block_col = round_up_n(round_up_n(block_col, (4 / byte_per_pixel)), 2) + + if (new_block_col > block_col): + if byte_per_pixel == 1: + block_col = new_block_col - 4 + elif byte_per_pixel == 4: + block_col = new_block_col - 2 + else: + block_col = new_block_col - 2 + + ## + # crop + self.set_crop(start_x=block_start_col, start_y=block_start_row, end_x=block_start_col+block_col,end_y=block_start_row+block_row,align_w_to_4=False) + image_temp, _ = self.funcs['crop'](image_data) + + # color + image_temp, _ = self.funcs['color'](image_temp) + + # resize + self.set_resize(type='fixed_520',keep_ratio='no',calculate_ratio_using_CSim = 'yes', resize_w=(col_num - pad_num[0] - pad_num[2]),resize_h=(row_num - pad_num[1] - pad_num[3])) + image_temp, _ = self.funcs['resize'](image_temp) + + # normalize + image_temp, _ = self.funcs['normalize'](image_temp) + + # padding + self.set_padding(type='specific',pad_l=pad_num[0],pad_t=pad_num[1],pad_r=pad_num[2],pad_b=pad_num[3]) + image_temp, _ = self.funcs['padding'](image_temp) + + ## + remain_col -= col_num + if c == 0: + image_temp_H = image_temp + else: + image_temp_H = np.concatenate((image_temp_H, image_temp), axis=1) + + ## + remain_row -= row_num + if r == 0: + image_temp_V = image_temp_H + else: + image_temp_V = np.concatenate((image_temp_V, image_temp_H), axis=0) + + ## + image_data = image_temp_V + + # # round_w_to_16 + if str2bool(self.config['520_setting']['round_w_to_16']): + out_w_16 = round_up_n(out_col,16) + image = np.ones((out_row,out_w_16 - out_col,4)) *128 + image_data = np.concatenate((image_data, image), axis=1) + + # rotate + rotate = self.config['520_setting']['rotate'] + if not (rotate == 0): + dic = {} + dic['rotate_direction'] = rotate + rotate = Rotate.runner(**dic, b_print = str2bool(self.config['print_info'])) + image_data = rotate.run(image_data) + + return image_data + + def __run_whole_process_720(self,image_data): + ''' + private function + ''' + # init from config + crop_fisrt = str2bool(self.config['720_setting']['crop_fisrt']) + matrix_c00 = self.config['720_setting']['matrix_c00'] + matrix_c01 = self.config['720_setting']['matrix_c01'] + matrix_c02 = self.config['720_setting']['matrix_c02'] + matrix_c10 = self.config['720_setting']['matrix_c10'] + matrix_c11 = self.config['720_setting']['matrix_c11'] + matrix_c12 = self.config['720_setting']['matrix_c12'] + matrix_c20 = self.config['720_setting']['matrix_c20'] + matrix_c21 = self.config['720_setting']['matrix_c21'] + matrix_c22 = self.config['720_setting']['matrix_c22'] + vector_b00 = self.config['720_setting']['vector_b00'] + vector_b01 = self.config['720_setting']['vector_b01'] + vector_b02 = self.config['720_setting']['vector_b02'] + shiftvalue = self.config['720_setting']['shift'] + subvalue = self.config['720_setting']['sub'] + + #crop + if crop_fisrt: + image_data, _ = self.funcs['crop'](image_data) + + #color + image_data, _ = self.funcs['color'](image_data) + + #resize + self.set_resize(type='fixed_720',calculate_ratio_using_CSim = 'yes') + image_data, _ = self.funcs['resize'](image_data) + + #matrix + h, w, c = image_data.shape + image_f = image_data.reshape((h * w, c)) + matrix_c = np.array([[matrix_c00, matrix_c01, matrix_c02], + [matrix_c10, matrix_c11, matrix_c12], + [matrix_c20, matrix_c21, matrix_c22]]) + b = np.array([[vector_b00], [vector_b01], [vector_b02]]) + calculated_image_f = np.zeros(image_f.shape, dtype=np.uint8) + for i in range(h*w): + pt = np.swapaxes(image_f[np.newaxis, i, :], 0, 1) + matrix_pt = np.floor(np.multiply((matrix_c @ pt), 1/np.power(2, 1))) + matrix_pt.astype(int) + result = np.floor(np.multiply(np.add(matrix_pt, b), 1/np.power(2, 7))) + result.astype(int) + + result = twos_complement_pix(result) + + if shiftvalue == 1: + result = clip_pix(np.add(result, -128 * np.ones(result.shape)), -128, 127) + else: + result = clip_pix(result, 0, 255) + + result = result + np.array([[subvalue], [subvalue], [subvalue]]) + calculated_image_f[i, :] = clip_ary(np.squeeze(result)) + + image_data = calculated_image_f.reshape(image_data[:, :, 0:3].shape) + + #padding + image_data, _ = self.funcs['padding'](image_data) + + return image_data + + def run_crop(self, image_data): + ''' + @brief + run_crop, according config setting to run crop + + @param + image[np.array] : only can be np.array + + @return + np.array + ''' + self.__update_crop() + image_data, info = self.subclass['crop'].run(image_data) + return image_data, info + + def run_color_conversion(self, image_data): + ''' + @brief + run_color_conversion, according config setting to run color conversion + + @param + image[np.array] : only can be np.array + + @return + np.array + ''' + self.__update_color() + image_data, info = self.subclass['color'].run(image_data) + return image_data,info + + def run_resize(self, image_data): + ''' + @brief + run_resize, according config setting to run resize + + @param + image[np.array] : only can be np.array + + @return + np.array + ''' + self.__update_resize() + image_data,info = self.subclass['resize'].run(image_data) + return image_data,info + + def run_normalize(self, image_data): + ''' + @brief + run_normalize, according config setting to run normalize + + @param + image[np.array] : only can be np.array + + @return + np.array + ''' + self.__update_normalize() + image_data,info = self.subclass['normalize'].run(image_data) + return image_data,info + + def run_padding(self, image_data): + ''' + @brief + run_padding, according config setting to run padding + + @param + image[np.array] : only can be np.array + + @return + np.array + ''' + self.__update_padding() + image_data,info = self.subclass['padding'].run(image_data) + return image_data,info + + diff --git a/kneron_preprocessing/__init__.py b/kneron_preprocessing/__init__.py new file mode 100644 index 0000000..0a40017 --- /dev/null +++ b/kneron_preprocessing/__init__.py @@ -0,0 +1,2 @@ +from .Flow import * +from .API import * diff --git a/kneron_preprocessing/funcs/ColorConversion.py b/kneron_preprocessing/funcs/ColorConversion.py new file mode 100644 index 0000000..8bfea7b --- /dev/null +++ b/kneron_preprocessing/funcs/ColorConversion.py @@ -0,0 +1,285 @@ +import numpy as np +from PIL import Image +from .utils import signed_rounding, clip, str2bool + +format_bit = 10 +c00_yuv = 1 +c02_yuv = 1436 +c10_yuv = 1 +c11_yuv = -354 +c12_yuv = -732 +c20_yuv = 1 +c21_yuv = 1814 +c00_ycbcr = 1192 +c02_ycbcr = 1634 +c10_ycbcr = 1192 +c11_ycbcr = -401 +c12_ycbcr = -833 +c20_ycbcr = 1192 +c21_ycbcr = 2065 + +Matrix_ycbcr_to_rgb888 = np.array( + [[1.16438356e+00, 1.16438356e+00, 1.16438356e+00], + [2.99747219e-07, - 3.91762529e-01, 2.01723263e+00], + [1.59602686e+00, - 8.12968294e-01, 3.04059479e-06]]) + +Matrix_rgb888_to_ycbcr = np.array( + [[0.25678824, - 0.14822353, 0.43921569], + [0.50412941, - 0.29099216, - 0.36778824], + [0.09790588, 0.43921569, - 0.07142745]]) + +Matrix_rgb888_to_yuv = np.array( + [[ 0.29899106, -0.16877996, 0.49988381], + [ 0.5865453, -0.33110385, -0.41826072], + [ 0.11446364, 0.49988381, -0.08162309]]) + +# Matrix_rgb888_to_yuv = np.array( +# [[0.299, - 0.147, 0.615], +# [0.587, - 0.289, - 0.515], +# [0.114, 0.436, - 0.100]]) + +# Matrix_yuv_to_rgb888 = np.array( +# [[1.000, 1.000, 1.000], +# [0.000, - 0.394, 2.032], +# [1.140, - 0.581, 0.000]]) + +class runner(object): + def __init__(self): + self.set = { + 'print_info':'no', + 'model_size':[0,0], + 'numerical_type':'floating', + "source_format": "rgb888", + "out_format": "rgb888", + "options": { + "simulation": "no", + "simulation_format": "rgb888" + } + } + + def update(self, **kwargs): + # + self.set.update(kwargs) + + ## simulation + self.funs = [] + if str2bool(self.set['options']['simulation']) and self.set['source_format'].lower() in ['RGB888', 'rgb888', 'RGB', 'rgb']: + if self.set['options']['simulation_format'].lower() in ['YUV422', 'yuv422', 'YUV', 'yuv']: + self.funs.append(self._ColorConversion_RGB888_to_YUV422) + self.set['source_format'] = 'YUV422' + elif self.set['options']['simulation_format'].lower() in ['YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']: + self.funs.append(self._ColorConversion_RGB888_to_YCbCr422) + self.set['source_format'] = 'YCbCr422' + elif self.set['options']['simulation_format'].lower() in['RGB565', 'rgb565']: + self.funs.append(self._ColorConversion_RGB888_to_RGB565) + self.set['source_format'] = 'RGB565' + + ## to rgb888 + if self.set['source_format'].lower() in ['YUV444', 'yuv444','YUV422', 'yuv422', 'YUV', 'yuv']: + self.funs.append(self._ColorConversion_YUV_to_RGB888) + elif self.set['source_format'].lower() in ['YCBCR444', 'YCbCr444', 'ycbcr444','YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']: + self.funs.append(self._ColorConversion_YCbCr_to_RGB888) + elif self.set['source_format'].lower() in ['RGB565', 'rgb565']: + self.funs.append(self._ColorConversion_RGB565_to_RGB888) + elif self.set['source_format'].lower() in ['l', 'L' , 'nir', 'NIR']: + self.funs.append(self._ColorConversion_L_to_RGB888) + elif self.set['source_format'].lower() in ['RGBA8888', 'rgba8888' , 'RGBA', 'rgba']: + self.funs.append(self._ColorConversion_RGBA8888_to_RGB888) + + ## output format + if self.set['out_format'].lower() in ['L', 'l']: + self.funs.append(self._ColorConversion_RGB888_to_L) + elif self.set['out_format'].lower() in['RGB565', 'rgb565']: + self.funs.append(self._ColorConversion_RGB888_to_RGB565) + elif self.set['out_format'].lower() in['RGBA', 'RGBA8888','rgba','rgba8888']: + self.funs.append(self._ColorConversion_RGB888_to_RGBA8888) + elif self.set['out_format'].lower() in['YUV', 'YUV444','yuv','yuv444']: + self.funs.append(self._ColorConversion_RGB888_to_YUV444) + elif self.set['out_format'].lower() in['YUV422','yuv422']: + self.funs.append(self._ColorConversion_RGB888_to_YUV422) + elif self.set['out_format'].lower() in['YCBCR', 'YCBCR444','YCbCr','YCbCr444','ycbcr','ycbcr444']: + self.funs.append(self._ColorConversion_RGB888_to_YCbCr444) + elif self.set['out_format'].lower() in['YCBCR422','YCbCr422','ycbcr422']: + self.funs.append(self._ColorConversion_RGB888_to_YCbCr422) + + def print_info(self): + print("", + "source_format:", self.set['source_format'], + ', out_format:', self.set['out_format'], + ', simulation:', self.set['options']['simulation'], + ', simulation_format:', self.set['options']['simulation_format']) + + def run(self, image_data): + assert isinstance(image_data, np.ndarray) + # print info + if str2bool(self.set['print_info']): + self.print_info() + + # color + for _, f in enumerate(self.funs): + image_data = f(image_data) + + # output + info = {} + return image_data, info + + def _ColorConversion_RGB888_to_YUV444(self, image): + ## floating + image = image.astype('float') + image = (image @ Matrix_rgb888_to_yuv + 0.5).astype('uint8') + return image + + def _ColorConversion_RGB888_to_YUV422(self, image): + # rgb888 to yuv444 + image = self._ColorConversion_RGB888_to_YUV444(image) + + # yuv444 to yuv422 + u2 = image[:, 0::2, 1] + u4 = np.repeat(u2, 2, axis=1) + v2 = image[:, 1::2, 2] + v4 = np.repeat(v2, 2, axis=1) + image[..., 1] = u4 + image[..., 2] = v4 + return image + + def _ColorConversion_YUV_to_RGB888(self, image): + ## fixed + h, w, c = image.shape + image_f = image.reshape((h * w, c)) + image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8) + + for i in range(h * w): + image_y = image_f[i, 0] *1024 + if image_f[i, 1] > 127: + image_u = -((~(image_f[i, 1] - 1)) & 0xFF) + else: + image_u = image_f[i, 1] + if image_f[i, 2] > 127: + image_v = -((~(image_f[i, 2] - 1)) & 0xFF) + else: + image_v = image_f[i, 2] + + image_r = c00_yuv * image_y + c02_yuv * image_v + image_g = c10_yuv * image_y + c11_yuv * image_u + c12_yuv * image_v + image_b = c20_yuv * image_y + c21_yuv * image_u + + image_r = signed_rounding(image_r, format_bit) + image_g = signed_rounding(image_g, format_bit) + image_b = signed_rounding(image_b, format_bit) + + image_r = image_r >> format_bit + image_g = image_g >> format_bit + image_b = image_b >> format_bit + + image_rgb_f[i, 0] = clip(image_r, 0, 255) + image_rgb_f[i, 1] = clip(image_g, 0, 255) + image_rgb_f[i, 2] = clip(image_b, 0, 255) + + image_rgb = image_rgb_f.reshape((h, w, c)) + return image_rgb + + def _ColorConversion_RGB888_to_YCbCr444(self, image): + ## floating + image = image.astype('float') + image = (image @ Matrix_rgb888_to_ycbcr + 0.5).astype('uint8') + image[:, :, 0] += 16 + image[:, :, 1] += 128 + image[:, :, 2] += 128 + + return image + + def _ColorConversion_RGB888_to_YCbCr422(self, image): + # rgb888 to ycbcr444 + image = self._ColorConversion_RGB888_to_YCbCr444(image) + + # ycbcr444 to ycbcr422 + cb2 = image[:, 0::2, 1] + cb4 = np.repeat(cb2, 2, axis=1) + cr2 = image[:, 1::2, 2] + cr4 = np.repeat(cr2, 2, axis=1) + image[..., 1] = cb4 + image[..., 2] = cr4 + return image + + def _ColorConversion_YCbCr_to_RGB888(self, image): + ## floating + if (self.set['numerical_type'] == 'floating'): + image = image.astype('float') + image[:, :, 0] -= 16 + image[:, :, 1] -= 128 + image[:, :, 2] -= 128 + image = ((image @ Matrix_ycbcr_to_rgb888) + 0.5).astype('uint8') + return image + + ## fixed + h, w, c = image.shape + image_f = image.reshape((h * w, c)) + image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8) + + for i in range(h * w): + image_y = (image_f[i, 0] - 16) * c00_ycbcr + image_cb = image_f[i, 1] - 128 + image_cr = image_f[i, 2] - 128 + + image_r = image_y + c02_ycbcr * image_cr + image_g = image_y + c11_ycbcr * image_cb + c12_ycbcr * image_cr + image_b = image_y + c21_ycbcr * image_cb + + image_r = signed_rounding(image_r, format_bit) + image_g = signed_rounding(image_g, format_bit) + image_b = signed_rounding(image_b, format_bit) + + image_r = image_r >> format_bit + image_g = image_g >> format_bit + image_b = image_b >> format_bit + + image_rgb_f[i, 0] = clip(image_r, 0, 255) + image_rgb_f[i, 1] = clip(image_g, 0, 255) + image_rgb_f[i, 2] = clip(image_b, 0, 255) + + image_rgb = image_rgb_f.reshape((h, w, c)) + return image_rgb + + def _ColorConversion_RGB888_to_RGB565(self, image): + assert (len(image.shape)==3) + assert (image.shape[2]>=3) + + image_rgb565 = np.zeros(image.shape, dtype=np.uint8) + image_rgb = image.astype('uint8') + image_rgb565[:, :, 0] = image_rgb[:, :, 0] >> 3 + image_rgb565[:, :, 1] = image_rgb[:, :, 1] >> 2 + image_rgb565[:, :, 2] = image_rgb[:, :, 2] >> 3 + return image_rgb565 + + def _ColorConversion_RGB565_to_RGB888(self, image): + assert (len(image.shape)==3) + assert (image.shape[2]==3) + + image_rgb = np.zeros(image.shape, dtype=np.uint8) + image_rgb[:, :, 0] = image[:, :, 0] << 3 + image_rgb[:, :, 1] = image[:, :, 1] << 2 + image_rgb[:, :, 2] = image[:, :, 2] << 3 + return image_rgb + + def _ColorConversion_L_to_RGB888(self, image): + image_L = image.astype('uint8') + img = Image.fromarray(image_L).convert('RGB') + image_data = np.array(img).astype('uint8') + return image_data + + def _ColorConversion_RGB888_to_L(self, image): + image_rgb = image.astype('uint8') + img = Image.fromarray(image_rgb).convert('L') + image_data = np.array(img).astype('uint8') + return image_data + + def _ColorConversion_RGBA8888_to_RGB888(self, image): + assert (len(image.shape)==3) + assert (image.shape[2]==4) + return image[:,:,:3] + + def _ColorConversion_RGB888_to_RGBA8888(self, image): + assert (len(image.shape)==3) + assert (image.shape[2]==3) + imageA = np.concatenate((image, np.zeros((image.shape[0], image.shape[1], 1), dtype=np.uint8) ), axis=2) + return imageA diff --git a/kneron_preprocessing/funcs/Crop.py b/kneron_preprocessing/funcs/Crop.py new file mode 100644 index 0000000..3dcdb71 --- /dev/null +++ b/kneron_preprocessing/funcs/Crop.py @@ -0,0 +1,145 @@ +import numpy as np +from PIL import Image +from .utils import str2int, str2float, str2bool, pad_square_to_4 +from .utils_520 import round_up_n +from .Runner_base import Runner_base, Param_base + +class General(Param_base): + type = 'center' + align_w_to_4 = False + pad_square_to_4 = False + rounding_type = 0 + crop_w = 0 + crop_h = 0 + start_x = 0. + start_y = 0. + end_x = 0. + end_y = 0. + def update(self, **dic): + self.type = dic['type'] + self.align_w_to_4 = str2bool(dic['align_w_to_4']) + self.rounding_type = str2int(dic['rounding_type']) + self.crop_w = str2int(dic['crop_w']) + self.crop_h = str2int(dic['crop_h']) + self.start_x = str2float(dic['start_x']) + self.start_y = str2float(dic['start_y']) + self.end_x = str2float(dic['end_x']) + self.end_y = str2float(dic['end_y']) + + def __str__(self): + str_out = [ + ', type:',str(self.type), + ', align_w_to_4:',str(self.align_w_to_4), + ', pad_square_to_4:',str(self.pad_square_to_4), + ', crop_w:',str(self.crop_w), + ', crop_h:',str(self.crop_h), + ', start_x:',str(self.start_x), + ', start_y:',str(self.start_y), + ', end_x:',str(self.end_x), + ', end_y:',str(self.end_y)] + return(' '.join(str_out)) + +class runner(Runner_base): + ## overwrite the class in Runner_base + general = General() + + def __str__(self): + return('') + + def update(self, **kwargs): + ## + super().update(**kwargs) + + ## + if (self.general.start_x != self.general.end_x) and (self.general.start_y != self.general.end_y): + self.general.type = 'specific' + elif(self.general.type != 'specific'): + if self.general.crop_w == 0 or self.general.crop_h == 0: + self.general.crop_w = self.common.model_size[0] + self.general.crop_h = self.common.model_size[1] + assert(self.general.crop_w > 0) + assert(self.general.crop_h > 0) + assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner']) + else: + assert(self.general.type == 'specific') + + def run(self, image_data): + ## init + img = Image.fromarray(image_data) + w, h = img.size + + ## get range + if self.general.type.lower() in ['CENTER', 'Center', 'center']: + x1, y1, x2, y2 = self._calcuate_xy_center(w, h) + elif self.general.type.lower() in ['CORNER', 'Corner', 'corner']: + x1, y1, x2, y2 = self._calcuate_xy_corner(w, h) + else: + x1 = self.general.start_x + y1 = self.general.start_y + x2 = self.general.end_x + y2 = self.general.end_y + assert( ((x1 != x2) and (y1 != y2)) ) + + ## rounding + if self.general.rounding_type == 0: + x1 = int(np.floor(x1)) + y1 = int(np.floor(y1)) + x2 = int(np.ceil(x2)) + y2 = int(np.ceil(y2)) + else: + x1 = int(round(x1)) + y1 = int(round(y1)) + x2 = int(round(x2)) + y2 = int(round(y2)) + + if self.general.align_w_to_4: + # x1 = (x1+1) &(~3) #//+2 + # x2 = (x2+2) &(~3) #//+1 + x1 = (x1+3) &(~3) #//+2 + left = w - x2 + left = (left+3) &(~3) + x2 = w - left + + ## pad_square_to_4 + if str2bool(self.general.pad_square_to_4): + x1,x2,y1,y2 = pad_square_to_4(x1,x2,y1,y2) + + # do crop + box = (x1,y1,x2,y2) + img = img.crop(box) + + # print info + if str2bool(self.common.print_info): + self.general.start_x = x1 + self.general.start_y = y1 + self.general.end_x = x2 + self.general.end_y = y2 + self.general.crop_w = x2 - x1 + self.general.crop_h = y2 - y1 + self.print_info() + + # output + image_data = np.array(img) + info = {} + info['box'] = box + + return image_data, info + + + ## protect fun + def _calcuate_xy_center(self, w, h): + x1 = w/2 - self.general.crop_w / 2 + y1 = h/2 - self.general.crop_h / 2 + x2 = w/2 + self.general.crop_w / 2 + y2 = h/2 + self.general.crop_h / 2 + return x1, y1, x2, y2 + + def _calcuate_xy_corner(self, _1, _2): + x1 = 0 + y1 = 0 + x2 = self.general.crop_w + y2 = self.general.crop_h + return x1, y1, x2, y2 + + def do_crop(self, image_data, startW, startH, endW, endH): + return image_data[startH:endH, startW:endW, :] diff --git a/kneron_preprocessing/funcs/Normalize.py b/kneron_preprocessing/funcs/Normalize.py new file mode 100644 index 0000000..0760fba --- /dev/null +++ b/kneron_preprocessing/funcs/Normalize.py @@ -0,0 +1,186 @@ +import numpy as np +from .utils import str2bool, str2int, str2float, clip_ary + +class runner(object): + def __init__(self): + self.set = { + 'general': { + 'print_info':'no', + 'model_size':[0,0], + 'numerical_type':'floating', + 'type': 'kneron' + }, + 'floating':{ + "scale": 1, + "bias": 0, + "mean": "", + "std": "", + }, + 'hw':{ + "radix":8, + "shift":"", + "sub":"" + } + } + return + + def update(self, **kwargs): + # + self.set.update(kwargs) + + # + if self.set['general']['numerical_type'] == '520': + if self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']: + self.fun_normalize = self._chen_520 + self.shift = 7 - self.set['hw']['radix'] + self.sub = 128 + elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']: + self.fun_normalize = self._chen_520 + self.shift = 8 - self.set['hw']['radix'] + self.sub = 0 + elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']: + self.fun_normalize = self._chen_520 + self.shift = 8 - self.set['hw']['radix'] + self.sub = 128 + else: + self.fun_normalize = self._chen_520 + self.shift = 0 + self.sub = 0 + elif self.set['general']['numerical_type'] == '720': + self.fun_normalize = self._chen_720 + self.shift = 0 + self.sub = 0 + else: + if self.set['general']['type'].lower() in ['TORCH', 'Torch', 'torch']: + self.fun_normalize = self._normalize_torch + self.set['floating']['scale'] = 255. + self.set['floating']['mean'] = [0.485, 0.456, 0.406] + self.set['floating']['std'] = [0.229, 0.224, 0.225] + elif self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']: + self.fun_normalize = self._normalize_tf + self.set['floating']['scale'] = 127.5 + self.set['floating']['bias'] = -1. + elif self.set['general']['type'].lower() in ['CAFFE', 'Caffe', 'caffe']: + self.fun_normalize = self._normalize_caffe + self.set['floating']['mean'] = [103.939, 116.779, 123.68] + elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']: + self.fun_normalize = self._normalize_yolo + self.set['floating']['scale'] = 255. + elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']: + self.fun_normalize = self._normalize_kneron + self.set['floating']['scale'] = 256. + self.set['floating']['bias'] = -0.5 + else: + self.fun_normalize = self._normalize_customized + self.set['floating']['scale'] = str2float(self.set['floating']['scale']) + self.set['floating']['bias'] = str2float(self.set['floating']['bias']) + if self.set['floating']['mean'] != None: + if len(self.set['floating']['mean']) != 3: + self.set['floating']['mean'] = None + if self.set['floating']['std'] != None: + if len(self.set['floating']['std']) != 3: + self.set['floating']['std'] = None + + + def print_info(self): + if self.set['general']['numerical_type'] == '520': + print("", + 'numerical_type', self.set['general']['numerical_type'], + ", type:", self.set['general']['type'], + ', shift:',self.shift, + ', sub:', self.sub) + else: + print("", + 'numerical_type', self.set['general']['numerical_type'], + ", type:", self.set['general']['type'], + ', scale:',self.set['floating']['scale'], + ', bias:', self.set['floating']['bias'], + ', mean:', self.set['floating']['mean'], + ', std:',self.set['floating']['std']) + + def run(self, image_data): + # print info + if str2bool(self.set['general']['print_info']): + self.print_info() + + # norm + image_data = self.fun_normalize(image_data) + + # output + info = {} + return image_data, info + + def _normalize_torch(self, x): + if len(x.shape) != 3: + return x + x = x.astype('float') + x = x / self.set['floating']['scale'] + x[..., 0] -= self.set['floating']['mean'][0] + x[..., 1] -= self.set['floating']['mean'][1] + x[..., 2] -= self.set['floating']['mean'][2] + x[..., 0] /= self.set['floating']['std'][0] + x[..., 1] /= self.set['floating']['std'][1] + x[..., 2] /= self.set['floating']['std'][2] + return x + + def _normalize_tf(self, x): + # print('_normalize_tf') + x = x.astype('float') + x = x / self.set['floating']['scale'] + x = x + self.set['floating']['bias'] + return x + + def _normalize_caffe(self, x): + if len(x.shape) != 3: + return x + x = x.astype('float') + x = x[..., ::-1] + x[..., 0] -= self.set['floating']['mean'][0] + x[..., 1] -= self.set['floating']['mean'][1] + x[..., 2] -= self.set['floating']['mean'][2] + return x + + def _normalize_yolo(self, x): + # print('_normalize_yolo') + x = x.astype('float') + x = x / self.set['floating']['scale'] + return x + + def _normalize_kneron(self, x): + # print('_normalize_kneron') + x = x.astype('float') + x = x/self.set['floating']['scale'] + x = x + self.set['floating']['bias'] + return x + + def _normalize_customized(self, x): + # print('_normalize_customized') + x = x.astype('float') + if self.set['floating']['scale'] != 0: + x = x/ self.set['floating']['scale'] + x = x + self.set['floating']['bias'] + if self.set['floating']['mean'] is not None: + x[..., 0] -= self.set['floating']['mean'][0] + x[..., 1] -= self.set['floating']['mean'][1] + x[..., 2] -= self.set['floating']['mean'][2] + if self.set['floating']['std'] is not None: + x[..., 0] /= self.set['floating']['std'][0] + x[..., 1] /= self.set['floating']['std'][1] + x[..., 2] /= self.set['floating']['std'][2] + + return x + + def _chen_520(self, x): + # print('_chen_520') + x = (x - self.sub).astype('uint8') + x = (np.right_shift(x,self.shift)) + x=x.astype('uint8') + return x + + def _chen_720(self, x): + # print('_chen_720') + if self.shift == 1: + x = x + np.array([[self.sub], [self.sub], [self.sub]]) + else: + x = x + np.array([[self.sub], [self.sub], [self.sub]]) + return x \ No newline at end of file diff --git a/kneron_preprocessing/funcs/Padding.py b/kneron_preprocessing/funcs/Padding.py new file mode 100644 index 0000000..e1af1c5 --- /dev/null +++ b/kneron_preprocessing/funcs/Padding.py @@ -0,0 +1,187 @@ +import numpy as np +from PIL import Image +from .utils import str2bool, str2int, str2float +from .Runner_base import Runner_base, Param_base + +class General(Param_base): + type = '' + pad_val = '' + padded_w = '' + padded_h = '' + pad_l = '' + pad_r = '' + pad_t = '' + pad_b = '' + padding_ch = 3 + padding_ch_type = 'RGB' + def update(self, **dic): + self.type = dic['type'] + self.pad_val = dic['pad_val'] + self.padded_w = str2int(dic['padded_w']) + self.padded_h = str2int(dic['padded_h']) + self.pad_l = str2int(dic['pad_l']) + self.pad_r = str2int(dic['pad_r']) + self.pad_t = str2int(dic['pad_t']) + self.pad_b = str2int(dic['pad_b']) + + def __str__(self): + str_out = [ + ', type:',str(self.type), + ', pad_val:',str(self.pad_val), + ', pad_l:',str(self.pad_l), + ', pad_r:',str(self.pad_r), + ', pad_r:',str(self.pad_t), + ', pad_b:',str(self.pad_b), + ', padding_ch:',str(self.padding_ch)] + return(' '.join(str_out)) + +class Hw(Param_base): + radix = 8 + normalize_type = 'floating' + def update(self, **dic): + self.radix = dic['radix'] + self.normalize_type = dic['normalize_type'] + + def __str__(self): + str_out = [ + ', radix:', str(self.radix), + ', normalize_type:',str(self.normalize_type)] + return(' '.join(str_out)) + + +class runner(Runner_base): + ## overwrite the class in Runner_base + general = General() + hw = Hw() + + def __str__(self): + return('') + + def update(self, **kwargs): + super().update(**kwargs) + + ## update pad type & pad length + if (self.general.pad_l != 0) or (self.general.pad_r != 0) or (self.general.pad_t != 0) or (self.general.pad_b != 0): + self.general.type = 'specific' + assert(self.general.pad_l >= 0) + assert(self.general.pad_r >= 0) + assert(self.general.pad_t >= 0) + assert(self.general.pad_b >= 0) + elif(self.general.type != 'specific'): + if self.general.padded_w == 0 or self.general.padded_h == 0: + self.general.padded_w = self.common.model_size[0] + self.general.padded_h = self.common.model_size[1] + assert(self.general.padded_w > 0) + assert(self.general.padded_h > 0) + assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner']) + else: + assert(self.general.type == 'specific') + + ## decide pad_val & padding ch + # if numerical_type is floating + if (self.common.numerical_type == 'floating'): + if self.general.pad_val != 'edge': + self.general.pad_val = str2float(self.general.pad_val) + self.general.padding_ch = 3 + self.general.padding_ch_type = 'RGB' + # if numerical_type is 520 or 720 + else: + if self.general.pad_val == '': + if self.hw.normalize_type.lower() in ['TF', 'Tf', 'tf']: + self.general.pad_val = np.uint8(-128 >> (7 - self.hw.radix)) + elif self.hw.normalize_type.lower() in ['YOLO', 'Yolo', 'yolo']: + self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix)) + elif self.hw.normalize_type.lower() in ['KNERON', 'Kneron', 'kneron']: + self.general.pad_val = np.uint8(-128 >> (8 - self.hw.radix)) + else: + self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix)) + else: + self.general.pad_val = str2int(self.general.pad_val) + self.general.padding_ch = 4 + self.general.padding_ch_type = 'RGBA' + + def run(self, image_data): + # init + shape = image_data.shape + w = shape[1] + h = shape[0] + if len(shape) < 3: + self.general.padding_ch = 1 + self.general.padding_ch_type = 'L' + else: + if shape[2] == 3 and self.general.padding_ch == 4: + image_data = np.concatenate((image_data, np.zeros((h, w, 1), dtype=np.uint8) ), axis=2) + + ## padding + if self.general.type.lower() in ['CENTER', 'Center', 'center']: + img_pad = self._padding_center(image_data, w, h) + elif self.general.type.lower() in ['CORNER', 'Corner', 'corner']: + img_pad = self._padding_corner(image_data, w, h) + else: + img_pad = self._padding_sp(image_data, w, h) + + # print info + if str2bool(self.common.print_info): + self.print_info() + + # output + info = {} + return img_pad, info + + ## protect fun + def _padding_center(self, img, ori_w, ori_h): + # img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), int(self.general.pad_val[0])) + # img = Image.fromarray(img) + # img_pad.paste(img, ((self.general.padded_w-ori_w)//2, (self.general.padded_h-ori_h)//2)) + # return img_pad + padH = self.general.padded_h - ori_h + padW = self.general.padded_w - ori_w + self.general.pad_t = padH // 2 + self.general.pad_b = (padH // 2) + (padH % 2) + self.general.pad_l = padW // 2 + self.general.pad_r = (padW // 2) + (padW % 2) + if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0: + return img + img_pad = self._padding_sp(img,ori_w,ori_h) + return img_pad + + def _padding_corner(self, img, ori_w, ori_h): + # img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), self.general.pad_val) + # img_pad.paste(img, (0, 0)) + self.general.pad_l = 0 + self.general.pad_r = self.general.padded_w - ori_w + self.general.pad_t = 0 + self.general.pad_b = self.general.padded_h - ori_h + if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0: + return img + img_pad = self._padding_sp(img,ori_w,ori_h) + return img_pad + + def _padding_sp(self, img, ori_w, ori_h): + # block_t = np.zeros((self.general.pad_t, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float) + # block_l = np.zeros((ori_h, self.general.pad_l, self.general.padding_ch), dtype=np.float) + # block_r = np.zeros((ori_h, self.general.pad_r, self.general.padding_ch), dtype=np.float) + # block_b = np.zeros((self.general.pad_b, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float) + # for i in range(self.general.padding_ch): + # block_t[:, :, i] = np.ones(block_t[:, :, i].shape, dtype=np.float) * self.general.pad_val + # block_l[:, :, i] = np.ones(block_l[:, :, i].shape, dtype=np.float) * self.general.pad_val + # block_r[:, :, i] = np.ones(block_r[:, :, i].shape, dtype=np.float) * self.general.pad_val + # block_b[:, :, i] = np.ones(block_b[:, :, i].shape, dtype=np.float) * self.general.pad_val + # padded_image_hor = np.concatenate((block_l, img, block_r), axis=1) + # padded_image = np.concatenate((block_t, padded_image_hor, block_b), axis=0) + # return padded_image + if self.general.padding_ch == 1: + pad_range = ( (self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r) ) + else: + pad_range = ((self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r),(0,0)) + + if isinstance(self.general.pad_val, str): + if self.general.pad_val == 'edge': + padded_image = np.pad(img, pad_range, mode="edge") + else: + padded_image = np.pad(img, pad_range, mode="constant",constant_values=0) + else: + padded_image = np.pad(img, pad_range, mode="constant",constant_values=self.general.pad_val) + + return padded_image + diff --git a/kneron_preprocessing/funcs/Resize.py b/kneron_preprocessing/funcs/Resize.py new file mode 100644 index 0000000..8e948b9 --- /dev/null +++ b/kneron_preprocessing/funcs/Resize.py @@ -0,0 +1,237 @@ +import numpy as np +import cv2 +from PIL import Image +from .utils import str2bool, str2int +from ctypes import c_float +from .Runner_base import Runner_base, Param_base + +class General(Param_base): + type = 'bilinear' + keep_ratio = True + zoom = True + calculate_ratio_using_CSim = True + resize_w = 0 + resize_h = 0 + resized_w = 0 + resized_h = 0 + def update(self, **dic): + self.type = dic['type'] + self.keep_ratio = str2bool(dic['keep_ratio']) + self.zoom = str2bool(dic['zoom']) + self.calculate_ratio_using_CSim = str2bool(dic['calculate_ratio_using_CSim']) + self.resize_w = str2int(dic['resize_w']) + self.resize_h = str2int(dic['resize_h']) + + def __str__(self): + str_out = [ + ', type:',str(self.type), + ', keep_ratio:',str(self.keep_ratio), + ', zoom:',str(self.zoom), + ', calculate_ratio_using_CSim:',str(self.calculate_ratio_using_CSim), + ', resize_w:',str(self.resize_w), + ', resize_h:',str(self.resize_h), + ', resized_w:',str(self.resized_w), + ', resized_h:',str(self.resized_h)] + return(' '.join(str_out)) + +class Hw(Param_base): + resize_bit = 12 + def update(self, **dic): + pass + + def __str__(self): + str_out = [ + ', resize_bit:',str(self.resize_bit)] + return(' '.join(str_out)) + +class runner(Runner_base): + ## overwrite the class in Runner_base + general = General() + hw = Hw() + + def __str__(self): + return('') + + def update(self, **kwargs): + super().update(**kwargs) + + ## if resize size has not been assigned, then it will take model size as resize size + if self.general.resize_w == 0 or self.general.resize_h == 0: + self.general.resize_w = self.common.model_size[0] + self.general.resize_h = self.common.model_size[1] + assert(self.general.resize_w > 0) + assert(self.general.resize_h > 0) + + ## + if self.common.numerical_type == '520': + self.general.type = 'fixed_520' + elif self.common.numerical_type == '720': + self.general.type = 'fixed_720' + assert(self.general.type.lower() in ['BILINEAR', 'Bilinear', 'bilinear', 'BICUBIC', 'Bicubic', 'bicubic', 'FIXED', 'Fixed', 'fixed', 'FIXED_520', 'Fixed_520', 'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720','CV', 'cv', 'opencv', 'OpenCV', 'CV2', 'cv2']) + + + def run(self, image_data): + ## init + ori_w = image_data.shape[1] + ori_h = image_data.shape[0] + info = {} + + ## + if self.general.keep_ratio: + self.general.resized_w, self.general.resized_h = self.calcuate_scale_keep_ratio(self.general.resize_w,self.general.resize_h, ori_w, ori_h, self.general.calculate_ratio_using_CSim) + else: + self.general.resized_w = int(self.general.resize_w) + self.general.resized_h = int(self.general.resize_h) + assert(self.general.resized_w > 0) + assert(self.general.resized_h > 0) + + ## + if (self.general.resized_w > ori_w) or (self.general.resized_h > ori_h): + if not self.general.zoom: + info['size'] = (ori_w,ori_h) + if str2bool(self.common.print_info): + print('no resize') + self.print_info() + return image_data, info + + ## resize + if self.general.type.lower() in ['BILINEAR', 'Bilinear', 'bilinear']: + image_data = self.do_resize_bilinear(image_data, self.general.resized_w, self.general.resized_h) + elif self.general.type.lower() in ['BICUBIC', 'Bicubic', 'bicubic']: + image_data = self.do_resize_bicubic(image_data, self.general.resized_w, self.general.resized_h) + elif self.general.type.lower() in ['CV', 'cv', 'opencv', 'OpenCV', 'CV2', 'cv2']: + image_data = self.do_resize_cv2(image_data, self.general.resized_w, self.general.resized_h) + elif self.general.type.lower() in ['FIXED', 'Fixed', 'fixed', 'FIXED_520', 'Fixed_520', 'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720']: + image_data = self.do_resize_fixed(image_data, self.general.resized_w, self.general.resized_h, self.hw.resize_bit, self.general.type) + + + # output + info['size'] = (self.general.resized_w, self.general.resized_h) + + # print info + if str2bool(self.common.print_info): + self.print_info() + + return image_data, info + + def calcuate_scale_keep_ratio(self, tar_w, tar_h, ori_w, ori_h, calculate_ratio_using_CSim): + if not calculate_ratio_using_CSim: + scale_w = tar_w * 1.0 / ori_w*1.0 + scale_h = tar_h * 1.0 / ori_h*1.0 + scale = scale_w if scale_w < scale_h else scale_h + new_w = int(round(ori_w * scale)) + new_h = int(round(ori_h * scale)) + return new_w, new_h + + ## calculate_ratio_using_CSim + scale_w = c_float(tar_w * 1.0 / (ori_w * 1.0)).value + scale_h = c_float(tar_h * 1.0 / (ori_h * 1.0)).value + scale_ratio = 0.0 + scale_target_w = 0 + scale_target_h = 0 + padH = 0 + padW = 0 + + bScaleW = True if scale_w < scale_h else False + if bScaleW: + scale_ratio = scale_w + scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value) + scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value) + assert (abs(scale_target_w - tar_w) <= 1), "Error: scale down width cannot meet expectation\n" + padH = tar_h - scale_target_h + padW = 0 + assert (padH >= 0), "Error: padH shouldn't be less than zero\n" + else: + scale_ratio = scale_h + scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value) + scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value) + assert (abs(scale_target_h - tar_h) <= 1), "Error: scale down height cannot meet expectation\n" + padW = tar_w - scale_target_w + padH = 0 + assert (padW >= 0), "Error: padW shouldn't be less than zero\n" + new_w = tar_w - padW + new_h = tar_h - padH + return new_w, new_h + + def do_resize_bilinear(self, image_data, resized_w, resized_h): + img = Image.fromarray(image_data) + img = img.resize((resized_w, resized_h), Image.BILINEAR) + image_data = np.array(img).astype('uint8') + return image_data + + def do_resize_bicubic(self, image_data, resized_w, resized_h): + img = Image.fromarray(image_data) + img = img.resize((resized_w, resized_h), Image.BICUBIC) + image_data = np.array(img).astype('uint8') + return image_data + + def do_resize_cv2(self, image_data, resized_w, resized_h): + image_data = cv2.resize(image_data, (resized_w, resized_h)) + image_data = np.array(image_data) + # image_data = np.array(image_data).astype('uint8') + return image_data + + def do_resize_fixed(self, image_data, resized_w, resized_h, resize_bit, type): + if len(image_data.shape) < 3: + m, n = image_data.shape + tmp = np.zeros((m,n,3), dtype=np.uint8) + tmp[:,:,0] = image_data + image_data = tmp + c = 3 + gray = True + else: + m, n, c = image_data.shape + gray = False + + resolution = 1 << resize_bit + + # Width + ratio = int(((n - 1) << resize_bit) / (resized_w - 1)) + ratio_cnt = 0 + src_x = 0 + resized_image_w = np.zeros((m, resized_w, c), dtype=np.uint8) + + for dst_x in range(resized_w): + while ratio_cnt > resolution: + ratio_cnt = ratio_cnt - resolution + src_x = src_x + 1 + mul1 = np.ones((m, c)) * (resolution - ratio_cnt) + mul2 = np.ones((m, c)) * ratio_cnt + resized_image_w[:, dst_x, :] = np.multiply(np.multiply( + image_data[:, src_x, :], mul1) + np.multiply(image_data[:, src_x + 1, :], mul2), 1/resolution) + ratio_cnt = ratio_cnt + ratio + + # Height + ratio = int(((m - 1) << resize_bit) / (resized_h - 1)) + ## NPU HW special case 2 , only on 520 + if type.lower() in ['FIXED_520', 'Fixed_520', 'fixed_520']: + if (((ratio * (resized_h - 1)) % 4096 == 0) and ratio != 4096): + ratio -= 1 + + ratio_cnt = 0 + src_x = 0 + resized_image = np.zeros( + (resized_h, resized_w, c), dtype=np.uint8) + for dst_x in range(resized_h): + while ratio_cnt > resolution: + ratio_cnt = ratio_cnt - resolution + src_x = src_x + 1 + + mul1 = np.ones((resized_w, c)) * (resolution - ratio_cnt) + mul2 = np.ones((resized_w, c)) * ratio_cnt + + ## NPU HW special case 1 , both on 520 / 720 + if (((dst_x > 0) and ratio_cnt == resolution) and (ratio != resolution)): + if type.lower() in ['FIXED_520', 'Fixed_520', 'fixed_520','FIXED_720', 'Fixed_720', 'fixed_720' ]: + resized_image[dst_x, :, :] = np.multiply(np.multiply( + resized_image_w[src_x+1, :, :], mul1) + np.multiply(resized_image_w[src_x + 2, :, :], mul2), 1/resolution) + else: + resized_image[dst_x, :, :] = np.multiply(np.multiply( + resized_image_w[src_x, :, :], mul1) + np.multiply(resized_image_w[src_x + 1, :, :], mul2), 1/resolution) + + ratio_cnt = ratio_cnt + ratio + + if gray: + resized_image = resized_image[:,:,0] + + return resized_image diff --git a/kneron_preprocessing/funcs/Rotate.py b/kneron_preprocessing/funcs/Rotate.py new file mode 100644 index 0000000..63f882f --- /dev/null +++ b/kneron_preprocessing/funcs/Rotate.py @@ -0,0 +1,45 @@ +import numpy as np +from .utils import str2bool, str2int + +class runner(object): + def __init__(self, *args, **kwargs): + self.set = { + 'operator': '', + "rotate_direction": 0, + + } + self.update(*args, **kwargs) + + def update(self, *args, **kwargs): + self.set.update(kwargs) + self.rotate_direction = str2int(self.set['rotate_direction']) + + # print info + if str2bool(self.set['b_print']): + self.print_info() + + def print_info(self): + print("", + 'rotate_direction', self.rotate_direction,) + + + def run(self, image_data): + image_data = self._rotate(image_data) + return image_data + + def _rotate(self,img): + if self.rotate_direction == 1 or self.rotate_direction == 2: + col, row, unit = img.shape + pInBuf = img.reshape((-1,1)) + pOutBufTemp = np.zeros((col* row* unit)) + for r in range(row): + for c in range(col): + for u in range(unit): + if self.rotate_direction == 1: + pOutBufTemp[unit * (c * row + (row - r - 1))+u] = pInBuf[unit * (r * col + c)+u] + elif self.rotate_direction == 2: + pOutBufTemp[unit * (row * (col - c - 1) + r)+u] = pInBuf[unit * (r * col + c)+u] + + img = pOutBufTemp.reshape((col,row,unit)) + + return img diff --git a/kneron_preprocessing/funcs/Runner_base.py b/kneron_preprocessing/funcs/Runner_base.py new file mode 100644 index 0000000..7bedbcf --- /dev/null +++ b/kneron_preprocessing/funcs/Runner_base.py @@ -0,0 +1,59 @@ +from abc import ABCMeta, abstractmethod + +class Param_base(object): + @abstractmethod + def update(self,**dic): + raise NotImplementedError("Must override") + + def load_dic(self, key, **dic): + if key in dic: + param = eval('self.'+key) + param = dic[key] + + def __str__(self): + str_out = [] + return(' '.join(str_out)) + + +class Common(Param_base): + print_info = False + model_size = [0,0] + numerical_type = 'floating' + + def update(self, **dic): + self.print_info = dic['print_info'] + self.model_size = dic['model_size'] + self.numerical_type = dic['numerical_type'] + + def __str__(self): + str_out = ['numerical_type:',str(self.numerical_type)] + return(' '.join(str_out)) + +class Runner_base(metaclass=ABCMeta): + common = Common() + general = Param_base() + floating = Param_base() + hw = Param_base() + + def update(self, **kwargs): + ## update param + self.common.update(**kwargs['common']) + self.general.update(**kwargs['general']) + assert(self.common.numerical_type.lower() in ['floating', '520', '720']) + if (self.common.numerical_type == 'floating'): + if (self.floating.__class__.__name__ != 'Param_base'): + self.floating.update(**kwargs['floating']) + else: + if (self.hw.__class__.__name__ != 'Param_base'): + self.hw.update(**kwargs['hw']) + + def print_info(self): + if (self.common.numerical_type == 'floating'): + print(self, self.common, self.general, self.floating) + else: + print(self, self.common, self.general, self.hw) + + + + + diff --git a/kneron_preprocessing/funcs/__init__.py b/kneron_preprocessing/funcs/__init__.py new file mode 100644 index 0000000..0b46298 --- /dev/null +++ b/kneron_preprocessing/funcs/__init__.py @@ -0,0 +1,2 @@ +from . import ColorConversion, Padding, Resize, Crop, Normalize, Rotate + diff --git a/kneron_preprocessing/funcs/utils.py b/kneron_preprocessing/funcs/utils.py new file mode 100644 index 0000000..a1e509a --- /dev/null +++ b/kneron_preprocessing/funcs/utils.py @@ -0,0 +1,372 @@ +import numpy as np +from PIL import Image +import struct + +def pad_square_to_4(x_start, x_end, y_start, y_end): + w_int = x_end - x_start + h_int = y_end - y_start + pad = w_int - h_int + if pad > 0: + pad_s = (pad >> 1) &(~3) + pad_e = pad - pad_s + y_start -= pad_s + y_end += pad_e + else:#//pad <=0 + pad_s = -(((pad) >> 1) &(~3)) + pad_e = (-pad) - pad_s + x_start -= pad_s + x_end += pad_e + return x_start, x_end, y_start, y_end + +def str_fill(value): + if len(value) == 1: + value = "0" + value + elif len(value) == 0: + value = "00" + + return value + +def clip_ary(value): + list_v = [] + for i in range(len(value)): + v = value[i] % 256 + list_v.append(v) + + return list_v + +def str2bool(v): + if isinstance(v,bool): + return v + return v.lower() in ('TRUE', 'True', 'true', '1', 'T', 't', 'Y', 'YES', 'y', 'yes') + + +def str2int(s): + if s == "": + s = 0 + s = int(s) + return s + +def str2float(s): + if s == "": + s = 0 + s = float(s) + return s + +def clip(value, mini, maxi): + if value < mini: + result = mini + elif value > maxi: + result = maxi + else: + result = value + + return result + + +def clip_ary(value): + list_v = [] + for i in range(len(value)): + v = value[i] % 256 + list_v.append(v) + + return list_v + + +def signed_rounding(value, bit): + if value < 0: + value = value - (1 << (bit - 1)) + else: + value = value + (1 << (bit - 1)) + + return value + +def hex_loader(data_folder,**kwargs): + format_mode = kwargs['raw_img_fmt'] + src_h = kwargs['img_in_height'] + src_w = kwargs['img_in_width'] + + if format_mode in ['YUV444', 'yuv444', 'YCBCR444', 'YCbCr444', 'ycbcr444']: + output = hex_yuv444(data_folder,src_h,src_w) + elif format_mode in ['RGB565', 'rgb565']: + output = hex_rgb565(data_folder,src_h,src_w) + elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']: + output = hex_yuv422(data_folder,src_h,src_w) + + return output + +def hex_rgb565(hex_folder,src_h,src_w): + pix_per_line = 8 + byte_per_line = 16 + + f = open(hex_folder) + pixel_r = [] + pixel_g = [] + pixel_b = [] + + # Ignore the first line + f.readline() + input_line = int((src_h * src_w)/pix_per_line) + for i in range(input_line): + readline = f.readline() + for j in range(int(byte_per_line/2)-1, -1, -1): + data1 = int(readline[(j * 4 + 0):(j * 4 + 2)], 16) + data0 = int(readline[(j * 4 + 2):(j * 4 + 4)], 16) + r = ((data1 & 0xf8) >> 3) + g = (((data0 & 0xe0) >> 5) + ((data1 & 0x7) << 3)) + b = (data0 & 0x1f) + pixel_r.append(r) + pixel_g.append(g) + pixel_b.append(b) + + ary_r = np.array(pixel_r, dtype=np.uint8) + ary_g = np.array(pixel_g, dtype=np.uint8) + ary_b = np.array(pixel_b, dtype=np.uint8) + output = np.concatenate((ary_r[:, None], ary_g[:, None], ary_b[:, None]), axis=1) + output = output.reshape((src_h, src_w, 3)) + + return output + +def hex_yuv444(hex_folder,src_h,src_w): + pix_per_line = 4 + byte_per_line = 16 + + f = open(hex_folder) + byte0 = [] + byte1 = [] + byte2 = [] + byte3 = [] + + # Ignore the first line + f.readline() + input_line = int((src_h * src_w)/pix_per_line) + for i in range(input_line): + readline = f.readline() + for j in range(byte_per_line-1, -1, -1): + data = int(readline[(j*2):(j*2+2)], 16) + if (j+1) % 4 == 0: + byte0.append(data) + elif (j+2) % 4 == 0: + byte1.append(data) + elif (j+3) % 4 == 0: + byte2.append(data) + elif (j+4) % 4 == 0: + byte3.append(data) + # ary_a = np.array(byte0, dtype=np.uint8) + ary_v = np.array(byte1, dtype=np.uint8) + ary_u = np.array(byte2, dtype=np.uint8) + ary_y = np.array(byte3, dtype=np.uint8) + output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1) + output = output.reshape((src_h, src_w, 3)) + + return output + +def hex_yuv422(hex_folder,src_h,src_w): + pix_per_line = 8 + byte_per_line = 16 + f = open(hex_folder) + pixel_y = [] + pixel_u = [] + pixel_v = [] + + # Ignore the first line + f.readline() + input_line = int((src_h * src_w)/pix_per_line) + for i in range(input_line): + readline = f.readline() + for j in range(int(byte_per_line/4)-1, -1, -1): + data3 = int(readline[(j * 8 + 0):(j * 8 + 2)], 16) + data2 = int(readline[(j * 8 + 2):(j * 8 + 4)], 16) + data1 = int(readline[(j * 8 + 4):(j * 8 + 6)], 16) + data0 = int(readline[(j * 8 + 6):(j * 8 + 8)], 16) + pixel_y.append(data3) + pixel_y.append(data1) + pixel_u.append(data2) + pixel_u.append(data2) + pixel_v.append(data0) + pixel_v.append(data0) + + ary_y = np.array(pixel_y, dtype=np.uint8) + ary_u = np.array(pixel_u, dtype=np.uint8) + ary_v = np.array(pixel_v, dtype=np.uint8) + output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1) + output = output.reshape((src_h, src_w, 3)) + + return output + +def bin_loader(data_folder,**kwargs): + format_mode = kwargs['raw_img_fmt'] + src_h = kwargs['img_in_height'] + src_w = kwargs['img_in_width'] + if format_mode in ['YUV','yuv','YUV444', 'yuv444', 'YCBCR','YCbCr','ycbcr','YCBCR444', 'YCbCr444', 'ycbcr444']: + output = bin_yuv444(data_folder,src_h,src_w) + elif format_mode in ['RGB565', 'rgb565']: + output = bin_rgb565(data_folder,src_h,src_w) + elif format_mode in ['NIR', 'nir','NIR888', 'nir888']: + output = bin_nir(data_folder,src_h,src_w) + elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']: + output = bin_yuv422(data_folder,src_h,src_w) + elif format_mode in ['RGB888','rgb888']: + output = np.fromfile(data_folder, dtype='uint8') + output = output.reshape(src_h,src_w,3) + elif format_mode in ['RGBA8888','rgba8888', 'RGBA' , 'rgba']: + output_temp = np.fromfile(data_folder, dtype='uint8') + output_temp = output_temp.reshape(src_h,src_w,4) + output = output_temp[:,:,0:3] + + return output + +def bin_yuv444(in_img_path,src_h,src_w): + # load bin + struct_fmt = '1B' + struct_len = struct.calcsize(struct_fmt) + struct_unpack = struct.Struct(struct_fmt).unpack_from + + row = src_h + col = src_w + pixels = row*col + + raw = [] + with open(in_img_path, "rb") as f: + while True: + data = f.read(struct_len) + if not data: break + s = struct_unpack(data) + raw.append(s[0]) + + + raw = raw[:pixels*4] + + # + output = np.zeros((pixels * 3), dtype=np.uint8) + cnt = 0 + for i in range(0, pixels*4, 4): + #Y + output[cnt] = raw[i+3] + #U + cnt += 1 + output[cnt] = raw[i+2] + #V + cnt += 1 + output[cnt] = raw[i+1] + + cnt += 1 + + output = output.reshape((src_h,src_w,3)) + return output + +def bin_yuv422(in_img_path,src_h,src_w): + # load bin + struct_fmt = '1B' + struct_len = struct.calcsize(struct_fmt) + struct_unpack = struct.Struct(struct_fmt).unpack_from + + row = src_h + col = src_w + pixels = row*col + + raw = [] + with open(in_img_path, "rb") as f: + while True: + data = f.read(struct_len) + if not data: break + s = struct_unpack(data) + raw.append(s[0]) + + + raw = raw[:pixels*2] + + # + output = np.zeros((pixels * 3), dtype=np.uint8) + cnt = 0 + for i in range(0, pixels*2, 4): + #Y0 + output[cnt] = raw[i+3] + #U0 + cnt += 1 + output[cnt] = raw[i+2] + #V0 + cnt += 1 + output[cnt] = raw[i] + #Y1 + cnt += 1 + output[cnt] = raw[i+1] + #U1 + cnt += 1 + output[cnt] = raw[i+2] + #V1 + cnt += 1 + output[cnt] = raw[i] + + cnt += 1 + + output = output.reshape((src_h,src_w,3)) + return output + +def bin_rgb565(in_img_path,src_h,src_w): + # load bin + struct_fmt = '1B' + struct_len = struct.calcsize(struct_fmt) + struct_unpack = struct.Struct(struct_fmt).unpack_from + + row = src_h + col = src_w + pixels = row*col + + rgba565 = [] + with open(in_img_path, "rb") as f: + while True: + data = f.read(struct_len) + if not data: break + s = struct_unpack(data) + rgba565.append(s[0]) + + + rgba565 = rgba565[:pixels*2] + + # rgb565_bin to numpy_array + output = np.zeros((pixels * 3), dtype=np.uint8) + cnt = 0 + for i in range(0, pixels*2, 2): + temp = rgba565[i] + temp2 = rgba565[i+1] + #R-5 + output[cnt] = (temp2 >>3) + + #G-6 + cnt += 1 + output[cnt] = ((temp & 0xe0) >> 5) + ((temp2 & 0x07) << 3) + + #B-5 + cnt += 1 + output[cnt] = (temp & 0x1f) + + cnt += 1 + + output = output.reshape((src_h,src_w,3)) + return output + +def bin_nir(in_img_path,src_h,src_w): + # load bin + struct_fmt = '1B' + struct_len = struct.calcsize(struct_fmt) + struct_unpack = struct.Struct(struct_fmt).unpack_from + + nir = [] + with open(in_img_path, "rb") as f: + while True: + data = f.read(struct_len) + if not data: break + s = struct_unpack(data) + nir.append(s[0]) + + nir = nir[:src_h*src_w] + pixels = len(nir) + # nir_bin to numpy_array + output = np.zeros((len(nir) * 3), dtype=np.uint8) + for i in range(0, pixels): + output[i*3]=nir[i] + output[i*3+1]=nir[i] + output[i*3+2]=nir[i] + + output = output.reshape((src_h,src_w,3)) + return output diff --git a/kneron_preprocessing/funcs/utils_520.py b/kneron_preprocessing/funcs/utils_520.py new file mode 100644 index 0000000..27bd860 --- /dev/null +++ b/kneron_preprocessing/funcs/utils_520.py @@ -0,0 +1,50 @@ +import math + +def round_up_16(num): + return ((num + (16 - 1)) & ~(16 - 1)) + +def round_up_n(num, n): + if (num > 0): + temp = float(num) / n + return math.ceil(temp) * n + else: + return -math.ceil(float(-num) / n) * n + +def cal_img_row_offset(crop_num, pad_num, start_row, out_row, orig_row): + + scaled_img_row = int(out_row - (pad_num[1] + pad_num[3])) + if ((start_row - pad_num[1]) > 0): + img_str_row = int((start_row - pad_num[1])) + else: + img_str_row = 0 + valid_row = int(orig_row - (crop_num[1] + crop_num[3])) + img_str_row = int(valid_row * img_str_row / scaled_img_row) + return int(img_str_row + crop_num[1]) + +def get_pad_num(pad_num_orig, left, up, right, bottom): + pad_num = [0]*4 + for i in range(0,4): + pad_num[i] = pad_num_orig[i] + + if not (left): + pad_num[0] = 0 + if not (up): + pad_num[1] = 0 + if not (right): + pad_num[2] = 0 + if not (bottom): + pad_num[3] = 0 + + return pad_num + +def get_byte_per_pixel(raw_fmt): + if raw_fmt.lower() in ['RGB888', 'rgb888', 'RGB', 'rgb888']: + return 4 + elif raw_fmt.lower() in ['YUV', 'yuv', 'YUV422', 'yuv422']: + return 2 + elif raw_fmt.lower() in ['RGB565', 'rgb565']: + return 2 + elif raw_fmt.lower() in ['NIR888', 'nir888', 'NIR', 'nir']: + return 1 + else: + return -1 \ No newline at end of file diff --git a/kneron_preprocessing/funcs/utils_720.py b/kneron_preprocessing/funcs/utils_720.py new file mode 100644 index 0000000..8d1a046 --- /dev/null +++ b/kneron_preprocessing/funcs/utils_720.py @@ -0,0 +1,42 @@ +import numpy as np +from PIL import Image + +def twos_complement(value): + value = int(value) + # msb = (value & 0x8000) * (1/np.power(2, 15)) + msb = (value & 0x8000) >> 15 + if msb == 1: + if (((~value) & 0xFFFF) + 1) >= 0xFFFF: + result = ((~value) & 0xFFFF) + else: + result = (((~value) & 0xFFFF) + 1) + result = result * (-1) + else: + result = value + + return result + + +def twos_complement_pix(value): + h, _ = value.shape + for i in range(h): + value[i, 0] = twos_complement(value[i, 0]) + + return value + +def clip(value, mini, maxi): + if value < mini: + result = mini + elif value > maxi: + result = maxi + else: + result = value + + return result + +def clip_pix(value, mini, maxi): + h, _ = value.shape + for i in range(h): + value[i, 0] = clip(value[i, 0], mini, maxi) + + return value \ No newline at end of file diff --git a/mmseg/datasets/__init__.py b/mmseg/datasets/__init__.py index 5d42a11..fb8dbb2 100644 --- a/mmseg/datasets/__init__.py +++ b/mmseg/datasets/__init__.py @@ -18,6 +18,12 @@ from .pascal_context import PascalContextDataset, PascalContextDataset59 from .potsdam import PotsdamDataset from .stare import STAREDataset from .voc import PascalVOCDataset +from .golf_dataset import GolfDataset +from .golf7_dataset import Golf7Dataset +from .golf1_dataset import GrassOnlyDataset +from .golf4_dataset import Golf4Dataset +from .golf2_dataset import Golf2Dataset +from .golf8_dataset import Golf8Dataset __all__ = [ 'CustomDataset', 'build_dataloader', 'ConcatDataset', 'RepeatDataset', diff --git a/mmseg/datasets/golf1_dataset.py b/mmseg/datasets/golf1_dataset.py new file mode 100644 index 0000000..27d9597 --- /dev/null +++ b/mmseg/datasets/golf1_dataset.py @@ -0,0 +1,80 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class GrassOnlyDataset(CustomDataset): + """GrassOnlyDataset for semantic segmentation with only one valid class: grass.""" + + CLASSES = ('grass',) + + PALETTE = [ + [0, 128, 0], # grass - green + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(GrassOnlyDataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + print("✅ [GrassOnlyDataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + print("🧪 [GrassOnlyDataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(GrassOnlyDataset, self).evaluate(results, metrics, logger) + + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf2_dataset.py b/mmseg/datasets/golf2_dataset.py new file mode 100644 index 0000000..a267074 --- /dev/null +++ b/mmseg/datasets/golf2_dataset.py @@ -0,0 +1,84 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class Golf2Dataset(CustomDataset): + """Golf2Dataset for semantic segmentation with 2 valid classes (ignore background).""" + + CLASSES = ( + 'grass', 'road' + ) + + PALETTE = [ + [0, 255, 0], # grass - green + [255, 165, 0], # road - orange + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(Golf2Dataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + print("✅ [Golf2Dataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + print("🧪 [Golf2Dataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(Golf2Dataset, self).evaluate(results, metrics, logger) + + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf4_dataset.py b/mmseg/datasets/golf4_dataset.py new file mode 100644 index 0000000..c68f689 --- /dev/null +++ b/mmseg/datasets/golf4_dataset.py @@ -0,0 +1,86 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class Golf4Dataset(CustomDataset): + """Golf4Dataset for semantic segmentation with 4 valid classes (ignore background).""" + + CLASSES = ( + 'car', 'grass', 'people', 'road' + ) + + PALETTE = [ + [0, 0, 128], # car - dark blue + [0, 255, 0], # grass - green + [255, 0, 0], # people - red + [255, 165, 0], # road - orange + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(Golf4Dataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + print("✅ [Golf4Dataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + print("🧪 [Golf4Dataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(Golf4Dataset, self).evaluate(results, metrics, logger) + + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf7_dataset.py b/mmseg/datasets/golf7_dataset.py new file mode 100644 index 0000000..617e695 --- /dev/null +++ b/mmseg/datasets/golf7_dataset.py @@ -0,0 +1,90 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class Golf7Dataset(CustomDataset): + """Golf8Dataset for semantic segmentation with 7 valid classes (ignore background).""" + + CLASSES = ( + 'bunker', 'car', 'grass', + 'greenery', 'person', 'road', 'tree' + ) + + PALETTE = [ + [128, 0, 0], # bunker - dark red + [0, 0, 128], # car - dark blue + [0, 128, 0], # grass - green + [0, 255, 0], # greenery - light green + [255, 0, 0], # person - red + [255, 165, 0], # road - gray + [0, 255, 255], # tree - cyan + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(Golf7Dataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + print("✅ [Golf7Dataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + print("🧪 [Golf8Dataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(Golf7Dataset, self).evaluate(results, metrics, logger) + + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf8_dataset.py b/mmseg/datasets/golf8_dataset.py new file mode 100644 index 0000000..4d8cdf0 --- /dev/null +++ b/mmseg/datasets/golf8_dataset.py @@ -0,0 +1,92 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class Golf8Dataset(CustomDataset): + """Golf8Dataset for semantic segmentation with 8 valid classes (ignore background).""" + + CLASSES = ( + 'bunker', 'car', 'grass', + 'greenery', 'person', 'pond', + 'road', 'tree' + ) + + PALETTE = [ + [128, 0, 0], # bunker - dark red + [0, 0, 128], # car - dark blue + [0, 128, 0], # grass - green + [0, 255, 0], # greenery - light green + [255, 0, 0], # person - red + [0, 255, 255], # pond - cyan + [255, 165, 0], # road - orange + [0, 128, 128], # tree - dark cyan + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(Golf8Dataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + print("✅ [Golf8Dataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + print("🧪 [Golf8Dataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(Golf8Dataset, self).evaluate(results, metrics, logger) + + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf_dataset.py b/mmseg/datasets/golf_dataset.py new file mode 100644 index 0000000..b141663 --- /dev/null +++ b/mmseg/datasets/golf_dataset.py @@ -0,0 +1,96 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class GolfDataset(CustomDataset): + """GolfDataset for semantic segmentation with four classes: car, grass, people, and road.""" + + # ✅ 固定的類別與調色盤(不從 config 接收) + CLASSES = ('car', 'grass', 'people', 'road') + PALETTE = [ + [246, 14, 135], # car + [233, 81, 78], # grass + [220, 148, 21], # people + [207, 215, 220], # road + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(GolfDataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + # ✅ DEBUG:初始化時印出 CLASSES 與 PALETTE + print("✅ [GolfDataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + result = result.astype(np.uint8) + + # ✅ 把所有無效類別設為 255(當作背景處理) + result[result >= len(self.PALETTE)] = 255 + + output = Image.fromarray(result).convert('P') + + # ✅ 建立 palette,支援背景 class 255 為黑色 + palette = np.zeros((256, 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + palette[255] = [0, 0, 0] # 黑色背景 + + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + # ✅ DEBUG:評估時印出目前 CLASSES 使用狀況 + print("🧪 [GolfDataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(GolfDataset, self).evaluate(results, metrics, logger) + + # ✅ DEBUG:印出最終的 eval_results keys + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/mmseg/datasets/golf_dataset1.py b/mmseg/datasets/golf_dataset1.py new file mode 100644 index 0000000..4edbe8e --- /dev/null +++ b/mmseg/datasets/golf_dataset1.py @@ -0,0 +1,66 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp + +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + + +@DATASETS.register_module() +class GolfDataset(CustomDataset): + """GolfDataset for custom semantic segmentation with two classes: road and grass.""" + + CLASSES = ('road', 'grass') + + PALETTE = [[128, 64, 128], # road + [0, 255, 0]] # grass + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(GolfDataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + result_files = self.results2img(results, imgfile_prefix, indices) + return result_files + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(GolfDataset, self).evaluate(results, metrics, logger) + return eval_results diff --git a/mmseg/datasets/golf_datasetcanuse.py b/mmseg/datasets/golf_datasetcanuse.py new file mode 100644 index 0000000..ebf2b36 --- /dev/null +++ b/mmseg/datasets/golf_datasetcanuse.py @@ -0,0 +1,87 @@ +# Copyright (c) OpenMMLab. All rights reserved. +import os.path as osp +import mmcv +import numpy as np +from mmcv.utils import print_log +from PIL import Image + +from .builder import DATASETS +from .custom import CustomDataset + +@DATASETS.register_module() +class GolfDataset(CustomDataset): + """GolfDataset for semantic segmentation with four classes: car, grass, people, and road.""" + + # ✅ 固定的類別與調色盤(不從 config 接收) + CLASSES = ('car', 'grass', 'people', 'road') + PALETTE = [ + [246, 14, 135], # car + [233, 81, 78], # grass + [220, 148, 21], # people + [207, 215, 220], # road + ] + + def __init__(self, + img_suffix='_leftImg8bit.png', + seg_map_suffix='_gtFine_labelIds.png', + **kwargs): + super(GolfDataset, self).__init__( + img_suffix=img_suffix, + seg_map_suffix=seg_map_suffix, + **kwargs) + + # ✅ DEBUG:初始化時印出 CLASSES 與 PALETTE + print("✅ [GolfDataset] 初始化完成") + print(f" ➤ CLASSES: {self.CLASSES}") + print(f" ➤ PALETTE: {self.PALETTE}") + print(f" ➤ img_suffix: {img_suffix}") + print(f" ➤ seg_map_suffix: {seg_map_suffix}") + print(f" ➤ img_dir: {self.img_dir}") + print(f" ➤ ann_dir: {self.ann_dir}") + print(f" ➤ dataset length: {len(self)}") + + def results2img(self, results, imgfile_prefix, indices=None): + """Write the segmentation results to images.""" + if indices is None: + indices = list(range(len(self))) + + mmcv.mkdir_or_exist(imgfile_prefix) + result_files = [] + for result, idx in zip(results, indices): + filename = self.img_infos[idx]['filename'] + basename = osp.splitext(osp.basename(filename))[0] + png_filename = osp.join(imgfile_prefix, f'{basename}.png') + + output = Image.fromarray(result.astype(np.uint8)).convert('P') + palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8) + for label_id, color in enumerate(self.PALETTE): + palette[label_id] = color + output.putpalette(palette) + output.save(png_filename) + result_files.append(png_filename) + + return result_files + + def format_results(self, results, imgfile_prefix, indices=None): + """Format the results into dir (for evaluation or visualization).""" + return self.results2img(results, imgfile_prefix, indices) + + def evaluate(self, + results, + metric='mIoU', + logger=None, + imgfile_prefix=None): + """Evaluate the results with the given metric.""" + + # ✅ DEBUG:評估時印出目前 CLASSES 使用狀況 + print("🧪 [GolfDataset.evaluate] 被呼叫") + print(f" ➤ 當前 CLASSES: {self.CLASSES}") + print(f" ➤ 評估 metric: {metric}") + print(f" ➤ 結果數量: {len(results)}") + + metrics = metric if isinstance(metric, list) else [metric] + eval_results = super(GolfDataset, self).evaluate(results, metrics, logger) + + # ✅ DEBUG:印出最終的 eval_results keys + print(f" ➤ 返回評估指標: {list(eval_results.keys())}") + return eval_results diff --git a/tools/check/check_lane_offset.py b/tools/check/check_lane_offset.py new file mode 100644 index 0000000..bbfeceb --- /dev/null +++ b/tools/check/check_lane_offset.py @@ -0,0 +1,70 @@ +import cv2 +import numpy as np + +# === 1. 檔案與參數設定 === +img_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\pic_0441_jpg.rf.6e56eb8c0bed7f773fb447b9e217f779_leftImg8bit.png' + +# 色彩轉 label ID(RGB) +CLASS_RGB_TO_ID = { + (128, 64, 128): 3, # road(灰) + (0, 255, 0): 1, # grass(綠) + (255, 0, 255): 9, # background or sky(紫)可忽略 +} + +ROAD_ID = 3 +GRASS_ID = 1 + +# === 2. 讀圖並轉為 label mask === +bgr_img = cv2.imread(img_path) +rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB) +height, width, _ = rgb_img.shape + +label_mask = np.zeros((height, width), dtype=np.uint8) +for rgb, label in CLASS_RGB_TO_ID.items(): + match = np.all(rgb_img == rgb, axis=-1) + label_mask[match] = label + +# === 3. 分析畫面中下區域 === +y_start = int(height * 0.6) +x_start = int(width * 0.4) +x_end = int(width * 0.6) +roi = label_mask[y_start:, x_start:x_end] + +total_pixels = roi.size +road_pixels = np.sum(roi == ROAD_ID) +grass_pixels = np.sum(roi == GRASS_ID) + +road_ratio = road_pixels / total_pixels +grass_ratio = grass_pixels / total_pixels + +# === 4. 重心偏移分析 === +road_mask = (label_mask == ROAD_ID).astype(np.uint8) +M = cv2.moments(road_mask) +center_x = width // 2 +offset = 0 +cx = center_x +if M["m00"] > 0: + cx = int(M["m10"] / M["m00"]) + offset = cx - center_x + +# === 5. 結果輸出 === +print(f"🔍 中央 ROI - road比例: {road_ratio:.2f}, grass比例: {grass_ratio:.2f}") +if road_ratio < 0.5: + print("⚠️ 偏離道路(ROI 中道路比例過少)") +if grass_ratio > 0.3: + print("❗ 車輛壓到草地!") +if abs(offset) > 40: + print(f"⚠️ 道路重心偏移:{offset} px") +else: + print("✅ 道路重心正常") + +# === 6. 可視化 === +vis_img = bgr_img.copy() +cv2.rectangle(vis_img, (x_start, y_start), (x_end, height), (0, 255, 255), 2) # 黃色框 ROI +cv2.line(vis_img, (center_x, 0), (center_x, height), (255, 0, 0), 2) # 藍色中心線 +cv2.circle(vis_img, (cx, height // 2), 6, (0, 0, 255), -1) # 紅色重心點 + +# 輸出圖片 +save_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\visual_check.png' +cv2.imwrite(save_path, vis_img) +print(f"✅ 分析圖儲存成功:{save_path}") diff --git a/tools/check/checklatest.py b/tools/check/checklatest.py new file mode 100644 index 0000000..5c71477 --- /dev/null +++ b/tools/check/checklatest.py @@ -0,0 +1,33 @@ +import torch + +def check_pth_num_classes(pth_path): + checkpoint = torch.load(pth_path, map_location='cpu') + + if 'state_dict' not in checkpoint: + print("❌ 找不到 state_dict,這可能不是 MMSegmentation 的模型檔") + return + + state_dict = checkpoint['state_dict'] + + # 找出 decode head 最後一層分類器的 weight tensor + num_classes = None + for k in state_dict.keys(): + if 'decode_head' in k and 'weight' in k and 'decode_head.classifier' in k: + weight_tensor = state_dict[k] + num_classes = weight_tensor.shape[0] + print(f"✅ 檢查到類別數: {num_classes}") + break + + if num_classes is None: + print("⚠️ 無法判斷類別數,可能模型架構非標準格式") + else: + if num_classes == 19: + print("⚠️ 這是 Cityscapes 預設模型 (19 類)") + elif num_classes == 4: + print("✅ 這是 GolfDataset 自訂模型 (4 類)") + else: + print("❓ 類別數異常,請確認訓練資料與 config 設定是否一致") + +if __name__ == '__main__': + pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth' + check_pth_num_classes(pth_path) diff --git a/tools/check/checkonnx.py b/tools/check/checkonnx.py new file mode 100644 index 0000000..8f4dee3 --- /dev/null +++ b/tools/check/checkonnx.py @@ -0,0 +1,32 @@ +import onnx + +def check_onnx_num_classes(onnx_path): + model = onnx.load(onnx_path) + graph = model.graph + + print(f"📂 模型路徑: {onnx_path}") + print(f"📦 輸出節點總數: {len(graph.output)}") + + for output in graph.output: + name = output.name + shape = [] + for dim in output.type.tensor_type.shape.dim: + if dim.dim_param: + shape.append(dim.dim_param) + else: + shape.append(dim.dim_value) + print(f"🔎 輸出節點名稱: {name}") + print(f" 輸出形狀: {shape}") + if len(shape) == 4: + num_classes = shape[1] + print(f"✅ 偵測到類別數: {num_classes}") + if num_classes == 19: + print("⚠️ 這是 Cityscapes 預設模型 (19 類)") + elif num_classes == 4: + print("✅ 這是你訓練的 GolfDataset 模型 (4 類)") + else: + print("❓ 類別數未知,請確認是否正確訓練/轉換模型") + +if __name__ == '__main__': + onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.onnx' + check_onnx_num_classes(onnx_path) diff --git a/tools/check/list_pth_keys.py b/tools/check/list_pth_keys.py new file mode 100644 index 0000000..582e3c4 --- /dev/null +++ b/tools/check/list_pth_keys.py @@ -0,0 +1,29 @@ +import torch + +def check_num_classes_from_pth(pth_path): + checkpoint = torch.load(pth_path, map_location='cpu') + + if 'state_dict' not in checkpoint: + print("❌ 找不到 state_dict") + return + + state_dict = checkpoint['state_dict'] + weight_key = 'decode_head.conv_seg.weight' + + if weight_key in state_dict: + weight = state_dict[weight_key] + num_classes = weight.shape[0] + print(f"✅ 類別數: {num_classes}") + + if num_classes == 19: + print("⚠️ 這是 Cityscapes 模型 (19 類)") + elif num_classes == 4: + print("✅ 這是 GolfDataset 模型 (4 類)") + else: + print("❓ 非常規類別數,請自行確認資料與 config") + else: + print(f"❌ 找不到分類層: {weight_key}") + +if __name__ == '__main__': + pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth' + check_num_classes_from_pth(pth_path) diff --git a/tools/custom_infer.py b/tools/custom_infer.py new file mode 100644 index 0000000..7924e07 --- /dev/null +++ b/tools/custom_infer.py @@ -0,0 +1,36 @@ +import os +import torch +from mmseg.apis import inference_segmentor, init_segmentor + +def main(): + # 設定路徑 + config_file = 'configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py' + checkpoint_file = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth' + img_dir = 'data/cityscapes/leftImg8bit/val' + out_dir = 'work_dirs/vis_results' + + # 初始化模型 + model = init_segmentor(config_file, checkpoint_file, device='cuda:0') + print('CLASSES:', model.CLASSES) + print('PALETTE:', model.PALETTE) + # 建立輸出資料夾 + os.makedirs(out_dir, exist_ok=True) + + # 找出所有圖片檔 + img_list = [] + for root, _, files in os.walk(img_dir): + for f in files: + if f.endswith('.png') or f.endswith('.jpg'): + img_list.append(os.path.join(root, f)) + + # 推論每一張圖片 + for img_path in img_list: + result = inference_segmentor(model, img_path) + filename = os.path.basename(img_path) + out_path = os.path.join(out_dir, filename) + model.show_result(img_path, result, out_file=out_path, opacity=0.5) + + print(f'✅ 推論完成,共處理 {len(img_list)} 張圖片,結果已輸出至:{out_dir}') + +if __name__ == '__main__': + main() diff --git a/tools/kneron/e2eonnx.py b/tools/kneron/e2eonnx.py new file mode 100644 index 0000000..defac88 --- /dev/null +++ b/tools/kneron/e2eonnx.py @@ -0,0 +1,61 @@ +import numpy as np +import ktc +import cv2 +from PIL import Image + +# === 1. 前處理 + 推論 === +def run_e2e_simulation(img_path, onnx_path): + # 圖片前處理(724x362) + image = Image.open(img_path).convert("RGB") + image = image.resize((724, 362), Image.BILINEAR) + img_data = np.array(image) / 255.0 + img_data = np.transpose(img_data, (2, 0, 1)) # HWC → CHW + img_data = np.expand_dims(img_data, 0) # → NCHW (1,3,362,724) + + input_data = [img_data] + inf_results = ktc.kneron_inference( + input_data, + onnx_file=onnx_path, + input_names=["input"] + ) + + return inf_results + +# === 2. 呼叫推論 === +image_path = "test.png" +onnx_path = "work_dirs/meconfig8/latest_optimized.onnx" +result = run_e2e_simulation(image_path, onnx_path) + +print("推論結果 shape:", np.array(result).shape) # (1, 1, 7, 46, 91) + +# === 3. 提取與處理輸出 === +output_tensor = np.array(result)[0][0] # shape: (7, 46, 91) +pred_mask = np.argmax(output_tensor, axis=0) # shape: (46, 91) + +print("預測的 segmentation mask:") +print(pred_mask) + +# === 4. 上採樣回 724x362 === +upsampled_mask = cv2.resize(pred_mask.astype(np.uint8), (724, 362), interpolation=cv2.INTER_NEAREST) + +# === 5. 上色(簡單使用固定 palette)=== +# 根據你的 7 類別自行定義顏色 (BGR) +colors = np.array([ + [0, 0, 0], # 0: 背景 + [0, 255, 0], # 1: 草地 + [255, 0, 0], # 2: 車子 + [0, 0, 255], # 3: 人 + [255, 255, 0], # 4: 道路 + [255, 0, 255], # 5: 樹 + [0, 255, 255], # 6: 其他 +], dtype=np.uint8) + +colored_mask = colors[upsampled_mask] # shape: (362, 724, 3) +colored_mask = np.asarray(colored_mask, dtype=np.uint8) + +# === 6. 檢查並儲存 === +if colored_mask.shape != (362, 724, 3): + raise ValueError(f"❌ mask shape 不對: {colored_mask.shape}") + +cv2.imwrite("pred_mask_resized.png", colored_mask) +print("✅ 已儲存語意遮罩圖:pred_mask_resized.png") diff --git a/tools/kneron/onnx2nef720.py b/tools/kneron/onnx2nef720.py new file mode 100644 index 0000000..4c3512a --- /dev/null +++ b/tools/kneron/onnx2nef720.py @@ -0,0 +1,96 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' # 你的 onnx存放路徑 +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +data_path = "data724362" # 測試圖片資料夾 +imgsz_w, imgsz_h = 724, 362 # 輸入圖片尺寸,跟ONNX模型要求一致 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 載入並優化 ONNX 模型 === +print("🔄 Loading and optimizing ONNX...") +m = onnx.load(onnx_path) +m = ktc.onnx_optimizer.onnx2onnx_flow(m) +opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx') +onnx.save(m, opt_onnx_path) + +# === 4. 檢查 ONNX 輸入尺寸是否符合要求 === +input_tensor = m.graph.input[0] +input_shape = [dim.dim_value for dim in input_tensor.type.tensor_type.shape.dim] +print(f"📏 ONNX Input Shape: {input_shape}") + +expected_shape = [1, 3, imgsz_h, imgsz_w] # (N, C, H, W) + +if input_shape != expected_shape: + raise ValueError(f"❌ Error: ONNX input shape {input_shape} does not match expected {expected_shape}.") + +# === 5. 設定 Kneron 模型編譯參數 === +print("📐 Configuring model for KL720...") +km = ktc.ModelConfig(20008, "0001", "720", onnx_model=m) + +# (可選)模型效能評估 +eval_result = km.evaluate() +print("\n📊 NPU Performance Evaluation:\n" + str(eval_result)) + +# === 6. 準備圖片資料 === +print("🖼️ Preparing image data...") +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] + +if not files_found: + raise FileNotFoundError(f"❌ No images found in {data_path}!") + +print(f"✅ Found {len(files_found)} images in {data_path}") + +input_name = input_tensor.name +img_list = [] + +for root, _, files in os.walk(data_path): + for f in files: + fullpath = os.path.join(root, f) + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➔ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32).copy() + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➔ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➔ NCHW + img_list.append(img_np) + print(f"✅ Processed: {fullpath}") + except Exception as e: + print(f"❌ Failed to process {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ Error: No valid images were processed!") + +# === 7. BIE 量化分析 === +print("📦 Running fixed-point analysis (BIE)...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ Error: BIE model was not generated!") + +print("✅ BIE model saved to:", bie_save_path) + +# === 8. 編譯 NEF 模型 === +print("⚙️ Compiling NEF model...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ Error: NEF model was not generated!") + +print("✅ NEF compile done!") +print("📁 NEF file saved to:", nef_save_path) diff --git a/tools/kneron/onnx2nefSTDC630.py b/tools/kneron/onnx2nefSTDC630.py new file mode 100644 index 0000000..d228372 --- /dev/null +++ b/tools/kneron/onnx2nefSTDC630.py @@ -0,0 +1,103 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image +import kneronnxopt + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx') +data_path = "data724362" +imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 優化 ONNX 模型(使用 kneronnxopt API)=== +print("⚙️ 使用 kneronnxopt 優化 ONNX...") +try: + model = onnx.load(onnx_path) + input_tensor = model.graph.input[0] + input_name = input_tensor.name + input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim] + print(f"📌 模型實際的 input name 是: {input_name}") + + model = kneronnxopt.optimize( + model, + duplicate_shared_weights=1, + skip_check=False, + skip_fuse_qkv=True + ) + onnx.save(model, optimized_path) +except Exception as e: + print(f"❌ 優化失敗: {e}") + exit(1) + +# === 4. 載入優化後的模型 === +print("🔄 載入優化後的 ONNX...") +m = onnx.load(optimized_path) + +# === 5. 設定 Kneron 模型編譯參數 === +print("📐 配置模型...") +km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m) + +# (可選)模型效能評估 +eval_result = km.evaluate() +print("\n📊 NPU 效能評估:\n" + str(eval_result)) + +# === 6. 處理輸入圖片 === +print("🖼️ 處理輸入圖片...") +input_name = m.graph.input[0].name +img_list = [] + +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] + +if not files_found: + raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}!") + +for root, _, files in os.walk(data_path): + for f in files: + fullpath = os.path.join(root, f) + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32) + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW + img_list.append(img_np) + print(f"✅ 處理成功: {fullpath}") + except Exception as e: + print(f"❌ 圖片處理失敗 {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ 錯誤:沒有有效圖片被處理!") + +# === 7. BIE 分析(量化)=== +print("📦 執行固定點分析 BIE...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ 無法產生 BIE 模型") + +print("✅ BIE 模型儲存於:", bie_save_path) + +# === 8. 編譯 NEF 模型 === +print("⚙️ 編譯 NEF 模型...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ 無法產生 NEF 模型") + +print("✅ NEF 編譯完成") +print("📁 NEF 檔案儲存於:", nef_save_path) diff --git a/tools/kneron/onnx2nefSTDC630_2.py b/tools/kneron/onnx2nefSTDC630_2.py new file mode 100644 index 0000000..5ed3d45 --- /dev/null +++ b/tools/kneron/onnx2nefSTDC630_2.py @@ -0,0 +1,64 @@ +import os +import numpy as np +import onnx +import shutil +import cv2 +import ktc + +onnx_dir = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +data_path = "data512" +imgsz = (512, 512) + +os.makedirs(onnx_dir, exist_ok=True) + +print("🔄 Loading and optimizing ONNX...") +model = onnx.load(onnx_path) +model = ktc.onnx_optimizer.onnx2onnx_flow(model) +opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx') +onnx.save(model, opt_onnx_path) + +print("📐 Configuring model...") +km = ktc.ModelConfig(20008, "0001", "630", onnx_model=model) + +# Optional: performance check +print("\n📊 Evaluating model...") +print(km.evaluate()) + +input_name = model.graph.input[0].name +print("📥 ONNX input name:", input_name) + +img_list = [] +print("🖼️ Preprocessing images...") +for root, _, files in os.walk(data_path): + for fname in files: + if fname.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp')): + path = os.path.join(root, fname) + img = cv2.imread(path) + img = cv2.resize(img, imgsz) + img = img.astype(np.float32) / 256.0 - 0.5 + img = np.transpose(img, (2, 0, 1)) # HWC ➝ CHW + img = np.expand_dims(img, axis=0) # Add batch dim + img_list.append(img) + print("✅", path) + +if not img_list: + raise RuntimeError("❌ No images processed!") + +print("📦 Quantizing (BIE)...") +bie_path = km.analysis({input_name: img_list}) +bie_save = os.path.join(onnx_dir, os.path.basename(bie_path)) +shutil.copy(bie_path, bie_save) + +if not os.path.exists(bie_save): + raise RuntimeError("❌ BIE model not saved!") + +print("⚙️ Compiling NEF...") +nef_path = ktc.compile([km]) +nef_save = os.path.join(onnx_dir, os.path.basename(nef_path)) +shutil.copy(nef_path, nef_save) + +if not os.path.exists(nef_save): + raise RuntimeError("❌ NEF model not saved!") + +print("✅ Compile finished. NEF at:", nef_save) diff --git a/tools/kneron/onnx2nefSTDC630canuse.py b/tools/kneron/onnx2nefSTDC630canuse.py new file mode 100644 index 0000000..fd0a5da --- /dev/null +++ b/tools/kneron/onnx2nefSTDC630canuse.py @@ -0,0 +1,86 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +data_path = "data724362" +imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 載入並優化 ONNX 模型 === +print("🔄 Loading and optimizing ONNX...") +m = onnx.load(onnx_path) +m = ktc.onnx_optimizer.onnx2onnx_flow(m) +opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx') +onnx.save(m, opt_onnx_path) + +# === 4. 設定 Kneron 模型編譯參數 === +print("📐 Configuring model...") +km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m) + +# (可選)模型效能評估 +eval_result = km.evaluate() +print("\n📊 NPU Performance Evaluation:\n" + str(eval_result)) + +# === 5. 準備圖片資料 === +print("🖼️ Preparing image data...") +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] + +if not files_found: + raise FileNotFoundError(f"❌ No images found in {data_path}!") + +print(f"✅ Found {len(files_found)} images in {data_path}") + +input_name = m.graph.input[0].name +img_list = [] + +for root, _, files in os.walk(data_path): + for f in files: + fullpath = os.path.join(root, f) + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32) + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW (加上 batch 維度) + img_list.append(img_np) + print(f"✅ Processed: {fullpath}") + except Exception as e: + print(f"❌ Failed to process {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ Error: No valid images were processed!") + +# === 6. BIE 量化分析 === +print("📦 Running fixed-point analysis...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ Error: BIE model was not generated!") + +print("✅ BIE model saved to:", bie_save_path) + +# === 7. 編譯 NEF 模型 === +print("⚙️ Compiling NEF model...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ Error: NEF model was not generated!") + +print("✅ NEF compile done!") +print("📁 NEF file saved to:", nef_save_path) diff --git a/tools/kneron/onnx2nef_stdc630_safe.py b/tools/kneron/onnx2nef_stdc630_safe.py new file mode 100644 index 0000000..9b3539f --- /dev/null +++ b/tools/kneron/onnx2nef_stdc630_safe.py @@ -0,0 +1,92 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx') +data_path = 'data724362' +imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)=== +print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...") +model = onnx.load(onnx_path) +model = ktc.onnx_optimizer.onnx2onnx_flow(model) +onnx.save(model, optimized_path) + +# === 4. 驗證輸入 Shape 是否正確 === +print("📏 驗證 ONNX Input Shape...") +input_tensor = model.graph.input[0] +input_name = input_tensor.name +input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim] +expected_shape = [1, 3, imgsz_h, imgsz_w] +print(f"📌 input_name: {input_name}") +print(f"📌 input_shape: {input_shape}") +if input_shape != expected_shape: + raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}") + +# === 5. 初始化模型編譯器 (for KL630) === +print("📐 配置模型 for KL630...") +km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model) + +# (可選)效能分析 +eval_result = km.evaluate() +print("\n📊 NPU 效能分析:\n" + str(eval_result)) + +# === 6. 圖片預處理 === +print("🖼️ 處理輸入圖片...") +img_list = [] +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] +if not files_found: + raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}") + +for root, _, files in os.walk(data_path): + for f in files: + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + fullpath = os.path.join(root, f) + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32) + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW + img_list.append(img_np) + print(f"✅ 處理成功: {fullpath}") + except Exception as e: + print(f"❌ 處理失敗 {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ 沒有成功處理任何圖片!") + +# === 7. 執行 BIE 量化分析 === +print("📦 執行固定點分析 (BIE)...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ 無法產生 BIE 模型") + +print("✅ BIE 模型儲存於:", bie_save_path) + +# === 8. 編譯 NEF 模型 === +print("⚙️ 編譯 NEF 模型 for KL630...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ 無法產生 NEF 模型") + +print("✅ NEF 編譯完成") +print("📁 NEF 檔案儲存於:", nef_save_path) diff --git a/tools/kneron/onnx2nef_stdc830_safe.py b/tools/kneron/onnx2nef_stdc830_safe.py new file mode 100644 index 0000000..c7b8b5d --- /dev/null +++ b/tools/kneron/onnx2nef_stdc830_safe.py @@ -0,0 +1,92 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx') +data_path = 'data724362' +imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)=== +print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...") +model = onnx.load(onnx_path) +model = ktc.onnx_optimizer.onnx2onnx_flow(model) +onnx.save(model, optimized_path) + +# === 4. 驗證輸入 Shape 是否正確 === +print("📏 驗證 ONNX Input Shape...") +input_tensor = model.graph.input[0] +input_name = input_tensor.name +input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim] +expected_shape = [1, 3, imgsz_h, imgsz_w] +print(f"📌 input_name: {input_name}") +print(f"📌 input_shape: {input_shape}") +if input_shape != expected_shape: + raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}") + +# === 5. 初始化模型編譯器 (for KL630) === +print("📐 配置模型 for KL630...") +km = ktc.ModelConfig(40000, "0001", "730", onnx_model=model) + +# (可選)效能分析 +eval_result = km.evaluate() +print("\n📊 NPU 效能分析:\n" + str(eval_result)) + +# === 6. 圖片預處理 === +print("🖼️ 處理輸入圖片...") +img_list = [] +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] +if not files_found: + raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}") + +for root, _, files in os.walk(data_path): + for f in files: + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + fullpath = os.path.join(root, f) + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32) + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW + img_list.append(img_np) + print(f"✅ 處理成功: {fullpath}") + except Exception as e: + print(f"❌ 處理失敗 {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ 沒有成功處理任何圖片!") + +# === 7. 執行 BIE 量化分析 === +print("📦 執行固定點分析 (BIE)...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ 無法產生 BIE 模型") + +print("✅ BIE 模型儲存於:", bie_save_path) + +# === 8. 編譯 NEF 模型 === +print("⚙️ 編譯 NEF 模型 for KL630...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ 無法產生 NEF 模型") + +print("✅ NEF 編譯完成") +print("📁 NEF 檔案儲存於:", nef_save_path) diff --git a/tools/kneron/onnxe2e.py b/tools/kneron/onnxe2e.py new file mode 100644 index 0000000..bb1538f --- /dev/null +++ b/tools/kneron/onnxe2e.py @@ -0,0 +1,47 @@ +import onnxruntime as ort +import numpy as np +from PIL import Image +import cv2 + +# === 1. 載入 ONNX 模型 === +onnx_path = "work_dirs/meconfig8/latest.onnx" +session = ort.InferenceSession(onnx_path, providers=['CPUExecutionProvider']) + +# === 2. 前處理輸入圖像(724x362) === +def preprocess(img_path): + image = Image.open(img_path).convert("RGB") + image = image.resize((724, 362), Image.BILINEAR) + img = np.array(image) / 255.0 + img = np.transpose(img, (2, 0, 1)) # HWC → CHW + img = np.expand_dims(img, 0).astype(np.float32) # (1, 3, 362, 724) + return img + +img_path = "test.png" +input_tensor = preprocess(img_path) + +# === 3. 執行推論 === +input_name = session.get_inputs()[0].name +output = session.run(None, {input_name: input_tensor}) # list of np.array + +# === 4. 後處理 + 預測 Mask === +output_tensor = output[0][0] # shape: (num_classes, H, W) +pred_mask = np.argmax(output_tensor, axis=0).astype(np.uint8) # (H, W) + +# === 5. 可視化結果 === +colors = [ + [128, 0, 0], # 0: bunker + [0, 0, 128], # 1: car + [0, 128, 0], # 2: grass + [0, 255, 0], # 3: greenery + [255, 0, 0], # 4: person + [255, 165, 0], # 5: road + [0, 255, 255], # 6: tree +] + +color_mask = np.zeros((pred_mask.shape[0], pred_mask.shape[1], 3), dtype=np.uint8) +for cls_id, color in enumerate(colors): + color_mask[pred_mask == cls_id] = color + +# 儲存可視化圖片 +cv2.imwrite("onnx_pred_mask.png", color_mask) +print("✅ 預測結果已儲存為:onnx_pred_mask.png") diff --git a/tools/kneron/test.py b/tools/kneron/test.py new file mode 100644 index 0000000..9b3539f --- /dev/null +++ b/tools/kneron/test.py @@ -0,0 +1,92 @@ +import ktc +import numpy as np +import os +import onnx +import shutil +from PIL import Image + +# === 1. 設定路徑與參數 === +onnx_dir = 'work_dirs/meconfig8/' +onnx_path = os.path.join(onnx_dir, 'latest.onnx') +optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx') +data_path = 'data724362' +imgsz_w, imgsz_h = 724, 362 # STDC 預設解析度 + +# === 2. 建立輸出資料夾 === +os.makedirs(onnx_dir, exist_ok=True) + +# === 3. 優化 ONNX 模型(使用 onnx2onnx_flow)=== +print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...") +model = onnx.load(onnx_path) +model = ktc.onnx_optimizer.onnx2onnx_flow(model) +onnx.save(model, optimized_path) + +# === 4. 驗證輸入 Shape 是否正確 === +print("📏 驗證 ONNX Input Shape...") +input_tensor = model.graph.input[0] +input_name = input_tensor.name +input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim] +expected_shape = [1, 3, imgsz_h, imgsz_w] +print(f"📌 input_name: {input_name}") +print(f"📌 input_shape: {input_shape}") +if input_shape != expected_shape: + raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}") + +# === 5. 初始化模型編譯器 (for KL630) === +print("📐 配置模型 for KL630...") +km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model) + +# (可選)效能分析 +eval_result = km.evaluate() +print("\n📊 NPU 效能分析:\n" + str(eval_result)) + +# === 6. 圖片預處理 === +print("🖼️ 處理輸入圖片...") +img_list = [] +files_found = [f for _, _, files in os.walk(data_path) + for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))] +if not files_found: + raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}") + +for root, _, files in os.walk(data_path): + for f in files: + if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")): + continue + fullpath = os.path.join(root, f) + try: + img = Image.open(fullpath).convert("RGB") + img = Image.fromarray(np.array(img)[..., ::-1]) # RGB ➝ BGR + img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32) + img_np = img_np / 256.0 - 0.5 + img_np = np.transpose(img_np, (2, 0, 1)) # HWC ➝ CHW + img_np = np.expand_dims(img_np, axis=0) # CHW ➝ NCHW + img_list.append(img_np) + print(f"✅ 處理成功: {fullpath}") + except Exception as e: + print(f"❌ 處理失敗 {fullpath}: {e}") + +if not img_list: + raise RuntimeError("❌ 沒有成功處理任何圖片!") + +# === 7. 執行 BIE 量化分析 === +print("📦 執行固定點分析 (BIE)...") +bie_model_path = km.analysis({input_name: img_list}) +bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path)) +shutil.copy(bie_model_path, bie_save_path) + +if not os.path.exists(bie_save_path): + raise RuntimeError("❌ 無法產生 BIE 模型") + +print("✅ BIE 模型儲存於:", bie_save_path) + +# === 8. 編譯 NEF 模型 === +print("⚙️ 編譯 NEF 模型 for KL630...") +nef_model_path = ktc.compile([km]) +nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path)) +shutil.copy(nef_model_path, nef_save_path) + +if not os.path.exists(nef_save_path): + raise RuntimeError("❌ 無法產生 NEF 模型") + +print("✅ NEF 編譯完成") +print("📁 NEF 檔案儲存於:", nef_save_path) diff --git a/tools/kneron/test_onnx_dummy.py b/tools/kneron/test_onnx_dummy.py new file mode 100644 index 0000000..88ee9f6 --- /dev/null +++ b/tools/kneron/test_onnx_dummy.py @@ -0,0 +1,24 @@ +import onnxruntime as ort +import numpy as np + +# ✅ 模型路徑(你指定的) +onnx_path = r"C:\Users\rd_de\kneron-mmsegmentation\work_dirs\kn_stdc1_in1k-pre_512x1024_80k_cityscapes\latest.onnx" + +# 建立 ONNX session +session = ort.InferenceSession(onnx_path) + +# 印出模型 input 相關資訊 +input_name = session.get_inputs()[0].name +input_shape = session.get_inputs()[0].shape +print(f"✅ Input name: {input_name}") +print(f"✅ Input shape: {input_shape}") + +# 建立假圖輸入 (float32, shape = [1, 3, 512, 1024]) +dummy_input = np.random.rand(1, 3, 512, 1024).astype(np.float32) + +# 執行推論 +outputs = session.run(None, {input_name: dummy_input}) + +# 顯示模型輸出資訊 +for i, output in enumerate(outputs): + print(f"✅ Output {i}: shape = {output.shape}, dtype = {output.dtype}") diff --git a/tools/optimize_onnx_kneron.py b/tools/optimize_onnx_kneron.py new file mode 100644 index 0000000..351e697 --- /dev/null +++ b/tools/optimize_onnx_kneron.py @@ -0,0 +1,43 @@ +import os +import sys +import onnx + +# === 動態加入 optimizer_scripts 模組路徑 === +current_dir = os.path.dirname(os.path.abspath(__file__)) +sys.path.insert(0, os.path.join(current_dir, 'tools')) + +from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow + +def main(): + # === 設定路徑 === + onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest.onnx' + optimized_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest_optimized.onnx' + + if not os.path.exists(onnx_path): + print(f'❌ 找不到 ONNX 檔案: {onnx_path}') + return + + # === 載入 ONNX 模型 === + print(f'🔄 載入 ONNX: {onnx_path}') + m = onnx.load(onnx_path) + + # === 修正 ir_version(避免 opset11 時報錯)=== + if m.ir_version == 7: + print(f'⚠️ 調整 ir_version 7 → 6(相容性修正)') + m.ir_version = 6 + + # === 執行 Kneron 優化流程 === + print('⚙️ 執行 Kneron 優化 flow...') + try: + m = torch_exported_onnx_flow(m, disable_fuse_bn=False) + except Exception as e: + print(f'❌ 優化失敗: {type(e).__name__} → {e}') + return + + # === 儲存結果 === + os.makedirs(os.path.dirname(optimized_path), exist_ok=True) + onnx.save(m, optimized_path) + print(f'✅ 已儲存最佳化 ONNX: {optimized_path}') + +if __name__ == '__main__': + main() diff --git a/tools/optimizer_scripts/tools/other.py b/tools/optimizer_scripts/tools/other.py index b003fbb..da08a94 100644 --- a/tools/optimizer_scripts/tools/other.py +++ b/tools/optimizer_scripts/tools/other.py @@ -328,6 +328,15 @@ def topological_sort(g): if in_degree[node_name] == 0: to_add.append(node_name) del in_degree[node_name] + # deal with initializers (weights/biases) + for initializer in g.initializer: + init_name = initializer.name + for node_name in output_nodes[init_name]: + if node_name in in_degree: + in_degree[node_name] -= 1 + if in_degree[node_name] == 0: + to_add.append(node_name) + del in_degree[node_name] # main sort loop sorted_nodes = [] while to_add: diff --git a/tools/pytorch2onnx_kneron13.py b/tools/pytorch2onnx_kneron13.py new file mode 100644 index 0000000..da523f1 --- /dev/null +++ b/tools/pytorch2onnx_kneron13.py @@ -0,0 +1,242 @@ +# All modification made by Kneron Corp.: Copyright (c) 2022 Kneron Corp. +# Copyright (c) OpenMMLab. All rights reserved. +import argparse +import warnings +import os +import onnx +import mmcv +import numpy as np +import onnxruntime as rt +import torch +from mmcv import DictAction +from mmcv.onnx import register_extra_symbolics +from mmcv.runner import load_checkpoint +from torch import nn + +from mmseg.apis import show_result_pyplot +from mmseg.apis.inference import LoadImage +from mmseg.datasets.pipelines import Compose +from mmseg.models import build_segmentor + +from optimizer_scripts.tools import other +from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow + +torch.manual_seed(3) + + +def _parse_normalize_cfg(test_pipeline): + transforms = None + for pipeline in test_pipeline: + if 'transforms' in pipeline: + transforms = pipeline['transforms'] + break + assert transforms is not None, 'Failed to find `transforms`' + norm_config_li = [_ for _ in transforms if _['type'] == 'Normalize'] + assert len(norm_config_li) == 1, '`norm_config` should only have one' + return norm_config_li[0] + + +def _convert_batchnorm(module): + module_output = module + if isinstance(module, torch.nn.SyncBatchNorm): + module_output = torch.nn.BatchNorm2d( + module.num_features, module.eps, + module.momentum, module.affine, module.track_running_stats) + if module.affine: + module_output.weight.data = module.weight.data.clone().detach() + module_output.bias.data = module.bias.data.clone().detach() + module_output.weight.requires_grad = module.weight.requires_grad + module_output.bias.requires_grad = module.bias.requires_grad + module_output.running_mean = module.running_mean + module_output.running_var = module.running_var + module_output.num_batches_tracked = module.num_batches_tracked + for name, child in module.named_children(): + module_output.add_module(name, _convert_batchnorm(child)) + del module + return module_output + + +def _demo_mm_inputs(input_shape): + (N, C, H, W) = input_shape + rng = np.random.RandomState(0) + img = torch.FloatTensor(rng.rand(*input_shape)) + return img + + +def _prepare_input_img(img_path, test_pipeline, shape=None): + if shape is not None: + test_pipeline[1]['img_scale'] = (shape[1], shape[0]) + test_pipeline[1]['transforms'][0]['keep_ratio'] = False + test_pipeline = [LoadImage()] + test_pipeline[1:] + test_pipeline = Compose(test_pipeline) + data = dict(img=img_path) + data = test_pipeline(data) + img = torch.FloatTensor(data['img']).unsqueeze_(0) + return img + + +def pytorch2onnx(model, img, norm_cfg=None, opset_version=13, show=False, output_file='tmp.onnx', verify=False): + model.cpu().eval() + + if isinstance(model.decode_head, nn.ModuleList): + num_classes = model.decode_head[-1].num_classes + else: + num_classes = model.decode_head.num_classes + + model.forward = model.forward_dummy + origin_forward = model.forward + + register_extra_symbolics(opset_version) + with torch.no_grad(): + torch.onnx.export( + model, img, output_file, + input_names=['input'], + output_names=['output'], + export_params=True, + keep_initializers_as_inputs=False, + verbose=show, + opset_version=opset_version, + dynamic_axes=None) + print(f'Successfully exported ONNX model: {output_file} (opset_version={opset_version})') + + model.forward = origin_forward + + # NOTE: optimize onnx + m = onnx.load(output_file) + if opset_version == 11: + m.ir_version = 6 + m = torch_exported_onnx_flow(m, disable_fuse_bn=False) + onnx.save(m, output_file) + print(f'{output_file} optimized by KNERON successfully.') + + if verify: + onnx_model = onnx.load(output_file) + onnx.checker.check_model(onnx_model) + + with torch.no_grad(): + pytorch_result = model(img).numpy() + + input_all = [node.name for node in onnx_model.graph.input] + input_initializer = [node.name for node in onnx_model.graph.initializer] + net_feed_input = list(set(input_all) - set(input_initializer)) + assert len(net_feed_input) == 1 + sess = rt.InferenceSession(output_file, providers=['CPUExecutionProvider']) + onnx_result = sess.run(None, {net_feed_input[0]: img.detach().numpy()})[0] + + if show: + import cv2 + img_show = img[0][:3, ...].permute(1, 2, 0) * 255 + img_show = img_show.detach().numpy().astype(np.uint8) + ori_shape = img_show.shape[:2] + + onnx_result_ = onnx_result[0].argmax(0) + onnx_result_ = cv2.resize(onnx_result_.astype(np.uint8), (ori_shape[1], ori_shape[0])) + show_result_pyplot(model, img_show, (onnx_result_, ), palette=model.PALETTE, + block=False, title='ONNXRuntime', opacity=0.5) + + pytorch_result_ = pytorch_result.squeeze().argmax(0) + pytorch_result_ = cv2.resize(pytorch_result_.astype(np.uint8), (ori_shape[1], ori_shape[0])) + show_result_pyplot(model, img_show, (pytorch_result_, ), title='PyTorch', + palette=model.PALETTE, opacity=0.5) + + np.testing.assert_allclose( + pytorch_result.astype(np.float32) / num_classes, + onnx_result.astype(np.float32) / num_classes, + rtol=1e-5, + atol=1e-5, + err_msg='The outputs are different between Pytorch and ONNX') + print('The outputs are same between Pytorch and ONNX.') + + if norm_cfg is not None: + print("Prepending BatchNorm layer to ONNX as data normalization...") + mean = norm_cfg['mean'] + std = norm_cfg['std'] + i_n = m.graph.input[0] + if (i_n.type.tensor_type.shape.dim[1].dim_value != len(mean) or + i_n.type.tensor_type.shape.dim[1].dim_value != len(std)): + raise ValueError(f"--pixel-bias-value ({mean}) and --pixel-scale-value ({std}) should match input dimension.") + norm_bn_bias = [-1 * cm / cs + 128. / cs for cm, cs in zip(mean, std)] + norm_bn_scale = [1 / cs for cs in std] + other.add_bias_scale_bn_after(m.graph, i_n.name, norm_bn_bias, norm_bn_scale) + m = other.polish_model(m) + bn_outf = os.path.splitext(output_file)[0] + "_bn_prepended.onnx" + onnx.save(m, bn_outf) + print(f"BN-Prepended ONNX saved to {bn_outf}") + + return + + +def parse_args(): + parser = argparse.ArgumentParser(description='Convert MMSeg to ONNX') + parser.add_argument('config', help='test config file path') + parser.add_argument('--checkpoint', help='checkpoint file', default=None) + parser.add_argument('--input-img', type=str, help='Images for input', default=None) + parser.add_argument('--show', action='store_true', help='show onnx graph and segmentation results') + parser.add_argument('--verify', action='store_true', help='verify the onnx model') + parser.add_argument('--output-file', type=str, default='tmp.onnx') + parser.add_argument('--opset-version', type=int, default=13) # default opset=13 + parser.add_argument('--shape', type=int, nargs='+', default=None, help='input image height and width.') + parser.add_argument('--cfg-options', nargs='+', action=DictAction, help='Override config options.') + parser.add_argument('--normalization-in-onnx', action='store_true', help='Prepend BN for normalization.') + args = parser.parse_args() + return args + + +if __name__ == '__main__': + args = parse_args() + + if args.opset_version < 11: + raise ValueError(f"Only opset_version >=11 is supported (got {args.opset_version}).") + + cfg = mmcv.Config.fromfile(args.config) + if args.cfg_options is not None: + cfg.merge_from_dict(args.cfg_options) + cfg.model.pretrained = None + + test_mode = cfg.model.test_cfg.mode + if args.shape is None: + if test_mode == 'slide': + crop_size = cfg.model.test_cfg['crop_size'] + input_shape = (1, 3, crop_size[1], crop_size[0]) + else: + img_scale = cfg.test_pipeline[1]['img_scale'] + input_shape = (1, 3, img_scale[1], img_scale[0]) + else: + if test_mode == 'slide': + warnings.warn("Shape assignment for slide-mode models may cause unexpected results.") + if len(args.shape) == 1: + input_shape = (1, 3, args.shape[0], args.shape[0]) + elif len(args.shape) == 2: + input_shape = (1, 3) + tuple(args.shape) + else: + raise ValueError('Invalid input shape') + + cfg.model.train_cfg = None + segmentor = build_segmentor(cfg.model, train_cfg=None, test_cfg=cfg.get('test_cfg')) + segmentor = _convert_batchnorm(segmentor) + + if args.checkpoint: + checkpoint = load_checkpoint(segmentor, args.checkpoint, map_location='cpu') + segmentor.CLASSES = checkpoint['meta']['CLASSES'] + segmentor.PALETTE = checkpoint['meta']['PALETTE'] + + if args.input_img is not None: + preprocess_shape = (input_shape[2], input_shape[3]) + img = _prepare_input_img(args.input_img, cfg.data.test.pipeline, shape=preprocess_shape) + else: + img = _demo_mm_inputs(input_shape) + + if args.normalization_in_onnx: + norm_cfg = _parse_normalize_cfg(cfg.test_pipeline) + else: + norm_cfg = None + + pytorch2onnx( + segmentor, + img, + norm_cfg=norm_cfg, + opset_version=args.opset_version, + show=args.show, + output_file=args.output_file, + verify=args.verify, + ) diff --git a/tools/yolov5_preprocess.py b/tools/yolov5_preprocess.py new file mode 100644 index 0000000..ca6800e --- /dev/null +++ b/tools/yolov5_preprocess.py @@ -0,0 +1,161 @@ +# coding: utf-8 +import torch +import cv2 +import numpy as np +import math +import time +import kneron_preprocessing + +kneron_preprocessing.API.set_default_as_520() +torch.backends.cudnn.deterministic = True +img_formats = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.tiff', '.dng'] +def make_divisible(x, divisor): + # Returns x evenly divisble by divisor + return math.ceil(x / divisor) * divisor + +def check_img_size(img_size, s=32): + # Verify img_size is a multiple of stride s + new_size = make_divisible(img_size, int(s)) # ceil gs-multiple + if new_size != img_size: + print('WARNING: --img-size %g must be multiple of max stride %g, updating to %g' % (img_size, s, new_size)) + return new_size + +def letterbox_ori(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True): + # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232 + shape = img.shape[:2] # current shape [height, width] + if isinstance(new_shape, int): + new_shape = (new_shape, new_shape) + + # Scale ratio (new / old) + r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) + if not scaleup: # only scale down, do not scale up (for better test mAP) + r = min(r, 1.0) + + # Compute padding + ratio = r, r # width, height ratios + new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height + dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding + + dw /= 2 # divide padding into 2 sides + dh /= 2 + + if shape[::-1] != new_unpad: # resize + img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR) + #img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False) + + top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) + left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) + # top, bottom = int(0), int(round(dh + 0.1)) + # left, right = int(0), int(round(dw + 0.1)) + img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border + #img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0) + + return img, ratio, (dw, dh) + +def letterbox(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True): + # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232 + shape = img.shape[:2] # current shape [height, width] + if isinstance(new_shape, int): + new_shape = (new_shape, new_shape) + + # Scale ratio (new / old) + r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) + if not scaleup: # only scale down, do not scale up (for better test mAP) + r = min(r, 1.0) + + # Compute padding + ratio = r, r # width, height ratios + new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height + dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding + + # dw /= 2 # divide padding into 2 sides + # dh /= 2 + + if shape[::-1] != new_unpad: # resize + #img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR) + img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False) + + # top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) + # left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) + top, bottom = int(0), int(round(dh + 0.1)) + left, right = int(0), int(round(dw + 0.1)) + #img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color) # add border + img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0) + + return img, ratio, (dw, dh) + +def letterbox_test(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True): + + ratio = 1.0, 1.0 + dw, dh = 0, 0 + img = kneron_preprocessing.API.resize(img, size=(480, 256), keep_ratio=False, type='bilinear') + return img, ratio, (dw, dh) + +def LoadImages(path,img_size): #_rgb # for inference + if isinstance(path, str): + img0 = cv2.imread(path) # BGR + else: + img0 = path # BGR + + # Padded resize + img = letterbox(img0, new_shape=img_size)[0] + # Convert + img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416 + img = np.ascontiguousarray(img) + return img, img0 + +def LoadImages_yyy(path,img_size): #_yyy # for inference + if isinstance(path, str): + img0 = cv2.imread(path) # BGR + else: + img0 = path # BGR + + yvu = cv2.cvtColor(img0, cv2.COLOR_BGR2YCrCb) + y, v, u = cv2.split(yvu) + img0 = np.stack((y,)*3, axis=-1) + + # Padded resize + img = letterbox(img0, new_shape=img_size)[0] + + # Convert + img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416 + img = np.ascontiguousarray(img) + return img, img0 + +def LoadImages_yuv420(path,img_size): #_yuv420 # for inference + if isinstance(path, str): + img0 = cv2.imread(path) # BGR + else: + img0 = path # BGR + img_h, img_w = img0.shape[:2] + img_h = (img_h // 2) * 2 + img_w = (img_w // 2) * 2 + img = img0[:img_h,:img_w,:] + yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV_I420) + img0= cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420) #yuv420 + + + # Padded resize + img = letterbox(img0, new_shape=img_size)[0] + + # Convert + img = img[:, :, ::-1].transpose(2, 0, 1) # BGR to RGB, to 3x416x416 + img = np.ascontiguousarray(img) + return img, img0 + +def Yolov5_preprocess(image_path, device, imgsz_h, imgsz_w) : + model_stride_max = 32 + imgsz_h = check_img_size(imgsz_h, s=model_stride_max) # check img_size + imgsz_w = check_img_size(imgsz_w, s=model_stride_max) # check img_size + img, im0 = LoadImages(image_path, img_size=(imgsz_h,imgsz_w)) + img = kneron_preprocessing.API.norm(img) #path1 + #print('img',img.shape) + img = torch.from_numpy(img).to(device) #path1,path2 + # img = img.float() # uint8 to fp16/32 #path2 + # img /= 255.0#256.0 - 0.5 # 0 - 255 to -0.5 - 0.5 #path2 + + if img.ndimension() == 3: + img = img.unsqueeze(0) + + return img, im0 + diff --git a/使用手冊.txt b/使用手冊.txt new file mode 100644 index 0000000..6310ab2 --- /dev/null +++ b/使用手冊.txt @@ -0,0 +1,57 @@ +環境安裝: +# 建立與啟動 conda 環境 +conda create -n stdc_golface python=3.8 -y +conda activate stdc_golface + +# 安裝 PyTorch + 對應 CUDA 11.3 版本 +conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y + +# 安裝對應版本的 mmcv-full +pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html + +# 安裝 kneronstdc 專案 +cd kneronstdc +pip install -e . + +# 安裝常用工具套件 +pip install opencv-python tqdm matplotlib cityscapesscripts + +# 安裝 yapf 格式化工具(指定版本) +pip install yapf==0.31.0 +-------------------------------------------------------------------------------------- +data: +使用 Roboflow 匯出資料集格式請選擇: + +Semantic Segmentation Masks + +使用 seg2city.py 腳本將 Roboflow 格式轉換為 Cityscapes 格式 + +Cityscapes 範例資料可作為參考 + +將轉換後的資料放置至 data/cityscapes 資料夾 + +(cityscapes 為訓練預設的 dataset 名稱) +-------------------------------------------------------------------------------------- +訓練模型: +開剛剛新裝好的env,用cmd下指令,cd到kneronstdc裡面 +train的指令: +python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py + +test的指令: +python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --show-dir work_dirs/vis_results +------------------------------------------------------------------------------------ +映射到資料夾 +docker run --rm -it -v $(wslpath -u 'C:\Users\rd_de\kneronstdc'):/workspace/kneronstdc kneron/toolchain:latest + +轉ONNX指令 +python tools/pytorch2onnx_kneron.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py --checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx --verify + +把nef拉出來到電腦 +docker cp f78594411e1b:/data1/kneron_flow/models_630.nef "C:\Users\rd_de\kneronstdc\work_dirs\nef\models_630.nef" +--------------------------------------------------------------------------------------- +pip install opencv-python +RUN apt update && apt install -y libgl1 + + + +