feat: add golf dataset, kneron configs, and tools

- Add golf1/2/4/7/8 dataset classes for semantic segmentation - Add kneron-specific configs (meconfig series, kn_stdc1_golf4class) - Organize scripts into tools/check/ and tools/kneron/ - Add kneron_preprocessing module - Update README with quick-start guide - Update .gitignore to exclude data dirs, onnx, nef outputs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 13:14:30 +08:00 · 2026-03-18 13:14:30 +08:00 · 7716a0060f
commit 7716a0060f
parent 793c3a5bb0
60 changed files with 7428 additions and 58 deletions
--- a/.gitignore
+++ b/.gitignore
@ -117,3 +117,20 @@ mmseg/.mim

 # Pytorch
 *.pth
+
+# ONNX / NEF compiled outputs
+*.onnx
+*.nef
+batch_compile_out/
+conbinenef/
+
+# Local data directories
+data4/
+data50/
+data512/
+data724362/
+testdata/
+
+# Misc
+envs.txt
+.claude/
--- a/README.md
+++ b/README.md
@ -1,70 +1,62 @@
-# Kneron AI Training/Deployment Platform (mmsegmentation-based)
+# STDC GolfAce — Semantic Segmentation on Kneron

+## 快速開始

-## Introduction
+### 環境安裝

-  [kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) is a platform built upon the well-known [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) for mmsegmentation. If you are looking for original mmsegmentation document, please visit [mmsegmentation docs](https://mmsegmentation.readthedocs.io/en/latest/) for detailed mmsegmentation usage.
+```bash
+# 建立與啟動 conda 環境
+conda create -n stdc_golface python=3.8 -y
+conda activate stdc_golface

-  In this repository, we provide an end-to-end training/deployment flow to realize on Kneron's AI accelerators: 
+# 安裝 PyTorch + CUDA 11.3
+conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y

-  1. **Training/Evalulation:**
-      - Modified model configuration file and verified for Kneron hardware platform 
-      - Please see [Overview of Benchmark and Model Zoo](#Overview-of-Benchmark-and-Model-Zoo) for Kneron-Verified model list
-  2. **Converting to ONNX:** 
-      - tools/pytorch2onnx_kneron.py (beta)
-      - Export *optimized* and *Kneron-toolchain supported* onnx
-          - Automatically modify model for arbitrary data normalization preprocess
-  3. **Evaluation**
-      - tools/test_kneron.py (beta)
-      - Evaluate the model with *pytorch checkpoint, onnx, and kneron-nef*
-  4. **Testing**
-      - inference_kn (beta)
-      - Verify the converted [NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model on Kneron USB accelerator with this API
-  5. **Converting Kneron-NEF:** (toolchain feature)
-     - Convert the trained pytorch model to [Kneron-NEF](http://doc.kneron.com/docs/#toolchain/manual/#5-nef-workflow) model, which could be used on Kneron hardware platform.
+# 安裝 mmcv-full
+pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html

-## License
+# 安裝專案
+pip install -e .

-This project is released under the [Apache 2.0 license](LICENSE).
+# 安裝工具套件
+pip install opencv-python tqdm matplotlib cityscapesscripts yapf==0.31.0
+```

-## Changelog
+### 資料準備

-N/A
+1. 使用 **Roboflow** 匯出資料集，格式選擇 `Semantic Segmentation Masks`
+2. 使用 `seg2city.py` 將 Roboflow 格式轉換為 Cityscapes 格式
+3. 將轉換後的資料放至 `data/cityscapes/`

-## Overview of Benchmark and Kneron Model Zoo
+### 訓練與測試

-| Backbone | Crop Size | Mem (GB) | mIoU | Config | Download |
-|:--------:|:---------:|:--------:|:----:|:------:|:--------:|
-| STDC 1   | 512x1024  | 7.15     | 69.29|[config](https://github.com/kneron/kneron-mmsegmentation/tree/master/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py)|[model](https://github.com/kneron/Model_Zoo/blob/main/mmsegmentation/stdc_1/latest.zip)
+```bash
+# 訓練
+python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py

-NOTE: The performance may slightly differ from the original implementation since the input size is smaller.
+# 測試（輸出視覺化結果）
+python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \
+    work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \
+    --show-dir work_dirs/vis_results
+```

-## Installation
- Please refer to the Step 1 of [docs_kneron/stdc_step_by_step.md#step-1-environment](docs_kneron/stdc_step_by_step.md) for installation.
- Please refer to [Kneron PLUS - Python: Installation](http://doc.kneron.com/docs/#plus_python/introduction/install_dependency/) for the environment setup for Kneron USB accelerator.
+### 轉換 ONNX / NEF（Kneron Toolchain）

-## Getting Started
-### Tutorial - Kneron Edition
- [STDC-Seg: Step-By-Step](docs_kneron/stdc_step_by_step.md): A tutorial for users to get started easily. To see detailed documents, please see below.
+```bash
+# 啟動 Docker（WSL 環境）
+docker run --rm -it \
+    -v $(wslpath -u 'C:\Users\rd_de\stdc_git'):/workspace/stdc_git \
+    kneron/toolchain:latest

-### Documents - Kneron Edition
- [Kneron ONNX Export] (under development)
- [Kneron Inference] (under development)
- [Kneron Toolchain Step-By-Step (YOLOv3)](http://doc.kneron.com/docs/#toolchain/yolo_example/)
- [Kneron Toolchain Manual](http://doc.kneron.com/docs/#toolchain/manual/#0-overview)
+# 轉換 ONNX
+python tools/pytorch2onnx_kneron.py \
+    configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py \
+    --checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth \
+    --output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx \
+    --verify

-### Original mmsegmentation Documents
- [Original mmsegmentation getting started](https://github.com/open-mmlab/mmsegmentation#getting-started): It is recommended to read the original mmsegmentation getting started documents for other mmsegmentation operations.
- [Original mmsegmentation readthedoc](https://mmsegmentation.readthedocs.io/en/latest/): Original mmsegmentation documents.
+# 將 NEF 複製到本機
+docker cp <container_id>:/data1/kneron_flow/models_630.nef \
+    "C:\Users\rd_de\stdc_git\work_dirs\nef\models_630.nef"
+```

-## Contributing
-[kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation) a platform built upon [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation)
-
- For issues regarding to the original [mmsegmentation](https://github.com/open-mmlab/mmsegmentation):
-We appreciate all contributions to improve [OpenMMLab-mmsegmentation](https://github.com/open-mmlab/mmsegmentation). Ongoing projects can be found in out [GitHub Projects](https://github.com/open-mmlab/mmsegmentation/projects). Welcome community users to participate in these projects. Please refer to [CONTRIBUTING.md](.github/CONTRIBUTING.md) for the contributing guideline.
-
- For issues regarding to this repository [kneron-mmsegmentation](https://github.com/kneron/kneron-mmsegmentation): Welcome to leave the comment or submit pull requests here to improve kneron-mmsegmentation
-
-
-## Related Projects
- [kneron-mmdetection](https://github.com/kneron/kneron-mmdetection): Kneron training/deployment platform on [OpenMMLab - mmdetection](https://github.com/open-mmlab/mmdetection) object detection toolbox
--- a/configs/_base_/datasets/kn_cityscapes.py
+++ b/configs/_base_/datasets/kn_cityscapes.py
@ -1,5 +1,6 @@
 # dataset settings
-dataset_type = 'CityscapesDataset'
+#dataset_type = 'CityscapesDataset'
+dataset_type = 'GolfDataset'
 data_root = 'data/cityscapes/'
 img_norm_cfg = dict(
    mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
--- a/configs/_base_/datasets/kn_cityscapes1.py
+++ b/configs/_base_/datasets/kn_cityscapes1.py
@ -0,0 +1,70 @@
+# dataset settings
+dataset_type = 'GolfDataset'
+data_root = 'data/cityscapes0/'  # ✅ 你的資料根目錄
+
+img_norm_cfg = dict(
+    mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (512, 1024)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg']),
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', keep_ratio=True),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img']),
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline
+    ),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline
+    ),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline
+    )
+)
+
+# ✅ 類別與對應的調色盤（不傳給 dataset，用於繪圖/推論可視化）
+classes = ('car', 'grass', 'people', 'road')
+palette = [
+    [246, 14, 135],   # car
+    [233, 81, 78],    # grass
+    [220, 148, 21],   # people
+    [207, 215, 220],  # road
+]
--- a/configs/_base_/datasets/kn_cityscapes2.py
+++ b/configs/_base_/datasets/kn_cityscapes2.py
@ -0,0 +1,71 @@
+# dataset settings
+dataset_type = 'GolfDataset'
+data_root = 'data/cityscapes0/'  # ✅ 你的資料根目錄
+
+img_norm_cfg = dict(
+    mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg']),
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', keep_ratio=True),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img']),
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline
+    ),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline
+    ),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline
+    )
+)
+
+# ✅ 類別與對應的調色盤（不傳給 dataset，用於繪圖/推論可視化）
+classes = ('car', 'grass', 'people', 'road')
+palette = [
+    [246, 14, 135],   # car
+    [233, 81, 78],    # grass
+    [220, 148, 21],   # people
+    [207, 215, 220],  # road
+]
+
--- a/configs/_base_/schedules/schedule_2k.py
+++ b/configs/_base_/schedules/schedule_2k.py
@ -0,0 +1,22 @@
+# optimizer
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+
+# optimizer config
+optimizer_config = dict()
+
+# learning policy
+lr_config = dict(
+    policy='poly',
+    power=0.9,
+    min_lr=1e-4,
+    by_epoch=False
+)
+
+# runtime settings
+runner = dict(type='IterBasedRunner', max_iters=2000)
+
+# checkpoint 每 2000 次儲存一次（最後一次）
+checkpoint_config = dict(by_epoch=False, interval=2000)
+
+# 評估設定，每 2000 次執行一次 mIoU 評估
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/kn_stdc1_golf4class.py
+++ b/configs/stdc/kn_stdc1_golf4class.py
@ -0,0 +1,193 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+
+# ---------------- 模型設定 ---------------- #
+norm_cfg = dict(type='BN', requires_grad=True)
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False,
+            init_cfg=dict(
+                type='Pretrained',
+                checkpoint='https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+            )
+        ),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)
+    ),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=4,  # ✅ 四類
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
+    ),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,  # ✅
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
+        ),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,  # ✅
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)
+        ),
+        dict(
+            type='STDCHead',
+            in_channels=256,
+            channels=64,
+            num_convs=1,
+            num_classes=4,  # ✅ 最重要
+            boundary_threshold=0.1,
+            in_index=0,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=True,
+            loss_decode=[
+                dict(
+                    type='CrossEntropyLoss',
+                    loss_name='loss_ce',
+                    use_sigmoid=True,
+                    loss_weight=1.0),
+                dict(
+                    type='DiceLoss',
+                    loss_name='loss_dice',
+                    loss_weight=1.0)
+            ]
+        )
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+# ---------------- 資料集設定 ---------------- #
+dataset_type = 'GolfDataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(
+    mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (512, 1024)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(2048, 1024), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg']),
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(1024, 512),
+        flip=False,
+        transforms=[
+            dict(type='Resize', keep_ratio=True),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img']),
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline
+    ),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline
+    ),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline
+    )
+)
+
+# ---------------- 額外設定 ---------------- #
+log_config = dict(
+    interval=50, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+checkpoint_config = dict(by_epoch=False, interval=1000)
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(
+    policy='poly',
+    power=0.9,
+    min_lr=0.0001,
+    by_epoch=False,
+    warmup='linear',
+    warmup_iters=1000)
+runner = dict(type='IterBasedRunner', max_iters=20000)
+cudnn_benchmark = True
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+work_dir = './work_dirs/kn_stdc1_golf4class'
+gpu_ids = [0]
+
+# ✅ 可選：僅供視覺化或 post-processing 用，不會傳給 dataset
+classes = ('car', 'grass', 'people', 'road')
+palette = [
+    [246, 14, 135],   # car
+    [233, 81, 78],    # grass
+    [220, 148, 21],   # people
+    [207, 215, 220],  # road
+]
--- a/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py
+++ b/configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py
@ -1,14 +1,17 @@
 checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'  # noqa
 _base_ = [
-    '../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes.py',
-    '../_base_/default_runtime.py', '../_base_/schedules/schedule_80k.py'
+    '../_base_/models/stdc.py', '../_base_/datasets/kn_cityscapes2.py',
+    '../_base_/default_runtime.py', '../_base_/schedules/schedule_2k.py'
 ]
 lr_config = dict(warmup='linear', warmup_iters=1000)
 data = dict(
-    samples_per_gpu=12,
-    workers_per_gpu=4,
+    samples_per_gpu=2,
+    workers_per_gpu=2,
 )
 model = dict(
    backbone=dict(
        backbone_cfg=dict(
            init_cfg=dict(type='Pretrained', checkpoint=checkpoint))))
+
+
+
--- a/configs/stdc/meconfig.py
+++ b/configs/stdc/meconfig.py
@ -0,0 +1,137 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=4,
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+dataset_type = 'GolfDataset'
+data_root = 'data/cityscapes0/'
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', keep_ratio=True),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=80000)
+checkpoint_config = dict(by_epoch=False, interval=2000)
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig1.py
+++ b/configs/stdc/meconfig1.py
@ -0,0 +1,146 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=1,  # ✅ 只分類 grass
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=1,
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=1,
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+# ✅ 更新為你新的 dataset 類別
+dataset_type = 'GrassOnlyDataset'
+data_root = 'data/cityscapes/'
+
+# ✅ 加入 classes 與 palette 定義
+classes = ('grass',)
+palette = [[0, 128, 0]]
+
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=80000)
+checkpoint_config = dict(by_epoch=False, interval=2000)
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig2.py
+++ b/configs/stdc/meconfig2.py
@ -0,0 +1,149 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=2,  # ✅ grass + road
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=2,
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=2,
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+# ✅ 使用 Golf2Dataset (草地與道路)
+dataset_type = 'Golf2Dataset'
+data_root = 'data/cityscapes/'
+
+# ✅ 類別與對應顏色
+classes = ('grass', 'road')
+palette = [
+    [0, 255, 0],     # grass
+    [255, 165, 0],   # road
+]
+
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=80000)
+checkpoint_config = dict(by_epoch=False, interval=2000)
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig4.py
+++ b/configs/stdc/meconfig4.py
@ -0,0 +1,151 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=4,  # ✅ 改為 4 類
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,  # ✅ 改為 4 類
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=4,  # ✅ 改為 4 類
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+# ✅ 新 dataset 類別
+dataset_type = 'Golf4Dataset'
+data_root = 'data/cityscapes/'
+
+# ✅ 類別與配色
+classes = ('car', 'grass', 'people', 'road')
+palette = [
+    [0, 0, 128],     # car
+    [0, 255, 0],     # grass
+    [255, 0, 0],     # people
+    [255, 165, 0],   # road
+]
+
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=80000)
+checkpoint_config = dict(by_epoch=False, interval=2000)
+evaluation = dict(interval=2000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig7.py
+++ b/configs/stdc/meconfig7.py
@ -0,0 +1,137 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=7,
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=7,
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=7,
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+dataset_type = 'Golf8Dataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=320000)
+checkpoint_config = dict(by_epoch=False, interval=32000)
+evaluation = dict(interval=32000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig8.py
+++ b/configs/stdc/meconfig8.py
@ -0,0 +1,137 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+checkpoint = 'https://download.openmmlab.com/mmsegmentation/v0.5/pretrain/stdc/stdc1_20220308-5368626c.pth'
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=8,  # ✅ 改為 8 類
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 改為 8 類
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 改為 8 類
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+dataset_type = 'Golf8Dataset'  # ✅ 使用 Golf8Dataset
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)])
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+load_from = None
+resume_from = None
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=320000)
+checkpoint_config = dict(by_epoch=False, interval=32000)
+evaluation = dict(interval=32000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/meconfig8_finetune.py
+++ b/configs/stdc/meconfig8_finetune.py
@ -0,0 +1,147 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=8,  # ✅ 8 類
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 8 類
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 8 類
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+dataset_type = 'Golf8Dataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)]
+)
+
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+
+# ✅ Fine-tune 用設定
+load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth'
+resume_from = None
+
+
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+# ✅ Fine-tune 推薦學習率
+optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=160000)
+
+checkpoint_config = dict(by_epoch=False, interval=16000)
+evaluation = dict(interval=16000, metric='mIoU', pre_eval=True)
--- a/configs/stdc/test.py
+++ b/configs/stdc/test.py
@ -0,0 +1,147 @@
+norm_cfg = dict(type='BN', requires_grad=True)
+
+model = dict(
+    type='EncoderDecoder',
+    pretrained=None,
+    backbone=dict(
+        type='STDCContextPathNet',
+        backbone_cfg=dict(
+            type='STDCNet',
+            stdc_type='STDCNet1',
+            in_channels=3,
+            channels=(32, 64, 256, 512, 1024),
+            bottleneck_type='cat',
+            num_convs=4,
+            norm_cfg=norm_cfg,
+            act_cfg=dict(type='ReLU'),
+            with_final_conv=False),
+        last_in_channels=(1024, 512),
+        out_channels=128,
+        ffm_cfg=dict(in_channels=384, out_channels=256, scale_factor=4)),
+    decode_head=dict(
+        type='FCNHead',
+        in_channels=256,
+        channels=256,
+        num_convs=1,
+        num_classes=8,  # ✅ 8 類
+        in_index=3,
+        concat_input=False,
+        dropout_ratio=0.1,
+        norm_cfg=norm_cfg,
+        align_corners=True,
+        sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+        loss_decode=dict(
+            type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    auxiliary_head=[
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 8 類
+            in_index=2,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+        dict(
+            type='FCNHead',
+            in_channels=128,
+            channels=64,
+            num_convs=1,
+            num_classes=8,  # ✅ 8 類
+            in_index=1,
+            norm_cfg=norm_cfg,
+            concat_input=False,
+            align_corners=False,
+            sampler=dict(type='OHEMPixelSampler', thresh=0.7, min_kept=10000),
+            loss_decode=dict(
+                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
+    ],
+    train_cfg=dict(),
+    test_cfg=dict(mode='whole')
+)
+
+dataset_type = 'Golf8Dataset'
+data_root = 'data/cityscapes/'
+img_norm_cfg = dict(mean=[128., 128., 128.], std=[256., 256., 256.], to_rgb=True)
+crop_size = (360, 720)
+
+train_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(type='LoadAnnotations'),
+    dict(type='Resize', img_scale=(724, 362), ratio_range=(0.5, 2.0)),
+    dict(type='RandomCrop', crop_size=crop_size, cat_max_ratio=0.75),
+    dict(type='RandomFlip', prob=0.5),
+    dict(type='PhotoMetricDistortion'),
+    dict(type='Normalize', **img_norm_cfg),
+    dict(type='Pad', size=crop_size, pad_val=0, seg_pad_val=255),
+    dict(type='DefaultFormatBundle'),
+    dict(type='Collect', keys=['img', 'gt_semantic_seg'])
+]
+
+test_pipeline = [
+    dict(type='LoadImageFromFile'),
+    dict(
+        type='MultiScaleFlipAug',
+        img_scale=(724, 362),
+        flip=False,
+        transforms=[
+            dict(type='Resize', img_scale=(724, 362), keep_ratio=False),
+            dict(type='RandomFlip'),
+            dict(type='Normalize', **img_norm_cfg),
+            dict(type='ImageToTensor', keys=['img']),
+            dict(type='Collect', keys=['img'])
+        ])
+]
+
+data = dict(
+    samples_per_gpu=2,
+    workers_per_gpu=2,
+    train=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/train',
+        ann_dir='gtFine/train',
+        pipeline=train_pipeline),
+    val=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/val',
+        ann_dir='gtFine/val',
+        pipeline=test_pipeline),
+    test=dict(
+        type=dataset_type,
+        data_root=data_root,
+        img_dir='leftImg8bit/test',
+        ann_dir='gtFine/test',
+        pipeline=test_pipeline)
+)
+
+log_config = dict(
+    interval=50,
+    hooks=[dict(type='TextLoggerHook', by_epoch=False)]
+)
+
+dist_params = dict(backend='nccl')
+log_level = 'INFO'
+
+# ✅ Fine-tune 用設定
+load_from = 'C:/Users/rd_de/kneronstdc/work_dirs/meconfig8/0619/latest.pth'
+resume_from = None
+
+
+workflow = [('train', 1)]
+cudnn_benchmark = True
+
+# ✅ Fine-tune 推薦學習率
+optimizer = dict(type='SGD', lr=0.001, momentum=0.9, weight_decay=0.0005)
+optimizer_config = dict()
+
+lr_config = dict(policy='poly', power=0.9, min_lr=1e-4, by_epoch=False)
+runner = dict(type='IterBasedRunner', max_iters=160)
+
+checkpoint_config = dict(by_epoch=False, interval=16)
+evaluation = dict(interval=16, metric='mIoU', pre_eval=True)
--- a/kneron_preprocessing/API.py
+++ b/kneron_preprocessing/API.py
@ -0,0 +1,684 @@
+# -*- coding: utf-8 -*-
+
+import numpy as np
+import os
+from .funcs.utils import str2int, str2bool
+from . import Flow
+
+flow = Flow()
+flow.set_numerical_type('floating')
+flow_520 = Flow()
+flow_520.set_numerical_type('520')
+flow_720 = Flow()
+flow_720.set_numerical_type('720')
+
+DEFAULT = None
+default = {
+    'crop':{
+        'align_w_to_4':False
+        },
+    'resize':{
+        'type':'bilinear',
+        'calculate_ratio_using_CSim':False
+        }
+}
+
+def set_default_as_520():
+    """
+    Set some default parameter as 520 setting
+
+    crop.align_w_to_4 = True
+    crop.pad_square_to_4 = True
+    resize.type = 'fixed_520'
+    resize.calculate_ratio_using_CSim = True
+    """
+    global default
+    default['crop']['align_w_to_4'] = True
+    default['resize']['type'] = 'fixed_520'
+    default['resize']['calculate_ratio_using_CSim'] = True
+    return
+
+def set_default_as_floating():
+    """
+    Set some default parameter as floating setting
+
+    crop.align_w_to_4 = False
+    crop.pad_square_to_4 = False
+    resize.type = 'bilinear'
+    resize.calculate_ratio_using_CSim = False
+    """
+    global default
+    default['crop']['align_w_to_4'] = False
+    default['resize']['type'] = 'bilinear'
+    default['resize']['calculate_ratio_using_CSim'] = False
+    pass
+
+def print_info_on():
+    """
+    turn print infomation on.
+    """
+    flow.set_print_info(True)
+    flow_520.set_print_info(True)
+
+def print_info_off():
+    """
+    turn print infomation off.
+    """
+    flow.set_print_info(False)
+    flow_520.set_print_info(False)
+
+def load_image(image):
+    """
+    load_image function
+    load load_image and output as rgb888 format np.array
+
+    Args:
+        image: [np.array/str], can be np.array or image file path
+
+    Returns:
+        out: [np.array], rgb888 format
+
+    Examples:
+    """
+    image = flow.load_image(image, is_raw = False)
+    return image
+
+def load_bin(image, fmt=None, size=None):
+    """
+    load_bin function
+    load bin file and output as rgb888 format np.array
+
+    Args:
+        image: [str], bin file path
+        fmt: [str], "rgb888" / "rgb565" / "nir"
+        size: [tuble], (image_w, image_h)
+
+    Returns:
+        out: [np.array], rgb888 format
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.load_bin(image,'rgb565',(raw_w,raw_h))
+    """    
+    assert isinstance(size, tuple)
+    assert isinstance(fmt, str)
+    # assert (fmt.lower() in ['rgb888', "rgb565" , "nir",'RGB888', "RGB565" , "NIR", 'NIR888', 'nir888'])
+
+    image = flow.load_image(image, is_raw = True, raw_img_type='bin', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1])
+    flow.set_color_conversion(source_format=fmt, out_format = 'rgb888')
+    image,_ = flow.funcs['color'](image)
+    return image
+
+def load_hex(file, fmt=None, size=None):
+    """
+    load_hex function
+    load hex file and output as rgb888 format np.array
+
+    Args:
+        image: [str], hex file path
+        fmt: [str], "rgb888" / "yuv444" / "ycbcr444" / "yuv422" / "ycbcr422" / "rgb565"
+        size: [tuble], (image_w, image_h)
+
+    Returns:
+        out: [np.array], rgb888 format
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.load_hex(image,'rgb565',(raw_w,raw_h))
+    """  
+    assert isinstance(size, tuple)
+    assert isinstance(fmt, str)
+    assert (fmt.lower() in ['rgb888',"yuv444" , "ycbcr444" , "yuv422" , "ycbcr422" , "rgb565"])
+
+    image = flow.load_image(file, is_raw = True, raw_img_type='hex', raw_img_fmt = fmt, img_in_width = size[0], img_in_height = size[1])
+    flow.set_color_conversion(source_format=fmt, out_format = 'rgb888')
+    image,_ = flow.funcs['color'](image)
+    return image
+
+def dump_image(image, output=None, file_fmt='txt',image_fmt='rgb888',order=0):
+    """
+    dump_image function
+
+    dump txt, bin or hex, default is txt
+    image format as following format: RGB888, RGBA8888, RGB565, NIR, YUV444, YCbCr444, YUV422, YCbCr422, default is RGB888
+
+    Args:
+        image: [np.array/str], can be np.array or image file path
+        output: [str], dump file path
+        file_fmt: [str], "bin" / "txt" / "hex", set dump file format, default is txt
+        image_fmt: [str], RGB888 / RGBA8888 / RGB565 / NIR / YUV444 / YCbCr444 / YUV422 / YCbCr422, default is RGB888
+
+    Examples:
+        >>> kneron_preprocessing.API.dump_image(image_data,out_path,fmt='bin')
+    """
+    if isinstance(image, str):
+        image = load_image(image)
+
+    assert isinstance(image, np.ndarray)
+    if output is None:
+        return
+
+    flow.set_output_setting(is_dump=False, dump_format=file_fmt, image_format=image_fmt ,output_file=output)
+    flow.dump_image(image)
+    return
+
+def convert(image, out_fmt = 'RGB888', source_fmt = 'RGB888'):
+    """
+    color convert
+
+    Args:
+        image: [np.array], input
+        out_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422"
+        source_fmt: [str], "rgb888" / "rgba8888" / "rgb565" / "yuv" / "ycbcr" / "yuv422" / "ycbcr422"
+
+    Returns:
+        out: [np.array]
+
+    Examples:
+
+    """  
+    flow.set_color_conversion(source_format = source_fmt, out_format=out_fmt, simulation=False)
+    image,_ = flow.funcs['color'](image)
+    return image
+
+def get_crop_range(box,align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0):
+    """
+    get exact crop box according different setting
+
+    Args:
+        box: [tuble], (x1, y1, x2, y2)
+        align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
+        pad_square_to_4: [bool], pad to square(align 4) or not, default False
+        rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding 
+
+    Returns:
+        out: [tuble,4], (crop_x1, crop_y1, crop_x2, crop_y2) 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.get_crop_range((272,145,461,341), align_w_to_4=True, pad_square_to_4=True)
+        (272, 145, 460, 341)
+    """  
+    if box is None:
+        return (0,0,0,0)
+    if align_w_to_4 is None:
+        align_w_to_4 = default['crop']['align_w_to_4']
+
+    flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
+    image = np.zeros((1,1,3)).astype('uint8')
+    _,info = flow.funcs['crop'](image)
+    
+    return info['box']
+
+def crop(image, box=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}):
+    """
+    crop function
+
+    specific crop range by box
+
+    Args:
+        image: [np.array], input
+        box: [tuble], (x1, y1, x2, y2)
+        align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
+        pad_square_to_4: [bool], pad to square(align 4) or not, default False
+        rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding 
+        info_out: [dic], save the final crop box into info_out['box']
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), align_w_to_4=True, info_out=info)
+        >>> info['box']
+        (272, 145, 460, 341)
+
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop(image_data,(272,145,461,341), pad_square_to_4=True, info_out=info)
+        >>> info['box']
+        (268, 145, 464, 341)
+    """  
+    assert isinstance(image, np.ndarray)
+    if box is None:
+        return image
+    if align_w_to_4 is None:
+        align_w_to_4 = default['crop']['align_w_to_4']
+
+    flow.set_crop(type='specific', start_x=box[0],start_y=box[1],end_x=box[2],end_y=box[3], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
+    image,info = flow.funcs['crop'](image)
+    
+    info_out['box'] = info['box']
+    return image
+
+def crop_center(image, range=None, align_w_to_4=DEFAULT, pad_square_to_4=False,rounding_type=0 ,info_out = {}):
+    """
+    crop function
+
+    center crop by range
+
+    Args:
+        image: [np.array], input
+        range: [tuble], (crop_w, crop_h)
+        align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
+        pad_square_to_4: [bool], pad to square(align 4) or not, default False
+        rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding 
+        info_out: [dic], save the final crop box into info_out['box']
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), align_w_to_4=True,info_out=info)
+        >>> info['box']
+        (268, 220, 372, 260)
+
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop_center(image_data,(102,40), pad_square_to_4=True, info_out=info)
+        >>> info['box']
+        (269, 192, 371, 294)
+    """   
+    assert isinstance(image, np.ndarray)
+    if range is None:
+        return image
+    if align_w_to_4 is None:
+        align_w_to_4 = default['crop']['align_w_to_4']
+
+    flow.set_crop(type='center', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4,rounding_type=rounding_type)
+    image,info = flow.funcs['crop'](image)
+
+    info_out['box'] = info['box']
+    return image
+
+def crop_corner(image, range=None, align_w_to_4=DEFAULT,pad_square_to_4=False,rounding_type=0 ,info_out = {}):
+    """
+    crop function
+
+    corner crop by range
+
+    Args:
+        image: [np.array], input
+        range: [tuble], (crop_w, crop_h)
+        align_w_to_4: [bool], crop length in w direction align to 4 or not, default False
+        pad_square_to_4: [bool], pad to square(align 4) or not, default False
+        rounding_type: [int], 0-> x1,y1 take floor, x2,y2 take ceil; 1->all take rounding 
+        info_out: [dic], save the final crop box into info_out['box']
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), align_w_to_4=True,info_out=info)
+        >>> info['box']
+        (0, 0, 104, 40)
+
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.crop_corner(image_data,(102,40), pad_square_to_4=True,info_out=info)
+        >>> info['box']
+        (0, -28, 102, 74)
+    """
+    assert isinstance(image, np.ndarray)
+    if range is None:
+        return image
+    if align_w_to_4 is None:
+        align_w_to_4 = default['crop']['align_w_to_4']
+
+    flow.set_crop(type='corner', crop_w=range[0],crop_h=range[1], align_w_to_4=align_w_to_4, pad_square_to_4=pad_square_to_4)
+    image, info = flow.funcs['crop'](image)
+
+    info_out['box'] = info['box']
+    return image
+
+def resize(image, size=None, keep_ratio = True, zoom = True, type=DEFAULT, calculate_ratio_using_CSim = DEFAULT, info_out = {}):
+    """
+    resize function
+
+    resize type can be bilinear or bilicubic as floating type, fixed or fixed_520/fixed_720 as fixed type.
+    fixed_520/fixed_720 type has add some function to simulate 520/720 bug.
+
+    Args:
+        image: [np.array], input
+        size: [tuble], (input_w, input_h)
+        keep_ratio: [bool], keep_ratio or not, default True
+        zoom: [bool], enable resize can zoom image or not, default True
+        type: [str], "bilinear" / "bilicubic" / "cv2" / "fixed" / "fixed_520" / "fixed_720"
+        calculate_ratio_using_CSim: [bool], calculate the ratio and scale using Csim function and C float, default False
+        info_out: [dic], save the final scale size(w,h) into info_out['size']
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> info = {}
+        >>> image_data = kneron_preprocessing.API.resize(image_data,size=(56,56),type='fixed',info_out=info)
+        >>> info_out['size']
+        (54,56)
+    """
+    assert isinstance(image, np.ndarray)
+    if size is None:
+        return image
+    if type is None:
+        type = default['resize']['type']
+    if calculate_ratio_using_CSim is None:
+        calculate_ratio_using_CSim = default['resize']['calculate_ratio_using_CSim']
+
+    flow.set_resize(resize_w = size[0], resize_h = size[1], type=type, keep_ratio=keep_ratio,zoom=zoom, calculate_ratio_using_CSim=calculate_ratio_using_CSim)
+    image, info = flow.funcs['resize'](image)
+    info_out['size'] = info['size']
+
+    return image
+
+def pad(image, pad_l=0, pad_r=0, pad_t=0, pad_b=0, pad_val=0):
+    """
+    pad function
+
+    specific left, right, top and bottom pad size.
+
+    Args:
+        image[np.array]: input
+        pad_l: [int], pad size from left, default 0
+        pad_r: [int], pad size form right, default 0
+        pad_t: [int], pad size from top, default 0
+        pad_b: [int], pad size form bottom, default 0
+        pad_val: [float], the value of pad, , default 0 
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.pad(image_data,20,40,20,40,-0.5)
+    """
+    assert isinstance(image, np.ndarray)
+
+    flow.set_padding(type='specific',pad_l=pad_l,pad_r=pad_r,pad_t=pad_t,pad_b=pad_b,pad_val=pad_val)
+    image, _ = flow.funcs['padding'](image)
+    return image
+
+def pad_center(image,size=None, pad_val=0):
+    """
+    pad function
+
+    center pad with pad size.
+
+    Args:
+        image[np.array]: input
+        size: [tuble], (padded_size_w, padded_size_h)
+        pad_val: [float], the value of pad, , default 0 
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.pad_center(image_data,size=(56,56),pad_val=-0.5)
+    """
+    assert isinstance(image, np.ndarray)
+    if size is None:
+        return image
+    assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) )
+
+    flow.set_padding(type='center',padded_w=size[0],padded_h=size[1],pad_val=pad_val)
+    image, _ = flow.funcs['padding'](image)
+    return image
+
+def pad_corner(image,size=None, pad_val=0):
+    """
+    pad function
+
+    corner pad with pad size.
+
+    Args:
+        image[np.array]: input
+        size: [tuble], (padded_size_w, padded_size_h)
+        pad_val: [float], the value of pad, , default 0 
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.pad_corner(image_data,size=(56,56),pad_val=-0.5)
+    """   
+    assert isinstance(image, np.ndarray)
+    if size is None:
+        return image
+    assert ( (image.shape[0] <= size[1]) & (image.shape[1] <= size[0]) )
+
+    flow.set_padding(type='corner',padded_w=size[0],padded_h=size[1],pad_val=pad_val)
+    image, _ = flow.funcs['padding'](image)
+    return image
+
+def norm(image,scale=256.,bias=-0.5, mean=None, std=None):
+    """
+    norm function
+    
+    x = (x/scale - bias)
+    x[0,1,2] = x - mean[0,1,2]
+    x[0,1,2] = x / std[0,1,2]
+
+    Args:
+        image: [np.array], input
+        scale: [float], default = 256
+        bias: [float], default = -0.5
+        mean: [tuble,3], default = None
+        std: [tuble,3], default = None
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.norm(image_data)
+        >>> image_data = kneron_preprocessing.API.norm(image_data,mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
+    """  
+    assert isinstance(image, np.ndarray)
+
+    flow.set_normalize(type='specific',scale=scale,  bias=bias, mean=mean, std =std)
+    image, _ = flow.funcs['normalize'](image)
+    return image
+
+def inproc_520(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False, rotate=0, radix=8, bit_width=8, round_w_to_16=True, NUM_BANK_LINE=32,BANK_ENTRY_CNT=512,MAX_IMG_PREPROC_ROW_NUM=511,MAX_IMG_PREPROC_COL_NUM=256):
+    """
+    inproc_520
+
+    Args:
+        image: [np.array], input
+        crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop
+        pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0
+        norm: [str], default = 'kneron'
+        rotate: [int], 0 / 1 / 2 ,default = 0
+        radix: [int], default = 8
+        bit_width: [int], default = 8
+        round_w_to_16: [bool], default = True
+        gray: [bool], default = False
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1)
+    """  
+    # assert isinstance(image, np.ndarray)
+
+    if (not isinstance(image, np.ndarray)):
+        flow_520.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1])
+    else:
+        flow_520.set_raw_img(is_raw_img='no')
+        flow_520.set_color_conversion(source_format='rgb888')
+
+    if npu_size is None:
+        return image
+
+    flow_520.set_model_size(w=npu_size[0],h=npu_size[1])
+
+    ## Crop
+    if crop_box != None:
+        flow_520.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3])
+        crop_fisrt = True
+    else:
+        crop_fisrt = False
+
+    ## Color
+    if gray:
+        flow_520.set_color_conversion(out_format='l',simulation='no')
+    else:
+        flow_520.set_color_conversion(out_format='rgb888',simulation='no')
+
+    ## Resize & Pad
+    pad_mode = str2int(pad_mode)
+    if (pad_mode == 0):
+        pad_type = 'center'
+        resize_keep_ratio = 'yes'
+    elif (pad_mode == 1):
+        pad_type = 'corner'
+        resize_keep_ratio = 'yes'
+    else:
+        pad_type = 'center'
+        resize_keep_ratio = 'no'
+    
+    flow_520.set_resize(keep_ratio=resize_keep_ratio)
+    flow_520.set_padding(type=pad_type)
+
+    ## Norm
+    flow_520.set_normalize(type=norm)
+
+    ## 520 inproc
+    flow_520.set_520_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM)
+    image_data, _ = flow_520.run_whole_process(image)
+
+    return image_data
+
+def inproc_720(image,raw_fmt='rgb565',raw_size=None,npu_size=None, crop_box=None, pad_mode=0, norm='kneron', gray=False):
+    """
+    inproc_720
+
+    Args:
+        image: [np.array], input
+        crop_box: [tuble], (x1, y1, x2, y2), if None will skip crop
+        pad_mode: [int], 0: pad 2 sides, 1: pad 1 side, 2: no pad. default = 0
+        norm: [str], default = 'kneron'
+        rotate: [int], 0 / 1 / 2 ,default = 0
+        radix: [int], default = 8
+        bit_width: [int], default = 8
+        round_w_to_16: [bool], default = True
+        gray: [bool], default = False
+
+    Returns:
+        out: [np.array] 
+
+    Examples:
+        >>> image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(56,56),crop_box=(272,145,460,341),pad_mode=1)
+    """  
+    # assert isinstance(image, np.ndarray)
+
+    if (not isinstance(image, np.ndarray)):
+        flow_720.set_raw_img(is_raw_img='yes',raw_img_type = 'bin',raw_img_fmt=raw_fmt, img_in_width=raw_size[0], img_in_height=raw_size[1])
+    else:
+        flow_720.set_raw_img(is_raw_img='no')
+        flow_720.set_color_conversion(source_format='rgb888')
+
+    if npu_size is None:
+        return image
+
+    flow_720.set_model_size(w=npu_size[0],h=npu_size[1])
+
+    ## Crop
+    if crop_box != None:
+        flow_720.set_crop(start_x=crop_box[0],start_y=crop_box[1],end_x=crop_box[2],end_y=crop_box[3])
+        crop_fisrt = True
+    else:
+        crop_fisrt = False
+
+    ## Color
+    if gray:
+        flow_720.set_color_conversion(out_format='l',simulation='no')
+    else:
+        flow_720.set_color_conversion(out_format='rgb888',simulation='no')
+
+    ## Resize & Pad
+    pad_mode = str2int(pad_mode)
+    if (pad_mode == 0):
+        pad_type = 'center'
+        resize_keep_ratio = 'yes'
+    elif (pad_mode == 1):
+        pad_type = 'corner'
+        resize_keep_ratio = 'yes'
+    else:
+        pad_type = 'center'
+        resize_keep_ratio = 'no'
+    
+    flow_720.set_resize(keep_ratio=resize_keep_ratio)
+    flow_720.set_padding(type=pad_type)
+
+    ## 720 inproc
+    # flow_720.set_720_setting(radix=radix,bit_width=bit_width,rotate=rotate,crop_fisrt=crop_fisrt,round_w_to_16=round_w_to_16,NUM_BANK_LINE=NUM_BANK_LINE,BANK_ENTRY_CNT=BANK_ENTRY_CNT,MAX_IMG_PREPROC_ROW_NUM=MAX_IMG_PREPROC_ROW_NUM,MAX_IMG_PREPROC_COL_NUM=MAX_IMG_PREPROC_COL_NUM)
+    image_data, _ = flow_720.run_whole_process(image)
+
+    return image_data
+
+def bit_match(data1, data2):
+    """
+    bit_match function
+
+    check data1 is equal to data2 or not.
+
+    Args:
+        data1: [np.array / str], can be array or txt/bin file
+        data2: [np.array / str], can be array or txt/bin file
+
+    Returns:
+        out1: [bool], is match or not
+        out2: [np.array], if not match, save the position for mismatched data
+
+    Examples:
+        >>> result, mismatched = kneron_preprocessing.API.bit_match(data1,data2)
+    """
+    if isinstance(data1, str):
+        if os.path.splitext(data1)[1] == '.bin':
+            data1 = np.fromfile(data1, dtype='uint8')
+        elif os.path.splitext(data1)[1] == '.txt':
+            data1 = np.loadtxt(data1)
+    
+    assert isinstance(data1, np.ndarray)
+
+    if isinstance(data2, str):
+        if os.path.splitext(data2)[1] == '.bin':
+            data2 = np.fromfile(data2, dtype='uint8')
+        elif os.path.splitext(data2)[1] == '.txt':
+            data2 = np.loadtxt(data2)
+
+    assert isinstance(data2, np.ndarray)
+
+
+    data1 = data1.reshape((-1,1))
+    data2 = data2.reshape((-1,1))
+
+    if not(len(data1) == len(data2)):
+        print('error len')
+        return False, np.zeros((1))
+    else: 
+        ans = data2 - data1    
+        if len(np.where(ans>0)[0]) > 0:
+            print('error',np.where(ans>0)[0])
+            return False, np.where(ans>0)[0]
+        else:
+            print('pass')
+            return True, np.zeros((1))
+
+def cpr_to_crp(x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end):
+    """
+    calculate the parameters of crop->pad->resize flow  to HW crop->resize->padding flow
+
+    Args:
+
+    Returns:
+
+    Examples:
+
+    """
+    pad_l = round(pad_l * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r))
+    pad_r = round(pad_r * (rx_end-rx_start) / (x_end - x_start + pad_l + pad_r)) 
+    pad_t = round(pad_t * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b))
+    pad_b = round(pad_b * (ry_end-ry_start) / (y_end - y_start + pad_t + pad_b))
+
+    rx_start +=pad_l
+    rx_end -=pad_r
+    ry_start +=pad_t
+    ry_end -=pad_b
+
+    return x_start, x_end, y_start, y_end, pad_l, pad_r, pad_t, pad_b, rx_start, rx_end, ry_start, ry_end
--- a/kneron_preprocessing/Cflow.py
+++ b/kneron_preprocessing/Cflow.py
@ -0,0 +1,172 @@
+import numpy as np
+import argparse
+import kneron_preprocessing
+
+def main_(args):
+    image = args.input_file
+    filefmt = args.file_fmt
+    if filefmt == 'bin':
+        raw_format = args.raw_format
+        raw_w = args.input_width
+        raw_h = args.input_height
+
+        image_data = kneron_preprocessing.API.load_bin(image,raw_format,(raw_w,raw_h))
+    else:
+        image_data = kneron_preprocessing.API.load_image(image)
+
+
+    npu_w = args.width
+    npu_h = args.height
+
+    crop_first = True if args.crop_first == "True" else False
+    if crop_first:
+        x1 = args.x_pos
+        y1 = args.y_pos
+        x2 = args.crop_w + x1
+        y2 = args.crop_h + y1
+        crop_box = [x1,y1,x2,y2]
+    else:
+        crop_box = None
+
+    pad_mode = args.pad_mode
+    norm_mode = args.norm_mode
+    bitwidth = args.bitwidth
+    radix = args.radix
+    rotate = args.rotate_mode
+
+    ##
+    image_data = kneron_preprocessing.API.inproc_520(image_data,npu_size=(npu_w,npu_h),crop_box=crop_box,pad_mode=pad_mode,norm=norm_mode,rotate=rotate,radix=radix,bit_width=bitwidth)
+
+    output_file = args.output_file
+    kneron_preprocessing.API.dump_image(image_data,output_file,'bin','rgba')
+
+    return
+
+
+if __name__ == "__main__":
+    argparser = argparse.ArgumentParser(
+        description="preprocessing"
+        )
+
+    argparser.add_argument(
+        '-i',
+        '--input_file',
+        help="input file name"
+        )
+
+    argparser.add_argument(
+        '-ff',
+        '--file_fmt',
+        help="input file format, jpg or bin"
+        )
+
+    argparser.add_argument(
+        '-rf',
+        '--raw_format',
+        help="input file image format, rgb or rgb565 or nir"
+        )
+
+    argparser.add_argument(
+        '-i_w',
+        '--input_width',
+        type=int,
+        help="input image width"
+        )
+
+    argparser.add_argument(
+        '-i_h',
+        '--input_height',
+        type=int,
+        help="input image height"
+        )
+
+    argparser.add_argument(
+        '-o',
+        '--output_file',
+        help="output file name"
+        )
+
+    argparser.add_argument(
+        '-s_w',
+        '--width',
+        type=int,
+        help="output width for npu input",
+        )
+
+    argparser.add_argument(
+        '-s_h',
+        '--height',
+        type=int,
+        help="output height for npu input",
+        )
+
+    argparser.add_argument(
+        '-c_f',
+        '--crop_first',
+        help="crop first True or False",
+        )
+
+    argparser.add_argument(
+        '-x',
+        '--x_pos',
+        type=int,
+        help="left up coordinate x",
+        )
+
+    argparser.add_argument(
+        '-y',
+        '--y_pos',
+        type=int,
+        help="left up coordinate y",
+        )
+
+    argparser.add_argument(
+        '-c_w',
+        '--crop_w',
+        type=int,
+        help="crop width",
+        )
+
+    argparser.add_argument(
+        '-c_h',
+        '--crop_h',
+        type=int,
+        help="crop height",
+        )
+
+    argparser.add_argument(
+        '-p_m',
+        '--pad_mode',
+        type=int,
+        help=" 0: pad 2 sides, 1: pad 1 side, 2: no pad.",
+        )
+
+    argparser.add_argument(
+        '-n_m',
+        '--norm_mode',
+        help="normalizaton mode: yolo, kneron, tf."
+        )
+
+    argparser.add_argument(
+        '-r_m',
+        '--rotate_mode',
+        type=int,
+        help="rotate mode:0,1,2"
+        )
+
+    argparser.add_argument(
+        '-bw',
+        '--bitwidth',
+        type=int,
+        help="Int for bitwidth"
+        )
+    
+    argparser.add_argument(
+        '-r',
+        '--radix',
+        type=int,
+        help="Int for radix"
+        )
+
+    args = argparser.parse_args()
+    main_(args)
--- a/kneron_preprocessing/Flow.py
+++ b/kneron_preprocessing/Flow.py
--- a/kneron_preprocessing/init.py
+++ b/kneron_preprocessing/init.py
@ -0,0 +1,2 @@
+from .Flow import *
+from .API import *
--- a/kneron_preprocessing/funcs/ColorConversion.py
+++ b/kneron_preprocessing/funcs/ColorConversion.py
@ -0,0 +1,285 @@
+import numpy as np
+from PIL import Image
+from .utils import signed_rounding, clip, str2bool
+
+format_bit = 10
+c00_yuv = 1
+c02_yuv = 1436
+c10_yuv = 1
+c11_yuv = -354
+c12_yuv = -732
+c20_yuv = 1
+c21_yuv = 1814
+c00_ycbcr = 1192
+c02_ycbcr = 1634
+c10_ycbcr = 1192
+c11_ycbcr = -401
+c12_ycbcr = -833
+c20_ycbcr = 1192
+c21_ycbcr = 2065
+
+Matrix_ycbcr_to_rgb888 = np.array(
+    [[1.16438356e+00,  1.16438356e+00,  1.16438356e+00],
+     [2.99747219e-07, - 3.91762529e-01,  2.01723263e+00],
+     [1.59602686e+00, - 8.12968294e-01,  3.04059479e-06]])
+
+Matrix_rgb888_to_ycbcr = np.array(
+    [[0.25678824, - 0.14822353,  0.43921569],
+     [0.50412941, - 0.29099216, - 0.36778824],
+     [0.09790588,  0.43921569, - 0.07142745]])
+
+Matrix_rgb888_to_yuv = np.array(
+    [[ 0.29899106, -0.16877996,  0.49988381],
+    [ 0.5865453,  -0.33110385, -0.41826072],
+    [ 0.11446364,  0.49988381, -0.08162309]])
+
+# Matrix_rgb888_to_yuv = np.array(
+#     [[0.299, - 0.147,   0.615],
+#      [0.587, - 0.289, - 0.515],
+#      [0.114,   0.436, - 0.100]])
+
+# Matrix_yuv_to_rgb888 = np.array(
+#     [[1.000,   1.000,  1.000],
+#      [0.000, - 0.394,  2.032],
+#      [1.140, - 0.581,  0.000]])
+
+class runner(object):
+    def __init__(self):
+        self.set = {
+            'print_info':'no',
+            'model_size':[0,0],
+            'numerical_type':'floating',
+            "source_format": "rgb888",
+            "out_format": "rgb888",
+            "options": {
+                "simulation": "no",
+                "simulation_format": "rgb888"
+            }
+        }
+
+    def update(self, **kwargs):
+        #
+        self.set.update(kwargs)
+
+        ## simulation
+        self.funs = []
+        if str2bool(self.set['options']['simulation']) and self.set['source_format'].lower() in ['RGB888', 'rgb888', 'RGB', 'rgb']:
+            if self.set['options']['simulation_format'].lower() in ['YUV422', 'yuv422', 'YUV', 'yuv']:
+                self.funs.append(self._ColorConversion_RGB888_to_YUV422)
+                self.set['source_format'] = 'YUV422'
+            elif self.set['options']['simulation_format'].lower() in ['YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']:
+                self.funs.append(self._ColorConversion_RGB888_to_YCbCr422)
+                self.set['source_format'] = 'YCbCr422'
+            elif self.set['options']['simulation_format'].lower() in['RGB565', 'rgb565']:
+                self.funs.append(self._ColorConversion_RGB888_to_RGB565)
+                self.set['source_format'] = 'RGB565'
+        
+        ## to rgb888
+        if self.set['source_format'].lower() in ['YUV444', 'yuv444','YUV422', 'yuv422', 'YUV', 'yuv']:
+            self.funs.append(self._ColorConversion_YUV_to_RGB888)
+        elif self.set['source_format'].lower() in ['YCBCR444', 'YCbCr444', 'ycbcr444','YCBCR422', 'YCbCr422', 'ycbcr422', 'YCBCR', 'YCbCr', 'ycbcr']:
+            self.funs.append(self._ColorConversion_YCbCr_to_RGB888)
+        elif self.set['source_format'].lower() in ['RGB565', 'rgb565']:
+            self.funs.append(self._ColorConversion_RGB565_to_RGB888)
+        elif self.set['source_format'].lower() in ['l', 'L' , 'nir', 'NIR']:
+            self.funs.append(self._ColorConversion_L_to_RGB888)
+        elif self.set['source_format'].lower() in ['RGBA8888', 'rgba8888' , 'RGBA', 'rgba']:
+            self.funs.append(self._ColorConversion_RGBA8888_to_RGB888)
+
+        ## output format
+        if self.set['out_format'].lower() in ['L', 'l']:
+            self.funs.append(self._ColorConversion_RGB888_to_L)
+        elif self.set['out_format'].lower() in['RGB565', 'rgb565']:
+            self.funs.append(self._ColorConversion_RGB888_to_RGB565)
+        elif self.set['out_format'].lower() in['RGBA', 'RGBA8888','rgba','rgba8888']:
+            self.funs.append(self._ColorConversion_RGB888_to_RGBA8888)
+        elif self.set['out_format'].lower() in['YUV', 'YUV444','yuv','yuv444']:
+            self.funs.append(self._ColorConversion_RGB888_to_YUV444)
+        elif self.set['out_format'].lower() in['YUV422','yuv422']:
+            self.funs.append(self._ColorConversion_RGB888_to_YUV422)
+        elif self.set['out_format'].lower() in['YCBCR', 'YCBCR444','YCbCr','YCbCr444','ycbcr','ycbcr444']:
+            self.funs.append(self._ColorConversion_RGB888_to_YCbCr444)
+        elif self.set['out_format'].lower() in['YCBCR422','YCbCr422','ycbcr422']:
+            self.funs.append(self._ColorConversion_RGB888_to_YCbCr422)
+
+    def print_info(self):
+        print("<colorConversion>",
+              "source_format:", self.set['source_format'],
+              ', out_format:', self.set['out_format'],
+              ', simulation:', self.set['options']['simulation'],
+              ', simulation_format:', self.set['options']['simulation_format'])
+
+    def run(self, image_data):
+        assert isinstance(image_data, np.ndarray)
+        # print info
+        if str2bool(self.set['print_info']):
+            self.print_info()
+
+        # color
+        for _, f in enumerate(self.funs):
+            image_data = f(image_data)
+
+        # output
+        info = {}
+        return image_data, info
+
+    def _ColorConversion_RGB888_to_YUV444(self, image):
+        ## floating
+        image = image.astype('float')
+        image = (image @ Matrix_rgb888_to_yuv + 0.5).astype('uint8')
+        return image
+
+    def _ColorConversion_RGB888_to_YUV422(self, image):
+        # rgb888 to yuv444
+        image = self._ColorConversion_RGB888_to_YUV444(image)
+
+        # yuv444 to yuv422
+        u2 = image[:, 0::2, 1]
+        u4 = np.repeat(u2, 2, axis=1)
+        v2 = image[:, 1::2, 2]
+        v4 = np.repeat(v2, 2, axis=1)
+        image[..., 1] = u4
+        image[..., 2] = v4
+        return image
+           
+    def _ColorConversion_YUV_to_RGB888(self, image):
+        ## fixed
+        h, w, c = image.shape
+        image_f = image.reshape((h * w, c))
+        image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8)
+
+        for i in range(h * w):
+            image_y = image_f[i, 0] *1024
+            if image_f[i, 1] > 127:
+                image_u = -((~(image_f[i, 1] - 1)) & 0xFF)
+            else:
+                image_u = image_f[i, 1]
+            if image_f[i, 2] > 127:
+                image_v = -((~(image_f[i, 2] - 1)) & 0xFF)
+            else:
+                image_v = image_f[i, 2]
+
+            image_r = c00_yuv * image_y + c02_yuv * image_v
+            image_g = c10_yuv * image_y + c11_yuv * image_u + c12_yuv * image_v
+            image_b = c20_yuv * image_y + c21_yuv * image_u
+
+            image_r = signed_rounding(image_r, format_bit)
+            image_g = signed_rounding(image_g, format_bit)
+            image_b = signed_rounding(image_b, format_bit)
+
+            image_r = image_r >> format_bit
+            image_g = image_g >> format_bit
+            image_b = image_b >> format_bit
+
+            image_rgb_f[i, 0] = clip(image_r, 0, 255)
+            image_rgb_f[i, 1] = clip(image_g, 0, 255)
+            image_rgb_f[i, 2] = clip(image_b, 0, 255)
+
+        image_rgb = image_rgb_f.reshape((h, w, c))
+        return image_rgb
+
+    def _ColorConversion_RGB888_to_YCbCr444(self, image):
+        ## floating
+        image = image.astype('float')
+        image = (image @ Matrix_rgb888_to_ycbcr + 0.5).astype('uint8')
+        image[:, :, 0] += 16
+        image[:, :, 1] += 128
+        image[:, :, 2] += 128
+
+        return image
+
+    def _ColorConversion_RGB888_to_YCbCr422(self, image):
+        # rgb888 to ycbcr444
+        image = self._ColorConversion_RGB888_to_YCbCr444(image)
+
+        # ycbcr444 to ycbcr422
+        cb2 = image[:, 0::2, 1]
+        cb4 = np.repeat(cb2, 2, axis=1)
+        cr2 = image[:, 1::2, 2]
+        cr4 = np.repeat(cr2, 2, axis=1)
+        image[..., 1] = cb4
+        image[..., 2] = cr4
+        return image
+
+    def _ColorConversion_YCbCr_to_RGB888(self, image):
+        ## floating
+        if (self.set['numerical_type'] == 'floating'):
+            image = image.astype('float')
+            image[:, :, 0] -= 16
+            image[:, :, 1] -= 128
+            image[:, :, 2] -= 128
+            image = ((image @ Matrix_ycbcr_to_rgb888) + 0.5).astype('uint8')
+            return image
+
+        ## fixed
+        h, w, c = image.shape
+        image_f = image.reshape((h * w, c))
+        image_rgb_f = np.zeros(image_f.shape, dtype=np.uint8)
+
+        for i in range(h * w):
+            image_y = (image_f[i, 0] - 16) * c00_ycbcr
+            image_cb = image_f[i, 1] - 128
+            image_cr = image_f[i, 2] - 128
+
+            image_r = image_y + c02_ycbcr * image_cr
+            image_g = image_y + c11_ycbcr * image_cb + c12_ycbcr * image_cr
+            image_b = image_y + c21_ycbcr * image_cb
+
+            image_r = signed_rounding(image_r, format_bit)
+            image_g = signed_rounding(image_g, format_bit)
+            image_b = signed_rounding(image_b, format_bit)
+
+            image_r = image_r >> format_bit
+            image_g = image_g >> format_bit
+            image_b = image_b >> format_bit
+
+            image_rgb_f[i, 0] = clip(image_r, 0, 255)
+            image_rgb_f[i, 1] = clip(image_g, 0, 255)
+            image_rgb_f[i, 2] = clip(image_b, 0, 255)
+
+        image_rgb = image_rgb_f.reshape((h, w, c))
+        return image_rgb
+
+    def _ColorConversion_RGB888_to_RGB565(self, image):
+        assert (len(image.shape)==3)
+        assert (image.shape[2]>=3)
+        
+        image_rgb565 = np.zeros(image.shape, dtype=np.uint8)
+        image_rgb = image.astype('uint8')
+        image_rgb565[:, :, 0] = image_rgb[:, :, 0] >> 3
+        image_rgb565[:, :, 1] = image_rgb[:, :, 1] >> 2
+        image_rgb565[:, :, 2] = image_rgb[:, :, 2] >> 3
+        return image_rgb565
+
+    def _ColorConversion_RGB565_to_RGB888(self, image):
+        assert (len(image.shape)==3)
+        assert (image.shape[2]==3)
+
+        image_rgb = np.zeros(image.shape, dtype=np.uint8)
+        image_rgb[:, :, 0] = image[:, :, 0] << 3
+        image_rgb[:, :, 1] = image[:, :, 1] << 2
+        image_rgb[:, :, 2] = image[:, :, 2] << 3
+        return image_rgb
+
+    def _ColorConversion_L_to_RGB888(self, image):
+        image_L = image.astype('uint8')
+        img = Image.fromarray(image_L).convert('RGB')
+        image_data = np.array(img).astype('uint8')
+        return image_data
+
+    def _ColorConversion_RGB888_to_L(self, image):
+        image_rgb = image.astype('uint8')
+        img = Image.fromarray(image_rgb).convert('L')
+        image_data = np.array(img).astype('uint8')
+        return image_data
+
+    def _ColorConversion_RGBA8888_to_RGB888(self, image):
+        assert (len(image.shape)==3)
+        assert (image.shape[2]==4)
+        return image[:,:,:3]
+
+    def _ColorConversion_RGB888_to_RGBA8888(self, image):
+        assert (len(image.shape)==3)
+        assert (image.shape[2]==3)
+        imageA = np.concatenate((image, np.zeros((image.shape[0], image.shape[1], 1), dtype=np.uint8) ), axis=2)
+        return imageA
--- a/kneron_preprocessing/funcs/Crop.py
+++ b/kneron_preprocessing/funcs/Crop.py
@ -0,0 +1,145 @@
+import numpy as np
+from PIL import Image
+from .utils import str2int, str2float, str2bool, pad_square_to_4
+from .utils_520 import round_up_n
+from .Runner_base import Runner_base, Param_base
+
+class General(Param_base):
+    type = 'center'
+    align_w_to_4 = False
+    pad_square_to_4 = False
+    rounding_type = 0
+    crop_w = 0
+    crop_h = 0
+    start_x = 0.
+    start_y = 0.
+    end_x = 0.
+    end_y = 0.
+    def update(self, **dic):
+        self.type = dic['type']
+        self.align_w_to_4 = str2bool(dic['align_w_to_4'])
+        self.rounding_type = str2int(dic['rounding_type'])
+        self.crop_w = str2int(dic['crop_w'])
+        self.crop_h = str2int(dic['crop_h'])
+        self.start_x = str2float(dic['start_x'])
+        self.start_y = str2float(dic['start_y'])
+        self.end_x = str2float(dic['end_x'])
+        self.end_y = str2float(dic['end_y'])
+
+    def __str__(self):
+        str_out = [
+            ', type:',str(self.type),
+            ', align_w_to_4:',str(self.align_w_to_4),
+            ', pad_square_to_4:',str(self.pad_square_to_4),
+            ', crop_w:',str(self.crop_w),
+            ', crop_h:',str(self.crop_h),
+            ', start_x:',str(self.start_x),
+            ', start_y:',str(self.start_y),
+            ', end_x:',str(self.end_x),
+            ', end_y:',str(self.end_y)]
+        return(' '.join(str_out))
+       
+class runner(Runner_base):
+    ## overwrite the class in Runner_base
+    general = General()
+
+    def __str__(self):
+        return('<Crop>')
+
+    def update(self, **kwargs):
+        ##
+        super().update(**kwargs)
+
+        ##
+        if (self.general.start_x != self.general.end_x) and (self.general.start_y != self.general.end_y):
+            self.general.type = 'specific'
+        elif(self.general.type != 'specific'):
+            if self.general.crop_w == 0 or self.general.crop_h == 0:
+                self.general.crop_w = self.common.model_size[0]
+                self.general.crop_h = self.common.model_size[1]
+            assert(self.general.crop_w > 0)
+            assert(self.general.crop_h > 0)
+            assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner'])
+        else:
+            assert(self.general.type == 'specific')
+
+    def run(self, image_data):
+        ## init
+        img = Image.fromarray(image_data)
+        w, h = img.size
+
+        ## get range
+        if self.general.type.lower() in ['CENTER', 'Center', 'center']:
+            x1, y1, x2, y2 = self._calcuate_xy_center(w, h)
+        elif self.general.type.lower() in ['CORNER', 'Corner', 'corner']:
+            x1, y1, x2, y2 = self._calcuate_xy_corner(w, h)
+        else:
+            x1 = self.general.start_x
+            y1 = self.general.start_y
+            x2 = self.general.end_x
+            y2 = self.general.end_y
+            assert( ((x1 != x2) and (y1 != y2)) )
+
+        ## rounding
+        if self.general.rounding_type == 0:
+            x1 = int(np.floor(x1))
+            y1 = int(np.floor(y1))
+            x2 = int(np.ceil(x2))
+            y2 = int(np.ceil(y2))
+        else:
+            x1 = int(round(x1))
+            y1 = int(round(y1))
+            x2 = int(round(x2))
+            y2 = int(round(y2))
+
+        if self.general.align_w_to_4:
+            # x1 = (x1+1) &(~3)  #//+2
+            # x2 = (x2+2) &(~3)  #//+1
+            x1 = (x1+3) &(~3)  #//+2
+            left = w - x2
+            left = (left+3) &(~3)
+            x2 = w - left
+
+        ## pad_square_to_4
+        if str2bool(self.general.pad_square_to_4):
+            x1,x2,y1,y2 = pad_square_to_4(x1,x2,y1,y2)
+
+        # do crop
+        box = (x1,y1,x2,y2)
+        img = img.crop(box)
+
+        # print info
+        if str2bool(self.common.print_info):
+            self.general.start_x = x1
+            self.general.start_y = y1
+            self.general.end_x = x2
+            self.general.end_y = y2
+            self.general.crop_w = x2 - x1
+            self.general.crop_h = y2 - y1
+            self.print_info()
+
+        # output
+        image_data = np.array(img)
+        info = {}
+        info['box'] = box
+
+        return image_data, info
+
+
+    ## protect fun
+    def _calcuate_xy_center(self, w, h):
+        x1 = w/2 - self.general.crop_w / 2
+        y1 = h/2 - self.general.crop_h / 2
+        x2 = w/2 + self.general.crop_w / 2
+        y2 = h/2 + self.general.crop_h / 2
+        return x1, y1, x2, y2
+
+    def _calcuate_xy_corner(self, _1, _2):
+        x1 = 0
+        y1 = 0
+        x2 = self.general.crop_w
+        y2 = self.general.crop_h
+        return x1, y1, x2, y2
+
+    def do_crop(self, image_data, startW, startH, endW, endH):
+        return image_data[startH:endH, startW:endW, :]
--- a/kneron_preprocessing/funcs/Normalize.py
+++ b/kneron_preprocessing/funcs/Normalize.py
@ -0,0 +1,186 @@
+import numpy as np
+from .utils import str2bool, str2int, str2float, clip_ary
+
+class runner(object):
+    def __init__(self):
+        self.set = {
+            'general': {
+                'print_info':'no',
+                'model_size':[0,0],
+                'numerical_type':'floating',
+                'type': 'kneron'
+            },
+            'floating':{
+                "scale": 1,
+                "bias": 0,
+                "mean": "",
+                "std": "",
+            },
+            'hw':{
+                "radix":8,
+                "shift":"",
+                "sub":""
+            }
+        }
+        return
+
+    def update(self, **kwargs):
+        #
+        self.set.update(kwargs)
+
+        #
+        if self.set['general']['numerical_type'] == '520':
+            if self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']:
+                self.fun_normalize = self._chen_520
+                self.shift = 7 - self.set['hw']['radix']
+                self.sub = 128
+            elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']:
+                self.fun_normalize = self._chen_520
+                self.shift = 8 - self.set['hw']['radix']
+                self.sub = 0
+            elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']:
+                self.fun_normalize = self._chen_520
+                self.shift = 8 - self.set['hw']['radix']
+                self.sub = 128
+            else:
+                self.fun_normalize = self._chen_520
+                self.shift = 0
+                self.sub = 0      
+        elif self.set['general']['numerical_type'] == '720':
+                self.fun_normalize = self._chen_720
+                self.shift = 0
+                self.sub = 0                   
+        else:
+            if self.set['general']['type'].lower() in ['TORCH', 'Torch', 'torch']:
+                self.fun_normalize = self._normalize_torch
+                self.set['floating']['scale'] = 255.
+                self.set['floating']['mean'] = [0.485, 0.456, 0.406]
+                self.set['floating']['std'] = [0.229, 0.224, 0.225]
+            elif self.set['general']['type'].lower() in ['TF', 'Tf', 'tf']:
+                self.fun_normalize = self._normalize_tf
+                self.set['floating']['scale'] = 127.5
+                self.set['floating']['bias'] = -1.
+            elif self.set['general']['type'].lower() in ['CAFFE', 'Caffe', 'caffe']:
+                self.fun_normalize = self._normalize_caffe
+                self.set['floating']['mean'] = [103.939, 116.779, 123.68]
+            elif self.set['general']['type'].lower() in ['YOLO', 'Yolo', 'yolo']:
+                self.fun_normalize = self._normalize_yolo
+                self.set['floating']['scale'] = 255.
+            elif self.set['general']['type'].lower() in ['KNERON', 'Kneron', 'kneron']:
+                self.fun_normalize = self._normalize_kneron
+                self.set['floating']['scale'] = 256.
+                self.set['floating']['bias'] = -0.5
+            else:
+                self.fun_normalize = self._normalize_customized
+                self.set['floating']['scale'] = str2float(self.set['floating']['scale'])
+                self.set['floating']['bias'] = str2float(self.set['floating']['bias'])
+                if self.set['floating']['mean'] != None:
+                    if len(self.set['floating']['mean']) != 3:
+                        self.set['floating']['mean'] = None
+                if self.set['floating']['std'] != None:
+                    if len(self.set['floating']['std']) != 3:
+                        self.set['floating']['std'] = None
+
+
+    def print_info(self):
+        if self.set['general']['numerical_type'] == '520':
+            print("<normalize>",
+            'numerical_type', self.set['general']['numerical_type'],
+            ", type:", self.set['general']['type'],
+            ', shift:',self.shift, 
+            ', sub:', self.sub)
+        else:
+            print("<normalize>",
+            'numerical_type', self.set['general']['numerical_type'],
+            ", type:", self.set['general']['type'],
+            ', scale:',self.set['floating']['scale'], 
+            ', bias:', self.set['floating']['bias'],
+            ', mean:', self.set['floating']['mean'],
+            ', std:',self.set['floating']['std'])
+
+    def run(self, image_data):
+        # print info
+        if str2bool(self.set['general']['print_info']):
+            self.print_info()
+
+        # norm
+        image_data = self.fun_normalize(image_data)
+
+        # output
+        info = {}
+        return image_data, info
+
+    def _normalize_torch(self, x):
+        if len(x.shape) != 3:
+            return x
+        x = x.astype('float')
+        x = x / self.set['floating']['scale']
+        x[..., 0] -= self.set['floating']['mean'][0]
+        x[..., 1] -= self.set['floating']['mean'][1]
+        x[..., 2] -= self.set['floating']['mean'][2]
+        x[..., 0] /= self.set['floating']['std'][0]
+        x[..., 1] /= self.set['floating']['std'][1]
+        x[..., 2] /= self.set['floating']['std'][2]
+        return x
+
+    def _normalize_tf(self, x):
+        # print('_normalize_tf')
+        x = x.astype('float')
+        x = x / self.set['floating']['scale']
+        x = x + self.set['floating']['bias']
+        return x
+
+    def _normalize_caffe(self, x):
+        if len(x.shape) != 3:
+            return x
+        x = x.astype('float')
+        x = x[..., ::-1]
+        x[..., 0] -= self.set['floating']['mean'][0]
+        x[..., 1] -= self.set['floating']['mean'][1]
+        x[..., 2] -= self.set['floating']['mean'][2]
+        return x
+
+    def _normalize_yolo(self, x):
+        # print('_normalize_yolo')
+        x = x.astype('float')
+        x = x / self.set['floating']['scale']
+        return x
+
+    def _normalize_kneron(self, x):
+        # print('_normalize_kneron')
+        x = x.astype('float')
+        x = x/self.set['floating']['scale']
+        x = x + self.set['floating']['bias']
+        return x
+
+    def _normalize_customized(self, x):
+        # print('_normalize_customized')
+        x = x.astype('float')
+        if  self.set['floating']['scale'] != 0:
+            x = x/ self.set['floating']['scale'] 
+        x = x + self.set['floating']['bias'] 
+        if self.set['floating']['mean'] is not None:
+            x[..., 0] -= self.set['floating']['mean'][0]
+            x[..., 1] -= self.set['floating']['mean'][1]
+            x[..., 2] -= self.set['floating']['mean'][2]
+        if self.set['floating']['std'] is not None:
+            x[..., 0] /= self.set['floating']['std'][0]
+            x[..., 1] /= self.set['floating']['std'][1]
+            x[..., 2] /= self.set['floating']['std'][2]
+
+        return x
+
+    def _chen_520(self, x):
+        # print('_chen_520')
+        x = (x - self.sub).astype('uint8')
+        x = (np.right_shift(x,self.shift))
+        x=x.astype('uint8')
+        return x
+
+    def _chen_720(self, x):
+        # print('_chen_720')
+        if self.shift == 1:
+            x = x + np.array([[self.sub], [self.sub], [self.sub]])
+        else:
+            x = x + np.array([[self.sub], [self.sub], [self.sub]])
+        return x
--- a/kneron_preprocessing/funcs/Padding.py
+++ b/kneron_preprocessing/funcs/Padding.py
@ -0,0 +1,187 @@
+import numpy as np
+from PIL import Image
+from .utils import str2bool, str2int, str2float
+from .Runner_base import Runner_base, Param_base
+
+class General(Param_base):
+    type = ''
+    pad_val = ''
+    padded_w = ''
+    padded_h = ''
+    pad_l = ''
+    pad_r = ''
+    pad_t = ''
+    pad_b = ''
+    padding_ch = 3
+    padding_ch_type = 'RGB'
+    def update(self, **dic):
+        self.type = dic['type']
+        self.pad_val = dic['pad_val']
+        self.padded_w = str2int(dic['padded_w'])
+        self.padded_h = str2int(dic['padded_h'])
+        self.pad_l = str2int(dic['pad_l'])
+        self.pad_r = str2int(dic['pad_r'])
+        self.pad_t = str2int(dic['pad_t'])
+        self.pad_b = str2int(dic['pad_b'])
+
+    def __str__(self):
+        str_out = [
+            ', type:',str(self.type),
+            ', pad_val:',str(self.pad_val),
+            ', pad_l:',str(self.pad_l),
+            ', pad_r:',str(self.pad_r),
+            ', pad_r:',str(self.pad_t),
+            ', pad_b:',str(self.pad_b),
+            ', padding_ch:',str(self.padding_ch)]
+        return(' '.join(str_out))
+
+class Hw(Param_base):
+    radix = 8
+    normalize_type = 'floating'
+    def update(self, **dic):
+        self.radix = dic['radix']
+        self.normalize_type = dic['normalize_type']
+
+    def __str__(self):
+        str_out = [
+            ', radix:', str(self.radix),
+            ', normalize_type:',str(self.normalize_type)]
+        return(' '.join(str_out))
+
+
+class runner(Runner_base):
+    ## overwrite the class in Runner_base
+    general = General()
+    hw = Hw()
+
+    def __str__(self):
+        return('<Padding>')
+
+    def update(self, **kwargs):
+        super().update(**kwargs)
+
+        ## update pad type & pad length
+        if (self.general.pad_l != 0) or (self.general.pad_r != 0) or (self.general.pad_t != 0) or (self.general.pad_b != 0):
+            self.general.type = 'specific'
+            assert(self.general.pad_l >= 0)
+            assert(self.general.pad_r >= 0)
+            assert(self.general.pad_t >= 0)
+            assert(self.general.pad_b >= 0)
+        elif(self.general.type != 'specific'):
+            if self.general.padded_w == 0 or self.general.padded_h == 0:
+                self.general.padded_w = self.common.model_size[0]
+                self.general.padded_h = self.common.model_size[1]
+            assert(self.general.padded_w > 0)
+            assert(self.general.padded_h > 0)
+            assert(self.general.type.lower() in ['CENTER', 'Center', 'center', 'CORNER', 'Corner', 'corner'])
+        else:
+            assert(self.general.type == 'specific')
+            
+        ## decide pad_val & padding ch
+        # if numerical_type is floating
+        if (self.common.numerical_type == 'floating'):
+            if self.general.pad_val != 'edge':
+                self.general.pad_val = str2float(self.general.pad_val)
+            self.general.padding_ch = 3
+            self.general.padding_ch_type = 'RGB'
+        # if numerical_type is 520 or 720
+        else: 
+            if self.general.pad_val == '':
+                if self.hw.normalize_type.lower() in ['TF', 'Tf', 'tf']:
+                    self.general.pad_val = np.uint8(-128 >> (7 - self.hw.radix))
+                elif self.hw.normalize_type.lower() in ['YOLO', 'Yolo', 'yolo']:
+                    self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix))
+                elif self.hw.normalize_type.lower() in ['KNERON', 'Kneron', 'kneron']:
+                    self.general.pad_val = np.uint8(-128 >> (8 - self.hw.radix))
+                else:
+                    self.general.pad_val = np.uint8(0 >> (8 - self.hw.radix))
+            else:
+                self.general.pad_val = str2int(self.general.pad_val)
+            self.general.padding_ch = 4
+            self.general.padding_ch_type = 'RGBA'
+
+    def run(self, image_data):
+        # init
+        shape = image_data.shape
+        w = shape[1]
+        h = shape[0]
+        if len(shape) < 3:
+            self.general.padding_ch = 1
+            self.general.padding_ch_type = 'L'
+        else:
+            if shape[2] == 3 and self.general.padding_ch == 4:
+                image_data = np.concatenate((image_data, np.zeros((h, w, 1), dtype=np.uint8) ), axis=2)
+                
+        ## padding
+        if self.general.type.lower() in ['CENTER',  'Center',  'center']:
+            img_pad = self._padding_center(image_data, w, h)
+        elif self.general.type.lower() in ['CORNER',  'Corner',  'corner']:
+            img_pad = self._padding_corner(image_data, w, h)
+        else:
+            img_pad = self._padding_sp(image_data, w, h)
+
+        # print info
+        if str2bool(self.common.print_info):
+            self.print_info()
+
+        # output
+        info = {}
+        return img_pad, info
+
+    ## protect fun
+    def _padding_center(self, img, ori_w, ori_h):
+        # img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), int(self.general.pad_val[0]))
+        # img = Image.fromarray(img)
+        # img_pad.paste(img, ((self.general.padded_w-ori_w)//2, (self.general.padded_h-ori_h)//2))
+        # return img_pad
+        padH = self.general.padded_h - ori_h
+        padW = self.general.padded_w - ori_w
+        self.general.pad_t = padH // 2
+        self.general.pad_b = (padH // 2) + (padH % 2)
+        self.general.pad_l = padW // 2
+        self.general.pad_r = (padW // 2) + (padW % 2)
+        if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0:
+            return img
+        img_pad = self._padding_sp(img,ori_w,ori_h)
+        return img_pad
+
+    def _padding_corner(self, img, ori_w, ori_h):
+        # img_pad = Image.new(self.general.padding_ch_type, (self.general.padded_w, self.general.padded_h), self.general.pad_val)
+        # img_pad.paste(img, (0, 0))
+        self.general.pad_l = 0
+        self.general.pad_r = self.general.padded_w - ori_w
+        self.general.pad_t = 0
+        self.general.pad_b = self.general.padded_h - ori_h
+        if self.general.pad_l < 0 or self.general.pad_r <0 or self.general.pad_t <0 or self.general.pad_b<0:
+            return img
+        img_pad = self._padding_sp(img,ori_w,ori_h)
+        return img_pad
+
+    def _padding_sp(self, img, ori_w, ori_h):
+        # block_t = np.zeros((self.general.pad_t, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float)
+        # block_l = np.zeros((ori_h, self.general.pad_l, self.general.padding_ch), dtype=np.float)
+        # block_r = np.zeros((ori_h, self.general.pad_r, self.general.padding_ch), dtype=np.float)
+        # block_b = np.zeros((self.general.pad_b, self.general.pad_l + self.general.pad_r + ori_w, self.general.padding_ch), dtype=np.float)
+        # for i in range(self.general.padding_ch):
+        #     block_t[:, :, i] = np.ones(block_t[:, :, i].shape, dtype=np.float) * self.general.pad_val
+        #     block_l[:, :, i] = np.ones(block_l[:, :, i].shape, dtype=np.float) * self.general.pad_val
+        #     block_r[:, :, i] = np.ones(block_r[:, :, i].shape, dtype=np.float) * self.general.pad_val
+        #     block_b[:, :, i] = np.ones(block_b[:, :, i].shape, dtype=np.float) * self.general.pad_val
+        # padded_image_hor = np.concatenate((block_l, img, block_r), axis=1)
+        # padded_image = np.concatenate((block_t, padded_image_hor, block_b), axis=0)
+        # return padded_image
+        if self.general.padding_ch == 1:
+            pad_range = ( (self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r) )
+        else:
+            pad_range = ((self.general.pad_t, self.general.pad_b),(self.general.pad_l, self.general.pad_r),(0,0))
+
+        if isinstance(self.general.pad_val, str):
+            if self.general.pad_val == 'edge':
+                padded_image = np.pad(img, pad_range, mode="edge")
+            else:
+                padded_image = np.pad(img, pad_range, mode="constant",constant_values=0)
+        else:
+            padded_image = np.pad(img, pad_range, mode="constant",constant_values=self.general.pad_val)
+        
+        return padded_image
+
--- a/kneron_preprocessing/funcs/Resize.py
+++ b/kneron_preprocessing/funcs/Resize.py
@ -0,0 +1,237 @@
+import numpy as np
+import cv2
+from PIL import Image
+from .utils import str2bool, str2int
+from ctypes import c_float
+from .Runner_base import Runner_base, Param_base
+
+class General(Param_base):
+    type = 'bilinear'
+    keep_ratio = True
+    zoom = True
+    calculate_ratio_using_CSim = True
+    resize_w = 0
+    resize_h = 0
+    resized_w = 0
+    resized_h = 0
+    def update(self, **dic):
+        self.type = dic['type']
+        self.keep_ratio = str2bool(dic['keep_ratio'])
+        self.zoom = str2bool(dic['zoom'])
+        self.calculate_ratio_using_CSim = str2bool(dic['calculate_ratio_using_CSim'])
+        self.resize_w = str2int(dic['resize_w'])
+        self.resize_h = str2int(dic['resize_h'])
+
+    def __str__(self):
+        str_out = [
+            ', type:',str(self.type),
+            ', keep_ratio:',str(self.keep_ratio),
+            ', zoom:',str(self.zoom),
+            ', calculate_ratio_using_CSim:',str(self.calculate_ratio_using_CSim),
+            ', resize_w:',str(self.resize_w),
+            ', resize_h:',str(self.resize_h),
+            ', resized_w:',str(self.resized_w),
+            ', resized_h:',str(self.resized_h)]
+        return(' '.join(str_out))
+
+class Hw(Param_base):
+    resize_bit = 12
+    def update(self, **dic):
+        pass
+
+    def __str__(self):
+        str_out = [
+            ', resize_bit:',str(self.resize_bit)]
+        return(' '.join(str_out))
+
+class runner(Runner_base):
+    ## overwrite the class in Runner_base
+    general = General()
+    hw = Hw()
+
+    def __str__(self):
+        return('<Resize>')
+
+    def update(self, **kwargs):
+        super().update(**kwargs)
+        
+        ## if resize size has not been assigned, then it will take model size as resize size
+        if self.general.resize_w == 0 or self.general.resize_h == 0:
+            self.general.resize_w = self.common.model_size[0]
+            self.general.resize_h = self.common.model_size[1]
+        assert(self.general.resize_w > 0)
+        assert(self.general.resize_h > 0)
+
+        ##
+        if self.common.numerical_type == '520':
+            self.general.type = 'fixed_520'
+        elif self.common.numerical_type == '720':
+            self.general.type = 'fixed_720'
+        assert(self.general.type.lower() in ['BILINEAR',  'Bilinear',  'bilinear', 'BICUBIC',  'Bicubic',  'bicubic', 'FIXED',  'Fixed', 'fixed', 'FIXED_520',  'Fixed_520',  'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720','CV', 'cv', 'opencv', 'OpenCV', 'CV2', 'cv2'])
+
+
+    def run(self, image_data):
+        ## init
+        ori_w = image_data.shape[1]
+        ori_h = image_data.shape[0]
+        info = {}
+
+        ##
+        if self.general.keep_ratio:
+            self.general.resized_w, self.general.resized_h = self.calcuate_scale_keep_ratio(self.general.resize_w,self.general.resize_h, ori_w, ori_h, self.general.calculate_ratio_using_CSim)
+        else:
+            self.general.resized_w = int(self.general.resize_w)
+            self.general.resized_h = int(self.general.resize_h)
+        assert(self.general.resized_w > 0)
+        assert(self.general.resized_h > 0)
+
+        ##
+        if (self.general.resized_w > ori_w) or (self.general.resized_h > ori_h):
+            if not self.general.zoom: 
+                info['size'] = (ori_w,ori_h)
+                if str2bool(self.common.print_info):
+                    print('no resize')
+                    self.print_info()
+                return image_data, info
+
+        ## resize
+        if self.general.type.lower() in ['BILINEAR',  'Bilinear',  'bilinear']:
+            image_data = self.do_resize_bilinear(image_data, self.general.resized_w, self.general.resized_h)
+        elif self.general.type.lower() in ['BICUBIC',  'Bicubic',  'bicubic']:
+            image_data = self.do_resize_bicubic(image_data, self.general.resized_w, self.general.resized_h)
+        elif self.general.type.lower() in ['CV',  'cv',  'opencv', 'OpenCV',  'CV2',  'cv2']:
+            image_data = self.do_resize_cv2(image_data, self.general.resized_w, self.general.resized_h)
+        elif self.general.type.lower() in ['FIXED',  'Fixed',  'fixed', 'FIXED_520',  'Fixed_520',  'fixed_520', 'FIXED_720', 'Fixed_720', 'fixed_720']:
+            image_data = self.do_resize_fixed(image_data, self.general.resized_w, self.general.resized_h, self.hw.resize_bit, self.general.type)
+
+       
+        # output
+        info['size'] = (self.general.resized_w, self.general.resized_h)
+
+        # print info
+        if str2bool(self.common.print_info):
+            self.print_info()
+
+        return image_data, info
+
+    def calcuate_scale_keep_ratio(self, tar_w, tar_h, ori_w, ori_h, calculate_ratio_using_CSim):
+        if not calculate_ratio_using_CSim:
+            scale_w = tar_w * 1.0 / ori_w*1.0
+            scale_h = tar_h * 1.0 / ori_h*1.0
+            scale = scale_w if scale_w < scale_h else scale_h
+            new_w = int(round(ori_w * scale))
+            new_h = int(round(ori_h * scale))
+            return new_w, new_h
+        
+        ## calculate_ratio_using_CSim
+        scale_w = c_float(tar_w * 1.0 / (ori_w * 1.0)).value
+        scale_h = c_float(tar_h * 1.0 / (ori_h * 1.0)).value
+        scale_ratio = 0.0
+        scale_target_w = 0
+        scale_target_h = 0
+        padH = 0
+        padW = 0
+
+        bScaleW = True if scale_w < scale_h else False
+        if bScaleW:
+            scale_ratio = scale_w
+            scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value)
+            scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value)
+            assert (abs(scale_target_w - tar_w) <= 1), "Error: scale down width cannot meet expectation\n"
+            padH = tar_h - scale_target_h
+            padW = 0
+            assert (padH >= 0), "Error: padH shouldn't be less than zero\n"
+        else:
+            scale_ratio = scale_h 
+            scale_target_w = int(c_float(scale_ratio * ori_w + 0.5).value)
+            scale_target_h = int(c_float(scale_ratio * ori_h + 0.5).value)
+            assert (abs(scale_target_h - tar_h) <= 1), "Error: scale down height cannot meet expectation\n"
+            padW = tar_w - scale_target_w
+            padH = 0
+            assert (padW >= 0), "Error: padW shouldn't be less than zero\n"
+        new_w = tar_w - padW
+        new_h = tar_h - padH
+        return new_w, new_h
+    
+    def do_resize_bilinear(self, image_data, resized_w, resized_h):
+        img = Image.fromarray(image_data)
+        img = img.resize((resized_w, resized_h), Image.BILINEAR)
+        image_data = np.array(img).astype('uint8')
+        return image_data        
+
+    def do_resize_bicubic(self, image_data, resized_w, resized_h):
+        img = Image.fromarray(image_data)
+        img = img.resize((resized_w, resized_h), Image.BICUBIC)
+        image_data = np.array(img).astype('uint8')
+        return image_data
+
+    def do_resize_cv2(self, image_data, resized_w, resized_h):
+        image_data = cv2.resize(image_data, (resized_w, resized_h))
+        image_data = np.array(image_data)
+        # image_data = np.array(image_data).astype('uint8')
+        return image_data
+
+    def do_resize_fixed(self, image_data, resized_w, resized_h, resize_bit, type):
+        if len(image_data.shape) < 3:
+            m, n = image_data.shape
+            tmp = np.zeros((m,n,3), dtype=np.uint8)
+            tmp[:,:,0] = image_data
+            image_data = tmp
+            c = 3
+            gray = True
+        else:
+            m, n, c = image_data.shape
+            gray = False
+
+        resolution = 1 << resize_bit
+
+        # Width
+        ratio = int(((n - 1) << resize_bit) / (resized_w - 1))
+        ratio_cnt = 0
+        src_x = 0
+        resized_image_w = np.zeros((m, resized_w, c), dtype=np.uint8)
+        
+        for dst_x in range(resized_w):
+            while ratio_cnt > resolution:
+                ratio_cnt = ratio_cnt - resolution
+                src_x = src_x + 1
+            mul1 = np.ones((m, c)) * (resolution - ratio_cnt)
+            mul2 = np.ones((m, c)) * ratio_cnt
+            resized_image_w[:, dst_x, :] = np.multiply(np.multiply(
+                image_data[:, src_x, :], mul1) + np.multiply(image_data[:, src_x + 1, :], mul2), 1/resolution)
+            ratio_cnt = ratio_cnt + ratio
+
+        # Height
+        ratio = int(((m - 1) << resize_bit) / (resized_h - 1))
+        ## NPU HW special case 2 , only on 520
+        if type.lower() in ['FIXED_520',  'Fixed_520',  'fixed_520']:
+            if (((ratio * (resized_h - 1)) % 4096 == 0) and ratio != 4096):
+                ratio -= 1
+
+        ratio_cnt = 0
+        src_x = 0
+        resized_image = np.zeros(
+            (resized_h, resized_w, c), dtype=np.uint8)
+        for dst_x in range(resized_h):
+            while ratio_cnt > resolution:
+                ratio_cnt = ratio_cnt - resolution
+                src_x = src_x + 1
+                       
+            mul1 = np.ones((resized_w, c)) * (resolution - ratio_cnt)
+            mul2 = np.ones((resized_w, c)) * ratio_cnt
+            
+            ## NPU HW special case 1 , both on 520 / 720
+            if (((dst_x > 0) and ratio_cnt == resolution) and (ratio != resolution)):
+                if type.lower() in ['FIXED_520',  'Fixed_520',  'fixed_520','FIXED_720',  'Fixed_720',  'fixed_720' ]:
+                    resized_image[dst_x, :, :] = np.multiply(np.multiply(
+                        resized_image_w[src_x+1, :, :], mul1) + np.multiply(resized_image_w[src_x + 2, :, :], mul2), 1/resolution)
+            else:
+                resized_image[dst_x, :, :] = np.multiply(np.multiply(
+                    resized_image_w[src_x, :, :], mul1) + np.multiply(resized_image_w[src_x + 1, :, :], mul2), 1/resolution)
+
+            ratio_cnt = ratio_cnt + ratio
+
+        if gray:
+            resized_image = resized_image[:,:,0]
+
+        return resized_image
--- a/kneron_preprocessing/funcs/Rotate.py
+++ b/kneron_preprocessing/funcs/Rotate.py
@ -0,0 +1,45 @@
+import numpy as np
+from .utils import str2bool, str2int
+
+class runner(object):
+    def __init__(self, *args, **kwargs):
+        self.set = {
+            'operator': '',
+            "rotate_direction": 0,
+
+        }
+        self.update(*args, **kwargs)
+
+    def update(self, *args, **kwargs):
+        self.set.update(kwargs)
+        self.rotate_direction = str2int(self.set['rotate_direction'])
+
+        # print info
+        if str2bool(self.set['b_print']):
+            self.print_info()
+
+    def print_info(self):
+        print("<rotate>",
+            'rotate_direction', self.rotate_direction,)
+
+
+    def run(self, image_data):
+        image_data = self._rotate(image_data)
+        return image_data
+
+    def _rotate(self,img):
+        if self.rotate_direction == 1 or self.rotate_direction == 2:
+            col, row, unit = img.shape
+            pInBuf = img.reshape((-1,1))
+            pOutBufTemp = np.zeros((col* row* unit))
+            for r in range(row):
+                for c in range(col):
+                    for u in range(unit):
+                        if self.rotate_direction == 1:
+                            pOutBufTemp[unit * (c * row + (row - r - 1))+u] = pInBuf[unit * (r * col + c)+u]
+                        elif self.rotate_direction == 2:
+                            pOutBufTemp[unit * (row * (col - c - 1) + r)+u] = pInBuf[unit * (r * col + c)+u]
+
+            img = pOutBufTemp.reshape((col,row,unit))
+
+        return img
--- a/kneron_preprocessing/funcs/Runner_base.py
+++ b/kneron_preprocessing/funcs/Runner_base.py
@ -0,0 +1,59 @@
+from abc import ABCMeta, abstractmethod
+
+class Param_base(object):
+    @abstractmethod
+    def update(self,**dic):
+        raise NotImplementedError("Must override")
+
+    def load_dic(self, key, **dic):
+        if key in dic:
+            param = eval('self.'+key)
+            param = dic[key]
+
+    def __str__(self):
+        str_out = []
+        return(' '.join(str_out))
+  
+
+class Common(Param_base):
+    print_info = False
+    model_size = [0,0]
+    numerical_type = 'floating'
+
+    def update(self, **dic):
+        self.print_info = dic['print_info']
+        self.model_size = dic['model_size']
+        self.numerical_type = dic['numerical_type']
+    
+    def __str__(self):
+        str_out = ['numerical_type:',str(self.numerical_type)]
+        return(' '.join(str_out))
+    
+class Runner_base(metaclass=ABCMeta):
+    common = Common()
+    general = Param_base()
+    floating = Param_base()
+    hw = Param_base()
+
+    def update(self, **kwargs):
+        ## update param
+        self.common.update(**kwargs['common'])
+        self.general.update(**kwargs['general'])
+        assert(self.common.numerical_type.lower() in ['floating', '520', '720'])
+        if (self.common.numerical_type == 'floating'):
+            if (self.floating.__class__.__name__ != 'Param_base'):
+                self.floating.update(**kwargs['floating'])
+        else:
+            if (self.hw.__class__.__name__ != 'Param_base'):
+                self.hw.update(**kwargs['hw'])
+
+    def print_info(self):
+        if (self.common.numerical_type == 'floating'):
+            print(self, self.common, self.general, self.floating)
+        else:
+            print(self, self.common, self.general, self.hw)
+        
+
+
+        
+
--- a/kneron_preprocessing/funcs/init.py
+++ b/kneron_preprocessing/funcs/init.py
@ -0,0 +1,2 @@
+from . import ColorConversion, Padding, Resize, Crop, Normalize, Rotate
+
--- a/kneron_preprocessing/funcs/utils.py
+++ b/kneron_preprocessing/funcs/utils.py
@ -0,0 +1,372 @@
+import numpy as np
+from PIL import Image
+import struct
+
+def pad_square_to_4(x_start, x_end, y_start, y_end):
+    w_int = x_end - x_start 
+    h_int = y_end - y_start
+    pad = w_int - h_int
+    if pad > 0:
+        pad_s = (pad >> 1) &(~3)
+        pad_e = pad - pad_s
+        y_start -= pad_s
+        y_end += pad_e
+    else:#//pad <=0
+        pad_s = -(((pad) >> 1) &(~3))
+        pad_e = (-pad) - pad_s
+        x_start -= pad_s
+        x_end += pad_e
+    return x_start, x_end, y_start, y_end
+
+def str_fill(value):
+    if len(value) == 1:
+        value = "0" + value
+    elif len(value) == 0:
+        value = "00"
+
+    return value
+
+def clip_ary(value):
+    list_v = []
+    for i in range(len(value)):
+        v = value[i] % 256
+        list_v.append(v)
+
+    return list_v
+    
+def str2bool(v):
+    if isinstance(v,bool):
+        return v
+    return v.lower() in ('TRUE', 'True', 'true', '1', 'T', 't', 'Y', 'YES', 'y', 'yes')
+
+
+def str2int(s):
+    if s == "":
+        s = 0
+    s = int(s)
+    return s
+
+def str2float(s):
+    if s == "":
+        s = 0
+    s = float(s)
+    return s
+
+def clip(value, mini, maxi):
+    if value < mini:
+        result = mini
+    elif value > maxi:
+        result = maxi
+    else:
+        result = value
+
+    return result
+
+
+def clip_ary(value):
+    list_v = []
+    for i in range(len(value)):
+        v = value[i] % 256
+        list_v.append(v)
+
+    return list_v
+
+
+def signed_rounding(value, bit):
+    if value < 0:
+        value = value - (1 << (bit - 1))
+    else:
+        value = value + (1 << (bit - 1))
+
+    return value
+
+def hex_loader(data_folder,**kwargs):
+    format_mode = kwargs['raw_img_fmt']
+    src_h = kwargs['img_in_height']
+    src_w = kwargs['img_in_width']
+
+    if format_mode in ['YUV444', 'yuv444', 'YCBCR444', 'YCbCr444', 'ycbcr444']:
+        output = hex_yuv444(data_folder,src_h,src_w)
+    elif format_mode in ['RGB565', 'rgb565']:
+        output = hex_rgb565(data_folder,src_h,src_w)
+    elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']:
+        output = hex_yuv422(data_folder,src_h,src_w)
+
+    return output
+
+def hex_rgb565(hex_folder,src_h,src_w):
+    pix_per_line = 8
+    byte_per_line = 16
+
+    f = open(hex_folder)
+    pixel_r = []
+    pixel_g = []
+    pixel_b = []
+
+    # Ignore the first line
+    f.readline()
+    input_line = int((src_h * src_w)/pix_per_line)
+    for i in range(input_line):
+        readline = f.readline()
+        for j in range(int(byte_per_line/2)-1, -1, -1):
+            data1 = int(readline[(j * 4 + 0):(j * 4 + 2)], 16)
+            data0 = int(readline[(j * 4 + 2):(j * 4 + 4)], 16)
+            r = ((data1 & 0xf8) >> 3)
+            g = (((data0 & 0xe0) >> 5) + ((data1 & 0x7) << 3))
+            b = (data0 & 0x1f)
+            pixel_r.append(r)
+            pixel_g.append(g)
+            pixel_b.append(b)
+
+    ary_r = np.array(pixel_r, dtype=np.uint8)
+    ary_g = np.array(pixel_g, dtype=np.uint8)
+    ary_b = np.array(pixel_b, dtype=np.uint8)
+    output = np.concatenate((ary_r[:, None], ary_g[:, None], ary_b[:, None]), axis=1)
+    output = output.reshape((src_h, src_w, 3))
+
+    return output
+
+def hex_yuv444(hex_folder,src_h,src_w):
+    pix_per_line = 4
+    byte_per_line = 16
+
+    f = open(hex_folder)
+    byte0 = []
+    byte1 = []
+    byte2 = []
+    byte3 = []
+
+    # Ignore the first line
+    f.readline()
+    input_line = int((src_h * src_w)/pix_per_line)
+    for i in range(input_line):
+        readline = f.readline()
+        for j in range(byte_per_line-1, -1, -1):
+            data = int(readline[(j*2):(j*2+2)], 16)
+            if (j+1) % 4 == 0:
+                byte0.append(data)
+            elif (j+2) % 4 == 0:
+                byte1.append(data)
+            elif (j+3) % 4 == 0:
+                byte2.append(data)
+            elif (j+4) % 4 == 0:
+                byte3.append(data)
+    # ary_a = np.array(byte0, dtype=np.uint8)
+    ary_v = np.array(byte1, dtype=np.uint8)
+    ary_u = np.array(byte2, dtype=np.uint8)
+    ary_y = np.array(byte3, dtype=np.uint8)
+    output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1)
+    output = output.reshape((src_h, src_w, 3))
+
+    return output
+
+def hex_yuv422(hex_folder,src_h,src_w):
+    pix_per_line = 8
+    byte_per_line = 16
+    f = open(hex_folder)
+    pixel_y = []
+    pixel_u = []
+    pixel_v = []
+
+    # Ignore the first line
+    f.readline()
+    input_line = int((src_h * src_w)/pix_per_line)
+    for i in range(input_line):
+        readline = f.readline()
+        for j in range(int(byte_per_line/4)-1, -1, -1):
+            data3 = int(readline[(j * 8 + 0):(j * 8 + 2)], 16)
+            data2 = int(readline[(j * 8 + 2):(j * 8 + 4)], 16)
+            data1 = int(readline[(j * 8 + 4):(j * 8 + 6)], 16)
+            data0 = int(readline[(j * 8 + 6):(j * 8 + 8)], 16)
+            pixel_y.append(data3)
+            pixel_y.append(data1)
+            pixel_u.append(data2)
+            pixel_u.append(data2)
+            pixel_v.append(data0)
+            pixel_v.append(data0)
+
+    ary_y = np.array(pixel_y, dtype=np.uint8)
+    ary_u = np.array(pixel_u, dtype=np.uint8)
+    ary_v = np.array(pixel_v, dtype=np.uint8)
+    output = np.concatenate((ary_y[:, None], ary_u[:, None], ary_v[:, None]), axis=1)
+    output = output.reshape((src_h, src_w, 3))
+
+    return output
+
+def bin_loader(data_folder,**kwargs):
+    format_mode = kwargs['raw_img_fmt']
+    src_h = kwargs['img_in_height']
+    src_w = kwargs['img_in_width']
+    if format_mode in ['YUV','yuv','YUV444', 'yuv444', 'YCBCR','YCbCr','ycbcr','YCBCR444', 'YCbCr444', 'ycbcr444']:
+        output = bin_yuv444(data_folder,src_h,src_w)
+    elif format_mode in ['RGB565', 'rgb565']:
+        output = bin_rgb565(data_folder,src_h,src_w)
+    elif format_mode in ['NIR', 'nir','NIR888', 'nir888']:
+        output = bin_nir(data_folder,src_h,src_w)
+    elif format_mode in ['YUV422', 'yuv422', 'YCBCR422', 'YCbCr422', 'ycbcr422']:
+        output = bin_yuv422(data_folder,src_h,src_w)
+    elif format_mode in ['RGB888','rgb888']:
+        output = np.fromfile(data_folder, dtype='uint8')
+        output = output.reshape(src_h,src_w,3)
+    elif format_mode in ['RGBA8888','rgba8888', 'RGBA' , 'rgba']:
+        output_temp = np.fromfile(data_folder, dtype='uint8')
+        output_temp = output_temp.reshape(src_h,src_w,4)
+        output = output_temp[:,:,0:3]
+
+    return output
+
+def bin_yuv444(in_img_path,src_h,src_w):
+    # load bin
+    struct_fmt = '1B' 
+    struct_len = struct.calcsize(struct_fmt)
+    struct_unpack = struct.Struct(struct_fmt).unpack_from
+    
+    row = src_h
+    col = src_w
+    pixels = row*col
+
+    raw = []
+    with open(in_img_path, "rb") as f:
+        while True:
+            data = f.read(struct_len)
+            if not data: break
+            s = struct_unpack(data)
+            raw.append(s[0])
+    
+
+    raw = raw[:pixels*4]
+
+    #
+    output = np.zeros((pixels * 3), dtype=np.uint8)
+    cnt = 0
+    for i in range(0, pixels*4, 4):
+        #Y
+        output[cnt] = raw[i+3]
+        #U
+        cnt += 1
+        output[cnt] = raw[i+2]
+        #V
+        cnt += 1
+        output[cnt] = raw[i+1]
+
+        cnt += 1          
+
+    output = output.reshape((src_h,src_w,3))
+    return output
+    
+def bin_yuv422(in_img_path,src_h,src_w):
+    # load bin
+    struct_fmt = '1B' 
+    struct_len = struct.calcsize(struct_fmt)
+    struct_unpack = struct.Struct(struct_fmt).unpack_from
+    
+    row = src_h
+    col = src_w
+    pixels = row*col
+
+    raw = []
+    with open(in_img_path, "rb") as f:
+        while True:
+            data = f.read(struct_len)
+            if not data: break
+            s = struct_unpack(data)
+            raw.append(s[0])
+    
+
+    raw = raw[:pixels*2]
+
+    #
+    output = np.zeros((pixels * 3), dtype=np.uint8)
+    cnt = 0
+    for i in range(0, pixels*2, 4):
+        #Y0
+        output[cnt] = raw[i+3]
+        #U0
+        cnt += 1
+        output[cnt] = raw[i+2]
+        #V0
+        cnt += 1
+        output[cnt] = raw[i]
+        #Y1
+        cnt += 1
+        output[cnt] = raw[i+1]
+        #U1
+        cnt += 1
+        output[cnt] = raw[i+2]
+        #V1
+        cnt += 1
+        output[cnt] = raw[i]
+
+        cnt += 1          
+
+    output = output.reshape((src_h,src_w,3))
+    return output
+
+def bin_rgb565(in_img_path,src_h,src_w):
+    # load bin
+    struct_fmt = '1B' 
+    struct_len = struct.calcsize(struct_fmt)
+    struct_unpack = struct.Struct(struct_fmt).unpack_from
+    
+    row = src_h
+    col = src_w
+    pixels = row*col
+
+    rgba565 = []
+    with open(in_img_path, "rb") as f:
+        while True:
+            data = f.read(struct_len)
+            if not data: break
+            s = struct_unpack(data)
+            rgba565.append(s[0])
+    
+
+    rgba565 = rgba565[:pixels*2]
+
+    # rgb565_bin to numpy_array
+    output = np.zeros((pixels * 3), dtype=np.uint8)
+    cnt = 0
+    for i in range(0, pixels*2, 2):
+        temp = rgba565[i]
+        temp2 = rgba565[i+1]
+        #R-5
+        output[cnt] = (temp2 >>3)
+        
+        #G-6
+        cnt += 1
+        output[cnt] = ((temp & 0xe0) >> 5) + ((temp2 & 0x07) << 3)
+        
+        #B-5
+        cnt += 1
+        output[cnt] = (temp & 0x1f)
+
+        cnt += 1          
+
+    output = output.reshape((src_h,src_w,3))
+    return output
+
+def bin_nir(in_img_path,src_h,src_w):
+    # load bin
+    struct_fmt = '1B' 
+    struct_len = struct.calcsize(struct_fmt)
+    struct_unpack = struct.Struct(struct_fmt).unpack_from
+
+    nir = []
+    with open(in_img_path, "rb") as f:
+        while True:
+            data = f.read(struct_len)
+            if not data: break
+            s = struct_unpack(data)
+            nir.append(s[0])
+            
+    nir = nir[:src_h*src_w]
+    pixels = len(nir)
+    # nir_bin to numpy_array
+    output = np.zeros((len(nir) * 3), dtype=np.uint8)
+    for i in range(0, pixels):
+        output[i*3]=nir[i]
+        output[i*3+1]=nir[i]
+        output[i*3+2]=nir[i]
+
+    output = output.reshape((src_h,src_w,3))
+    return output
--- a/kneron_preprocessing/funcs/utils_520.py
+++ b/kneron_preprocessing/funcs/utils_520.py
@ -0,0 +1,50 @@
+import math
+
+def round_up_16(num):
+    return ((num + (16 - 1)) & ~(16 - 1))
+
+def round_up_n(num, n):
+    if (num > 0):
+        temp = float(num) / n
+        return math.ceil(temp) * n
+    else:
+        return -math.ceil(float(-num) / n) * n
+
+def cal_img_row_offset(crop_num, pad_num, start_row, out_row, orig_row):
+
+    scaled_img_row = int(out_row - (pad_num[1] + pad_num[3]))
+    if ((start_row - pad_num[1]) > 0):
+        img_str_row = int((start_row - pad_num[1]))
+    else:
+        img_str_row = 0
+    valid_row = int(orig_row - (crop_num[1] + crop_num[3]))
+    img_str_row = int(valid_row * img_str_row / scaled_img_row)
+    return int(img_str_row + crop_num[1])
+
+def get_pad_num(pad_num_orig, left, up, right, bottom):
+    pad_num = [0]*4
+    for i in range(0,4):
+        pad_num[i] = pad_num_orig[i]
+
+    if not (left):
+        pad_num[0] = 0
+    if not (up):
+        pad_num[1] = 0
+    if not (right):
+        pad_num[2] = 0
+    if not (bottom):
+        pad_num[3] = 0
+
+    return pad_num
+
+def get_byte_per_pixel(raw_fmt):
+    if raw_fmt.lower() in ['RGB888', 'rgb888', 'RGB', 'rgb888']:
+        return 4
+    elif raw_fmt.lower() in ['YUV', 'yuv', 'YUV422', 'yuv422']:
+        return 2
+    elif raw_fmt.lower() in ['RGB565', 'rgb565']:
+        return 2
+    elif raw_fmt.lower() in ['NIR888', 'nir888', 'NIR', 'nir']:
+        return 1
+    else:
+        return -1
--- a/kneron_preprocessing/funcs/utils_720.py
+++ b/kneron_preprocessing/funcs/utils_720.py
@ -0,0 +1,42 @@
+import numpy as np
+from PIL import Image
+
+def twos_complement(value):
+    value = int(value)
+    # msb = (value & 0x8000) * (1/np.power(2, 15))
+    msb = (value & 0x8000) >> 15
+    if msb == 1:
+        if (((~value) & 0xFFFF) + 1) >= 0xFFFF:
+            result = ((~value) & 0xFFFF)
+        else:
+            result = (((~value) & 0xFFFF) + 1)
+        result = result * (-1)
+    else:
+        result = value
+
+    return result
+
+
+def twos_complement_pix(value):
+    h, _ = value.shape
+    for i in range(h):
+        value[i, 0] = twos_complement(value[i, 0])
+
+    return value
+
+def clip(value, mini, maxi):
+    if value < mini:
+        result = mini
+    elif value > maxi:
+        result = maxi
+    else:
+        result = value
+
+    return result
+
+def clip_pix(value, mini, maxi):
+    h, _ = value.shape
+    for i in range(h):
+        value[i, 0] = clip(value[i, 0], mini, maxi)
+
+    return value
--- a/mmseg/datasets/init.py
+++ b/mmseg/datasets/init.py
@ -18,6 +18,12 @@ from .pascal_context import PascalContextDataset, PascalContextDataset59
 from .potsdam import PotsdamDataset
 from .stare import STAREDataset
 from .voc import PascalVOCDataset
+from .golf_dataset import GolfDataset
+from .golf7_dataset import Golf7Dataset
+from .golf1_dataset import GrassOnlyDataset
+from .golf4_dataset import Golf4Dataset
+from .golf2_dataset import Golf2Dataset
+from .golf8_dataset import Golf8Dataset

 __all__ = [
    'CustomDataset', 'build_dataloader', 'ConcatDataset', 'RepeatDataset',
--- a/mmseg/datasets/golf1_dataset.py
+++ b/mmseg/datasets/golf1_dataset.py
@ -0,0 +1,80 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class GrassOnlyDataset(CustomDataset):
+    """GrassOnlyDataset for semantic segmentation with only one valid class: grass."""
+
+    CLASSES = ('grass',)
+
+    PALETTE = [
+        [0, 128, 0],  # grass - green
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(GrassOnlyDataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        print("✅ [GrassOnlyDataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+        print("🧪 [GrassOnlyDataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(GrassOnlyDataset, self).evaluate(results, metrics, logger)
+
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf2_dataset.py
+++ b/mmseg/datasets/golf2_dataset.py
@ -0,0 +1,84 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class Golf2Dataset(CustomDataset):
+    """Golf2Dataset for semantic segmentation with 2 valid classes (ignore background)."""
+
+    CLASSES = (
+        'grass', 'road'
+    )
+
+    PALETTE = [
+        [0, 255, 0],     # grass - green
+        [255, 165, 0],   # road - orange
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(Golf2Dataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        print("✅ [Golf2Dataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        print("🧪 [Golf2Dataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(Golf2Dataset, self).evaluate(results, metrics, logger)
+
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf4_dataset.py
+++ b/mmseg/datasets/golf4_dataset.py
@ -0,0 +1,86 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class Golf4Dataset(CustomDataset):
+    """Golf4Dataset for semantic segmentation with 4 valid classes (ignore background)."""
+
+    CLASSES = (
+        'car', 'grass', 'people', 'road'
+    )
+
+    PALETTE = [
+        [0, 0, 128],     # car - dark blue
+        [0, 255, 0],     # grass - green
+        [255, 0, 0],     # people - red
+        [255, 165, 0],   # road - orange
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(Golf4Dataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        print("✅ [Golf4Dataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        print("🧪 [Golf4Dataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(Golf4Dataset, self).evaluate(results, metrics, logger)
+
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf7_dataset.py
+++ b/mmseg/datasets/golf7_dataset.py
@ -0,0 +1,90 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class Golf7Dataset(CustomDataset):
+    """Golf8Dataset for semantic segmentation with 7 valid classes (ignore background)."""
+
+    CLASSES = (
+        'bunker', 'car', 'grass',
+        'greenery', 'person', 'road', 'tree'
+    )
+
+    PALETTE = [
+        [128, 0, 0],       # bunker - dark red
+        [0, 0, 128],       # car - dark blue
+        [0, 128, 0],       # grass - green
+        [0, 255, 0],       # greenery - light green
+        [255, 0, 0],       # person - red
+        [255, 165, 0],   # road - gray
+        [0, 255, 255],     # tree - cyan
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(Golf7Dataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        print("✅ [Golf7Dataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        print("🧪 [Golf8Dataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(Golf7Dataset, self).evaluate(results, metrics, logger)
+
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf8_dataset.py
+++ b/mmseg/datasets/golf8_dataset.py
@ -0,0 +1,92 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class Golf8Dataset(CustomDataset):
+    """Golf8Dataset for semantic segmentation with 8 valid classes (ignore background)."""
+
+    CLASSES = (
+        'bunker', 'car', 'grass',
+        'greenery', 'person', 'pond',
+        'road', 'tree'
+    )
+
+    PALETTE = [
+        [128, 0, 0],       # bunker - dark red
+        [0, 0, 128],       # car - dark blue
+        [0, 128, 0],       # grass - green
+        [0, 255, 0],       # greenery - light green
+        [255, 0, 0],       # person - red
+        [0, 255, 255],     # pond - cyan
+        [255, 165, 0],     # road - orange
+        [0, 128, 128],     # tree - dark cyan
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(Golf8Dataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        print("✅ [Golf8Dataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        print("🧪 [Golf8Dataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(Golf8Dataset, self).evaluate(results, metrics, logger)
+
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf_dataset.py
+++ b/mmseg/datasets/golf_dataset.py
@ -0,0 +1,96 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class GolfDataset(CustomDataset):
+    """GolfDataset for semantic segmentation with four classes: car, grass, people, and road."""
+
+    # ✅ 固定的類別與調色盤（不從 config 接收）
+    CLASSES = ('car', 'grass', 'people', 'road')
+    PALETTE = [
+        [246, 14, 135],   # car
+        [233, 81, 78],    # grass
+        [220, 148, 21],   # people
+        [207, 215, 220],  # road
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(GolfDataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        # ✅ DEBUG：初始化時印出 CLASSES 與 PALETTE
+        print("✅ [GolfDataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            result = result.astype(np.uint8)
+
+            # ✅ 把所有無效類別設為 255（當作背景處理）
+            result[result >= len(self.PALETTE)] = 255
+
+            output = Image.fromarray(result).convert('P')
+
+            # ✅ 建立 palette，支援背景 class 255 為黑色
+            palette = np.zeros((256, 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            palette[255] = [0, 0, 0]  # 黑色背景
+
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        # ✅ DEBUG：評估時印出目前 CLASSES 使用狀況
+        print("🧪 [GolfDataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
+
+        # ✅ DEBUG：印出最終的 eval_results keys
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/mmseg/datasets/golf_dataset1.py
+++ b/mmseg/datasets/golf_dataset1.py
@ -0,0 +1,66 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+
+@DATASETS.register_module()
+class GolfDataset(CustomDataset):
+    """GolfDataset for custom semantic segmentation with two classes: road and grass."""
+
+    CLASSES = ('road', 'grass')
+
+    PALETTE = [[128, 64, 128],  # road
+               [0, 255, 0]]     # grass
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(GolfDataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        result_files = self.results2img(results, imgfile_prefix, indices)
+        return result_files
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
+        return eval_results
--- a/mmseg/datasets/golf_datasetcanuse.py
+++ b/mmseg/datasets/golf_datasetcanuse.py
@ -0,0 +1,87 @@
+# Copyright (c) OpenMMLab. All rights reserved.
+import os.path as osp
+import mmcv
+import numpy as np
+from mmcv.utils import print_log
+from PIL import Image
+
+from .builder import DATASETS
+from .custom import CustomDataset
+
+@DATASETS.register_module()
+class GolfDataset(CustomDataset):
+    """GolfDataset for semantic segmentation with four classes: car, grass, people, and road."""
+
+    # ✅ 固定的類別與調色盤（不從 config 接收）
+    CLASSES = ('car', 'grass', 'people', 'road')
+    PALETTE = [
+        [246, 14, 135],   # car
+        [233, 81, 78],    # grass
+        [220, 148, 21],   # people
+        [207, 215, 220],  # road
+    ]
+
+    def __init__(self,
+                 img_suffix='_leftImg8bit.png',
+                 seg_map_suffix='_gtFine_labelIds.png',
+                 **kwargs):
+        super(GolfDataset, self).__init__(
+            img_suffix=img_suffix,
+            seg_map_suffix=seg_map_suffix,
+            **kwargs)
+
+        # ✅ DEBUG：初始化時印出 CLASSES 與 PALETTE
+        print("✅ [GolfDataset] 初始化完成")
+        print(f"   ➤ CLASSES: {self.CLASSES}")
+        print(f"   ➤ PALETTE: {self.PALETTE}")
+        print(f"   ➤ img_suffix: {img_suffix}")
+        print(f"   ➤ seg_map_suffix: {seg_map_suffix}")
+        print(f"   ➤ img_dir: {self.img_dir}")
+        print(f"   ➤ ann_dir: {self.ann_dir}")
+        print(f"   ➤ dataset length: {len(self)}")
+
+    def results2img(self, results, imgfile_prefix, indices=None):
+        """Write the segmentation results to images."""
+        if indices is None:
+            indices = list(range(len(self)))
+
+        mmcv.mkdir_or_exist(imgfile_prefix)
+        result_files = []
+        for result, idx in zip(results, indices):
+            filename = self.img_infos[idx]['filename']
+            basename = osp.splitext(osp.basename(filename))[0]
+            png_filename = osp.join(imgfile_prefix, f'{basename}.png')
+
+            output = Image.fromarray(result.astype(np.uint8)).convert('P')
+            palette = np.zeros((len(self.PALETTE), 3), dtype=np.uint8)
+            for label_id, color in enumerate(self.PALETTE):
+                palette[label_id] = color
+            output.putpalette(palette)
+            output.save(png_filename)
+            result_files.append(png_filename)
+
+        return result_files
+
+    def format_results(self, results, imgfile_prefix, indices=None):
+        """Format the results into dir (for evaluation or visualization)."""
+        return self.results2img(results, imgfile_prefix, indices)
+
+    def evaluate(self,
+                 results,
+                 metric='mIoU',
+                 logger=None,
+                 imgfile_prefix=None):
+        """Evaluate the results with the given metric."""
+
+        # ✅ DEBUG：評估時印出目前 CLASSES 使用狀況
+        print("🧪 [GolfDataset.evaluate] 被呼叫")
+        print(f"   ➤ 當前 CLASSES: {self.CLASSES}")
+        print(f"   ➤ 評估 metric: {metric}")
+        print(f"   ➤ 結果數量: {len(results)}")
+
+        metrics = metric if isinstance(metric, list) else [metric]
+        eval_results = super(GolfDataset, self).evaluate(results, metrics, logger)
+
+        # ✅ DEBUG：印出最終的 eval_results keys
+        print(f"   ➤ 返回評估指標: {list(eval_results.keys())}")
+        return eval_results
--- a/tools/check/check_lane_offset.py
+++ b/tools/check/check_lane_offset.py
@ -0,0 +1,70 @@
+import cv2
+import numpy as np
+
+# === 1. 檔案與參數設定 ===
+img_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\pic_0441_jpg.rf.6e56eb8c0bed7f773fb447b9e217f779_leftImg8bit.png'
+
+# 色彩轉 label ID（RGB）
+CLASS_RGB_TO_ID = {
+    (128, 64, 128): 3,  # road（灰）
+    (0, 255, 0): 1,     # grass（綠）
+    (255, 0, 255): 9,   # background or sky（紫）可忽略
+}
+
+ROAD_ID = 3
+GRASS_ID = 1
+
+# === 2. 讀圖並轉為 label mask ===
+bgr_img = cv2.imread(img_path)
+rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB)
+height, width, _ = rgb_img.shape
+
+label_mask = np.zeros((height, width), dtype=np.uint8)
+for rgb, label in CLASS_RGB_TO_ID.items():
+    match = np.all(rgb_img == rgb, axis=-1)
+    label_mask[match] = label
+
+# === 3. 分析畫面中下區域 ===
+y_start = int(height * 0.6)
+x_start = int(width * 0.4)
+x_end = int(width * 0.6)
+roi = label_mask[y_start:, x_start:x_end]
+
+total_pixels = roi.size
+road_pixels = np.sum(roi == ROAD_ID)
+grass_pixels = np.sum(roi == GRASS_ID)
+
+road_ratio = road_pixels / total_pixels
+grass_ratio = grass_pixels / total_pixels
+
+# === 4. 重心偏移分析 ===
+road_mask = (label_mask == ROAD_ID).astype(np.uint8)
+M = cv2.moments(road_mask)
+center_x = width // 2
+offset = 0
+cx = center_x
+if M["m00"] > 0:
+    cx = int(M["m10"] / M["m00"])
+    offset = cx - center_x
+
+# === 5. 結果輸出 ===
+print(f"🔍 中央 ROI - road比例: {road_ratio:.2f}, grass比例: {grass_ratio:.2f}")
+if road_ratio < 0.5:
+    print("⚠️ 偏離道路（ROI 中道路比例過少）")
+if grass_ratio > 0.3:
+    print("❗ 車輛壓到草地！")
+if abs(offset) > 40:
+    print(f"⚠️ 道路重心偏移：{offset} px")
+else:
+    print("✅ 道路重心正常")
+
+# === 6. 可視化 ===
+vis_img = bgr_img.copy()
+cv2.rectangle(vis_img, (x_start, y_start), (x_end, height), (0, 255, 255), 2)  # 黃色框 ROI
+cv2.line(vis_img, (center_x, 0), (center_x, height), (255, 0, 0), 2)            # 藍色中心線
+cv2.circle(vis_img, (cx, height // 2), 6, (0, 0, 255), -1)                      # 紅色重心點
+
+# 輸出圖片
+save_path = r'C:\Users\rd_de\kneronstdc\work_dirs\vis_results\good\visual_check.png'
+cv2.imwrite(save_path, vis_img)
+print(f"✅ 分析圖儲存成功：{save_path}")
--- a/tools/check/checklatest.py
+++ b/tools/check/checklatest.py
@ -0,0 +1,33 @@
+import torch
+
+def check_pth_num_classes(pth_path):
+    checkpoint = torch.load(pth_path, map_location='cpu')
+
+    if 'state_dict' not in checkpoint:
+        print("❌ 找不到 state_dict，這可能不是 MMSegmentation 的模型檔")
+        return
+
+    state_dict = checkpoint['state_dict']
+
+    # 找出 decode head 最後一層分類器的 weight tensor
+    num_classes = None
+    for k in state_dict.keys():
+        if 'decode_head' in k and 'weight' in k and 'decode_head.classifier' in k:
+            weight_tensor = state_dict[k]
+            num_classes = weight_tensor.shape[0]
+            print(f"✅ 檢查到類別數: {num_classes}")
+            break
+
+    if num_classes is None:
+        print("⚠️ 無法判斷類別數，可能模型架構非標準格式")
+    else:
+        if num_classes == 19:
+            print("⚠️ 這是 Cityscapes 預設模型 (19 類)")
+        elif num_classes == 4:
+            print("✅ 這是 GolfDataset 自訂模型 (4 類)")
+        else:
+            print("❓ 類別數異常，請確認訓練資料與 config 設定是否一致")
+
+if __name__ == '__main__':
+    pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth'
+    check_pth_num_classes(pth_path)
--- a/tools/check/checkonnx.py
+++ b/tools/check/checkonnx.py
@ -0,0 +1,32 @@
+import onnx
+
+def check_onnx_num_classes(onnx_path):
+    model = onnx.load(onnx_path)
+    graph = model.graph
+
+    print(f"📂 模型路徑: {onnx_path}")
+    print(f"📦 輸出節點總數: {len(graph.output)}")
+
+    for output in graph.output:
+        name = output.name
+        shape = []
+        for dim in output.type.tensor_type.shape.dim:
+            if dim.dim_param:
+                shape.append(dim.dim_param)
+            else:
+                shape.append(dim.dim_value)
+        print(f"🔎 輸出節點名稱: {name}")
+        print(f"   輸出形狀: {shape}")
+        if len(shape) == 4:
+            num_classes = shape[1]
+            print(f"✅ 偵測到類別數: {num_classes}")
+            if num_classes == 19:
+                print("⚠️ 這是 Cityscapes 預設模型 (19 類)")
+            elif num_classes == 4:
+                print("✅ 這是你訓練的 GolfDataset 模型 (4 類)")
+            else:
+                print("❓ 類別數未知，請確認是否正確訓練/轉換模型")
+
+if __name__ == '__main__':
+    onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.onnx'
+    check_onnx_num_classes(onnx_path)
--- a/tools/check/list_pth_keys.py
+++ b/tools/check/list_pth_keys.py
@ -0,0 +1,29 @@
+import torch
+
+def check_num_classes_from_pth(pth_path):
+    checkpoint = torch.load(pth_path, map_location='cpu')
+
+    if 'state_dict' not in checkpoint:
+        print("❌ 找不到 state_dict")
+        return
+
+    state_dict = checkpoint['state_dict']
+    weight_key = 'decode_head.conv_seg.weight'
+
+    if weight_key in state_dict:
+        weight = state_dict[weight_key]
+        num_classes = weight.shape[0]
+        print(f"✅ 類別數: {num_classes}")
+
+        if num_classes == 19:
+            print("⚠️ 這是 Cityscapes 模型 (19 類)")
+        elif num_classes == 4:
+            print("✅ 這是 GolfDataset 模型 (4 類)")
+        else:
+            print("❓ 非常規類別數，請自行確認資料與 config")
+    else:
+        print(f"❌ 找不到分類層: {weight_key}")
+
+if __name__ == '__main__':
+    pth_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig\latest.pth'
+    check_num_classes_from_pth(pth_path)
--- a/tools/custom_infer.py
+++ b/tools/custom_infer.py
@ -0,0 +1,36 @@
+import os
+import torch
+from mmseg.apis import inference_segmentor, init_segmentor
+
+def main():
+    # 設定路徑
+    config_file = 'configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py'
+    checkpoint_file = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth'
+    img_dir = 'data/cityscapes/leftImg8bit/val'
+    out_dir = 'work_dirs/vis_results'
+
+    # 初始化模型
+    model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
+    print('CLASSES：', model.CLASSES)
+    print('PALETTE：', model.PALETTE)
+    # 建立輸出資料夾
+    os.makedirs(out_dir, exist_ok=True)
+
+    # 找出所有圖片檔
+    img_list = []
+    for root, _, files in os.walk(img_dir):
+        for f in files:
+            if f.endswith('.png') or f.endswith('.jpg'):
+                img_list.append(os.path.join(root, f))
+
+    # 推論每一張圖片
+    for img_path in img_list:
+        result = inference_segmentor(model, img_path)
+        filename = os.path.basename(img_path)
+        out_path = os.path.join(out_dir, filename)
+        model.show_result(img_path, result, out_file=out_path, opacity=0.5)
+
+    print(f'✅ 推論完成，共處理 {len(img_list)} 張圖片，結果已輸出至：{out_dir}')
+
+if __name__ == '__main__':
+    main()
--- a/tools/kneron/e2eonnx.py
+++ b/tools/kneron/e2eonnx.py
@ -0,0 +1,61 @@
+import numpy as np
+import ktc
+import cv2
+from PIL import Image
+
+# === 1. 前處理 + 推論 ===
+def run_e2e_simulation(img_path, onnx_path):
+    # 圖片前處理（724x362）
+    image = Image.open(img_path).convert("RGB")
+    image = image.resize((724, 362), Image.BILINEAR)
+    img_data = np.array(image) / 255.0
+    img_data = np.transpose(img_data, (2, 0, 1))  # HWC → CHW
+    img_data = np.expand_dims(img_data, 0)        # → NCHW (1,3,362,724)
+
+    input_data = [img_data]
+    inf_results = ktc.kneron_inference(
+        input_data,
+        onnx_file=onnx_path,
+        input_names=["input"]
+    )
+
+    return inf_results
+
+# === 2. 呼叫推論 ===
+image_path = "test.png"
+onnx_path = "work_dirs/meconfig8/latest_optimized.onnx"
+result = run_e2e_simulation(image_path, onnx_path)
+
+print("推論結果 shape:", np.array(result).shape)  # (1, 1, 7, 46, 91)
+
+# === 3. 提取與處理輸出 ===
+output_tensor = np.array(result)[0][0]        # shape: (7, 46, 91)
+pred_mask = np.argmax(output_tensor, axis=0)  # shape: (46, 91)
+
+print("預測的 segmentation mask：")
+print(pred_mask)
+
+# === 4. 上採樣回 724x362 ===
+upsampled_mask = cv2.resize(pred_mask.astype(np.uint8), (724, 362), interpolation=cv2.INTER_NEAREST)
+
+# === 5. 上色（簡單使用固定 palette）===
+# 根據你的 7 類別自行定義顏色 (BGR)
+colors = np.array([
+    [0, 0, 0],          # 0: 背景
+    [0, 255, 0],        # 1: 草地
+    [255, 0, 0],        # 2: 車子
+    [0, 0, 255],        # 3: 人
+    [255, 255, 0],      # 4: 道路
+    [255, 0, 255],      # 5: 樹
+    [0, 255, 255],      # 6: 其他
+], dtype=np.uint8)
+
+colored_mask = colors[upsampled_mask]  # shape: (362, 724, 3)
+colored_mask = np.asarray(colored_mask, dtype=np.uint8)
+
+# === 6. 檢查並儲存 ===
+if colored_mask.shape != (362, 724, 3):
+    raise ValueError(f"❌ mask shape 不對: {colored_mask.shape}")
+
+cv2.imwrite("pred_mask_resized.png", colored_mask)
+print("✅ 已儲存語意遮罩圖：pred_mask_resized.png")
--- a/tools/kneron/onnx2nef720.py
+++ b/tools/kneron/onnx2nef720.py
@ -0,0 +1,96 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'  # 你的 onnx存放路徑
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+data_path = "data724362"  # 測試圖片資料夾
+imgsz_w, imgsz_h = 724, 362  # 輸入圖片尺寸，跟ONNX模型要求一致
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 載入並優化 ONNX 模型 ===
+print("🔄 Loading and optimizing ONNX...")
+m = onnx.load(onnx_path)
+m = ktc.onnx_optimizer.onnx2onnx_flow(m)
+opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
+onnx.save(m, opt_onnx_path)
+
+# === 4. 檢查 ONNX 輸入尺寸是否符合要求 ===
+input_tensor = m.graph.input[0]
+input_shape = [dim.dim_value for dim in input_tensor.type.tensor_type.shape.dim]
+print(f"📏 ONNX Input Shape: {input_shape}")
+
+expected_shape = [1, 3, imgsz_h, imgsz_w]  # (N, C, H, W)
+
+if input_shape != expected_shape:
+    raise ValueError(f"❌ Error: ONNX input shape {input_shape} does not match expected {expected_shape}.")
+
+# === 5. 設定 Kneron 模型編譯參數 ===
+print("📐 Configuring model for KL720...")
+km = ktc.ModelConfig(20008, "0001", "720", onnx_model=m)
+
+# （可選）模型效能評估
+eval_result = km.evaluate()
+print("\n📊 NPU Performance Evaluation:\n" + str(eval_result))
+
+# === 6. 準備圖片資料 ===
+print("🖼️ Preparing image data...")
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+
+if not files_found:
+    raise FileNotFoundError(f"❌ No images found in {data_path}!")
+
+print(f"✅ Found {len(files_found)} images in {data_path}")
+
+input_name = input_tensor.name
+img_list = []
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        fullpath = os.path.join(root, f)
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➔ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32).copy()
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➔ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➔ NCHW
+            img_list.append(img_np)
+            print(f"✅ Processed: {fullpath}")
+        except Exception as e:
+            print(f"❌ Failed to process {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ Error: No valid images were processed!")
+
+# === 7. BIE 量化分析 ===
+print("📦 Running fixed-point analysis (BIE)...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ Error: BIE model was not generated!")
+
+print("✅ BIE model saved to:", bie_save_path)
+
+# === 8. 編譯 NEF 模型 ===
+print("⚙️ Compiling NEF model...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ Error: NEF model was not generated!")
+
+print("✅ NEF compile done!")
+print("📁 NEF file saved to:", nef_save_path)
--- a/tools/kneron/onnx2nefSTDC630.py
+++ b/tools/kneron/onnx2nefSTDC630.py
@ -0,0 +1,103 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+import kneronnxopt
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
+data_path = "data724362"
+imgsz_w, imgsz_h = 724, 362  # STDC 預設解析度
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 優化 ONNX 模型（使用 kneronnxopt API）===
+print("⚙️ 使用 kneronnxopt 優化 ONNX...")
+try:
+    model = onnx.load(onnx_path)
+    input_tensor = model.graph.input[0]
+    input_name = input_tensor.name
+    input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
+    print(f"📌 模型實際的 input name 是: {input_name}")
+
+    model = kneronnxopt.optimize(
+        model,
+        duplicate_shared_weights=1,
+        skip_check=False,
+        skip_fuse_qkv=True
+    )
+    onnx.save(model, optimized_path)
+except Exception as e:
+    print(f"❌ 優化失敗: {e}")
+    exit(1)
+
+# === 4. 載入優化後的模型 ===
+print("🔄 載入優化後的 ONNX...")
+m = onnx.load(optimized_path)
+
+# === 5. 設定 Kneron 模型編譯參數 ===
+print("📐 配置模型...")
+km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m)
+
+# （可選）模型效能評估
+eval_result = km.evaluate()
+print("\n📊 NPU 效能評估:\n" + str(eval_result))
+
+# === 6. 處理輸入圖片 ===
+print("🖼️ 處理輸入圖片...")
+input_name = m.graph.input[0].name
+img_list = []
+
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+
+if not files_found:
+    raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}!")
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        fullpath = os.path.join(root, f)
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➝ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➝ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➝ NCHW
+            img_list.append(img_np)
+            print(f"✅ 處理成功: {fullpath}")
+        except Exception as e:
+            print(f"❌ 圖片處理失敗 {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ 錯誤：沒有有效圖片被處理！")
+
+# === 7. BIE 分析（量化）===
+print("📦 執行固定點分析 BIE...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ 無法產生 BIE 模型")
+
+print("✅ BIE 模型儲存於:", bie_save_path)
+
+# === 8. 編譯 NEF 模型 ===
+print("⚙️ 編譯 NEF 模型...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ 無法產生 NEF 模型")
+
+print("✅ NEF 編譯完成")
+print("📁 NEF 檔案儲存於:", nef_save_path)
--- a/tools/kneron/onnx2nefSTDC630_2.py
+++ b/tools/kneron/onnx2nefSTDC630_2.py
@ -0,0 +1,64 @@
+import os
+import numpy as np
+import onnx
+import shutil
+import cv2
+import ktc
+
+onnx_dir = 'work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+data_path = "data512"
+imgsz = (512, 512)
+
+os.makedirs(onnx_dir, exist_ok=True)
+
+print("🔄 Loading and optimizing ONNX...")
+model = onnx.load(onnx_path)
+model = ktc.onnx_optimizer.onnx2onnx_flow(model)
+opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
+onnx.save(model, opt_onnx_path)
+
+print("📐 Configuring model...")
+km = ktc.ModelConfig(20008, "0001", "630", onnx_model=model)
+
+# Optional: performance check
+print("\n📊 Evaluating model...")
+print(km.evaluate())
+
+input_name = model.graph.input[0].name
+print("📥 ONNX input name:", input_name)
+
+img_list = []
+print("🖼️ Preprocessing images...")
+for root, _, files in os.walk(data_path):
+    for fname in files:
+        if fname.lower().endswith(('.jpg', '.jpeg', '.png', '.bmp')):
+            path = os.path.join(root, fname)
+            img = cv2.imread(path)
+            img = cv2.resize(img, imgsz)
+            img = img.astype(np.float32) / 256.0 - 0.5
+            img = np.transpose(img, (2, 0, 1))  # HWC ➝ CHW
+            img = np.expand_dims(img, axis=0)   # Add batch dim
+            img_list.append(img)
+            print("✅", path)
+
+if not img_list:
+    raise RuntimeError("❌ No images processed!")
+
+print("📦 Quantizing (BIE)...")
+bie_path = km.analysis({input_name: img_list})
+bie_save = os.path.join(onnx_dir, os.path.basename(bie_path))
+shutil.copy(bie_path, bie_save)
+
+if not os.path.exists(bie_save):
+    raise RuntimeError("❌ BIE model not saved!")
+
+print("⚙️ Compiling NEF...")
+nef_path = ktc.compile([km])
+nef_save = os.path.join(onnx_dir, os.path.basename(nef_path))
+shutil.copy(nef_path, nef_save)
+
+if not os.path.exists(nef_save):
+    raise RuntimeError("❌ NEF model not saved!")
+
+print("✅ Compile finished. NEF at:", nef_save)
--- a/tools/kneron/onnx2nefSTDC630canuse.py
+++ b/tools/kneron/onnx2nefSTDC630canuse.py
@ -0,0 +1,86 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+data_path = "data724362"
+imgsz_w, imgsz_h = 724, 362  # STDC 預設解析度
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 載入並優化 ONNX 模型 ===
+print("🔄 Loading and optimizing ONNX...")
+m = onnx.load(onnx_path)
+m = ktc.onnx_optimizer.onnx2onnx_flow(m)
+opt_onnx_path = os.path.join(onnx_dir, 'latest.opt.onnx')
+onnx.save(m, opt_onnx_path)
+
+# === 4. 設定 Kneron 模型編譯參數 ===
+print("📐 Configuring model...")
+km = ktc.ModelConfig(20008, "0001", "630", onnx_model=m)
+
+# （可選）模型效能評估
+eval_result = km.evaluate()
+print("\n📊 NPU Performance Evaluation:\n" + str(eval_result))
+
+# === 5. 準備圖片資料 ===
+print("🖼️ Preparing image data...")
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+
+if not files_found:
+    raise FileNotFoundError(f"❌ No images found in {data_path}!")
+
+print(f"✅ Found {len(files_found)} images in {data_path}")
+
+input_name = m.graph.input[0].name
+img_list = []
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        fullpath = os.path.join(root, f)
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➝ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➝ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➝ NCHW (加上 batch 維度)
+            img_list.append(img_np)
+            print(f"✅ Processed: {fullpath}")
+        except Exception as e:
+            print(f"❌ Failed to process {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ Error: No valid images were processed!")
+
+# === 6. BIE 量化分析 ===
+print("📦 Running fixed-point analysis...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ Error: BIE model was not generated!")
+
+print("✅ BIE model saved to:", bie_save_path)
+
+# === 7. 編譯 NEF 模型 ===
+print("⚙️ Compiling NEF model...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ Error: NEF model was not generated!")
+
+print("✅ NEF compile done!")
+print("📁 NEF file saved to:", nef_save_path)
--- a/tools/kneron/onnx2nef_stdc630_safe.py
+++ b/tools/kneron/onnx2nef_stdc630_safe.py
@ -0,0 +1,92 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
+data_path = 'data724362'
+imgsz_w, imgsz_h = 724, 362  # STDC 預設解析度
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 優化 ONNX 模型（使用 onnx2onnx_flow）===
+print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
+model = onnx.load(onnx_path)
+model = ktc.onnx_optimizer.onnx2onnx_flow(model)
+onnx.save(model, optimized_path)
+
+# === 4. 驗證輸入 Shape 是否正確 ===
+print("📏 驗證 ONNX Input Shape...")
+input_tensor = model.graph.input[0]
+input_name = input_tensor.name
+input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
+expected_shape = [1, 3, imgsz_h, imgsz_w]
+print(f"📌 input_name: {input_name}")
+print(f"📌 input_shape: {input_shape}")
+if input_shape != expected_shape:
+    raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
+
+# === 5. 初始化模型編譯器 (for KL630) ===
+print("📐 配置模型 for KL630...")
+km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model)
+
+# （可選）效能分析
+eval_result = km.evaluate()
+print("\n📊 NPU 效能分析:\n" + str(eval_result))
+
+# === 6. 圖片預處理 ===
+print("🖼️ 處理輸入圖片...")
+img_list = []
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+if not files_found:
+    raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        fullpath = os.path.join(root, f)
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➝ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➝ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➝ NCHW
+            img_list.append(img_np)
+            print(f"✅ 處理成功: {fullpath}")
+        except Exception as e:
+            print(f"❌ 處理失敗 {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ 沒有成功處理任何圖片！")
+
+# === 7. 執行 BIE 量化分析 ===
+print("📦 執行固定點分析 (BIE)...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ 無法產生 BIE 模型")
+
+print("✅ BIE 模型儲存於:", bie_save_path)
+
+# === 8. 編譯 NEF 模型 ===
+print("⚙️ 編譯 NEF 模型 for KL630...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ 無法產生 NEF 模型")
+
+print("✅ NEF 編譯完成")
+print("📁 NEF 檔案儲存於:", nef_save_path)
--- a/tools/kneron/onnx2nef_stdc830_safe.py
+++ b/tools/kneron/onnx2nef_stdc830_safe.py
@ -0,0 +1,92 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
+data_path = 'data724362'
+imgsz_w, imgsz_h = 724, 362  # STDC 預設解析度
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 優化 ONNX 模型（使用 onnx2onnx_flow）===
+print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
+model = onnx.load(onnx_path)
+model = ktc.onnx_optimizer.onnx2onnx_flow(model)
+onnx.save(model, optimized_path)
+
+# === 4. 驗證輸入 Shape 是否正確 ===
+print("📏 驗證 ONNX Input Shape...")
+input_tensor = model.graph.input[0]
+input_name = input_tensor.name
+input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
+expected_shape = [1, 3, imgsz_h, imgsz_w]
+print(f"📌 input_name: {input_name}")
+print(f"📌 input_shape: {input_shape}")
+if input_shape != expected_shape:
+    raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
+
+# === 5. 初始化模型編譯器 (for KL630) ===
+print("📐 配置模型 for KL630...")
+km = ktc.ModelConfig(40000, "0001", "730", onnx_model=model)
+
+# （可選）效能分析
+eval_result = km.evaluate()
+print("\n📊 NPU 效能分析:\n" + str(eval_result))
+
+# === 6. 圖片預處理 ===
+print("🖼️ 處理輸入圖片...")
+img_list = []
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+if not files_found:
+    raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        fullpath = os.path.join(root, f)
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➝ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➝ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➝ NCHW
+            img_list.append(img_np)
+            print(f"✅ 處理成功: {fullpath}")
+        except Exception as e:
+            print(f"❌ 處理失敗 {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ 沒有成功處理任何圖片！")
+
+# === 7. 執行 BIE 量化分析 ===
+print("📦 執行固定點分析 (BIE)...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ 無法產生 BIE 模型")
+
+print("✅ BIE 模型儲存於:", bie_save_path)
+
+# === 8. 編譯 NEF 模型 ===
+print("⚙️ 編譯 NEF 模型 for KL630...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ 無法產生 NEF 模型")
+
+print("✅ NEF 編譯完成")
+print("📁 NEF 檔案儲存於:", nef_save_path)
--- a/tools/kneron/onnxe2e.py
+++ b/tools/kneron/onnxe2e.py
@ -0,0 +1,47 @@
+import onnxruntime as ort
+import numpy as np
+from PIL import Image
+import cv2
+
+# === 1. 載入 ONNX 模型 ===
+onnx_path = "work_dirs/meconfig8/latest.onnx"
+session = ort.InferenceSession(onnx_path, providers=['CPUExecutionProvider'])
+
+# === 2. 前處理輸入圖像（724x362） ===
+def preprocess(img_path):
+    image = Image.open(img_path).convert("RGB")
+    image = image.resize((724, 362), Image.BILINEAR)
+    img = np.array(image) / 255.0
+    img = np.transpose(img, (2, 0, 1))  # HWC → CHW
+    img = np.expand_dims(img, 0).astype(np.float32)  # (1, 3, 362, 724)
+    return img
+
+img_path = "test.png"
+input_tensor = preprocess(img_path)
+
+# === 3. 執行推論 ===
+input_name = session.get_inputs()[0].name
+output = session.run(None, {input_name: input_tensor})  # list of np.array
+
+# === 4. 後處理 + 預測 Mask ===
+output_tensor = output[0][0]           # shape: (num_classes, H, W)
+pred_mask = np.argmax(output_tensor, axis=0).astype(np.uint8)  # (H, W)
+
+# === 5. 可視化結果 ===
+colors = [
+    [128, 0, 0],    # 0: bunker
+    [0, 0, 128],    # 1: car
+    [0, 128, 0],    # 2: grass
+    [0, 255, 0],    # 3: greenery
+    [255, 0, 0],    # 4: person
+    [255, 165, 0],  # 5: road
+    [0, 255, 255],  # 6: tree
+]
+
+color_mask = np.zeros((pred_mask.shape[0], pred_mask.shape[1], 3), dtype=np.uint8)
+for cls_id, color in enumerate(colors):
+    color_mask[pred_mask == cls_id] = color
+
+# 儲存可視化圖片
+cv2.imwrite("onnx_pred_mask.png", color_mask)
+print("✅ 預測結果已儲存為：onnx_pred_mask.png")
--- a/tools/kneron/test.py
+++ b/tools/kneron/test.py
@ -0,0 +1,92 @@
+import ktc
+import numpy as np
+import os
+import onnx
+import shutil
+from PIL import Image
+
+# === 1. 設定路徑與參數 ===
+onnx_dir = 'work_dirs/meconfig8/'
+onnx_path = os.path.join(onnx_dir, 'latest.onnx')
+optimized_path = os.path.join(onnx_dir, 'latest_optimized.onnx')
+data_path = 'data724362'
+imgsz_w, imgsz_h = 724, 362  # STDC 預設解析度
+
+# === 2. 建立輸出資料夾 ===
+os.makedirs(onnx_dir, exist_ok=True)
+
+# === 3. 優化 ONNX 模型（使用 onnx2onnx_flow）===
+print("⚙️ 使用 onnx2onnx_flow 優化 ONNX...")
+model = onnx.load(onnx_path)
+model = ktc.onnx_optimizer.onnx2onnx_flow(model)
+onnx.save(model, optimized_path)
+
+# === 4. 驗證輸入 Shape 是否正確 ===
+print("📏 驗證 ONNX Input Shape...")
+input_tensor = model.graph.input[0]
+input_name = input_tensor.name
+input_shape = [d.dim_value for d in input_tensor.type.tensor_type.shape.dim]
+expected_shape = [1, 3, imgsz_h, imgsz_w]
+print(f"📌 input_name: {input_name}")
+print(f"📌 input_shape: {input_shape}")
+if input_shape != expected_shape:
+    raise ValueError(f"❌ Shape mismatch: {input_shape} ≠ {expected_shape}")
+
+# === 5. 初始化模型編譯器 (for KL630) ===
+print("📐 配置模型 for KL630...")
+km = ktc.ModelConfig(32769, "0001", "630", onnx_model=model)
+
+# （可選）效能分析
+eval_result = km.evaluate()
+print("\n📊 NPU 效能分析:\n" + str(eval_result))
+
+# === 6. 圖片預處理 ===
+print("🖼️ 處理輸入圖片...")
+img_list = []
+files_found = [f for _, _, files in os.walk(data_path)
+               for f in files if f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp"))]
+if not files_found:
+    raise FileNotFoundError(f"❌ 找不到圖片於 {data_path}")
+
+for root, _, files in os.walk(data_path):
+    for f in files:
+        if not f.lower().endswith((".jpg", ".jpeg", ".png", ".bmp")):
+            continue
+        fullpath = os.path.join(root, f)
+        try:
+            img = Image.open(fullpath).convert("RGB")
+            img = Image.fromarray(np.array(img)[..., ::-1])  # RGB ➝ BGR
+            img_np = np.array(img.resize((imgsz_w, imgsz_h), Image.BILINEAR)).astype(np.float32)
+            img_np = img_np / 256.0 - 0.5
+            img_np = np.transpose(img_np, (2, 0, 1))  # HWC ➝ CHW
+            img_np = np.expand_dims(img_np, axis=0)   # CHW ➝ NCHW
+            img_list.append(img_np)
+            print(f"✅ 處理成功: {fullpath}")
+        except Exception as e:
+            print(f"❌ 處理失敗 {fullpath}: {e}")
+
+if not img_list:
+    raise RuntimeError("❌ 沒有成功處理任何圖片！")
+
+# === 7. 執行 BIE 量化分析 ===
+print("📦 執行固定點分析 (BIE)...")
+bie_model_path = km.analysis({input_name: img_list})
+bie_save_path = os.path.join(onnx_dir, os.path.basename(bie_model_path))
+shutil.copy(bie_model_path, bie_save_path)
+
+if not os.path.exists(bie_save_path):
+    raise RuntimeError("❌ 無法產生 BIE 模型")
+
+print("✅ BIE 模型儲存於:", bie_save_path)
+
+# === 8. 編譯 NEF 模型 ===
+print("⚙️ 編譯 NEF 模型 for KL630...")
+nef_model_path = ktc.compile([km])
+nef_save_path = os.path.join(onnx_dir, os.path.basename(nef_model_path))
+shutil.copy(nef_model_path, nef_save_path)
+
+if not os.path.exists(nef_save_path):
+    raise RuntimeError("❌ 無法產生 NEF 模型")
+
+print("✅ NEF 編譯完成")
+print("📁 NEF 檔案儲存於:", nef_save_path)
--- a/tools/kneron/test_onnx_dummy.py
+++ b/tools/kneron/test_onnx_dummy.py
@ -0,0 +1,24 @@
+import onnxruntime as ort
+import numpy as np
+
+# ✅ 模型路徑（你指定的）
+onnx_path = r"C:\Users\rd_de\kneron-mmsegmentation\work_dirs\kn_stdc1_in1k-pre_512x1024_80k_cityscapes\latest.onnx"
+
+# 建立 ONNX session
+session = ort.InferenceSession(onnx_path)
+
+# 印出模型 input 相關資訊
+input_name = session.get_inputs()[0].name
+input_shape = session.get_inputs()[0].shape
+print(f"✅ Input name: {input_name}")
+print(f"✅ Input shape: {input_shape}")
+
+# 建立假圖輸入 (float32, shape = [1, 3, 512, 1024])
+dummy_input = np.random.rand(1, 3, 512, 1024).astype(np.float32)
+
+# 執行推論
+outputs = session.run(None, {input_name: dummy_input})
+
+# 顯示模型輸出資訊
+for i, output in enumerate(outputs):
+    print(f"✅ Output {i}: shape = {output.shape}, dtype = {output.dtype}")
--- a/tools/optimize_onnx_kneron.py
+++ b/tools/optimize_onnx_kneron.py
@ -0,0 +1,43 @@
+import os
+import sys
+import onnx
+
+# === 動態加入 optimizer_scripts 模組路徑 ===
+current_dir = os.path.dirname(os.path.abspath(__file__))
+sys.path.insert(0, os.path.join(current_dir, 'tools'))
+
+from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow
+
+def main():
+    # === 設定路徑 ===
+    onnx_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest.onnx'
+    optimized_path = r'C:\Users\rd_de\kneronstdc\work_dirs\meconfig8\latest_optimized.onnx'
+
+    if not os.path.exists(onnx_path):
+        print(f'❌ 找不到 ONNX 檔案: {onnx_path}')
+        return
+
+    # === 載入 ONNX 模型 ===
+    print(f'🔄 載入 ONNX: {onnx_path}')
+    m = onnx.load(onnx_path)
+
+    # === 修正 ir_version（避免 opset11 時報錯）===
+    if m.ir_version == 7:
+        print(f'⚠️ 調整 ir_version 7 → 6（相容性修正）')
+        m.ir_version = 6
+
+    # === 執行 Kneron 優化流程 ===
+    print('⚙️ 執行 Kneron 優化 flow...')
+    try:
+        m = torch_exported_onnx_flow(m, disable_fuse_bn=False)
+    except Exception as e:
+        print(f'❌ 優化失敗: {type(e).__name__} → {e}')
+        return
+
+    # === 儲存結果 ===
+    os.makedirs(os.path.dirname(optimized_path), exist_ok=True)
+    onnx.save(m, optimized_path)
+    print(f'✅ 已儲存最佳化 ONNX: {optimized_path}')
+
+if __name__ == '__main__':
+    main()
--- a/tools/optimizer_scripts/tools/other.py
+++ b/tools/optimizer_scripts/tools/other.py
@ -328,6 +328,15 @@ def topological_sort(g):
            if in_degree[node_name] == 0:
                to_add.append(node_name)
                del in_degree[node_name]
+    # deal with initializers (weights/biases)
+    for initializer in g.initializer:
+        init_name = initializer.name
+        for node_name in output_nodes[init_name]:
+            if node_name in in_degree:
+                in_degree[node_name] -= 1
+                if in_degree[node_name] == 0:
+                    to_add.append(node_name)
+                    del in_degree[node_name]
    # main sort loop
    sorted_nodes = []
    while to_add:
--- a/tools/pytorch2onnx_kneron13.py
+++ b/tools/pytorch2onnx_kneron13.py
@ -0,0 +1,242 @@
+# All modification made by Kneron Corp.: Copyright (c) 2022 Kneron Corp.
+# Copyright (c) OpenMMLab. All rights reserved.
+import argparse
+import warnings
+import os
+import onnx
+import mmcv
+import numpy as np
+import onnxruntime as rt
+import torch
+from mmcv import DictAction
+from mmcv.onnx import register_extra_symbolics
+from mmcv.runner import load_checkpoint
+from torch import nn
+
+from mmseg.apis import show_result_pyplot
+from mmseg.apis.inference import LoadImage
+from mmseg.datasets.pipelines import Compose
+from mmseg.models import build_segmentor
+
+from optimizer_scripts.tools import other
+from optimizer_scripts.pytorch_exported_onnx_preprocess import torch_exported_onnx_flow
+
+torch.manual_seed(3)
+
+
+def _parse_normalize_cfg(test_pipeline):
+    transforms = None
+    for pipeline in test_pipeline:
+        if 'transforms' in pipeline:
+            transforms = pipeline['transforms']
+            break
+    assert transforms is not None, 'Failed to find `transforms`'
+    norm_config_li = [_ for _ in transforms if _['type'] == 'Normalize']
+    assert len(norm_config_li) == 1, '`norm_config` should only have one'
+    return norm_config_li[0]
+
+
+def _convert_batchnorm(module):
+    module_output = module
+    if isinstance(module, torch.nn.SyncBatchNorm):
+        module_output = torch.nn.BatchNorm2d(
+            module.num_features, module.eps,
+            module.momentum, module.affine, module.track_running_stats)
+        if module.affine:
+            module_output.weight.data = module.weight.data.clone().detach()
+            module_output.bias.data = module.bias.data.clone().detach()
+            module_output.weight.requires_grad = module.weight.requires_grad
+            module_output.bias.requires_grad = module.bias.requires_grad
+        module_output.running_mean = module.running_mean
+        module_output.running_var = module.running_var
+        module_output.num_batches_tracked = module.num_batches_tracked
+    for name, child in module.named_children():
+        module_output.add_module(name, _convert_batchnorm(child))
+    del module
+    return module_output
+
+
+def _demo_mm_inputs(input_shape):
+    (N, C, H, W) = input_shape
+    rng = np.random.RandomState(0)
+    img = torch.FloatTensor(rng.rand(*input_shape))
+    return img
+
+
+def _prepare_input_img(img_path, test_pipeline, shape=None):
+    if shape is not None:
+        test_pipeline[1]['img_scale'] = (shape[1], shape[0])
+    test_pipeline[1]['transforms'][0]['keep_ratio'] = False
+    test_pipeline = [LoadImage()] + test_pipeline[1:]
+    test_pipeline = Compose(test_pipeline)
+    data = dict(img=img_path)
+    data = test_pipeline(data)
+    img = torch.FloatTensor(data['img']).unsqueeze_(0)
+    return img
+
+
+def pytorch2onnx(model, img, norm_cfg=None, opset_version=13, show=False, output_file='tmp.onnx', verify=False):
+    model.cpu().eval()
+
+    if isinstance(model.decode_head, nn.ModuleList):
+        num_classes = model.decode_head[-1].num_classes
+    else:
+        num_classes = model.decode_head.num_classes
+
+    model.forward = model.forward_dummy
+    origin_forward = model.forward
+
+    register_extra_symbolics(opset_version)
+    with torch.no_grad():
+        torch.onnx.export(
+            model, img, output_file,
+            input_names=['input'],
+            output_names=['output'],
+            export_params=True,
+            keep_initializers_as_inputs=False,
+            verbose=show,
+            opset_version=opset_version,
+            dynamic_axes=None)
+        print(f'Successfully exported ONNX model: {output_file} (opset_version={opset_version})')
+
+    model.forward = origin_forward
+
+    # NOTE: optimize onnx
+    m = onnx.load(output_file)
+    if opset_version == 11:
+        m.ir_version = 6
+    m = torch_exported_onnx_flow(m, disable_fuse_bn=False)
+    onnx.save(m, output_file)
+    print(f'{output_file} optimized by KNERON successfully.')
+
+    if verify:
+        onnx_model = onnx.load(output_file)
+        onnx.checker.check_model(onnx_model)
+
+        with torch.no_grad():
+            pytorch_result = model(img).numpy()
+
+        input_all = [node.name for node in onnx_model.graph.input]
+        input_initializer = [node.name for node in onnx_model.graph.initializer]
+        net_feed_input = list(set(input_all) - set(input_initializer))
+        assert len(net_feed_input) == 1
+        sess = rt.InferenceSession(output_file, providers=['CPUExecutionProvider'])
+        onnx_result = sess.run(None, {net_feed_input[0]: img.detach().numpy()})[0]
+
+        if show:
+            import cv2
+            img_show = img[0][:3, ...].permute(1, 2, 0) * 255
+            img_show = img_show.detach().numpy().astype(np.uint8)
+            ori_shape = img_show.shape[:2]
+
+            onnx_result_ = onnx_result[0].argmax(0)
+            onnx_result_ = cv2.resize(onnx_result_.astype(np.uint8), (ori_shape[1], ori_shape[0]))
+            show_result_pyplot(model, img_show, (onnx_result_, ), palette=model.PALETTE,
+                               block=False, title='ONNXRuntime', opacity=0.5)
+
+            pytorch_result_ = pytorch_result.squeeze().argmax(0)
+            pytorch_result_ = cv2.resize(pytorch_result_.astype(np.uint8), (ori_shape[1], ori_shape[0]))
+            show_result_pyplot(model, img_show, (pytorch_result_, ), title='PyTorch',
+                               palette=model.PALETTE, opacity=0.5)
+
+        np.testing.assert_allclose(
+            pytorch_result.astype(np.float32) / num_classes,
+            onnx_result.astype(np.float32) / num_classes,
+            rtol=1e-5,
+            atol=1e-5,
+            err_msg='The outputs are different between Pytorch and ONNX')
+        print('The outputs are same between Pytorch and ONNX.')
+
+    if norm_cfg is not None:
+        print("Prepending BatchNorm layer to ONNX as data normalization...")
+        mean = norm_cfg['mean']
+        std = norm_cfg['std']
+        i_n = m.graph.input[0]
+        if (i_n.type.tensor_type.shape.dim[1].dim_value != len(mean) or
+            i_n.type.tensor_type.shape.dim[1].dim_value != len(std)):
+            raise ValueError(f"--pixel-bias-value ({mean}) and --pixel-scale-value ({std}) should match input dimension.")
+        norm_bn_bias = [-1 * cm / cs + 128. / cs for cm, cs in zip(mean, std)]
+        norm_bn_scale = [1 / cs for cs in std]
+        other.add_bias_scale_bn_after(m.graph, i_n.name, norm_bn_bias, norm_bn_scale)
+        m = other.polish_model(m)
+        bn_outf = os.path.splitext(output_file)[0] + "_bn_prepended.onnx"
+        onnx.save(m, bn_outf)
+        print(f"BN-Prepended ONNX saved to {bn_outf}")
+
+    return
+
+
+def parse_args():
+    parser = argparse.ArgumentParser(description='Convert MMSeg to ONNX')
+    parser.add_argument('config', help='test config file path')
+    parser.add_argument('--checkpoint', help='checkpoint file', default=None)
+    parser.add_argument('--input-img', type=str, help='Images for input', default=None)
+    parser.add_argument('--show', action='store_true', help='show onnx graph and segmentation results')
+    parser.add_argument('--verify', action='store_true', help='verify the onnx model')
+    parser.add_argument('--output-file', type=str, default='tmp.onnx')
+    parser.add_argument('--opset-version', type=int, default=13)  # default opset=13
+    parser.add_argument('--shape', type=int, nargs='+', default=None, help='input image height and width.')
+    parser.add_argument('--cfg-options', nargs='+', action=DictAction, help='Override config options.')
+    parser.add_argument('--normalization-in-onnx', action='store_true', help='Prepend BN for normalization.')
+    args = parser.parse_args()
+    return args
+
+
+if __name__ == '__main__':
+    args = parse_args()
+
+    if args.opset_version < 11:
+        raise ValueError(f"Only opset_version >=11 is supported (got {args.opset_version}).")
+
+    cfg = mmcv.Config.fromfile(args.config)
+    if args.cfg_options is not None:
+        cfg.merge_from_dict(args.cfg_options)
+    cfg.model.pretrained = None
+
+    test_mode = cfg.model.test_cfg.mode
+    if args.shape is None:
+        if test_mode == 'slide':
+            crop_size = cfg.model.test_cfg['crop_size']
+            input_shape = (1, 3, crop_size[1], crop_size[0])
+        else:
+            img_scale = cfg.test_pipeline[1]['img_scale']
+            input_shape = (1, 3, img_scale[1], img_scale[0])
+    else:
+        if test_mode == 'slide':
+            warnings.warn("Shape assignment for slide-mode models may cause unexpected results.")
+        if len(args.shape) == 1:
+            input_shape = (1, 3, args.shape[0], args.shape[0])
+        elif len(args.shape) == 2:
+            input_shape = (1, 3) + tuple(args.shape)
+        else:
+            raise ValueError('Invalid input shape')
+
+    cfg.model.train_cfg = None
+    segmentor = build_segmentor(cfg.model, train_cfg=None, test_cfg=cfg.get('test_cfg'))
+    segmentor = _convert_batchnorm(segmentor)
+
+    if args.checkpoint:
+        checkpoint = load_checkpoint(segmentor, args.checkpoint, map_location='cpu')
+        segmentor.CLASSES = checkpoint['meta']['CLASSES']
+        segmentor.PALETTE = checkpoint['meta']['PALETTE']
+
+    if args.input_img is not None:
+        preprocess_shape = (input_shape[2], input_shape[3])
+        img = _prepare_input_img(args.input_img, cfg.data.test.pipeline, shape=preprocess_shape)
+    else:
+        img = _demo_mm_inputs(input_shape)
+
+    if args.normalization_in_onnx:
+        norm_cfg = _parse_normalize_cfg(cfg.test_pipeline)
+    else:
+        norm_cfg = None
+
+    pytorch2onnx(
+        segmentor,
+        img,
+        norm_cfg=norm_cfg,
+        opset_version=args.opset_version,
+        show=args.show,
+        output_file=args.output_file,
+        verify=args.verify,
+    )
--- a/tools/yolov5_preprocess.py
+++ b/tools/yolov5_preprocess.py
@ -0,0 +1,161 @@
+# coding: utf-8
+import torch
+import cv2
+import numpy as np
+import math
+import time
+import kneron_preprocessing
+
+kneron_preprocessing.API.set_default_as_520()
+torch.backends.cudnn.deterministic = True
+img_formats = ['.bmp', '.jpg', '.jpeg', '.png', '.tif', '.tiff', '.dng']
+def make_divisible(x, divisor):
+    # Returns x evenly divisble by divisor
+    return math.ceil(x / divisor) * divisor
+
+def check_img_size(img_size, s=32):
+    # Verify img_size is a multiple of stride s
+    new_size = make_divisible(img_size, int(s))  # ceil gs-multiple
+    if new_size != img_size:
+        print('WARNING: --img-size %g must be multiple of max stride %g, updating to %g' % (img_size, s, new_size))
+    return new_size
+
+def letterbox_ori(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
+    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
+    shape = img.shape[:2]  # current shape [height, width]
+    if isinstance(new_shape, int):
+        new_shape = (new_shape, new_shape)
+
+    # Scale ratio (new / old)
+    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
+    if not scaleup:  # only scale down, do not scale up (for better test mAP)
+        r = min(r, 1.0)
+
+    # Compute padding
+    ratio = r, r  # width, height ratios
+    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height 
+    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
+
+    dw /= 2  # divide padding into 2 sides
+    dh /= 2
+
+    if shape[::-1] != new_unpad:  # resize
+        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
+        #img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False)
+
+    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
+    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
+    # top, bottom = int(0), int(round(dh + 0.1))
+    # left, right = int(0), int(round(dw + 0.1))    
+    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
+    #img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0)
+
+    return img, ratio, (dw, dh)
+
+def letterbox(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
+    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
+    shape = img.shape[:2]  # current shape [height, width]
+    if isinstance(new_shape, int):
+        new_shape = (new_shape, new_shape)
+
+    # Scale ratio (new / old)
+    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
+    if not scaleup:  # only scale down, do not scale up (for better test mAP)
+        r = min(r, 1.0)
+
+    # Compute padding
+    ratio = r, r  # width, height ratios
+    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) # width, height 
+    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
+
+    # dw /= 2  # divide padding into 2 sides
+    # dh /= 2
+
+    if shape[::-1] != new_unpad:  # resize
+        #img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
+        img = kneron_preprocessing.API.resize(img,size=new_unpad, keep_ratio = False)
+
+    # top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
+    # left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
+    top, bottom = int(0), int(round(dh + 0.1))
+    left, right = int(0), int(round(dw + 0.1))    
+    #img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
+    img = kneron_preprocessing.API.pad(img, left, right, top, bottom, 0)
+
+    return img, ratio, (dw, dh)
+
+def letterbox_test(img, new_shape=(640, 640), color=(0, 0, 0), auto=True, scaleFill=False, scaleup=True):
+
+    ratio = 1.0, 1.0
+    dw, dh = 0, 0
+    img = kneron_preprocessing.API.resize(img, size=(480, 256), keep_ratio=False, type='bilinear')
+    return img, ratio, (dw, dh)
+
+def LoadImages(path,img_size):  #_rgb # for inference
+    if isinstance(path, str):
+        img0 = cv2.imread(path)  # BGR       
+    else:
+        img0 = path  # BGR
+
+    # Padded resize
+    img = letterbox(img0, new_shape=img_size)[0]
+    # Convert
+    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
+    img = np.ascontiguousarray(img)
+    return img, img0
+
+def LoadImages_yyy(path,img_size): #_yyy # for inference
+    if isinstance(path, str):
+        img0 = cv2.imread(path)  # BGR       
+    else:
+        img0 = path  # BGR
+
+    yvu = cv2.cvtColor(img0, cv2.COLOR_BGR2YCrCb)
+    y, v, u = cv2.split(yvu)
+    img0 = np.stack((y,)*3, axis=-1)
+
+    # Padded resize
+    img = letterbox(img0, new_shape=img_size)[0]
+
+    # Convert
+    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
+    img = np.ascontiguousarray(img)
+    return img, img0
+
+def LoadImages_yuv420(path,img_size):  #_yuv420 # for inference 
+    if isinstance(path, str):
+        img0 = cv2.imread(path)  # BGR       
+    else:
+        img0 = path  # BGR
+    img_h, img_w = img0.shape[:2]
+    img_h = (img_h // 2) * 2
+    img_w = (img_w // 2) * 2
+    img = img0[:img_h,:img_w,:]
+    yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV_I420)
+    img0= cv2.cvtColor(yuv, cv2.COLOR_YUV2BGR_I420) #yuv420
+
+    
+    # Padded resize
+    img = letterbox(img0, new_shape=img_size)[0]
+
+    # Convert
+    img = img[:, :, ::-1].transpose(2, 0, 1)  # BGR to RGB, to 3x416x416
+    img = np.ascontiguousarray(img)
+    return img, img0
+
+def Yolov5_preprocess(image_path, device, imgsz_h, imgsz_w) : 
+    model_stride_max = 32
+    imgsz_h = check_img_size(imgsz_h, s=model_stride_max)  # check img_size
+    imgsz_w = check_img_size(imgsz_w, s=model_stride_max)  # check img_size
+    img, im0 = LoadImages(image_path, img_size=(imgsz_h,imgsz_w))
+    img = kneron_preprocessing.API.norm(img) #path1
+    #print('img',img.shape)
+    img = torch.from_numpy(img).to(device) #path1,path2
+    # img = img.float()  # uint8 to fp16/32 #path2
+    # img /= 255.0#256.0 - 0.5 # 0 - 255 to -0.5 - 0.5 #path2
+    
+    if img.ndimension() == 3:
+        img = img.unsqueeze(0)
+    
+    return img, im0
+
--- a/使用手冊.txt
+++ b/使用手冊.txt
@ -0,0 +1,57 @@
+環境安裝:
+# 建立與啟動 conda 環境
+conda create -n stdc_golface python=3.8 -y
+conda activate stdc_golface
+
+# 安裝 PyTorch + 對應 CUDA 11.3 版本
+conda install pytorch=1.11.0 torchvision=0.12.0 torchaudio cudatoolkit=11.3 -c pytorch -y
+
+# 安裝對應版本的 mmcv-full
+pip install mmcv-full==1.5.0 -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html
+
+# 安裝 kneronstdc 專案
+cd kneronstdc
+pip install -e .
+
+# 安裝常用工具套件
+pip install opencv-python tqdm matplotlib cityscapesscripts
+
+# 安裝 yapf 格式化工具（指定版本）
+pip install yapf==0.31.0
+--------------------------------------------------------------------------------------
+data:
+使用 Roboflow 匯出資料集格式請選擇：
+
+Semantic Segmentation Masks
+
+使用 seg2city.py 腳本將 Roboflow 格式轉換為 Cityscapes 格式
+
+Cityscapes 範例資料可作為參考
+
+將轉換後的資料放置至 data/cityscapes 資料夾
+
+（cityscapes 為訓練預設的 dataset 名稱）
+--------------------------------------------------------------------------------------
+訓練模型:
+開剛剛新裝好的env，用cmd下指令，cd到kneronstdc裡面
+train的指令:
+python tools/train.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py
+
+test的指令:
+python tools/test.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --show-dir work_dirs/vis_results
+------------------------------------------------------------------------------------
+映射到資料夾
+docker run --rm -it -v $(wslpath -u 'C:\Users\rd_de\kneronstdc'):/workspace/kneronstdc kneron/toolchain:latest
+
+轉ONNX指令
+python tools/pytorch2onnx_kneron.py configs/stdc/kn_stdc1_in1k-pre_512x1024_80k_cityscapes.py --checkpoint work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.pth --output-file work_dirs/kn_stdc1_in1k-pre_512x1024_80k_cityscapes/latest.onnx --verify
+
+把nef拉出來到電腦
+docker cp f78594411e1b:/data1/kneron_flow/models_630.nef "C:\Users\rd_de\kneronstdc\work_dirs\nef\models_630.nef"
+---------------------------------------------------------------------------------------
+pip install opencv-python
+RUN apt update && apt install -y libgl1
+
+
+
+
				`@ -0,0 +1,2 @@`
				`from . import ColorConversion, Padding, Resize, Crop, Normalize, Rotate`