masonhuang/KNEO-Academy

Fork 1

abin 1e42293896 Add autoflow

2026-04-07 14:37:04 +08:00

23 KiB

Raw Blame History

Design Doc: KNEO Academy（Innovedus AI Playground）v2.0

作者：Architect Agent
狀態：Draft（從既有程式碼反向整理）
日期：2026-04-04
產品版本：v2.0

1. 背景與目標

1.1 產品概述

KNEO Academy（對外名稱：Innovedus AI Playground）是一套 Windows 桌面應用程式，讓擁有 Kneron NPU USB Dongle 的使用者能夠在本機端執行 Edge AI 推論，無需雲端服務、無需撰寫程式碼。

核心使用情境：

業務展示：即插即用，開箱展示 AI 推論能力
研發驗證：快速測試 Kneron NPU 在特定任務的推論效果
客戶自訂：上傳自訂 .nef 模型進行推論測試

1.2 設計目標

即插即用：連接 Kneron dongle 後，數秒內可開始推論
Plugin 化架構：透過目錄結構 + config.json + script.py 新增模型，不需修改主程式
Thread 隔離：UI 執行緒與推論執行緒完全分離，確保畫面不卡頓
PyInstaller 相容：可打包為單一可執行檔，方便分發

1.3 非目標（Out of Scope）

雲端推論或 API 服務
多裝置同時推論
行動平台支援

2. 整體架構概覽

2.1 設計模式

本應用程式採用 MVC（Model-View-Controller） 架構，搭配 Qt 的 Signal/Slot 機制實現跨執行緒通訊。

┌─────────────────────────────────────────────────────────────────┐
│                         Views（呈現層）                           │
│  SelectionScreen  LoginScreen  MainWindow  UtilitiesScreen       │
│        QWidget         QWidget      QWidget       QWidget         │
└────────────────────────────┬────────────────────────────────────┘
                             │ Signal/Slot
┌────────────────────────────▼────────────────────────────────────┐
│                      AppController（main.py）                     │
│              QStackedWidget — 頁面路由中樞                        │
└──────┬──────────────────┬───────────────────┬───────────────────┘
       │                  │                   │
┌──────▼──────┐  ┌────────▼────────┐  ┌──────▼──────────┐
│ Device      │  │ Inference       │  │ Media           │
│ Controller  │  │ Controller      │  │ Controller      │
│ （裝置管理） │  │ （推論管理）     │  │ （相機/媒體）    │
└──────┬──────┘  └───────┬─────────┘  └──────┬──────────┘
       │                 │                    │
┌──────▼──────┐  ┌───────▼─────────┐  ┌──────▼──────────┐
│ device_     │  │ InferenceWorker │  │ VideoThread      │
│ service.py  │  │ Thread          │  │ （QThread）       │
│ （kp SDK）  │  │ （QThread）     │  │ （OpenCV 擷取）  │
└─────────────┘  └────────┬────────┘  └─────────────────┘
                           │ 動態載入
                  ┌────────▼────────┐
                  │ script.py       │
                  │ （Plugin 推論） │
                  └─────────────────┘

2.2 頁面導航架構

AppController 使用 QStackedWidget 作為根容器，所有頁面在啟動時一次性初始化，透過 setCurrentWidget() 切換顯示，不需重新建立物件。

AppController
└── QStackedWidget（stack）
    ├── [index 0] SelectionScreen    ← 預設顯示
    ├── [index 1] LoginScreen
    ├── [index 2] UtilitiesScreen
    └── [index 3] MainWindow

Signal 連接關係（頁面切換）：

發出者	Signal	接收者（Slot）	效果
SelectionScreen	`open_utilities`	`AppController.show_login_screen`	跳至登入頁
SelectionScreen	`open_demo_app`	`AppController.show_demo_app`	跳至主視窗
LoginScreen	`login_success`	`AppController.show_utilities_screen`	登入成功，進入工具頁
LoginScreen	`back_to_selection`	`AppController.show_selection_screen`	返回首頁
UtilitiesScreen	`back_to_selection`	`AppController.show_selection_screen`	返回首頁

2.3 MainWindow 內部架構

MainWindow 是 AI Demo 推論的核心容器，內部持有三個 Controller 組成協作關係：

MainWindow（QWidget）
├── DeviceController    — 管理 kp SDK 裝置連接
├── InferenceController — 管理推論 Worker Thread 與 Queue
│   └── inference_queue（queue.Queue, maxsize=5）
└── MediaController     — 管理相機擷取與畫面更新
    └── VideoThread（QThread）

3. 模組依賴圖

main.py（AppController）
├── views/selection_screen.py（SelectionScreen）
│   └── config.py
├── views/login_screen.py（LoginScreen）
│   └── config.py
├── views/utilities_screen.py（UtilitiesScreen）
│   ├── config.py
│   ├── controllers/device_controller.py（DeviceController）
│   └── services/device_service.py（check_available_device）
└── views/mainWindows.py（MainWindow）
    ├── config.py
    ├── controllers/device_controller.py（DeviceController）
    │   ├── config.py
    │   └── services/device_service.py
    ├── controllers/inference_controller.py（InferenceController）
    │   ├── config.py
    │   ├── models/inference_worker.py（InferenceWorkerThread）
    │   │   ├── config.py
    │   │   └── [動態載入] utils/{mode}/{model}/script.py
    │   └── models/custom_inference_worker.py（CustomInferenceWorkerThread）
    ├── controllers/media_controller.py（MediaController）
    │   ├── models/video_thread.py（VideoThread）
    │   └── utils/image_utils.py
    ├── services/file_service.py（FileService）
    └── utils/config_utils.py（ConfigUtils）

注意：UtilitiesScreen 建立了自己的 DeviceController 實例（與 MainWindow 的是不同物件），兩者不共享裝置狀態。

4. 資料流程圖

4.1 Video 即時推論流程

相機硬體
    │ 每幀（~30fps）
    ▼
VideoThread.run()
    │ QImage
    │ change_pixmap_signal.emit(qt_image)
    ▼
MediaController.update_image(qt_image)
    ├── 1. 繪製 Bounding Box → canvas_label.setPixmap(pixmap)
    └── 2. qimage_to_numpy(qt_image) → frame_np
            │
            ▼
        InferenceController.add_frame_to_queue(frame_np)
            │ 若 queue 未滿（maxsize=5）
            ▼
        inference_queue.put(frame_np)
            │
            ▼
        InferenceWorkerThread.run()
            ├── 1. MSE 比較（與前一幀）→ 差異不大時，emit 快取結果
            ├── 2. 時間間隔檢查（min_interval=2秒）
            └── 3. script.inference(frame, params) → result
                    │
                    │ inference_result_signal.emit(result)
                    ▼
                MainWindow.handle_inference_result(result)
                    ├── 若有 "bounding box"/"bounding boxes"
                    │   → 更新 current_bounding_boxes（下一幀繪製）
                    └── 若無 bounding box → QMessageBox 彈出顯示

4.2 Image 推論流程

使用者點擊「Upload」
    │
    ▼
FileService.upload_file()
    ├── 1. 暫停相機 Signal（disconnect change_pixmap_signal）
    ├── 2. QFileDialog 選檔
    ├── 3. shutil.copy2() → %LOCALAPPDATA%/Kneron_Academy/uploads/
    ├── 4. 顯示圖片於 canvas_label
    └── 5. InferenceController.process_uploaded_image(file_path)
                │
                ▼
            _clear_inference_queue()
            inference_queue.put(img)
                │
                ▼
            InferenceWorkerThread（once_mode=True）
                │ 只處理一幀後停止
                ▼
            script.inference(frame, params) → result
                │
                │ inference_result_signal.emit(result)
                ▼
            MainWindow.handle_inference_result(result)

4.3 Custom Model 推論流程

使用者提供：
  - custom_model_path（.nef 檔）
  - custom_scpu_path（fw_scpu.bin）
  - custom_ncpu_path（fw_ncpu.bin）
  - custom_labels（可選）
    │
    ▼
InferenceController.select_custom_tool(tool_config)
    │
    ▼
CustomInferenceWorkerThread（QThread）
    │
    ├── initialize_device()（首次執行時）
    │   ├── kp.core.connect_devices([port_id])
    │   ├── kp.core.load_firmware_from_file(scpu, ncpu)
    │   └── kp.core.load_model_from_file(model_path)
    │
    └── run_single_inference(frame)
        ├── preprocess_frame()（resize to 640, BGR → BGR565）
        ├── kp.GenericImageInferenceDescriptor
        ├── kp.inference.generic_image_inference_send()
        ├── kp.inference.generic_image_inference_receive()
        ├── kp.inference.generic_inference_retrieve_float_node()
        └── post_process_yolo_v5() → ExampleYoloResult
                │
                │ inference_result_signal.emit(result_dict)
                ▼
            MainWindow.handle_inference_result()

注意：CustomInferenceWorkerThread 在 Worker Thread 內部自行連接/重置裝置，與 DeviceController 管理的 device_group 是不同的連接。這是一個雙重連接問題（見第 8 節技術問題）。

5. Thread 架構

5.1 執行緒關係圖

Qt Main Thread（UI Thread）
├── AppController（QStackedWidget 管理）
├── MainWindow（UI 事件處理）
│   ├── handle_inference_result()    ← 由 Signal 呼叫（執行在主執行緒）
│   └── update_image() via MediaController  ← 由 Signal 呼叫（執行在主執行緒）
│
├── VideoThread（QThread #1）
│   └── 職責：相機擷取、QImage 轉換、emit change_pixmap_signal
│   └── 內部：threading.Thread（用於相機開啟 timeout 機制）
│
├── InferenceWorkerThread（QThread #2）
│   └── 職責：從 queue 取幀、MSE 比較、呼叫 script.inference()、emit 結果
│
└── CustomInferenceWorkerThread（QThread #3，替代 InferenceWorkerThread）
    └── 職責：device init、kp 推論、YOLOv5 後處理、emit 結果

5.2 裝置掃描的執行緒

check_available_device() 使用 threading.Thread（非 QThread）執行 kp.core.scan_devices()，並設 5 秒 timeout：

check_available_device() in Main Thread
    └── threading.Thread（daemon=True）
            └── kp.core.scan_devices()（阻塞式 SDK 呼叫）
        thread.join(timeout=5.0)

同樣地，VideoThread._open_camera_with_timeout() 也使用 threading.Thread 開啟相機，timeout 為 5 秒。

5.3 跨執行緒通訊

所有跨執行緒通訊均透過 Qt Signal/Slot 機制，Qt 保證跨執行緒的 Signal 會在接收執行緒的 Event Loop 中排隊執行：

Signal	發出執行緒	接收執行緒（Slot）
`VideoThread.change_pixmap_signal`	VideoThread	Main Thread（`MediaController.update_image`）
`InferenceWorkerThread.inference_result_signal`	InferenceWorkerThread	Main Thread（`MainWindow.handle_inference_result`）
`CustomInferenceWorkerThread.inference_result_signal`	CustomInferenceWorkerThread	Main Thread（`MainWindow.handle_inference_result`）

6. Plugin 系統設計

6.1 架構概念

Plugin 系統讓 Kneron 或第三方可以透過放置目錄和設定檔來新增 AI 工具，完全不需修改主程式碼。

6.2 目錄結構

%LOCALAPPDATA%/Kneron_Academy/utils/
├── config.json                     ← 全域 Plugin 索引（自動產生）
├── {mode_name}/                    ← 推論模式目錄（如 object_detection）
│   └── {model_name}/               ← 模型目錄（如 yolov5_person）
│       ├── config.json             ← 模型設定
│       ├── script.py               ← 推論腳本（Plugin 核心）
│       └── {model_name}.nef        ← Kneron 模型檔

6.3 Plugin 載入流程

應用程式啟動
    │
    ▼
ConfigUtils.generate_global_config()
    ├── 掃描 utils/ 下所有 mode 目錄（跳過 _ 開頭的目錄）
    ├── 掃描每個 mode 下所有 model 目錄
    ├── 讀取每個 model/config.json
    └── 輸出 utils/config.json（Plugin 索引）
    
使用者選擇工具
    │
    ▼
InferenceController.select_tool(tool_config)
    │
    ▼
InferenceWorkerThread.__init__()
    └── load_inference_module(mode, model_name)
            └── importlib.util.spec_from_file_location()
                → 動態 import utils/{mode}/{model}/script.py

6.4 `script.py` 介面規範

每個 Plugin 的 script.py 必須實作以下介面：

def inference(frame: np.ndarray, params: dict) -> dict | None:
    """
    Args:
        frame: 影像幀，numpy array，形狀 (H, W, 3)，RGB 格式
        params: 推論參數字典（詳見 config.json schema）
    
    Returns:
        dict 或 None（None 表示跳過此幀）
    
    支援的回傳格式（在 handle_inference_result 中處理）：
    
    格式 A：單一 Bounding Box
    {
        "bounding box": [x1, y1, x2, y2],
        "result": "class_label"
    }
    
    格式 B：多個 Bounding Box（推薦）
    {
        "bounding boxes": [[x1, y1, x2, y2], ...],
        "results": ["label1", "label2", ...]
    }
    
    格式 C：任意分類結果（彈出 QMessageBox 顯示）
    {
        "key": "value",
        ...
    }
    """

6.5 `params` 字典的內容

InferenceController.select_tool() 在建立 InferenceWorkerThread 前，會將以下資訊注入 input_params：

Key	型別	來源
`device_group`	kp.DeviceGroup	DeviceController
`usb_port_id`	int	已連接裝置
`scpu_path`	str	firmware 路徑
`ncpu_path`	str	firmware 路徑
`model`	str	model .nef 完整路徑
`model_descriptor`	kp.ModelDescriptor	已上傳的模型描述
`file_path`	str	圖片/聲音模式的上傳檔案路徑
（其他）	any	來自 model config.json 的 `input_parameters`

7. 資料存放設計

7.1 執行期資料目錄

全部存放於 Windows 的 %LOCALAPPDATA%\Kneron_Academy\：

%LOCALAPPDATA%\Kneron_Academy\
├── utils\
│   ├── config.json                 ← Plugin 全域索引（啟動時自動產生）
│   ├── {mode}\
│   │   └── {model}\
│   │       ├── config.json         ← 模型設定
│   │       ├── script.py           ← 推論腳本
│   │       └── *.nef               ← 模型檔
│   └── ...
├── uploads\                        ← 使用者上傳的圖片（不自動清理）
│   └── *.jpg / *.png / *.wav / ...
└── firmware\
    ├── KL520\
    │   ├── fw_scpu.bin
    │   └── fw_ncpu.bin
    └── KL720\
        ├── fw_scpu.bin
        └── fw_ncpu.bin

7.2 靜態 UI 資源

打包在應用程式內（uxui/ 目錄），路徑透過 PROJECT_ROOT 常數計算：

PROJECT_ROOT = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
UXUI_ASSETS = os.path.join(PROJECT_ROOT, "uxui", "")

PyInstaller 注意：打包後 __file__ 的位置會改變，需確認 UXUI_ASSETS 路徑在打包後仍正確（詳見第 8.3 節）。

8. 打包架構（PyInstaller）

8.1 打包工具鏈

PyInstaller 6.12.0：將 Python 應用打包為 Windows .exe
Inno Setup（dist/test.iss）：製作 Windows 安裝包 (.exe installer)
PyArmor（計畫中）：混淆/加密 Python 原始碼

8.2 打包注意事項

項目	問題	解決方向
`kp` SDK	kp 是 C Extension，需確認是否能被 PyInstaller 正確打包	需設定 `hiddenimports` 或 `binaries`
動態 import	`importlib.util.spec_from_file_location()` 在打包後需從外部路徑載入	`script.py` 必須放在 `%LOCALAPPDATA%`，不能打包進 exe
`UXUI_ASSETS` 路徑	打包後 `__file__` 指向臨時目錄	需在 `.spec` 中設定 `datas`，並使用 `sys._MEIPASS` 處理路徑
OpenCV	OpenCV 需包含 DLL	通常 PyInstaller 能自動偵測

8.3 目錄結構（打包後）

安裝目錄/
├── Innovedus AI Playground.exe   ← 主執行檔
├── uxui/                         ← 靜態資源（需隨 exe 一起安裝）
└── ...

%LOCALAPPDATA%\Kneron_Academy\    ← 使用者資料（安裝時建立）
├── utils/
├── uploads/
└── firmware/

9. 已知技術問題 / 技術債

9.1 雙重裝置連接（⚠️ 嚴重）

問題：CustomInferenceWorkerThread 在 Worker Thread 內部調用 kp.core.connect_devices()，但 DeviceController 可能已經對同一個 usb_port_id 建立了連接（在 MainWindow 流程中）。

影響：可能導致 kp SDK 報告「裝置已被連接」的錯誤，或產生未定義行為。

建議：CustomInferenceWorkerThread 應改為接受外部傳入的 device_group，而非自行連接。

9.2 UtilitiesScreen 的 DeviceController 孤立問題（⚠️ 中度）

問題：UtilitiesScreen 建立了自己的 DeviceController(self) 實例，與 MainWindow 的 DeviceController 完全獨立，兩者各自管理自己的 device_group。

影響：使用者在 UtilitiesScreen 連接裝置後，切換到 MainWindow 並不知道裝置已連接；反之亦然。

建議：將 DeviceController 提升到 AppController 層級，作為共享的單例。

9.3 推論 Queue 丟幀而不通知（⚠️ 中度）

問題：inference_queue 的 maxsize=5，當 queue 滿時，add_frame_to_queue() 靜默丟棄幀（只印 print，不通知 UI）。

影響：在高推論延遲時，使用者不知道有幀被丟棄，可能誤以為推論仍在即時進行。

建議：新增 UI 指示推論 queue 壓力（如幀率顯示、lag 指示）。

9.4 LoginScreen 的驗證邏輯未實作（⚠️ 中度）

問題：LoginScreen.attempt_login() 的實際 Server 驗證邏輯未實作，目前只要輸入任何非空帳密就會成功登入。

# 目前實作（不安全）
if not username or not password:
    self.show_error("Please enter both username and password")
    return
self.login_success.emit()  # 永遠成功

建議：需補齊 Server 端驗證 API 呼叫。

9.5 debug print 語句散落各處（低優先）

問題：各 Controller 和 Thread 中有大量 print() 呼叫作為 debug 輸出，打包後仍會執行（輸出被丟棄，但有效能成本）。

建議：改用 Python 的 logging 模組，並設定適當的 log level。

9.6 `custom_inference_worker.py` 中的 `kp` 全域引用問題（⚠️ 中度）

問題：_boxes_scale() 和 post_process_yolo_v5() 函數的型別標注直接引用 kp.HwPreProcInfo、kp.InferenceFloatNodeOutput（如 def _boxes_scale(boxes, hardware_preproc_info: kp.HwPreProcInfo)），但 kp 在模組頂層未被 import。實際 kp import 是在函數內部的 run_single_inference() 中延遲進行的。

影響：型別標注在模組載入時會被解析（在 Python 3.10+ 以下），可能導致 NameError。

建議：在頂層加入 if TYPE_CHECKING: import kp，或改用字串型別標注 "kp.HwPreProcInfo"。

9.7 VideoThread 的 `threading.Thread` 記憶體洩漏風險（低優先）

問題：_open_camera_with_timeout() 啟動了 daemon=True 的 threading.Thread 並等待最多 5 秒，但如果 thread 仍存活（timeout），其仍會繼續嘗試開啟相機，可能導致相機資源被不正確佔用。

建議：使用 cv2 的 nonblocking 方式或設定相機 timeout 參數，避免 daemon thread 的不確定行為。

9.8 MSE 計算的效能問題（低優先）

問題：InferenceWorkerThread 和 CustomInferenceWorkerThread 的 MSE 計算會把整個 frame 轉成 float32 進行運算：

mse = np.mean((frame.astype(np.float32) - self.last_frame.astype(np.float32)) ** 2)

對於 640x480 的 3 通道影像，每次計算需要處理 ~921,600 個浮點數。

建議：可改為縮小解析度後再計算 MSE，或使用 histogram 比較等更快速的方式。

10. 容量與效能估算

10.1 系統需求（桌面應用）

資源	需求	備註
CPU	雙核心以上	主要用於影像轉換和後處理
RAM	2GB 以上	kp SDK + OpenCV + PyQt5
USB	USB 3.0	KL720 需要 USB 3.0
GPU	不需要	推論在 NPU 執行
磁碟	500MB 以上	安裝包 + 模型檔

10.2 推論速度特性

Queue maxsize：5 幀
VideoThread 輸出：~30fps（640x480）
InferenceWorkerThread min_interval：2 秒（標準模式）/ 0.5 秒（Custom 模式）
MSE threshold：500（低於此值視為相似幀，使用快取結果）
相機開啟 timeout：5 秒 × 最多 3 次嘗試

11. 安全性設計

11.1 目前狀態

項目	狀態	說明
Server 登入驗證	❌ 未實作	`attempt_login()` 永遠成功
程式碼保護	⚠️ 計畫中	PyArmor 列在計畫中
自定義模型驗證	❌ 無	任何 .nef 檔都能上傳
網路通訊加密	❌ 未知	Server 驗證端點未見 TLS 設定

11.2 Plugin 安全風險

load_inference_module() 使用 importlib 動態執行 script.py，等同於執行任意 Python 程式碼。若 %LOCALAPPDATA% 中的 script.py 被惡意替換，攻擊者可以完整控制推論行為。

建議：考慮對 script.py 進行簽章驗證，或限制其沙盒執行環境。

23 KiB Raw Blame History Unescape Escape