Refactor workers to use backend interfaces for quantization, compilation, and evaluation; add optional flags for simulation in request schemas and update documentation accordingly.
This commit is contained in:
parent
bc98456d74
commit
fdebf4db5d
22
README.md
22
README.md
@ -27,6 +27,28 @@ ONNX → BIE → NEF。系統以 Scheduler 為控制面,搭配 Worker Pool 與
|
|||||||
7) NEF Worker 執行完成
|
7) NEF Worker 執行完成
|
||||||
8) Scheduler 標記 COMPLETED
|
8) Scheduler 標記 COMPLETED
|
||||||
|
|
||||||
|
## Worker API Flags(可選)
|
||||||
|
這些旗標用於控制 evaluator / simulator 步驟。皆有預設值,不填不影響既有流程。
|
||||||
|
|
||||||
|
- ONNX `/api/onnx/process`
|
||||||
|
- `enable_evaluate` (default: `false`): 是否執行 IP evaluator(原 Web GUI 流程為 OFF)
|
||||||
|
- `enable_sim_fp` (default: `false`): 是否執行浮點 E2E 模擬(尚未接線)
|
||||||
|
- BIE `/api/bie/process`
|
||||||
|
- `enable_sim_fixed` (default: `false`): 是否執行定點 E2E 模擬(尚未接線)
|
||||||
|
- NEF `/api/nef/process`
|
||||||
|
- `enable_sim_hw` (default: `false`): 是否執行硬體 E2E 模擬(尚未接線)
|
||||||
|
|
||||||
|
## 流程預設開關對照(原 Web GUI vs 現在 Workers)
|
||||||
|
| 步驟 | 原 Web GUI 預設 | 現在 Workers 預設 | 開關 |
|
||||||
|
|---|---|---|---|
|
||||||
|
| ONNX 轉換/最佳化 | ON | ON | 無 |
|
||||||
|
| IP Evaluator | OFF | OFF | `enable_evaluate` |
|
||||||
|
| FP E2E 模擬 | OFF | OFF | `enable_sim_fp` |
|
||||||
|
| BIE 量化 | ON | ON | 無 |
|
||||||
|
| Fixed-Point E2E 模擬 | OFF | OFF | `enable_sim_fixed` |
|
||||||
|
| NEF Compile | ON | ON | 無 |
|
||||||
|
| HW E2E 模擬 | OFF | OFF | `enable_sim_hw` |
|
||||||
|
|
||||||
## 非目標
|
## 非目標
|
||||||
- 不做任務持久化
|
- 不做任務持久化
|
||||||
- 不做 crash 後 resume
|
- 不做 crash 後 resume
|
||||||
|
|||||||
@ -165,16 +165,34 @@ error:
|
|||||||
- 輸入:工作目錄下的唯一檔案(不假設檔名 / 副檔名)
|
- 輸入:工作目錄下的唯一檔案(不假設檔名 / 副檔名)
|
||||||
- 輸出:`out.onnx`
|
- 輸出:`out.onnx`
|
||||||
- 輸出位置:同一工作目錄
|
- 輸出位置:同一工作目錄
|
||||||
|
- 可選旗標:
|
||||||
|
- `enable_evaluate` (default: `false`):是否執行 IP evaluator(原 Web GUI 流程為 OFF)
|
||||||
|
- `enable_sim_fp` (default: `false`):是否執行浮點 E2E 模擬(尚未接線)
|
||||||
|
|
||||||
### 4.1.3 BIE Worker
|
### 4.1.3 BIE Worker
|
||||||
- 輸入:`out.onnx` + `ref_images/*`
|
- 輸入:`out.onnx` + `ref_images/*`
|
||||||
- 輸出:`out.bie`
|
- 輸出:`out.bie`
|
||||||
- 輸出位置:同一工作目錄
|
- 輸出位置:同一工作目錄
|
||||||
|
- 可選旗標:
|
||||||
|
- `enable_sim_fixed` (default: `false`):是否執行定點 E2E 模擬(尚未接線)
|
||||||
|
|
||||||
### 4.1.4 NEF Worker
|
### 4.1.4 NEF Worker
|
||||||
- 輸入:`out.bie`
|
- 輸入:`out.bie`
|
||||||
- 輸出:`out.nef`
|
- 輸出:`out.nef`
|
||||||
- 輸出位置:同一工作目錄
|
- 輸出位置:同一工作目錄
|
||||||
|
- 可選旗標:
|
||||||
|
- `enable_sim_hw` (default: `false`):是否執行硬體 E2E 模擬(尚未接線)
|
||||||
|
|
||||||
|
### 4.1.6 流程預設開關對照(原 Web GUI vs 現在 Workers)
|
||||||
|
| 步驟 | 原 Web GUI 預設 | 現在 Workers 預設 | 開關 |
|
||||||
|
|---|---|---|---|
|
||||||
|
| ONNX 轉換/最佳化 | ON | ON | 無 |
|
||||||
|
| IP Evaluator | OFF | OFF | `enable_evaluate` |
|
||||||
|
| FP E2E 模擬 | OFF | OFF | `enable_sim_fp` |
|
||||||
|
| BIE 量化 | ON | ON | 無 |
|
||||||
|
| Fixed-Point E2E 模擬 | OFF | OFF | `enable_sim_fixed` |
|
||||||
|
| NEF Compile | ON | ON | 無 |
|
||||||
|
| HW E2E 模擬 | OFF | OFF | `enable_sim_hw` |
|
||||||
|
|
||||||
### 4.1.5 Core / Toolchain 路徑一致性
|
### 4.1.5 Core / Toolchain 路徑一致性
|
||||||
- Worker 需將工作目錄 path 傳給 core
|
- Worker 需將工作目錄 path 傳給 core
|
||||||
|
|||||||
@ -245,7 +245,41 @@ If a file currently mixes multiple module responsibilities:
|
|||||||
- Move high-risk dependencies first (prebuilt calls, sys_flow usage).
|
- Move high-risk dependencies first (prebuilt calls, sys_flow usage).
|
||||||
- After each phase, re-check boundaries and adjust.
|
- After each phase, re-check boundaries and adjust.
|
||||||
|
|
||||||
## 6) Minimum Viable API Proposal
|
## 7) Current Structure and Replacement Strategy (As-Is)
|
||||||
|
Based on the refactor just completed, the effective call chain is:
|
||||||
|
|
||||||
|
```
|
||||||
|
workers (ONNX/BIE/NEF)
|
||||||
|
-> backends (interfaces + Kneron implementations)
|
||||||
|
-> ktc (toolchain python API)
|
||||||
|
-> vendor sys_flow / libs / libs_V2 / prebuilt binaries
|
||||||
|
```
|
||||||
|
|
||||||
|
### 7.1 What this means today
|
||||||
|
- Workers only depend on **backend interfaces**. They no longer call `ktc.ModelConfig` directly.
|
||||||
|
- Kneron specifics are concentrated in backend implementations.
|
||||||
|
- `ktc` still wraps the Kneron toolchain and binaries; that dependency remains, but it is **now isolated**.
|
||||||
|
|
||||||
|
### 7.2 How to replace later
|
||||||
|
1) **Replace backend implementations** (lowest-risk)
|
||||||
|
- Keep backend interfaces stable.
|
||||||
|
- Swap `Kneron*Backend` for `Your*Backend` without touching workers.
|
||||||
|
|
||||||
|
2) **Keep backend layer, but replace `ktc` calls**
|
||||||
|
- Modify `Kneron*Backend` to call your own library instead of `ktc`.
|
||||||
|
- Workers stay unchanged; only backend code moves.
|
||||||
|
|
||||||
|
3) **Introduce multiple backends**
|
||||||
|
- Add `get_*_backend(name=...)` selection based on config/env.
|
||||||
|
- Allows mixed runs: Kneron for NEF, OSS for ONNX, etc.
|
||||||
|
|
||||||
|
### 7.3 Where to implement replacements
|
||||||
|
- `services/backends/quantization.py`
|
||||||
|
- `services/backends/compiler.py`
|
||||||
|
- `services/backends/evaluator.py`
|
||||||
|
- `services/backends/simulator.py`
|
||||||
|
|
||||||
|
## 8) Minimum Viable API Proposal
|
||||||
Keep it minimal to avoid churn:
|
Keep it minimal to avoid churn:
|
||||||
|
|
||||||
```python
|
```python
|
||||||
@ -260,6 +294,24 @@ class CompilerBackend:
|
|||||||
|
|
||||||
Then the pipeline is just a pure composition of these two + ONNX ops.
|
Then the pipeline is just a pure composition of these two + ONNX ops.
|
||||||
|
|
||||||
|
## 9) What This Enables
|
||||||
|
- Replace ONNX converters / optimizers without touching quantization.
|
||||||
|
- Run ONNX flow in pure OSS environments (CI, dev) without Kneron binaries.
|
||||||
|
- Swap in future Kneron versions only inside backend adapters.
|
||||||
|
- Experiment with alternative quantization or compiler backends.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10) Next Steps (if you want)
|
||||||
|
I can draft the following next:
|
||||||
|
1) A small refactor plan with concrete file edits and minimal API changes.
|
||||||
|
2) A diagram (Mermaid) of the new modular flow.
|
||||||
|
3) A compatibility matrix (current vs target dependencies per module).
|
||||||
|
|
||||||
|
Tell me which one you want, and I’ll prepare it.
|
||||||
|
|
||||||
|
Then the pipeline is just a pure composition of these two + ONNX ops.
|
||||||
|
|
||||||
## 7) What This Enables
|
## 7) What This Enables
|
||||||
- Replace ONNX converters / optimizers without touching quantization.
|
- Replace ONNX converters / optimizers without touching quantization.
|
||||||
- Run ONNX flow in pure OSS environments (CI, dev) without Kneron binaries.
|
- Run ONNX flow in pure OSS environments (CI, dev) without Kneron binaries.
|
||||||
|
|||||||
45
docs/refactor_progress.md
Normal file
45
docs/refactor_progress.md
Normal file
@ -0,0 +1,45 @@
|
|||||||
|
# Refactor Progress Log
|
||||||
|
|
||||||
|
## 2026-02-05
|
||||||
|
- Started modularization refactor per `docs/flow_modularization_notes.md`.
|
||||||
|
- Goal: introduce backend interfaces, decouple ONNX evaluation, keep behavior stable.
|
||||||
|
|
||||||
|
### Planned Steps
|
||||||
|
1) Create backend interfaces (quantization/compiler, optional evaluator/simulator).
|
||||||
|
2) Update ONNX/BIE/NEF workers to use backends and make eval optional.
|
||||||
|
3) Review boundaries and document issues.
|
||||||
|
|
||||||
|
### Issues / Risks
|
||||||
|
- None yet.
|
||||||
|
|
||||||
|
## 2026-02-05 Update
|
||||||
|
- Added backend interfaces under `services/backends`.
|
||||||
|
- ONNX worker now makes IP evaluation optional via `parameters.enable_evaluate`.
|
||||||
|
- BIE/NEF workers now call backend interfaces instead of direct `ModelConfig` usage.
|
||||||
|
|
||||||
|
### Issues / Risks
|
||||||
|
- `services/workers/onnx/core.py` now sets `eval_report` to empty string when disabled; check callers if they rely on non-empty.
|
||||||
|
- Quantization backend supports optional `onnx_model` to avoid duplicate optimization.
|
||||||
|
|
||||||
|
## 2026-02-05 Update 2
|
||||||
|
- Added explicit request flags for evaluator/simulator toggles in worker schemas:
|
||||||
|
- ONNX: `enable_evaluate`, `enable_sim_fp`
|
||||||
|
- BIE: `enable_sim_fixed`
|
||||||
|
- NEF: `enable_sim_hw`
|
||||||
|
|
||||||
|
### Issues / Risks
|
||||||
|
- Simulator flags are defined but not yet wired to execution paths.
|
||||||
|
|
||||||
|
## 2026-02-05 Update 3
|
||||||
|
- Documented worker API flags in `README.md` and `docs/Design.md`.
|
||||||
|
|
||||||
|
## 2026-02-05 Update 4
|
||||||
|
- Set `enable_evaluate` default to `false` to match original Web GUI flow.
|
||||||
|
- Documented original Web GUI ON/OFF expectations in `README.md` and `docs/Design.md`.
|
||||||
|
|
||||||
|
## 2026-02-05 Update 5
|
||||||
|
- Added ON/OFF comparison table for original Web GUI vs current workers in `README.md` and `docs/Design.md`.
|
||||||
|
|
||||||
|
## 2026-02-05 Update 6
|
||||||
|
- Default `enable_evaluate` in `process_onnx_core` set to `False` to match Web GUI defaults.
|
||||||
|
- Full worker test set passed (onnx/bie/nef/e2e/e2e-tflite).
|
||||||
1
services/backends/__init__.py
Normal file
1
services/backends/__init__.py
Normal file
@ -0,0 +1 @@
|
|||||||
|
"""Backend interfaces and implementations."""
|
||||||
26
services/backends/compiler.py
Normal file
26
services/backends/compiler.py
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Protocol
|
||||||
|
|
||||||
|
|
||||||
|
class CompilerBackend(Protocol):
|
||||||
|
def compile(self, bie_path: str, output_dir: str, **kwargs) -> str:
|
||||||
|
"""Compile BIE into NEF and return the generated NEF path."""
|
||||||
|
|
||||||
|
|
||||||
|
class KneronCompilerBackend:
|
||||||
|
def compile(self, bie_path: str, output_dir: str, **kwargs) -> str:
|
||||||
|
import ktc
|
||||||
|
|
||||||
|
km = ktc.ModelConfig(
|
||||||
|
kwargs["model_id"],
|
||||||
|
kwargs["version"],
|
||||||
|
kwargs["platform"],
|
||||||
|
bie_path=bie_path,
|
||||||
|
)
|
||||||
|
return ktc.compile([km], output_dir=output_dir or None)
|
||||||
|
|
||||||
|
|
||||||
|
def get_compiler_backend(name: str | None = None) -> CompilerBackend:
|
||||||
|
_ = name
|
||||||
|
return KneronCompilerBackend()
|
||||||
26
services/backends/evaluator.py
Normal file
26
services/backends/evaluator.py
Normal file
@ -0,0 +1,26 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Protocol
|
||||||
|
|
||||||
|
|
||||||
|
class EvaluatorBackend(Protocol):
|
||||||
|
def evaluate(self, onnx_path: str, **kwargs) -> str:
|
||||||
|
"""Run IP evaluation and return a report string."""
|
||||||
|
|
||||||
|
|
||||||
|
class KneronEvaluatorBackend:
|
||||||
|
def evaluate(self, onnx_path: str, **kwargs) -> str:
|
||||||
|
import ktc
|
||||||
|
|
||||||
|
km = ktc.ModelConfig(
|
||||||
|
kwargs["model_id"],
|
||||||
|
kwargs["version"],
|
||||||
|
kwargs["platform"],
|
||||||
|
onnx_path=onnx_path,
|
||||||
|
)
|
||||||
|
return km.evaluate()
|
||||||
|
|
||||||
|
|
||||||
|
def get_evaluator_backend(name: str | None = None) -> EvaluatorBackend:
|
||||||
|
_ = name
|
||||||
|
return KneronEvaluatorBackend()
|
||||||
46
services/backends/quantization.py
Normal file
46
services/backends/quantization.py
Normal file
@ -0,0 +1,46 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Dict, Protocol
|
||||||
|
|
||||||
|
|
||||||
|
class QuantizationBackend(Protocol):
|
||||||
|
def analyze(
|
||||||
|
self,
|
||||||
|
onnx_path: str,
|
||||||
|
input_mapping: Dict,
|
||||||
|
output_dir: str,
|
||||||
|
**kwargs,
|
||||||
|
) -> str:
|
||||||
|
"""Run quantization and return the generated BIE path."""
|
||||||
|
|
||||||
|
|
||||||
|
class KneronQuantizationBackend:
|
||||||
|
def analyze(
|
||||||
|
self,
|
||||||
|
onnx_path: str,
|
||||||
|
input_mapping: Dict,
|
||||||
|
output_dir: str,
|
||||||
|
**kwargs,
|
||||||
|
) -> str:
|
||||||
|
import ktc
|
||||||
|
|
||||||
|
model = kwargs.get("onnx_model")
|
||||||
|
if model is None:
|
||||||
|
import onnx
|
||||||
|
|
||||||
|
model = onnx.load(onnx_path)
|
||||||
|
model = ktc.onnx_optimizer.onnx2onnx_flow(model, eliminate_tail=True, opt_matmul=True)
|
||||||
|
|
||||||
|
km = ktc.ModelConfig(
|
||||||
|
kwargs["model_id"],
|
||||||
|
kwargs["version"],
|
||||||
|
kwargs["platform"],
|
||||||
|
onnx_model=model,
|
||||||
|
)
|
||||||
|
return km.analysis(input_mapping, output_dir=output_dir)
|
||||||
|
|
||||||
|
|
||||||
|
def get_quantization_backend(name: str | None = None) -> QuantizationBackend:
|
||||||
|
# Placeholder for future backend selection logic.
|
||||||
|
_ = name
|
||||||
|
return KneronQuantizationBackend()
|
||||||
28
services/backends/simulator.py
Normal file
28
services/backends/simulator.py
Normal file
@ -0,0 +1,28 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
from typing import Protocol, Sequence
|
||||||
|
|
||||||
|
|
||||||
|
class SimulatorBackend(Protocol):
|
||||||
|
def simulate(self, input_data: Sequence, **kwargs):
|
||||||
|
"""Run E2E simulation and return results."""
|
||||||
|
|
||||||
|
|
||||||
|
class KneronSimulatorBackend:
|
||||||
|
def simulate(self, input_data: Sequence, **kwargs):
|
||||||
|
import ktc
|
||||||
|
|
||||||
|
return ktc.kneron_inference(
|
||||||
|
input_data,
|
||||||
|
onnx_file=kwargs.get("onnx_file"),
|
||||||
|
bie_file=kwargs.get("bie_file"),
|
||||||
|
nef_file=kwargs.get("nef_file"),
|
||||||
|
input_names=kwargs.get("input_names"),
|
||||||
|
platform=kwargs.get("platform"),
|
||||||
|
model_id=kwargs.get("model_id"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def get_simulator_backend(name: str | None = None) -> SimulatorBackend:
|
||||||
|
_ = name
|
||||||
|
return KneronSimulatorBackend()
|
||||||
@ -46,14 +46,7 @@ def process_bie_core(
|
|||||||
input_node_height = input_node.type.tensor_type.shape.dim[2].dim_value
|
input_node_height = input_node.type.tensor_type.shape.dim[2].dim_value
|
||||||
input_node_width = input_node.type.tensor_type.shape.dim[3].dim_value
|
input_node_width = input_node.type.tensor_type.shape.dim[3].dim_value
|
||||||
|
|
||||||
km = ktc.ModelConfig(
|
img_list = []
|
||||||
parameters["model_id"],
|
|
||||||
parameters["version"],
|
|
||||||
parameters["platform"],
|
|
||||||
onnx_model=model,
|
|
||||||
)
|
|
||||||
|
|
||||||
img_list = []
|
|
||||||
for dir_path, _, file_names in os.walk(data_dir):
|
for dir_path, _, file_names in os.walk(data_dir):
|
||||||
for file_name in file_names:
|
for file_name in file_names:
|
||||||
fullpath = os.path.join(dir_path, file_name)
|
fullpath = os.path.join(dir_path, file_name)
|
||||||
@ -66,7 +59,18 @@ def process_bie_core(
|
|||||||
)
|
)
|
||||||
img_list.append(img_data)
|
img_list.append(img_data)
|
||||||
|
|
||||||
bie_model_path = km.analysis({input_node_name: img_list}, output_dir=output_dir or ".")
|
from services.backends.quantization import get_quantization_backend
|
||||||
|
|
||||||
|
backend = get_quantization_backend()
|
||||||
|
bie_model_path = backend.analyze(
|
||||||
|
onnx_file_path,
|
||||||
|
{input_node_name: img_list},
|
||||||
|
output_dir or ".",
|
||||||
|
onnx_model=model,
|
||||||
|
model_id=parameters["model_id"],
|
||||||
|
version=parameters["version"],
|
||||||
|
platform=parameters["platform"],
|
||||||
|
)
|
||||||
|
|
||||||
if os.path.abspath(bie_model_path) != os.path.abspath(output_path):
|
if os.path.abspath(bie_model_path) != os.path.abspath(output_path):
|
||||||
# Move to avoid keeping duplicate large binaries on disk.
|
# Move to avoid keeping duplicate large binaries on disk.
|
||||||
|
|||||||
@ -58,12 +58,16 @@ class HealthResponse(BaseModel):
|
|||||||
timestamp: str
|
timestamp: str
|
||||||
active_tasks: int
|
active_tasks: int
|
||||||
|
|
||||||
class BIEProcessRequest(BaseModel):
|
class BIEProcessRequest(BaseModel):
|
||||||
onnx_file_id: str
|
onnx_file_id: str
|
||||||
model_id: int = Field(..., ge=1, le=65535)
|
model_id: int = Field(..., ge=1, le=65535)
|
||||||
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
||||||
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
||||||
data_dir: str = Field(..., min_length=1)
|
data_dir: str = Field(..., min_length=1)
|
||||||
|
enable_sim_fixed: bool = Field(
|
||||||
|
False,
|
||||||
|
description="Run fixed-point E2E simulation after quantization (not yet wired).",
|
||||||
|
)
|
||||||
|
|
||||||
class TaskStatusResponse(BaseModel):
|
class TaskStatusResponse(BaseModel):
|
||||||
task_id: str
|
task_id: str
|
||||||
|
|||||||
@ -23,16 +23,16 @@ def process_nef_core(
|
|||||||
os.environ.setdefault("KTC_WORKDIR", work_dir)
|
os.environ.setdefault("KTC_WORKDIR", work_dir)
|
||||||
os.environ.setdefault("KTC_SCRIPT_RES", res_dir)
|
os.environ.setdefault("KTC_SCRIPT_RES", res_dir)
|
||||||
|
|
||||||
import ktc
|
from services.backends.compiler import get_compiler_backend
|
||||||
|
|
||||||
km = ktc.ModelConfig(
|
backend = get_compiler_backend()
|
||||||
parameters["model_id"],
|
nef_model_path = backend.compile(
|
||||||
parameters["version"],
|
bie_file_path,
|
||||||
parameters["platform"],
|
output_dir or None,
|
||||||
bie_path=bie_file_path,
|
model_id=parameters["model_id"],
|
||||||
)
|
version=parameters["version"],
|
||||||
|
platform=parameters["platform"],
|
||||||
nef_model_path = ktc.compile([km], output_dir=output_dir or None)
|
)
|
||||||
if os.path.abspath(nef_model_path) != os.path.abspath(output_path):
|
if os.path.abspath(nef_model_path) != os.path.abspath(output_path):
|
||||||
# Move to avoid keeping duplicate large binaries on disk.
|
# Move to avoid keeping duplicate large binaries on disk.
|
||||||
shutil.move(str(nef_model_path), output_path)
|
shutil.move(str(nef_model_path), output_path)
|
||||||
|
|||||||
@ -58,11 +58,15 @@ class HealthResponse(BaseModel):
|
|||||||
timestamp: str
|
timestamp: str
|
||||||
active_tasks: int
|
active_tasks: int
|
||||||
|
|
||||||
class NEFProcessRequest(BaseModel):
|
class NEFProcessRequest(BaseModel):
|
||||||
bie_file_id: str
|
bie_file_id: str
|
||||||
model_id: int = Field(..., ge=1, le=65535)
|
model_id: int = Field(..., ge=1, le=65535)
|
||||||
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
||||||
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
||||||
|
enable_sim_hw: bool = Field(
|
||||||
|
False,
|
||||||
|
description="Run hardware E2E simulation after compilation (not yet wired).",
|
||||||
|
)
|
||||||
|
|
||||||
class TaskStatusResponse(BaseModel):
|
class TaskStatusResponse(BaseModel):
|
||||||
task_id: str
|
task_id: str
|
||||||
|
|||||||
@ -4,11 +4,11 @@ from typing import Dict, Any
|
|||||||
import onnx
|
import onnx
|
||||||
|
|
||||||
|
|
||||||
def process_onnx_core(
|
def process_onnx_core(
|
||||||
input_paths: Dict[str, str],
|
input_paths: Dict[str, str],
|
||||||
output_path: str,
|
output_path: str,
|
||||||
parameters: Dict[str, Any],
|
parameters: Dict[str, Any],
|
||||||
) -> Dict[str, Any]:
|
) -> Dict[str, Any]:
|
||||||
file_path = input_paths["file_path"]
|
file_path = input_paths["file_path"]
|
||||||
if not os.path.exists(file_path):
|
if not os.path.exists(file_path):
|
||||||
raise FileNotFoundError(f"Input file not found: {file_path}")
|
raise FileNotFoundError(f"Input file not found: {file_path}")
|
||||||
@ -36,16 +36,20 @@ def process_onnx_core(
|
|||||||
model = ktc.onnx_optimizer.onnx2onnx_flow(model, eliminate_tail=True, opt_matmul=True)
|
model = ktc.onnx_optimizer.onnx2onnx_flow(model, eliminate_tail=True, opt_matmul=True)
|
||||||
onnx.save(model, output_path)
|
onnx.save(model, output_path)
|
||||||
|
|
||||||
km = ktc.ModelConfig(
|
eval_result = ""
|
||||||
int(parameters["model_id"]),
|
if parameters.get("enable_evaluate", False):
|
||||||
str(parameters["version"]),
|
from services.backends.evaluator import get_evaluator_backend
|
||||||
str(parameters["platform"]),
|
|
||||||
onnx_model=model,
|
evaluator = get_evaluator_backend()
|
||||||
)
|
evaluate_result = evaluator.evaluate(
|
||||||
evaluate_result = km.evaluate()
|
output_path,
|
||||||
eval_result = evaluate_result.split(",")[0]
|
model_id=int(parameters["model_id"]),
|
||||||
|
version=str(parameters["version"]),
|
||||||
return {
|
platform=str(parameters["platform"]),
|
||||||
|
)
|
||||||
|
eval_result = evaluate_result.split(",")[0]
|
||||||
|
|
||||||
|
return {
|
||||||
"file_path": output_path,
|
"file_path": output_path,
|
||||||
"file_size": os.path.getsize(output_path),
|
"file_size": os.path.getsize(output_path),
|
||||||
"eval_report": eval_result,
|
"eval_report": eval_result,
|
||||||
|
|||||||
@ -67,11 +67,19 @@ class FileUploadResponse(BaseModel):
|
|||||||
file_path: str
|
file_path: str
|
||||||
message: str
|
message: str
|
||||||
|
|
||||||
class ONNXProcessRequest(BaseModel):
|
class ONNXProcessRequest(BaseModel):
|
||||||
file_id: str
|
file_id: str
|
||||||
model_id: int = Field(..., ge=1, le=65535)
|
model_id: int = Field(..., ge=1, le=65535)
|
||||||
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
version: str = Field(..., regex=r'^[0-9a-fA-F]{4}$')
|
||||||
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
platform: str = Field(..., regex=r'^(520|720|530|630|730)$')
|
||||||
|
enable_evaluate: bool = Field(
|
||||||
|
False,
|
||||||
|
description="Run IP evaluator (toolchain) after ONNX optimization.",
|
||||||
|
)
|
||||||
|
enable_sim_fp: bool = Field(
|
||||||
|
False,
|
||||||
|
description="Run floating-point E2E simulation (not yet wired).",
|
||||||
|
)
|
||||||
|
|
||||||
class TaskStatusResponse(BaseModel):
|
class TaskStatusResponse(BaseModel):
|
||||||
task_id: str
|
task_id: str
|
||||||
|
|||||||
@ -43,7 +43,13 @@ def test_worker_flow_e2e_uses_single_workdir():
|
|||||||
work_input_file = work_inputs[0]
|
work_input_file = work_inputs[0]
|
||||||
|
|
||||||
onnx_output = work_dir / "out.onnx"
|
onnx_output = work_dir / "out.onnx"
|
||||||
onnx_params = {"model_id": 10, "version": "e2e", "platform": "520", "work_dir": str(work_dir)}
|
onnx_params = {
|
||||||
|
"model_id": 10,
|
||||||
|
"version": "e2e",
|
||||||
|
"platform": "520",
|
||||||
|
"work_dir": str(work_dir),
|
||||||
|
"enable_evaluate": False,
|
||||||
|
}
|
||||||
onnx_result = process_onnx_core(
|
onnx_result = process_onnx_core(
|
||||||
{"file_path": str(work_input_file)},
|
{"file_path": str(work_input_file)},
|
||||||
str(onnx_output),
|
str(onnx_output),
|
||||||
|
|||||||
@ -43,7 +43,13 @@ def test_worker_flow_e2e_tflite_uses_single_workdir():
|
|||||||
work_input_file = work_inputs[0]
|
work_input_file = work_inputs[0]
|
||||||
|
|
||||||
onnx_output = work_dir / "out.onnx"
|
onnx_output = work_dir / "out.onnx"
|
||||||
onnx_params = {"model_id": 20, "version": "e2e-tflite", "platform": "520", "work_dir": str(work_dir)}
|
onnx_params = {
|
||||||
|
"model_id": 20,
|
||||||
|
"version": "e2e-tflite",
|
||||||
|
"platform": "520",
|
||||||
|
"work_dir": str(work_dir),
|
||||||
|
"enable_evaluate": False,
|
||||||
|
}
|
||||||
onnx_result = process_onnx_core(
|
onnx_result = process_onnx_core(
|
||||||
{"file_path": str(work_input_file)},
|
{"file_path": str(work_input_file)},
|
||||||
str(onnx_output),
|
str(onnx_output),
|
||||||
|
|||||||
Loading…
x
Reference in New Issue
Block a user