Add initial draft of Toolchain Flow Modularization Notes
This commit is contained in:
parent
e93a1a5996
commit
bc98456d74
277
docs/flow_modularization_notes.md
Normal file
277
docs/flow_modularization_notes.md
Normal file
@ -0,0 +1,277 @@
|
||||
# Toolchain Flow Modularization Notes (Draft)
|
||||
|
||||
Based on `docs/manual_v0.17.2.pdf` and the current repo structure, this note outlines a modular decomposition plan, risks, and a staged engineering approach to reduce Kneron prebuilt coupling while preserving the end-to-end flow.
|
||||
|
||||
## 1) Project Plan Review (Your Proposal)
|
||||
Your plan is sound and low-risk:
|
||||
1) **拆步驟成模組、先完成完整流程** → 保持可用性、可回歸。
|
||||
2) **逐一檢討模組是否可改寫/重建** → 聚焦風險最高的依賴點。
|
||||
|
||||
This sequencing avoids a “big bang rewrite.” It also lets you replace a single module without breaking downstream steps.
|
||||
|
||||
## 2) Manual-Driven Flow Stages (v0.17.2)
|
||||
From the manual, the official workflow breaks down cleanly into these steps:
|
||||
|
||||
**ONNX Workflow**
|
||||
- A. Model conversion to ONNX (Keras / PyTorch / Caffe / TFLite)
|
||||
- B. ONNX optimization (general optimizer)
|
||||
- C. Opset upgrade (if needed)
|
||||
- D. IP evaluation (performance / support check)
|
||||
- E. E2E simulator check (floating point)
|
||||
|
||||
**BIE Workflow**
|
||||
- F. Quantization (analysis → produce BIE)
|
||||
- G. E2E simulator check (fixed point)
|
||||
|
||||
**NEF Workflow**
|
||||
- H. Batch compile (BIE → NEF)
|
||||
- I. E2E simulator check (hardware)
|
||||
- J. NEF combine (optional)
|
||||
|
||||
This mapping is consistent with current repo services:
|
||||
- `services/workers/onnx/core.py`: A/B (+ D currently via `evaluate`)
|
||||
- `services/workers/bie/core.py`: F (+ optional G if you add)
|
||||
- `services/workers/nef/core.py`: H (+ optional I if you add)
|
||||
|
||||
## 2.1 Flow Diagram (Mermaid)
|
||||
```mermaid
|
||||
flowchart TD
|
||||
A[Format Model<br/>Keras/TFLite/Caffe/PyTorch] --> B[Convert to ONNX]
|
||||
B --> C[ONNX Optimize / Opset Upgrade / Graph Edits]
|
||||
C --> D{IP Evaluation?}
|
||||
D -- optional --> E[IP Evaluator Report]
|
||||
C --> F{E2E FP Sim?}
|
||||
F -- optional --> G[Float E2E Simulator Check]
|
||||
C --> H[Quantization / Analysis]
|
||||
H --> I[BIE Output]
|
||||
I --> J{E2E Fixed-Point Sim?}
|
||||
J -- optional --> K[Fixed-Point E2E Check]
|
||||
I --> L[Compile]
|
||||
L --> M[NEF Output]
|
||||
M --> N{E2E Hardware Sim?}
|
||||
N -- optional --> O[Hardware E2E Check]
|
||||
M --> P{NEF Combine?}
|
||||
P -- optional --> Q[Combined NEF]
|
||||
|
||||
subgraph OSS-Friendly
|
||||
A
|
||||
B
|
||||
C
|
||||
end
|
||||
|
||||
subgraph Kneron-Dependent
|
||||
D
|
||||
E
|
||||
F
|
||||
G
|
||||
H
|
||||
I
|
||||
J
|
||||
K
|
||||
L
|
||||
M
|
||||
N
|
||||
O
|
||||
P
|
||||
Q
|
||||
end
|
||||
```
|
||||
|
||||
## 3) Recommended Module Split (Initial)
|
||||
A clean split with minimal coupling and clear replacement points:
|
||||
|
||||
### 3.1 Format & Graph Layer (OSS-Friendly)
|
||||
1. **FormatConverters**
|
||||
- Keras→ONNX, TFLite→ONNX, Caffe→ONNX, PyTorch-exported ONNX
|
||||
- Pure Python; use `libs/ONNX_Convertor` + `libs/kneronnxopt`
|
||||
|
||||
2. **OnnxGraphOps**
|
||||
- optimize (onnx2onnx), opset upgrade, graph editing, shape fixes
|
||||
- Pure Python, independent from toolchain binaries
|
||||
|
||||
### 3.2 Validation & Simulation Layer (Kneron-Dependent)
|
||||
3. **IPEvaluator**
|
||||
- `ModelConfig.evaluate()` (toolchain evaluator)
|
||||
- Coupled to `sys_flow` + prebuilt binaries
|
||||
- Should be optional plug-in, not hard-dependency of ONNX conversion
|
||||
|
||||
4. **E2ESimulator**
|
||||
- float/fixed/hardware validation (kneron_inference)
|
||||
- Coupled to Kneron libs; keep as plugin backend
|
||||
|
||||
### 3.3 Quantization & Compile Layer (Kneron-Dependent)
|
||||
5. **QuantizationBackend** (BIE)
|
||||
- Current: `ModelConfig.analysis()` → sys_flow binaries
|
||||
- Make a backend interface with a Kneron implementation
|
||||
|
||||
6. **CompilerBackend** (NEF)
|
||||
- Current: `ktc.compile()` → prebuilt compiler
|
||||
- Same backend interface style; Kneron impl for now
|
||||
|
||||
### 3.4 Packaging/Orchestration Layer
|
||||
7. **Pipeline Orchestrator**
|
||||
- Defines the sequence and exchange formats (ONNX, BIE, NEF)
|
||||
- Should not import Kneron libs directly; only through backend interfaces
|
||||
|
||||
## 3.1 Module Dependency / Replaceability Matrix
|
||||
| Module | Inputs | Outputs | Current Dependency | Replaceability | Notes |
|
||||
|---|---|---|---|---|---|
|
||||
| FormatConverters | model files | ONNX | `libs/ONNX_Convertor` (OSS) | High | Already OSS; keep isolated. |
|
||||
| OnnxGraphOps | ONNX | ONNX | `libs/kneronnxopt`, onnx | High | Pure Python, safe to refactor. |
|
||||
| IPEvaluator | ONNX/BIE | report | `sys_flow` + prebuilt bins | Low | Optional plugin; avoid hard-depend in ONNX flow. |
|
||||
| E2ESimulator FP | ONNX | results | Kneron inference libs | Low | Optional; keep plugin backend. |
|
||||
| QuantizationBackend | ONNX + inputs | BIE | `sys_flow` + prebuilt bins | Low | Core Kneron dependency. |
|
||||
| E2ESimulator Fixed | BIE | results | Kneron inference libs | Low | Optional; can be skipped in web flow. |
|
||||
| CompilerBackend | BIE | NEF | `compiler/*` prebuilt | Low | Core Kneron dependency. |
|
||||
| E2ESimulator HW | NEF | results | Kneron inference libs | Low | Optional; likely external toolchain use. |
|
||||
| NEFCombine | NEF list | NEF | Kneron utils | Medium | Small wrapper; keep separate. |
|
||||
| Pipeline Orchestrator | modules | end-to-end | None (pure) | High | Ownable; should be OSS-only. |
|
||||
|
||||
## 4) Key Risks & Coupling Points (Observed in Repo)
|
||||
- `ktc.toolchain` calls `sys_flow` / `sys_flow_v2` (hard dependency on prebuilt binaries).
|
||||
- `ktc.ModelConfig.evaluate/analysis/compile` are all Kneron-specific.
|
||||
- `services/workers/onnx/core.py` calls `evaluate()` by default → this ties ONNX flow to Kneron.
|
||||
|
||||
## 5) Suggested Refactor Sequence (Low Disruption)
|
||||
**Phase 1: Interface Extraction**
|
||||
- Introduce two small interfaces:
|
||||
- `QuantizationBackend` (BIE)
|
||||
- `CompilerBackend` (NEF)
|
||||
- Wrap existing Kneron calls as default implementations.
|
||||
|
||||
**Phase 2: ONNX Flow Decoupling**
|
||||
- Make `IPEvaluator` optional in ONNX flow.
|
||||
- Keep current behavior by default but allow bypass.
|
||||
|
||||
**Phase 3: Modular Pipeline Assembly**
|
||||
- Build a pipeline that composes:
|
||||
- conversion → optimization → (optional evaluator)
|
||||
- quantization backend
|
||||
- compiler backend
|
||||
|
||||
**Phase 4: Replaceability Audit**
|
||||
- For each module, decide if:
|
||||
- can be OSS (conversion/optimization)
|
||||
- must remain Kneron backend (quantization/compile)
|
||||
- can be partially replaced (simulation/eval)
|
||||
|
||||
## 5.1 Concrete Refactor Plan (Minimal Interface Changes)
|
||||
Goal: preserve current behavior but make evaluation/simulation optional and enable backend swapping.
|
||||
|
||||
### Step 1: Introduce backend interfaces (no behavior change)
|
||||
Create simple interfaces and wrappers.
|
||||
- New files:
|
||||
- `services/backends/quantization.py`
|
||||
- `services/backends/compiler.py`
|
||||
- (optional) `services/backends/evaluator.py`
|
||||
- (optional) `services/backends/simulator.py`
|
||||
|
||||
Minimal interface (example):
|
||||
```python
|
||||
class QuantizationBackend:
|
||||
def analyze(self, onnx_path: str, input_mapping: dict, output_dir: str, **kwargs) -> str: ...
|
||||
|
||||
class CompilerBackend:
|
||||
def compile(self, bie_path: str, output_dir: str, **kwargs) -> str: ...
|
||||
```
|
||||
|
||||
Implement Kneron-backed versions wrapping existing calls:
|
||||
- `KneronQuantizationBackend` → `ktc.ModelConfig(...).analysis(...)`
|
||||
- `KneronCompilerBackend` → `ktc.compile(...)`
|
||||
|
||||
### Step 2: Decouple ONNX flow from evaluator (optional switch)
|
||||
Modify `services/workers/onnx/core.py`:
|
||||
- Add parameter `enable_evaluate` (default true to preserve behavior).
|
||||
- Guard `km.evaluate()` behind the flag.
|
||||
|
||||
### Step 3: Replace direct calls in BIE/NEF workers
|
||||
Modify:
|
||||
- `services/workers/bie/core.py` to use `QuantizationBackend`.
|
||||
- `services/workers/nef/core.py` to use `CompilerBackend`.
|
||||
|
||||
### Step 4: Optional simulator integration
|
||||
Add optional steps to workers:
|
||||
- `enable_sim_fp` in ONNX flow.
|
||||
- `enable_sim_fixed` in BIE flow.
|
||||
- `enable_sim_hw` in NEF flow.
|
||||
These should call a simulator backend; default off.
|
||||
|
||||
### Step 5: Pipeline orchestrator (optional)
|
||||
Add a thin orchestrator module to compose stages:
|
||||
- `services/pipeline/toolchain_pipeline.py`
|
||||
- Allows swapping backends from config/env.
|
||||
|
||||
### File Touch List (Minimal)
|
||||
1) `services/workers/onnx/core.py` (optional eval toggle)
|
||||
2) `services/workers/bie/core.py` (use QuantizationBackend)
|
||||
3) `services/workers/nef/core.py` (use CompilerBackend)
|
||||
4) `services/backends/quantization.py` (new)
|
||||
5) `services/backends/compiler.py` (new)
|
||||
6) (optional) `services/backends/evaluator.py` (new)
|
||||
7) (optional) `services/backends/simulator.py` (new)
|
||||
|
||||
## 6) Coupling Rules / Extraction Guidelines
|
||||
Goal: keep module boundaries stable so future swapping does not cascade changes.
|
||||
|
||||
### 6.1 Horizontal vs Vertical Coupling (Rule of Thumb)
|
||||
- **Horizontal coupling = avoid** (core A directly importing core B).
|
||||
- **Vertical references = allowed** (shared types/config/IO schemas used by all).
|
||||
|
||||
Keep shared references narrowly scoped to:
|
||||
- Interface contracts (Protocol / abstract base classes)
|
||||
- Common data structures (DTOs / results / error types)
|
||||
- Configuration schemas and environment keys
|
||||
|
||||
### 6.2 What Belongs in Shared vs Module
|
||||
**Put in shared only if it remains stable across backend swaps.**
|
||||
- If replacing a backend requires changing the code, it does *not* belong in shared.
|
||||
|
||||
**Examples**
|
||||
- Shared: `QuantizationBackend` interface, `CompilerBackend` interface, `PipelineResult`
|
||||
- Module: Kneron-specific env setup, sys_flow invocation, output file moving rules
|
||||
|
||||
### 6.3 Extracting Logic from Combined Files
|
||||
If a file currently mixes multiple module responsibilities:
|
||||
- Extract module-specific logic into the owning module.
|
||||
- Keep shared file limited to interface + cross-cutting types.
|
||||
|
||||
**Decision test**
|
||||
- *Would this code change if we swapped Kneron backend with OSS backend?*
|
||||
- Yes → belongs to module backend
|
||||
- No → can live in shared
|
||||
|
||||
### 6.4 Incremental Refactor Guidance
|
||||
- Don’t attempt perfect separation in one pass.
|
||||
- Move high-risk dependencies first (prebuilt calls, sys_flow usage).
|
||||
- After each phase, re-check boundaries and adjust.
|
||||
|
||||
## 6) Minimum Viable API Proposal
|
||||
Keep it minimal to avoid churn:
|
||||
|
||||
```python
|
||||
class QuantizationBackend:
|
||||
def analyze(self, onnx_path: str, input_mapping: dict, output_dir: str, **kwargs) -> str:
|
||||
"""Return BIE path"""
|
||||
|
||||
class CompilerBackend:
|
||||
def compile(self, bie_path: str, output_dir: str, **kwargs) -> str:
|
||||
"""Return NEF path"""
|
||||
```
|
||||
|
||||
Then the pipeline is just a pure composition of these two + ONNX ops.
|
||||
|
||||
## 7) What This Enables
|
||||
- Replace ONNX converters / optimizers without touching quantization.
|
||||
- Run ONNX flow in pure OSS environments (CI, dev) without Kneron binaries.
|
||||
- Swap in future Kneron versions only inside backend adapters.
|
||||
- Experiment with alternative quantization or compiler backends.
|
||||
|
||||
---
|
||||
|
||||
## Next Steps (if you want)
|
||||
I can draft the following next:
|
||||
1) A small refactor plan with concrete file edits and minimal API changes.
|
||||
2) A diagram (Mermaid) of the new modular flow.
|
||||
3) A compatibility matrix (current vs target dependencies per module).
|
||||
|
||||
Tell me which one you want, and I’ll prepare it.
|
||||
Loading…
x
Reference in New Issue
Block a user