292 lines
10 KiB
Markdown

# Cluster4NPU Pipeline TODO
## Current Status
**Pipeline Core**: Multi-stage pipeline with device auto-detection working
**Hardware Integration**: Kneron NPU dongles connecting and initializing successfully
**Auto-resize Preprocessing**: Model input shape detection and automatic preprocessing implemented
**Data Input Sources**: Camera and video file inputs implemented
**Result Persistence**: Result saving to file implemented
**End-to-End Workflow**: UI configuration now connects to core pipeline execution
**Bug Fixes**: Addressed file path and data processing issues
**Real-time Viewer**: Implemented a live view for real-time inference visualization
---
## Priority 1: Essential Components for Complete Inference Workflow
### 1. Data Source Implementation
**Status**: 🔴 Critical Missing Components
**Location**: Need to create new classes in `core/functions/` or extend existing ones
#### 1.1 Camera Input Source
- **File**: `core/functions/camera_source.py` (new)
- **Class**: `CameraSource`
- **Purpose**: Wrapper around cv2.VideoCapture for camera input
- **Integration**: Connect to InferencePipeline.put_data()
- **Features**:
- Multiple camera index support
- Resolution and FPS configuration
- Format conversion (BGR → model input format)
- Error handling for camera disconnection
#### 1.2 Video File Input Source
- **File**: `core/functions/video_source.py` (new)
- **Class**: `VideoFileSource`
- **Purpose**: Process video files frame by frame
- **Integration**: Feed frames to InferencePipeline
- **Features**:
- Support common video formats (MP4, AVI, MOV)
- Frame rate control and seeking
- Batch processing capabilities
- Progress tracking
#### 1.3 Image File Input Source
- **File**: `core/functions/image_source.py` (new)
- **Class**: `ImageFileSource`
- **Purpose**: Process single images or image directories
- **Integration**: Single-shot inference through pipeline
- **Features**:
- Support common image formats (JPG, PNG, BMP)
- Batch directory processing
- Image validation and error handling
#### 1.4 RTSP/HTTP Stream Source
- **File**: `core/functions/stream_source.py` (new)
- **Class**: `RTSPSource`, `HTTPStreamSource`
- **Purpose**: Process live video streams
- **Integration**: Real-time streaming to pipeline
- **Features**:
- Stream connection management
- Reconnection on failure
- Buffer management and frame dropping
### 2. Result Persistence System
**Status**: 🔴 Critical Missing Components
**Location**: `core/functions/result_handler.py` (new)
#### 2.1 Result Serialization
- **Class**: `ResultSerializer`
- **Purpose**: Convert inference results to standard formats
- **Features**:
- JSON export with timestamps
- CSV export for analytics
- Binary format for performance
- Configurable fields and formatting
#### 2.2 File Output Manager
- **Class**: `FileOutputManager`
- **Purpose**: Handle result file writing and organization
- **Features**:
- Timestamped file naming
- Directory organization by date/pipeline
- File rotation and cleanup
- Output format configuration
#### 2.3 Real-time Result Streaming
- **Class**: `ResultStreamer`
- **Purpose**: Stream results to external systems
- **Features**:
- WebSocket result broadcasting
- REST API endpoints
- Message queue integration (Redis, RabbitMQ)
- Custom callback system
### 3. Input/Output Integration Bridge
**Status**: 🔴 Critical Missing Components
**Location**: `core/functions/pipeline_manager.py` (new)
#### 3.1 Pipeline Configuration Manager
- **Class**: `PipelineConfigManager`
- **Purpose**: Convert UI configurations to executable pipelines
- **Integration**: Bridge between UI and core pipeline
- **Features**:
- Parse UI node configurations
- Instantiate appropriate data sources
- Configure result handlers
- Manage pipeline lifecycle
#### 3.2 Unified Workflow Orchestrator
- **Class**: `WorkflowOrchestrator`
- **Purpose**: Coordinate complete data flow from input to output
- **Features**:
- Input source management
- Pipeline execution control
- Result handling and persistence
- Error recovery and logging
---
## Priority 2: Enhanced Preprocessing and Auto-resize
### 4. Enhanced Preprocessing System
**Status**: 🟡 Partially Implemented
**Location**: `core/functions/Multidongle.py` (existing) + new preprocessing modules
#### 4.1 Current Auto-resize Implementation
- **Location**: `Multidongle.py:354-371` (preprocess_frame method)
- **Features**: ✅ Already implemented
- Automatic model input shape detection
- Dynamic resizing based on model requirements
- Format conversion (BGR565, RGB8888, YUYV, RAW8)
- Aspect ratio handling
#### 4.2 Enhanced Preprocessing Pipeline
- **File**: `core/functions/preprocessor.py` (new)
- **Class**: `AdvancedPreprocessor`
- **Purpose**: Extended preprocessing capabilities
- **Features**:
- **Smart cropping**: Maintain aspect ratio with intelligent cropping
- **Normalization**: Configurable pixel value normalization
- **Augmentation**: Real-time data augmentation for training
- **Multi-model support**: Different preprocessing for different models
- **Caching**: Preprocessed frame caching for performance
#### 4.3 Model-Aware Preprocessing
- **Enhancement**: Extend existing `Multidongle` class
- **Location**: `core/functions/Multidongle.py:188-199` (model_input_shape detection)
- **Features**:
- **Dynamic preprocessing**: Adjust preprocessing based on model metadata
- **Model-specific optimization**: Tailored preprocessing for different model types
- **Preprocessing profiles**: Saved preprocessing configurations per model
---
## Priority 3: UI Integration and User Experience
### 5. Dashboard Integration
**Status**: 🟡 Partially Implemented
**Location**: `ui/windows/dashboard.py` (existing)
#### 5.1 Real-time Pipeline Monitoring
- **Enhancement**: Extend existing Dashboard class
- **Features**:
- Live inference statistics
- Real-time result visualization
- Performance metrics dashboard
- Error monitoring and alerts
#### 5.2 Input Source Configuration
- **Integration**: Connect UI input nodes to actual data sources
- **Features**:
- Camera selection and preview
- File browser integration
- Stream URL validation
- Input source testing
### 6. Result Visualization
**Status**: 🔴 Not Implemented
**Location**: `ui/widgets/result_viewer.py` (new)
#### 6.1 Result Display Widget
- **Class**: `ResultViewer`
- **Purpose**: Display inference results in UI
- **Features**:
- Real-time result streaming
- Result history and filtering
- Export capabilities
- Customizable display formats
---
## Priority 4: Advanced Features and Optimization
### 7. Performance Optimization
**Status**: 🟡 Basic Implementation
**Location**: Multiple files
#### 7.1 Memory Management
- **Enhancement**: Optimize existing queue systems
- **Files**: `InferencePipeline.py`, `Multidongle.py`
- **Features**:
- Smart queue sizing based on available memory
- Frame dropping under load
- Memory leak detection and prevention
- Garbage collection optimization
#### 7.2 Multi-device Load Balancing
- **Enhancement**: Extend existing multi-dongle support
- **Location**: `core/functions/Multidongle.py` (existing auto-detection)
- **Features**:
- Intelligent device allocation
- Load balancing across devices
- Device health monitoring
- Automatic failover
### 8. Error Handling and Recovery
**Status**: 🟡 Basic Implementation
**Location**: Throughout codebase
#### 8.1 Comprehensive Error Recovery
- **Enhancement**: Extend existing error handling
- **Features**:
- Automatic device reconnection
- Pipeline restart on critical errors
- Input source recovery
- Result persistence on failure
---
## Implementation Roadmap
### Phase 1: Core Data Flow (Weeks 1-2)
1.**Complete**: Pipeline deployment and device initialization
2. 🔄 **In Progress**: Auto-resize preprocessing (mostly implemented)
3. **Next**: Implement basic camera input source
4. **Next**: Add simple result file output
5. **Next**: Create basic pipeline manager
### Phase 2: Complete Workflow (Weeks 3-4)
1. Add video file input support
2. Implement comprehensive result persistence
3. Create UI integration bridge
4. Add real-time monitoring
### Phase 3: Advanced Features (Weeks 5-6)
1. Enhanced preprocessing pipeline
2. Performance optimization
3. Advanced error handling
4. Result visualization
### Phase 4: Production Features (Weeks 7-8)
1. Multi-device load balancing
2. Advanced stream input support
3. Analytics and reporting
4. Configuration management
---
## Key Code Locations for Current Auto-resize Implementation
### Model Input Shape Detection
- **File**: `core/functions/Multidongle.py`
- **Lines**: 188-199 (model_input_shape property)
- **Status**: ✅ Working - detects model input dimensions from NEF files
### Automatic Preprocessing
- **File**: `core/functions/Multidongle.py`
- **Lines**: 354-371 (preprocess_frame method)
- **Status**: ✅ Working - auto-resizes based on model input shape
- **Features**: Format conversion, aspect ratio handling
### Pipeline Data Processing
- **File**: `core/functions/InferencePipeline.py`
- **Lines**: 165-240 (_process_data method)
- **Status**: ✅ Working - integrates preprocessing with inference
- **Features**: Inter-stage processing, result accumulation
### Format Conversion
- **File**: `core/functions/Multidongle.py`
- **Lines**: 382-396 (_convert_format method)
- **Status**: ✅ Working - supports BGR565, RGB8888, YUYV, RAW8
---
## Notes for Development
1. **Auto-resize is already implemented** ✅ - The system automatically detects model input shape and resizes accordingly
2. **Priority should be on input sources** - Camera and file input are the critical missing pieces
3. **Result persistence is essential** - Current system only provides callbacks, need file output
4. **UI integration gap** - UI configuration doesn't connect to core pipeline execution
5. **Performance is good** - Multi-threading and device management are solid foundations
The core pipeline and preprocessing are working well - the focus should be on completing the input/output ecosystem around the existing robust inference engine.