# Cluster4NPU Pipeline TODO ## Current Status ✅ **Pipeline Core**: Multi-stage pipeline with device auto-detection working ✅ **Hardware Integration**: Kneron NPU dongles connecting and initializing successfully ✅ **Auto-resize Preprocessing**: Model input shape detection and automatic preprocessing implemented ❌ **Data Input Sources**: Missing camera and file input implementations ❌ **Result Persistence**: No result saving or output mechanisms ❌ **End-to-End Workflow**: Gaps between UI configuration and core pipeline execution --- ## Priority 1: Essential Components for Complete Inference Workflow ### 1. Data Source Implementation **Status**: 🔴 Critical Missing Components **Location**: Need to create new classes in `core/functions/` or extend existing ones #### 1.1 Camera Input Source - **File**: `core/functions/camera_source.py` (new) - **Class**: `CameraSource` - **Purpose**: Wrapper around cv2.VideoCapture for camera input - **Integration**: Connect to InferencePipeline.put_data() - **Features**: - Multiple camera index support - Resolution and FPS configuration - Format conversion (BGR → model input format) - Error handling for camera disconnection #### 1.2 Video File Input Source - **File**: `core/functions/video_source.py` (new) - **Class**: `VideoFileSource` - **Purpose**: Process video files frame by frame - **Integration**: Feed frames to InferencePipeline - **Features**: - Support common video formats (MP4, AVI, MOV) - Frame rate control and seeking - Batch processing capabilities - Progress tracking #### 1.3 Image File Input Source - **File**: `core/functions/image_source.py` (new) - **Class**: `ImageFileSource` - **Purpose**: Process single images or image directories - **Integration**: Single-shot inference through pipeline - **Features**: - Support common image formats (JPG, PNG, BMP) - Batch directory processing - Image validation and error handling #### 1.4 RTSP/HTTP Stream Source - **File**: `core/functions/stream_source.py` (new) - **Class**: `RTSPSource`, `HTTPStreamSource` - **Purpose**: Process live video streams - **Integration**: Real-time streaming to pipeline - **Features**: - Stream connection management - Reconnection on failure - Buffer management and frame dropping ### 2. Result Persistence System **Status**: 🔴 Critical Missing Components **Location**: `core/functions/result_handler.py` (new) #### 2.1 Result Serialization - **Class**: `ResultSerializer` - **Purpose**: Convert inference results to standard formats - **Features**: - JSON export with timestamps - CSV export for analytics - Binary format for performance - Configurable fields and formatting #### 2.2 File Output Manager - **Class**: `FileOutputManager` - **Purpose**: Handle result file writing and organization - **Features**: - Timestamped file naming - Directory organization by date/pipeline - File rotation and cleanup - Output format configuration #### 2.3 Real-time Result Streaming - **Class**: `ResultStreamer` - **Purpose**: Stream results to external systems - **Features**: - WebSocket result broadcasting - REST API endpoints - Message queue integration (Redis, RabbitMQ) - Custom callback system ### 3. Input/Output Integration Bridge **Status**: 🔴 Critical Missing Components **Location**: `core/functions/pipeline_manager.py` (new) #### 3.1 Pipeline Configuration Manager - **Class**: `PipelineConfigManager` - **Purpose**: Convert UI configurations to executable pipelines - **Integration**: Bridge between UI and core pipeline - **Features**: - Parse UI node configurations - Instantiate appropriate data sources - Configure result handlers - Manage pipeline lifecycle #### 3.2 Unified Workflow Orchestrator - **Class**: `WorkflowOrchestrator` - **Purpose**: Coordinate complete data flow from input to output - **Features**: - Input source management - Pipeline execution control - Result handling and persistence - Error recovery and logging --- ## Priority 2: Enhanced Preprocessing and Auto-resize ### 4. Enhanced Preprocessing System **Status**: 🟡 Partially Implemented **Location**: `core/functions/Multidongle.py` (existing) + new preprocessing modules #### 4.1 Current Auto-resize Implementation - **Location**: `Multidongle.py:354-371` (preprocess_frame method) - **Features**: ✅ Already implemented - Automatic model input shape detection - Dynamic resizing based on model requirements - Format conversion (BGR565, RGB8888, YUYV, RAW8) - Aspect ratio handling #### 4.2 Enhanced Preprocessing Pipeline - **File**: `core/functions/preprocessor.py` (new) - **Class**: `AdvancedPreprocessor` - **Purpose**: Extended preprocessing capabilities - **Features**: - **Smart cropping**: Maintain aspect ratio with intelligent cropping - **Normalization**: Configurable pixel value normalization - **Augmentation**: Real-time data augmentation for training - **Multi-model support**: Different preprocessing for different models - **Caching**: Preprocessed frame caching for performance #### 4.3 Model-Aware Preprocessing - **Enhancement**: Extend existing `Multidongle` class - **Location**: `core/functions/Multidongle.py:188-199` (model_input_shape detection) - **Features**: - **Dynamic preprocessing**: Adjust preprocessing based on model metadata - **Model-specific optimization**: Tailored preprocessing for different model types - **Preprocessing profiles**: Saved preprocessing configurations per model --- ## Priority 3: UI Integration and User Experience ### 5. Dashboard Integration **Status**: 🟡 Partially Implemented **Location**: `ui/windows/dashboard.py` (existing) #### 5.1 Real-time Pipeline Monitoring - **Enhancement**: Extend existing Dashboard class - **Features**: - Live inference statistics - Real-time result visualization - Performance metrics dashboard - Error monitoring and alerts #### 5.2 Input Source Configuration - **Integration**: Connect UI input nodes to actual data sources - **Features**: - Camera selection and preview - File browser integration - Stream URL validation - Input source testing ### 6. Result Visualization **Status**: 🔴 Not Implemented **Location**: `ui/widgets/result_viewer.py` (new) #### 6.1 Result Display Widget - **Class**: `ResultViewer` - **Purpose**: Display inference results in UI - **Features**: - Real-time result streaming - Result history and filtering - Export capabilities - Customizable display formats --- ## Priority 4: Advanced Features and Optimization ### 7. Performance Optimization **Status**: 🟡 Basic Implementation **Location**: Multiple files #### 7.1 Memory Management - **Enhancement**: Optimize existing queue systems - **Files**: `InferencePipeline.py`, `Multidongle.py` - **Features**: - Smart queue sizing based on available memory - Frame dropping under load - Memory leak detection and prevention - Garbage collection optimization #### 7.2 Multi-device Load Balancing - **Enhancement**: Extend existing multi-dongle support - **Location**: `core/functions/Multidongle.py` (existing auto-detection) - **Features**: - Intelligent device allocation - Load balancing across devices - Device health monitoring - Automatic failover ### 8. Error Handling and Recovery **Status**: 🟡 Basic Implementation **Location**: Throughout codebase #### 8.1 Comprehensive Error Recovery - **Enhancement**: Extend existing error handling - **Features**: - Automatic device reconnection - Pipeline restart on critical errors - Input source recovery - Result persistence on failure --- ## Implementation Roadmap ### Phase 1: Core Data Flow (Weeks 1-2) 1. ✅ **Complete**: Pipeline deployment and device initialization 2. 🔄 **In Progress**: Auto-resize preprocessing (mostly implemented) 3. **Next**: Implement basic camera input source 4. **Next**: Add simple result file output 5. **Next**: Create basic pipeline manager ### Phase 2: Complete Workflow (Weeks 3-4) 1. Add video file input support 2. Implement comprehensive result persistence 3. Create UI integration bridge 4. Add real-time monitoring ### Phase 3: Advanced Features (Weeks 5-6) 1. Enhanced preprocessing pipeline 2. Performance optimization 3. Advanced error handling 4. Result visualization ### Phase 4: Production Features (Weeks 7-8) 1. Multi-device load balancing 2. Advanced stream input support 3. Analytics and reporting 4. Configuration management --- ## Key Code Locations for Current Auto-resize Implementation ### Model Input Shape Detection - **File**: `core/functions/Multidongle.py` - **Lines**: 188-199 (model_input_shape property) - **Status**: ✅ Working - detects model input dimensions from NEF files ### Automatic Preprocessing - **File**: `core/functions/Multidongle.py` - **Lines**: 354-371 (preprocess_frame method) - **Status**: ✅ Working - auto-resizes based on model input shape - **Features**: Format conversion, aspect ratio handling ### Pipeline Data Processing - **File**: `core/functions/InferencePipeline.py` - **Lines**: 165-240 (_process_data method) - **Status**: ✅ Working - integrates preprocessing with inference - **Features**: Inter-stage processing, result accumulation ### Format Conversion - **File**: `core/functions/Multidongle.py` - **Lines**: 382-396 (_convert_format method) - **Status**: ✅ Working - supports BGR565, RGB8888, YUYV, RAW8 --- ## Notes for Development 1. **Auto-resize is already implemented** ✅ - The system automatically detects model input shape and resizes accordingly 2. **Priority should be on input sources** - Camera and file input are the critical missing pieces 3. **Result persistence is essential** - Current system only provides callbacks, need file output 4. **UI integration gap** - UI configuration doesn't connect to core pipeline execution 5. **Performance is good** - Multi-threading and device management are solid foundations The core pipeline and preprocessing are working well - the focus should be on completing the input/output ecosystem around the existing robust inference engine.