36 Commits

Author SHA1 Message Date
cde1aac908 debug: Add comprehensive logging to diagnose pipeline hanging issue
- Add pipeline activity logging every 10 results to track processing
- Add queue size monitoring in InferencePipeline coordinator
- Add camera frame capture logging every 100 frames
- Add MultiDongle send/receive thread logging every 100 operations
- Add error handling for repeated callback failures in camera source

This will help identify where the pipeline stops processing:
- Camera capture stopping
- MultiDongle threads blocking
- Pipeline coordinator hanging
- Queue capacity issues

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 19:49:00 +08:00
ab802e60cf fix: Remove dongle USB timeout setting to prevent camera connection crashes
- Remove kp.core.set_timeout() call that causes crashes when camera is connected
- Add explanatory message indicating timeout is skipped for stability
- This prevents the system crash that occurs during camera initialization
- Trade-off: Removes USB timeout but ensures stable camera operation

The timeout setting was conflicting with camera connection process,
causing the entire system to crash during device initialization.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 19:29:26 +08:00
24d5726ee2 fix: Restore USB timeout setting and improve terminal display reliability
- Re-enable kp.core.set_timeout() which is required for proper device communication
- Fix GUI terminal truncation issue by using append() instead of setPlainText()
- Remove aggressive line limiting that was causing log display to stop midway
- Implement gentler memory management (trim only after 1000+ lines)
- This should resolve pipeline timeout issues and complete log display

The previous USB timeout disable was causing stage timeouts without inference results.
The terminal display issue was due to frequent text replacement causing display corruption.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 19:25:02 +08:00
f41d9ae5c8 feat: Implement output queue based FPS calculation for accurate throughput measurement
- Add time-window based FPS calculation using output queue timestamps
- Replace misleading "Theoretical FPS" (based on processing time) with real "Pipeline FPS"
- Track actual inference output generation rate over 10-second sliding window
- Add thread-safe FPS calculation with proper timestamp management
- Display realistic FPS values (4-9 FPS) instead of inflated values (90+ FPS)

Key improvements:
- _record_output_timestamp(): Records when each output is generated
- get_current_fps(): Calculates FPS based on actual throughput over time window
- Thread-safe implementation with fps_lock for concurrent access
- Automatic cleanup of old timestamps outside the time window
- Integration with GUI display to show meaningful FPS metrics

This provides users with accurate inference throughput measurements that reflect
real-world performance, especially important for multi-dongle setups where
understanding actual scaling is crucial.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 19:17:18 +08:00
2ba0f4ae27 fix: Remove duplicate inference result logging to prevent terminal spam
- Comment out print() statements in InferencePipeline that duplicate GUI callback output
- Prevents each inference result from appearing multiple times in terminal
- Keeps logging system clean while maintaining GUI formatted display
- This was causing terminal output to show each result 2-3 times due to:
  1. InferencePipeline print() statements captured by StdoutCapture
  2. Same results formatted and sent via terminal_output callback

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 19:10:37 +08:00
83906c87e3 fix: Implement stdout/stderr capture for complete logging in deployment UI
- Add StdoutCapture context manager to capture all print() statements
- Connect captured output to GUI terminal display via stdout_captured signal
- Fix logging issue where pipeline initialization and operation logs were not shown in app
- Prevent infinite recursion with _emitting flag in TeeWriter
- Ensure both console and GUI receive all log messages during deployment
- Comment out USB timeout setting that was causing device timeout issues

This resolves the issue where logs would stop showing partially in the app,
ensuring complete visibility of MultiDongle and InferencePipeline operations.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 12:52:35 +08:00
23d8c4ff61 fix: Replace undefined 'processed_result' with 'inference_result'
Fixed NameError where 'processed_result' was referenced but not defined.
Should use 'inference_result' which contains the actual inference output
from MultiDongle.get_latest_inference_result().

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 12:11:44 +08:00
18f1426cbc Merge branch 'main' of github.com:HuangMason320/cluster4npu 2025-07-24 11:56:40 +08:00
bab06b9fa4 Merge branch 'main' of github.com:HuangMason320/cluster4npu 2025-07-24 11:56:32 +08:00
f902659017 fix: Remove incompatible parameters to match standalone MultiDongle API
Key fixes:
1. Remove 'block' parameter from put_input() call - not supported in standalone code
2. Remove 'timeout' parameter from get_latest_inference_result() call
3. Improve _has_inference_result() logic to properly detect real inference results
   - Don't count "Processing" or "async" status as valid results
   - Only count actual tuple (prob, result_str) or meaningful dict results
   - Match standalone code behavior for FPS calculation

This should resolve the "unexpected keyword argument" errors and
provide accurate FPS counting like the standalone baseline.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 11:56:01 +08:00
80275bc774 fix: Correct FPS calculation to count actual inference results only
Key changes:
1. FPS Calculation: Only count when stage receives actual inference results
   - Add _has_inference_result() method to check for valid results
   - Only increment processed_count when real inference result is available
   - This measures "inferences per second" not "frames per second"

2. Reduced Log Spam: Remove excessive preprocessing debug logs
   - Remove shape/dtype logs for every frame
   - Only log successful inference results
   - Keep essential error logs

3. Maintain Async Pattern: Keep non-blocking processing
   - Still use timeout=0.001 for get_latest_inference_result
   - Still use block=False for put_input
   - No blocking while loops

Expected result: ~4 FPS (1 dongle) vs ~9 FPS (2 dongles)
matching standalone code behavior.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 11:30:13 +08:00
273ae71846 fix: Correct FPS calculation and reduce log spam
Key fixes:
1. FPS Calculation: Only count actual inference results, not frame processing
   - Previous: counted every frame processed (~90 FPS, incorrect)
   - Now: only counts when actual inference results are received (~9 FPS, correct)
   - Return None from _process_data when no inference result available
   - Skip FPS counting for iterations without real results

2. Log Reduction: Significantly reduced verbose logging
   - Removed excessive debug prints for preprocessing steps
   - Removed "No inference result" spam messages
   - Only log actual successful inference results

3. Async Processing: Maintain proper async pattern
   - Still use non-blocking get_latest_inference_result(timeout=0.001)
   - Still use non-blocking put_input(block=False)
   - But only count real inference throughput for FPS

This should now match standalone code behavior: ~4 FPS (1 dongle) vs ~9 FPS (2 dongles)

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 11:12:42 +08:00
67a1031009 fix: Remove blocking while loop that prevented multi-dongle scaling
The key issue was in InferencePipeline._process_data() where a 5-second
while loop was blocking waiting for inference results. This completely
serialized processing and prevented multiple dongles from working in parallel.

Changes:
- Replace blocking while loop with single non-blocking call
- Use timeout=0.001 for get_latest_inference_result (async pattern)
- Use block=False for put_input to prevent queue blocking
- Increase worker queue timeout from 0.1s to 1.0s
- Handle async processing status properly

This matches the pattern from the standalone code that achieved
4.xx FPS (1 dongle) vs 9.xx FPS (2 dongles) scaling.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 10:52:58 +08:00
bc92761a83 fix: Optimize multi-dongle inference for proper parallel processing
- Enable USB timeout (5000ms) for stable communication
- Fix send thread timeout from 0.01s to 1.0s for better blocking
- Update WebcamInferenceRunner to use async pattern (non-blocking)
- Add non-blocking put_input option to prevent frame drops
- Improve thread stopping mechanism with better cleanup

These changes follow Kneron official example pattern and should
enable proper parallel processing across multiple dongles.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 10:39:20 +08:00
cb9dff10a9 fix: Correct device scanning to access device_descriptor_list properly
Fixed DeviceDescriptorList object attribute error by properly accessing
the device_descriptor_list attribute instead of treating the result as
a direct list of devices.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-24 10:13:17 +08:00
f45c56d529 update debug_deployment 2025-07-24 10:05:39 +08:00
183b5659b7 feat: Integrate dongle model detection and refactor scan_devices
This commit integrates the dongle model detection logic into .
It refactors the  method to:
- Handle  in list or object format.
- Extract  and  for each device.
- Use  to identify dongle models.
- Return a more detailed device information structure.

The previously deleted files were moved to the  directory.
2025-07-24 10:01:56 +08:00
0e8d75c85c cleanup: Remove debug output after successful fix verification
- Remove all debug print statements from deployment dialog
- Remove debug output from workflow orchestrator and inference pipeline
- Remove test signal emissions and unused imports
- Code is now clean and production-ready
- Results are successfully flowing from inference to GUI display
2025-07-23 22:50:34 +08:00
18ec31738a fix: Ensure result callback is always set on pipeline regardless of result handler
- Remove dependency on result_handler for setting pipeline result callback
- Always call result_callback when handle_result is triggered
- This fixes the issue where GUI callbacks weren't being called because
  output type 'display' wasn't supported, causing result_handler to be None
- Add more debug output to trace callback flow
2025-07-23 22:43:42 +08:00
2dec66edad debug: Add callback chain debugging to InferencePipeline and WorkflowOrchestrator
- Add debug output in InferencePipeline result callback to see if it's called
- Add debug output in WorkflowOrchestrator handle_result to trace callback flow
- This will help identify exactly where the callback chain is breaking
- Previous test showed GUI can receive signals but callbacks aren't triggered
2025-07-23 22:43:06 +08:00
be44e6214a update debug for deploment 2025-07-17 12:05:10 +08:00
0e3295a780 feat: Add comprehensive terminal result printing for dongle deployments
- Enhanced deployment workflow to print detailed inference results to terminal in real-time
- Added rich formatting with emojis, confidence indicators, and performance metrics
- Combined GUI and terminal callbacks for dual output during module deployment
- Improved workflow orchestrator startup/shutdown feedback
- Added demonstration script showing terminal output examples
- Supports multi-stage pipelines with individual stage result display
- Includes processing time, FPS calculations, and metadata visualization

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-17 10:39:08 +08:00
e6c9817a98 feat: Add real-time inference results display to deployment UI
- Add result callback mechanism to WorkflowOrchestrator
- Implement result_updated signal in DeploymentWorker
- Create detailed inference results display with timestamps and formatted output
- Support both tuple and dict result formats
- Add auto-scrolling results panel with history management
- Connect pipeline results to Live View tab for real-time monitoring

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-17 10:22:48 +08:00
e97fd7a025 fix: Resolve remaining numpy array comparison errors in MultiDongle
- Fix ambiguous truth value error in get_latest_inference_result method
- Fix ambiguous truth value error in postprocess function
- Replace direct array evaluation with explicit length checks
- Use proper None checks instead of truthy evaluation on numpy arrays

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-17 10:11:38 +08:00
0a70df4098 fix: Complete array comparison fix and improve stop button functionality
- Fix remaining array comparison error in inference result validation
- Update PyQt signal signature for proper numpy array handling
- Improve DeploymentWorker to keep running after deployment
- Enhance stop button with non-blocking UI updates and better error handling

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-17 10:03:59 +08:00
183300472e fix: Resolve array comparison error and add inference stop functionality
- Fix ambiguous truth value error in InferencePipeline result handling
- Add stop inference button to deployment dialog with proper UI state management
- Improve error handling for tuple vs dict result types

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-17 09:46:31 +08:00
c94eb5ee30 fix import path problem in deployment.py 2025-07-17 09:25:07 +08:00
af9adc8e82 fix: Address file path and data processing bugs, add real-time viewer 2025-07-17 09:18:27 +08:00
7e173c42de feat: Implement essential components for complete inference workflow 2025-07-16 23:32:36 +08:00
ee4d1a3e4a Add comprehensive TODO planning and new camera/video source implementations
- Add detailed TODO.md with complete project roadmap and implementation priorities
- Implement CameraSource class with multi-camera support and real-time capture
- Add VideoFileSource class with batch processing and frame control capabilities
- Create foundation for complete input/output data flow integration
- Document current auto-resize preprocessing implementation status
- Establish clear development phases and key missing components

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-16 23:19:00 +08:00
049dedf2f7 Fix firmware path initialization and upload logic in MultiDongle
- Always store firmware paths (scpu_fw_path, ncpu_fw_path) when provided, not just when upload_fw=True
- Restore firmware upload condition to only run when upload_fw=True
- Fix 'MultiDongle' object has no attribute 'scpu_fw_path' error during pipeline initialization
- Ensure firmware paths are available for both upload and non-upload scenarios

This resolves the pipeline deployment error where firmware paths were missing
even when provided to the constructor, causing initialization failures.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-16 22:11:42 +08:00
e34cdfb856 Add TODO comment and device log 2025-07-16 21:53:31 +08:00
e0169cd845 Fix device detection format and pipeline deployment compatibility
Device Detection Updates:
- Update device series detection to use product_id mapping (0x100 -> KL520, 0x720 -> KL720)
- Handle JSON dict format from kp.core.scan_devices() properly
- Extract usb_port_id correctly from device descriptors
- Support multiple device descriptor formats (dict, list, object)
- Enhanced debug output shows Product ID for verification

Pipeline Deployment Fixes:
- Remove invalid preprocessor/postprocessor parameters from MultiDongle constructor
- Add max_queue_size parameter support to MultiDongle
- Fix pipeline stage initialization to match MultiDongle constructor
- Add auto_detect parameter support for pipeline stages
- Store stage processors as instance variables for future use

Example Updates:
- Update device_detection_example.py to show Product ID in output
- Enhanced error handling and format detection

Resolves pipeline deployment error: "MultiDongle.__init__() got an unexpected keyword argument 'preprocessor'"
Now properly handles real device descriptors with correct product_id to series mapping.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-16 21:45:14 +08:00
9020be5e7a Add Kneron device auto-detection and connection features
- Add scan_devices() method using kp.core.scan_devices() for device discovery
- Add connect_auto_detected_devices() for automatic device connection
- Add device series detection (KL520, KL720, KL630, KL730, KL540, etc.)
- Add auto_detect parameter to MultiDongle constructor
- Add get_device_info() and print_device_info() methods to display port IDs and series
- Update connection logic to use kp.core.connect_devices() per official docs
- Add device_detection_example.py with usage examples
- Maintain backward compatibility with manual port specification

Features display dongle series and port ID as requested for better device management.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-16 21:13:33 +08:00
f5e017b099 Fix DataProcessor missing class error in pipeline deployment
- Add DataProcessor abstract base class with process method
- Add PostProcessor class for handling inference output data
- Fix PreProcessor inheritance from DataProcessor
- Resolves "name 'DataProcessor' is not defined" error during pipeline deployment

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-16 20:53:41 +08:00
080eb5b887 Add intelligent pipeline topology analysis and comprehensive UI framework
Major Features:
• Advanced topological sorting algorithm with cycle detection and resolution
• Intelligent pipeline optimization with parallelization analysis
• Critical path analysis and performance metrics calculation
• Comprehensive .mflow file converter for seamless UI-to-API integration
• Complete modular UI framework with node-based pipeline editor
• Enhanced model node properties (scpu_fw_path, ncpu_fw_path)
• Professional output formatting without emoji decorations

Technical Improvements:
• Graph theory algorithms (DFS, BFS, topological sort)
• Automatic dependency resolution and conflict prevention
• Multi-criteria pipeline optimization
• Real-time stage count calculation and validation
• Comprehensive configuration validation and error handling
• Modular architecture with clean separation of concerns

New Components:
• MFlow converter with topology analysis (core/functions/mflow_converter.py)
• Complete node system with exact property matching
• Pipeline editor with visual node connections
• Performance estimation and dongle management panels
• Comprehensive test suite and demonstration scripts

🤖 Generated with Claude Code (https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-10 12:58:47 +08:00