## Dynasty The release version of Dynasty. We provide codes examples in the [example](example/) folder. ## How to use See [CMakeLists](CMakeLists.txt), [cpu inferencer codes](example/cpu_example.cpp), [cuda inferencer codes](example/cuda_example.cpp) for detailed examples. ### Mystery Node Mystery Node is customized onnx node, which has two attributes in hierarchy: 1. opid: int. This is to define a kind of customized node. 1. type: string. This is to define a type within a certain kind of customized node. This will become handy if the node will have different parameters. For example, one could define a node to perform y = kx + b, and use diffent type for diffent pair of (k,b) In the [example](example/) folder, mystery.c has implemented 3 different mystery node, and op0 also make use of 'type' attribute of the mystery node. Dynasty CPU inference lib has built-in support of mystery node with interface defined in mystery.h in [inclulde](include/) folder. If user doesn't have any mystery node, user can use mystery.c as in the [example](example/) folder as the default implemention of that interface, otherwise, the linker will complain it can't find mystery_op0 etc. If user does have the mystery node, he will need to update mystery.h/mystery.c accordingly. If user wants to provide Kneron with onnx model with mystery nodes in it, he will also need to provide a libary(for mystery.c) and a mystery.h to Kneron. See # build mystery lib section in CMakelists.txt ## Contribution and Change Lists ### 20. Dynasty 3.3.1 1. Include ReduceSum in CPU, CUDA 2. Merge conv CUDNN implementation into Dynasty CUDA ### 19. Dynasty 3.3.0 1. Include submodule common_header_lib 2. Support Tensor as input for inferencer ### 18. Dynasty 3.2.1 #### Changes 1. Update CUDA interfacer :: Concat Operator to speed it up ### 17. Dynasty 3.2.0 #### Changes 1. Update PianoInferencer interface : GetInputDimensions/GetOutputDimensions 1. Update Floating CUDA Inferencer/Operator: not all operators will allocate space for its output tensor, for example: if a node only has one child, in most cases it will use shared space from InitialiedData, rather than allocating its own space. ### 16. Dynasty 2.5.1 #### Changes 1. Fix Microsoft ONNX-Runtime inferencer header bugs; 1. Add Microsoft ONNX-Runtime inferencer examples; ### 15. Dynasty 2.5.0 #### Changes 1. Add Microsoft ONNX-Runtime inferencer; ## Contribution and Change Lists Nan Zhou Manager: Kidd Su ### 14. Dynasty 2.4.0 #### Changes 1. Add MKL-DNN inferencer; ## Contribution and Change Lists Nan Zhou Manager: Kidd Su ### 13. Dynasty 2.3.0 #### Changes 1. Add BIE support; 2. Combine CUDA and CUDNN; 3. Refactor Inferencer: use chain of responsibilities and template method pattern; ## Contribution and Change Lists Bo Xie: BIE support; Nan Zhou: others; Manager: Kidd Su ### 12. Dynasty 2.2.0 #### Changes 1. Combine CUDA and CUDNN; 2. Refactor Inferencer: use chain of responsibilities and template method pattern; #### Contribution: Nan Zhou Manager: Kidd Su ### 11. Dynasty 2.1.0 #### Changes 1. Add CUDA pad; 2. Add corresponding tests; #### Contribution: Shanshan Xiao Manager: Kidd Su ### 10. Dynasty 2.0.1 #### Changes 1. Fix logger mutex; #### Contribution: Nan Zhou Manager: Kidd Su ### 9. Dynasty 2.0.0 #### Changes 1. Add BIE support (change interfaces); 2. Add corresponding tests (correctness of CPU inference using BIE, memcheck of CPU inference using BIE); #### Contribution: Bo Xie, Nan Zhou Manager: Kidd Su ### 8. Dynasty 1.2.1 #### Changes 1. Fix dummy CUDNNInferencer bugs on platform without CUDNN; 2. Print types as well for unsupported operations; #### Contribution: Nan Zhou Manager: Kidd Su ### 7. Dynasty 1.2.0 #### Features 1. Add CUDNNInferencer; #### Contribution: CUDNNInferencer: Nan Zhou Manager: Kidd Su ### 6. Dynasty 1.1.0 #### Changes 1. Add all new operations from Renaissance to DynastyCpuVisitor. 2. Add tests to verify correctness of all new operations. 3. Add support for reading inputs not in bhwc format. #### Contribution: Ryan Han, Nan Zhou Manager: Kidd Su ### 5. Dynasty 1.0.4 #### Changes 1. Refactor AttributeParser. 2. Fix CPUInferencer out-of-boundary bugs. #### Contribution: Nan Zhou, Ryan Han Manager: Kidd Su ### 4. Dynasty 1.0.3 #### Changes 1. Add graceful checking on dimensions of input pixels. 2. Fix Flatten bugs in CPUInferencer. 3. Remove Graph optimizations in PianoInferencer; #### Contribution: Nan Zhou Manager: Kidd Su ### 3. Dynasty 1.0.2 #### Changes 1. Clean floating `blas`, `gemm`, and `im2col`, using C++ and namespace instead of pure C. #### Contribution: Nan Zhou Manager: Jenna Wu ### 2. Dynasty 1.0.1 #### Changes 1. Install headers including "JsonIO.h", "ONNXIO.h", and "UnixFileManager.h". #### Contribution: Nan Zhou Manager: Jenna Wu ### 1. Dynasty 1.0.0 #### Changes 1. Selected the design patterns and write codes from scratch; 2. Established the testing framework; 3. Builder Docker images for both CPU and GPU; 4. Finished parts of the operations in CPUInferencer, CUDAInferencer, and SNPEInferencer; #### Contribution: Design Pattern: Yao Zou & Nan Zhou CPUInferencer: Yunhan Ma & Nan Zhou CUDAInferencer: Nan Zhou SNPEInferencer: Yao Zou Test & CI & Docker: Nan Zhou CMake: Nan Zhou Manager: Jenna Wu ## Latest Supported Operations: | | Operations | CPU | CUDA |CUDNN| MKL| |----|-----------------------|-----|------|---- |----| | 1 | ConvNode | Y | Y (square kernel)| Y | Y| | 2 | BNNode | Y | Y |Y | Y | | 3 | LeakyReluNode | Y | Y |Y | Y | | 4 | GemmNode | Y | Y |Y | Y | | 5 | EmptyNode | N | N |N | N | | 6 | InputNode | Y | Y |Y | Y | | 7 | ReluNode | Y | Y |Y | Y | | 8 | MaxPoolNode | Y | Y |Y | Y | | 9 | AveragePoolNode | Y | Y |Y | Y | | 10 | FlattenNode | Y | Y |Y | Y | | 11 | AddNode | Y | Y |Y | Y | | 12 | MulNode | Y | N |N | Y | | 13 | ConcatNode | Y | Y |Y | Y | | 14 | ClipNode | Y | Y |Y | Y | | 15 | SliceNode | Y | N |N | Y | | 16 | SliceHeaderNode | N | N |N | N | | 17 | SliceTailNode | N | N |N | N | | 18 | BeethovenNode | N | N |N | N | | 19 | PadNode | Y | Y |Y | Y | | 20 | ConvTransposeNode | N | N |N | N | | 21 | UpsampleNode | Y | Y (scale == 2) |Y | Y | | 22 | TanhNode | Y | Y |Y | Y | | 23 | SigmoidNode | Y | Y |Y | Y | | 24 | ReshapeNode | Y | N |N | Y | | 25 | PReluNode | Y | Y |Y | Y | | 26 | GlobalAveragePoolNode | Y | Y |Y | Y | | 27 | GlobalMaxPoolNode | Y | Y |Y | Y | | 28 | SoftmaxNode | Y | Y |Y | Y | | 29 | FloorNode | Y | N |N | Y | | 30 | DropoutNode | Y | N |N | Y | | 31 | MysteryNode | N | N |N | N | | 32 | ConstantNode | N | N |N | N | | 33 | BitShiftNode | N | N |N | N | | 34 | CastNode | N | N |N | N | | 35 | DepthToSpaceNode | Y | N |N | Y | | 36 | DivNode | Y | N |N | Y | | 37 | EluNode | Y | N |N | Y | | 38 | ExpNode | Y | N |N | Y | | 39 | ExpandNode | Y | N |N | Y | | 40 | GatherNode | Y | N |N | Y | | 41 | GRUNode | Y | N |N | Y | | 42 | LpNormalizationNode | Y | N |N | Y | | 43 | LRNNode | Y | N |N | Y | | 44 | LSTMNode | Y | N |N | Y | | 45 | MatMulNode | Y | N |N | Y | | 46 | MaxRoiPoolNode | Y | N |N | Y | | 47 | MaxUnpoolNode | Y | N |N | Y | | 48 | MeanNode | Y | N |N | Y | | 49 | MinNode | Y | N |N | Y | | 50 | ModNode | N | N |N | N | | 51 | MultinomialNode | Y | N |N | Y | | 52 | NegNode | Y | N |N | Y | | 53 | NonMaxSuppressionNode | N | N |N | N | | 54 | NonZeroNode | Y | N |N | Y | | 55 | NotNode | Y | N |N | Y | | 56 | OneHotNode | Y | N |N | Y | | 57 | OrNode | Y | N |N | Y | | 58 | RandomUniformLikeNode | Y | N |N | Y | | 59 | SqueezeNode | N | N |N | N | | 60 | SubNode | Y | N |N | Y | | 61 | TransposeNode | N | N |N | N | | 62 | UnsqueezeNode | Y | N |N | Y |