312 lines
8.8 KiB
Markdown
312 lines
8.8 KiB
Markdown
## Dynasty
|
|
|
|
The release version of Dynasty. We provide codes examples in the [example](example/) folder.
|
|
|
|
## How to use
|
|
See [CMakeLists](CMakeLists.txt), [cpu inferencer codes](example/cpu_example.cpp),
|
|
[cuda inferencer codes](example/cuda_example.cpp) for detailed examples.
|
|
|
|
### Mystery Node
|
|
Mystery Node is customized onnx node, which has two attributes in hierarchy:
|
|
1. opid: int. This is to define a kind of customized node.
|
|
1. type: string. This is to define a type within a certain kind of customized node. This will become handy if the node will have different parameters. For example, one could define a node to perform y = kx + b, and use diffent type for diffent pair of (k,b)
|
|
|
|
In the [example](example/) folder, mystery.c has implemented 3 different mystery node, and op0 also make use of 'type' attribute of the mystery node.
|
|
|
|
Dynasty CPU inference lib has built-in support of mystery node with interface defined in mystery.h in [inclulde](include/) folder. If user doesn't have any mystery node, user can use mystery.c as in the [example](example/) folder as the default implemention of that interface, otherwise, the linker will complain it can't find mystery_op0 etc. If user does have the mystery node, he will need to update mystery.h/mystery.c accordingly.
|
|
|
|
If user wants to provide Kneron with onnx model with mystery nodes in it, he will also need to provide a libary(for mystery.c) and a mystery.h to Kneron. See # build mystery lib section in CMakelists.txt
|
|
|
|
|
|
|
|
## Contribution and Change Lists
|
|
### 20. Dynasty 3.3.1
|
|
1. Include ReduceSum in CPU, CUDA
|
|
2. Merge conv CUDNN implementation into Dynasty CUDA
|
|
|
|
### 19. Dynasty 3.3.0
|
|
1. Include submodule common_header_lib
|
|
2. Support Tensor as input for inferencer
|
|
|
|
### 18. Dynasty 3.2.1
|
|
#### Changes
|
|
1. Update CUDA interfacer :: Concat Operator to speed it up
|
|
|
|
|
|
|
|
### 17. Dynasty 3.2.0
|
|
#### Changes
|
|
1. Update PianoInferencer interface : GetInputDimensions/GetOutputDimensions
|
|
1. Update Floating CUDA Inferencer/Operator: not all operators will allocate space for its output tensor, for example: if a node only has one child, in most cases it will use shared space from InitialiedData, rather than allocating its own space.
|
|
|
|
|
|
|
|
### 16. Dynasty 2.5.1
|
|
|
|
#### Changes
|
|
|
|
1. Fix Microsoft ONNX-Runtime inferencer header bugs;
|
|
1. Add Microsoft ONNX-Runtime inferencer examples;
|
|
|
|
### 15. Dynasty 2.5.0
|
|
|
|
#### Changes
|
|
|
|
1. Add Microsoft ONNX-Runtime inferencer;
|
|
|
|
## Contribution and Change Lists
|
|
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 14. Dynasty 2.4.0
|
|
|
|
#### Changes
|
|
|
|
1. Add MKL-DNN inferencer;
|
|
|
|
## Contribution and Change Lists
|
|
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 13. Dynasty 2.3.0
|
|
|
|
#### Changes
|
|
|
|
1. Add BIE support;
|
|
2. Combine CUDA and CUDNN;
|
|
3. Refactor Inferencer: use chain of responsibilities and template method pattern;
|
|
|
|
## Contribution and Change Lists
|
|
|
|
Bo Xie: BIE support;
|
|
Nan Zhou: others;
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 12. Dynasty 2.2.0
|
|
|
|
#### Changes
|
|
|
|
1. Combine CUDA and CUDNN;
|
|
2. Refactor Inferencer: use chain of responsibilities and template method pattern;
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 11. Dynasty 2.1.0
|
|
|
|
#### Changes
|
|
|
|
1. Add CUDA pad;
|
|
2. Add corresponding tests;
|
|
|
|
#### Contribution:
|
|
Shanshan Xiao
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 10. Dynasty 2.0.1
|
|
|
|
#### Changes
|
|
|
|
1. Fix logger mutex;
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 9. Dynasty 2.0.0
|
|
|
|
#### Changes
|
|
|
|
1. Add BIE support (change interfaces);
|
|
2. Add corresponding tests (correctness of CPU inference using BIE, memcheck of CPU inference using BIE);
|
|
|
|
#### Contribution:
|
|
Bo Xie, Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 8. Dynasty 1.2.1
|
|
|
|
#### Changes
|
|
|
|
1. Fix dummy CUDNNInferencer bugs on platform without CUDNN;
|
|
2. Print types as well for unsupported operations;
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 7. Dynasty 1.2.0
|
|
|
|
#### Features
|
|
|
|
1. Add CUDNNInferencer;
|
|
|
|
#### Contribution:
|
|
CUDNNInferencer: Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 6. Dynasty 1.1.0
|
|
|
|
#### Changes
|
|
|
|
1. Add all new operations from Renaissance to DynastyCpuVisitor.
|
|
2. Add tests to verify correctness of all new operations.
|
|
3. Add support for reading inputs not in bhwc format.
|
|
|
|
#### Contribution:
|
|
Ryan Han, Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 5. Dynasty 1.0.4
|
|
#### Changes
|
|
|
|
1. Refactor AttributeParser.
|
|
2. Fix CPUInferencer out-of-boundary bugs.
|
|
|
|
#### Contribution:
|
|
Nan Zhou, Ryan Han
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 4. Dynasty 1.0.3
|
|
|
|
#### Changes
|
|
|
|
1. Add graceful checking on dimensions of input pixels.
|
|
2. Fix Flatten bugs in CPUInferencer.
|
|
3. Remove Graph optimizations in PianoInferencer;
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Kidd Su
|
|
|
|
### 3. Dynasty 1.0.2
|
|
|
|
#### Changes
|
|
|
|
1. Clean floating `blas`, `gemm`, and `im2col`, using C++ and namespace instead of pure C.
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Jenna Wu
|
|
|
|
### 2. Dynasty 1.0.1
|
|
|
|
#### Changes
|
|
|
|
1. Install headers including "JsonIO.h", "ONNXIO.h", and "UnixFileManager.h".
|
|
|
|
#### Contribution:
|
|
Nan Zhou
|
|
|
|
Manager: Jenna Wu
|
|
|
|
### 1. Dynasty 1.0.0
|
|
|
|
#### Changes
|
|
|
|
1. Selected the design patterns and write codes from scratch;
|
|
2. Established the testing framework;
|
|
3. Builder Docker images for both CPU and GPU;
|
|
4. Finished parts of the operations in CPUInferencer, CUDAInferencer, and SNPEInferencer;
|
|
|
|
|
|
#### Contribution:
|
|
|
|
Design Pattern: Yao Zou & Nan Zhou
|
|
|
|
CPUInferencer: Yunhan Ma & Nan Zhou
|
|
|
|
CUDAInferencer: Nan Zhou
|
|
|
|
SNPEInferencer: Yao Zou
|
|
|
|
Test & CI & Docker: Nan Zhou
|
|
|
|
CMake: Nan Zhou
|
|
|
|
Manager: Jenna Wu
|
|
|
|
|
|
## Latest Supported Operations:
|
|
|
|
| | Operations | CPU | CUDA |CUDNN| MKL|
|
|
|----|-----------------------|-----|------|---- |----|
|
|
| 1 | ConvNode | Y | Y (square kernel)| Y | Y|
|
|
| 2 | BNNode | Y | Y |Y | Y |
|
|
| 3 | LeakyReluNode | Y | Y |Y | Y |
|
|
| 4 | GemmNode | Y | Y |Y | Y |
|
|
| 5 | EmptyNode | N | N |N | N |
|
|
| 6 | InputNode | Y | Y |Y | Y |
|
|
| 7 | ReluNode | Y | Y |Y | Y |
|
|
| 8 | MaxPoolNode | Y | Y |Y | Y |
|
|
| 9 | AveragePoolNode | Y | Y |Y | Y |
|
|
| 10 | FlattenNode | Y | Y |Y | Y |
|
|
| 11 | AddNode | Y | Y |Y | Y |
|
|
| 12 | MulNode | Y | N |N | Y |
|
|
| 13 | ConcatNode | Y | Y |Y | Y |
|
|
| 14 | ClipNode | Y | Y |Y | Y |
|
|
| 15 | SliceNode | Y | N |N | Y |
|
|
| 16 | SliceHeaderNode | N | N |N | N |
|
|
| 17 | SliceTailNode | N | N |N | N |
|
|
| 18 | BeethovenNode | N | N |N | N |
|
|
| 19 | PadNode | Y | Y |Y | Y |
|
|
| 20 | ConvTransposeNode | N | N |N | N |
|
|
| 21 | UpsampleNode | Y | Y (scale == 2) |Y | Y |
|
|
| 22 | TanhNode | Y | Y |Y | Y |
|
|
| 23 | SigmoidNode | Y | Y |Y | Y |
|
|
| 24 | ReshapeNode | Y | N |N | Y |
|
|
| 25 | PReluNode | Y | Y |Y | Y |
|
|
| 26 | GlobalAveragePoolNode | Y | Y |Y | Y |
|
|
| 27 | GlobalMaxPoolNode | Y | Y |Y | Y |
|
|
| 28 | SoftmaxNode | Y | Y |Y | Y |
|
|
| 29 | FloorNode | Y | N |N | Y |
|
|
| 30 | DropoutNode | Y | N |N | Y |
|
|
| 31 | MysteryNode | N | N |N | N |
|
|
| 32 | ConstantNode | N | N |N | N |
|
|
| 33 | BitShiftNode | N | N |N | N |
|
|
| 34 | CastNode | N | N |N | N |
|
|
| 35 | DepthToSpaceNode | Y | N |N | Y |
|
|
| 36 | DivNode | Y | N |N | Y |
|
|
| 37 | EluNode | Y | N |N | Y |
|
|
| 38 | ExpNode | Y | N |N | Y |
|
|
| 39 | ExpandNode | Y | N |N | Y |
|
|
| 40 | GatherNode | Y | N |N | Y |
|
|
| 41 | GRUNode | Y | N |N | Y |
|
|
| 42 | LpNormalizationNode | Y | N |N | Y |
|
|
| 43 | LRNNode | Y | N |N | Y |
|
|
| 44 | LSTMNode | Y | N |N | Y |
|
|
| 45 | MatMulNode | Y | N |N | Y |
|
|
| 46 | MaxRoiPoolNode | Y | N |N | Y |
|
|
| 47 | MaxUnpoolNode | Y | N |N | Y |
|
|
| 48 | MeanNode | Y | N |N | Y |
|
|
| 49 | MinNode | Y | N |N | Y |
|
|
| 50 | ModNode | N | N |N | N |
|
|
| 51 | MultinomialNode | Y | N |N | Y |
|
|
| 52 | NegNode | Y | N |N | Y |
|
|
| 53 | NonMaxSuppressionNode | N | N |N | N |
|
|
| 54 | NonZeroNode | Y | N |N | Y |
|
|
| 55 | NotNode | Y | N |N | Y |
|
|
| 56 | OneHotNode | Y | N |N | Y |
|
|
| 57 | OrNode | Y | N |N | Y |
|
|
| 58 | RandomUniformLikeNode | Y | N |N | Y |
|
|
| 59 | SqueezeNode | N | N |N | N |
|
|
| 60 | SubNode | Y | N |N | Y |
|
|
| 61 | TransposeNode | N | N |N | N |
|
|
| 62 | UnsqueezeNode | Y | N |N | Y |
|