ORT_CPP add CUDA FP16 inference (#4320)

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
This commit is contained in:
DennisJ 2023-08-12 23:27:52 +08:00 committed by GitHub
parent 02d4f5200d
commit 1c753cbce6
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 57 additions and 24 deletions

View file

@ -6,8 +6,7 @@ This example demonstrates how to perform inference using YOLOv8 in C++ with ONNX
- Friendly for deployment in the industrial sector.
- Faster than OpenCV's DNN inference on both CPU and GPU.
- Supports CUDA acceleration.
- Easy to add FP16 inference (using template functions).
- Supports FP32 and FP16 CUDA acceleration.
## Exporting YOLOv8 Models
@ -47,13 +46,12 @@ Note: The dependency on C++17 is due to the usage of the C++17 filesystem featur
DCSP_INIT_PARAM params{ model_path, YOLO_ORIGIN_V8, {imgsz_w, imgsz_h}, 0.1, 0.5, false};
// GPU inference
DCSP_INIT_PARAM params{ model_path, YOLO_ORIGIN_V8, {imgsz_w, imgsz_h}, 0.1, 0.5, true};
// Load your image
cv::Mat img = cv::imread(img_path);
// Init Inference Session
char* ret = yoloDetector->CreateSession(params);
char* ret = p1->CreateSession(params);
ret = p->RunSession(img, res);
ret = yoloDetector->RunSession(img, res);
```
This repository should also work for YOLOv5, which needs a permute operator for the output of the YOLOv5 model, but this has not been implemented yet.