ORT_CPP add CUDA FP16 inference (#4320)

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2023-08-12 23:27:52 +08:00 · 2023-08-12 23:27:52 +08:00 · 1c753cbce6
commit 1c753cbce6
parent 02d4f5200d
5 changed files with 57 additions and 24 deletions
--- a/examples/YOLOv8-ONNXRuntime-CPP/README.md
+++ b/examples/YOLOv8-ONNXRuntime-CPP/README.md
@ -6,8 +6,7 @@ This example demonstrates how to perform inference using YOLOv8 in C++ with ONNX

 - Friendly for deployment in the industrial sector.
 - Faster than OpenCV's DNN inference on both CPU and GPU.
- Supports CUDA acceleration.
- Easy to add FP16 inference (using template functions).
+- Supports FP32 and FP16 CUDA acceleration.

 ## Exporting YOLOv8 Models

@ -47,13 +46,12 @@ Note: The dependency on C++17 is due to the usage of the C++17 filesystem featur
 DCSP_INIT_PARAM params{ model_path, YOLO_ORIGIN_V8, {imgsz_w, imgsz_h}, 0.1, 0.5, false};
 // GPU inference
 DCSP_INIT_PARAM params{ model_path, YOLO_ORIGIN_V8, {imgsz_w, imgsz_h}, 0.1, 0.5, true};
-
 // Load your image
 cv::Mat img = cv::imread(img_path);
+// Init Inference Session
+char* ret = yoloDetector->CreateSession(params);

-char* ret = p1->CreateSession(params);
-
-ret = p->RunSession(img, res);
+ret = yoloDetector->RunSession(img, res);
 ```

 This repository should also work for YOLOv5, which needs a permute operator for the output of the YOLOv5 model, but this has not been implemented yet.