From 1d33db5fa9c34717d508e01c93dfec6e3563563b Mon Sep 17 00:00:00 2001
From: Lakshantha Dissanayake <lakshanthad@yahoo.com>
Date: Mon, 30 Dec 2024 06:40:00 -0800
Subject: [PATCH] Update DeepStream Doc with YOLO11 and DeepStream 7.1 (#18443)

Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
---
 docs/en/guides/deepstream-nvidia-jetson.md | 87 +++++++++++++++-------
 1 file changed, 60 insertions(+), 27 deletions(-)

diff --git a/docs/en/guides/deepstream-nvidia-jetson.md b/docs/en/guides/deepstream-nvidia-jetson.md
index 90c361cb..678d1b11 100644
--- a/docs/en/guides/deepstream-nvidia-jetson.md
+++ b/docs/en/guides/deepstream-nvidia-jetson.md
@@ -23,7 +23,8 @@ This comprehensive guide provides a detailed walkthrough for deploying Ultralyti
 
 !!! note
 
-    This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.4](https://developer.nvidia.com/jetpack-sdk-464). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
+    This guide has been tested with [NVIDIA Jetson Orin Nano Super Developer Kit](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit) running the latest stable JetPack release of [JP6.1](https://developer.nvidia.com/embedded/jetpack-sdk-61),
+    [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.4](https://developer.nvidia.com/jetpack-sdk-464). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
 
 ## What is NVIDIA DeepStream?
 
@@ -38,6 +39,7 @@ Before you start to follow this guide:
 
     - For JetPack 4.6.4, install [DeepStream 6.0.1](https://docs.nvidia.com/metropolis/deepstream/6.0.1/dev-guide/text/DS_Quickstart.html)
     - For JetPack 5.1.3, install [DeepStream 6.3](https://docs.nvidia.com/metropolis/deepstream/6.3/dev-guide/text/DS_Quickstart.html)
+    - For JetPack 6.1, install [DeepStream 7.1](https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_Installation.html)
 
 !!! tip
 
@@ -47,34 +49,48 @@ Before you start to follow this guide:
 
 Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) GitHub repository which includes NVIDIA DeepStream SDK support for YOLO models. We appreciate the efforts of marcoslucianops for his contributions!
 
-1.  Install dependencies
+1.  Install Ultralytics with necessary dependencies
 
     ```bash
-    pip install cmake
-    pip install onnxsim
+    cd ~
+    pip install -U pip
+    git clone https://github.com/ultralytics/ultralytics
+    cd ultralytics
+    pip install -e ".[export]" onnxslim
     ```
 
-2.  Clone the following repository
+2.  Clone the DeepStream-Yolo repository
 
     ```bash
+    cd ~
     git clone https://github.com/marcoslucianops/DeepStream-Yolo
-    cd DeepStream-Yolo
     ```
 
-3.  Download Ultralytics YOLO11 detection model (.pt) of your choice from [YOLO11 releases](https://github.com/ultralytics/assets/releases). Here we use [yolov8s.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt).
+3.  Copy the `export_yoloV8.py` file from `DeepStream-Yolo/utils` directory to the `ultralytics` folder
 
     ```bash
-    wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt
+    cp ~/DeepStream-Yolo/utils/export_yoloV8.py ~/ultralytics
+    cd ultralytics
+    ```
+
+    !!! note
+
+        `export_yoloV8.py` works for both YOLOv8 and YOLO11 models.
+
+4.  Download Ultralytics YOLO11 detection model (.pt) of your choice from [YOLO11 releases](https://github.com/ultralytics/assets/releases). Here we use [yolo11s.pt](https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt).
+
+    ```bash
+    wget https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11s.pt
     ```
 
     !!! note
 
         You can also use a [custom trained YOLO11 model](https://docs.ultralytics.com/modes/train/).
 
-4.  Convert model to ONNX
+5.  Convert model to ONNX
 
     ```bash
-    python3 utils/export_yoloV8.py -w yolov8s.pt
+    python3 export_yoloV8.py -w yolo11s.pt
     ```
 
     !!! note "Pass the below arguments to the above command"
@@ -120,7 +136,14 @@ Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcosluc
         --batch 4
         ```
 
-5.  Set the CUDA version according to the JetPack version installed
+6.  Copy the generated `.onnx` model file and `labels.txt` file to the `DeepStream-Yolo` folder
+
+    ```bash
+    cp yolo11s.pt.onnx labels.txt ~/DeepStream-Yolo
+    cd ~/DeepStream-Yolo
+    ```
+
+7.  Set the CUDA version according to the JetPack version installed
 
     For JetPack 4.6.4:
 
@@ -134,24 +157,30 @@ Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcosluc
     export CUDA_VER=11.4
     ```
 
-6.  Compile the library
+    For Jetpack 6.1:
+
+    ```bash
+    export CUDA_VER=12.6
+    ```
+
+8.  Compile the library
 
     ```bash
     make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo
     ```
 
-7.  Edit the `config_infer_primary_yoloV8.txt` file according to your model (for YOLOv8s with 80 classes)
+9.  Edit the `config_infer_primary_yoloV8.txt` file according to your model (for YOLO11s with 80 classes)
 
     ```bash
     [property]
     ...
-    onnx-file=yolov8s.onnx
+    onnx-file=yolo11s.pt.onnx
     ...
     num-detected-classes=80
     ...
     ```
 
-8.  Edit the `deepstream_app_config` file
+10. Edit the `deepstream_app_config` file
 
     ```bash
     ...
@@ -160,7 +189,7 @@ Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcosluc
     config-file=config_infer_primary_yoloV8.txt
     ```
 
-9.  You can also change the video source in `deepstream_app_config` file. Here a default video file is loaded
+11. You can also change the video source in `deepstream_app_config` file. Here a default video file is loaded
 
     ```bash
     ...
@@ -183,12 +212,16 @@ deepstream-app -c deepstream_app_config.txt
 
 !!! tip
 
-    If you want to convert the model to FP16 [precision](https://www.ultralytics.com/glossary/precision), simply set `model-engine-file=model_b1_gpu0_fp16.engine` and `network-mode=2` inside `config_infer_primary_yoloV8.txt`
+    If you want to convert the model to FP16 precision, simply set `model-engine-file=model_b1_gpu0_fp16.engine` and `network-mode=2` inside `config_infer_primary_yoloV8.txt`
 
 ## INT8 Calibration
 
 If you want to use INT8 precision for inference, you need to follow the steps below
 
+!!! note
+
+    Currently INT8 does not work with TensorRT 10.x. This section of the guide has been tested with TensorRT 8.x which is expected to work.
+
 1.  Set `OPENCV` environment variable
 
     ```bash
@@ -303,13 +336,13 @@ deepstream-app -c deepstream_app_config.txt
 
 ## Benchmark Results
 
-The following table summarizes how YOLOv8s models perform at different TensorRT precision levels with an input size of 640x640 on NVIDIA Jetson Orin NX 16GB.
+The following table summarizes how YOLO11s models perform at different TensorRT precision levels with an input size of 640x640 on NVIDIA Jetson Orin NX 16GB.
 
-| Model Name | Precision | Inference Time (ms/im) | FPS |
-| ---------- | --------- | ---------------------- | --- |
-| YOLOv8s    | FP32      | 15.63                  | 64  |
-|            | FP16      | 7.94                   | 126 |
-|            | INT8      | 5.53                   | 181 |
+| Model Name | Precision | Inference Time (ms/im) | FPS  |
+| ---------- | --------- | ---------------------- | ---- |
+| YOLO11s    | FP32      | 14.6                   | 68.5 |
+|            | FP16      | 7.94                   | 126  |
+|            | INT8      | 5.95                   | 168  |
 
 ### Acknowledgements
 
@@ -336,17 +369,17 @@ To convert a YOLO11 model to ONNX format for deployment with DeepStream, use the
 Here's an example command:
 
 ```bash
-python3 utils/export_yoloV8.py -w yolov8s.pt --opset 12 --simplify
+python3 utils/export_yoloV8.py -w yolo11s.pt --opset 12 --simplify
 ```
 
 For more details on model conversion, check out our [model export section](../modes/export.md).
 
 ### What are the performance benchmarks for YOLO on NVIDIA Jetson Orin NX?
 
-The performance of YOLO11 models on NVIDIA Jetson Orin NX 16GB varies based on TensorRT precision levels. For example, YOLOv8s models achieve:
+The performance of YOLO11 models on NVIDIA Jetson Orin NX 16GB varies based on TensorRT precision levels. For example, YOLO11s models achieve:
 
-- **FP32 Precision**: 15.63 ms/im, 64 FPS
+- **FP32 Precision**: 14.6 ms/im, 68.5 FPS
 - **FP16 Precision**: 7.94 ms/im, 126 FPS
-- **INT8 Precision**: 5.53 ms/im, 181 FPS
+- **INT8 Precision**: 5.95 ms/im, 168 FPS
 
 These benchmarks underscore the efficiency and capability of using TensorRT-optimized YOLO11 models on NVIDIA Jetson hardware. For further details, see our [Benchmark Results](#benchmark-results) section.