From 266861e880f936802c862ee096c0423d6e84865f Mon Sep 17 00:00:00 2001 From: Lakshantha Dissanayake Date: Mon, 1 Jul 2024 02:44:54 -0700 Subject: [PATCH] Update NVIDIA Jetson DeepStream Guide with YOLOv8 and Jetson Orin Support (#14059) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> --- docs/en/guides/deepstream-nvidia-jetson.md | 305 +++++++++++++++++ docs/en/guides/index.md | 3 +- docs/en/yolov5/index.md | 1 - .../tutorials/running_on_jetson_nano.md | 319 ------------------ mkdocs.yml | 11 +- 5 files changed, 313 insertions(+), 326 deletions(-) create mode 100644 docs/en/guides/deepstream-nvidia-jetson.md delete mode 100644 docs/en/yolov5/tutorials/running_on_jetson_nano.md diff --git a/docs/en/guides/deepstream-nvidia-jetson.md b/docs/en/guides/deepstream-nvidia-jetson.md new file mode 100644 index 00000000..4be2d080 --- /dev/null +++ b/docs/en/guides/deepstream-nvidia-jetson.md @@ -0,0 +1,305 @@ +--- +comments: true +description: Learn how to deploy Ultralytics YOLOv8 on NVIDIA Jetson devices using TensorRT and DeepStream SDK. Explore performance benchmarks and maximize AI capabilities. +keywords: Ultralytics, YOLOv8, NVIDIA Jetson, JetPack, AI deployment, embedded systems, deep learning, TensorRT, DeepStream SDK, computer vision +--- + +# Ultralytics YOLOv8 on NVIDIA Jetson using DeepStream SDK and TensorRT + +This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLOv8 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices using DeepStream SDK and TensorRT. Here we use TensorRT to maximize the inference performance on the Jetson platform. + +DeepStream on NVIDIA Jetson + +!!! Note + + This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.4](https://developer.nvidia.com/jetpack-sdk-464). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy. + +## What is NVIDIA DeepStream? + +[NVIDIA's DeepStream SDK](https://developer.nvidia.com/deepstream-sdk) is a complete streaming analytics toolkit based on GStreamer for AI-based multi-sensor processing, video, audio, and image understanding. It's ideal for vision AI developers, software partners, startups, and OEMs building IVA (Intelligent Video Analytics) apps and services. You can now create stream-processing pipelines that incorporate neural networks and other complex processing tasks like tracking, video encoding/decoding, and video rendering. These pipelines enable real-time analytics on video, image, and sensor data. DeepStream's multi-platform support gives you a faster, easier way to develop vision AI applications and services on-premise, at the edge, and in the cloud. + +## Prerequisites + +Before you start to follow this guide: + +- Visit our documentation, [Quick Start Guide: NVIDIA Jetson with Ultralytics YOLOv8](nvidia-jetson.md) to set up your NVIDIA Jetson device with Ultralytics YOLOv8 +- Install [DeepStream SDK](https://developer.nvidia.com/deepstream-getting-started) according to the JetPack version + + - For JetPack 4.6.4, install [DeepStream 6.0.1](https://docs.nvidia.com/metropolis/deepstream/6.0.1/dev-guide/text/DS_Quickstart.html) + - For JetPack 5.1.3, install [DeepStream 6.3](https://docs.nvidia.com/metropolis/deepstream/6.3/dev-guide/text/DS_Quickstart.html) + +!!! Tip + + In this guide we have used the Debian package method of installing DeepStream SDK to the Jetson device. You can also visit the [DeepStream SDK on Jetson (Archived)](https://developer.nvidia.com/embedded/deepstream-on-jetson-downloads-archived) to access legacy versions of DeepStream. + +## DeepStream Configuration for YOLOv8 + +Here we are using [marcoslucianops/DeepStream-Yolo](https://github.com/marcoslucianops/DeepStream-Yolo) GitHub repository which includes NVIDIA DeepStream SDK support for YOLO models. We appreciate the efforts of marcoslucianops for his contributions! + +1. Install dependencies + + ```bash + pip install cmake + pip install onnxsim + ``` + +2. Clone the following repository + + ```bash + git clone https://github.com/marcoslucianops/DeepStream-Yolo + cd DeepStream-Yolo + ``` + +3. Download Ultralytics YOLOv8 detection model (.pt) of your choice from [YOLOv8 releases](https://github.com/ultralytics/assets/releases). Here we use [yolov8s.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt). + + ```bash + wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt + ``` + + !!! Note + + You can also use a [custom trained YOLOv8 model](https://docs.ultralytics.com/modes/train/). + +4. Convert model to ONNX + + ```bash + python3 utils/export_yoloV8.py -w yolov8s.pt + ``` + + !!! Note "Pass the below arguments to the above command" + + For DeepStream 6.0.1, use opset 12 or lower. The default opset is 16. + + ```bash + --opset 12 + ``` + + To change the inference size (default: 640) + + ```bash + -s SIZE + --size SIZE + -s HEIGHT WIDTH + --size HEIGHT WIDTH + ``` + + Example for 1280: + + ```bash + -s 1280 + or + -s 1280 1280 + ``` + + To simplify the ONNX model (DeepStream >= 6.0) + + ```bash + --simplify + ``` + + To use dynamic batch-size (DeepStream >= 6.1) + + ```bash + --dynamic + ``` + + To use static batch-size (example for batch-size = 4) + + ```bash + --batch 4 + ``` + +5. Set the CUDA version according to the JetPack version installed + + For JetPack 4.6.4: + + ```bash + export CUDA_VER=10.2 + ``` + + For JetPack 5.1.3: + + ```bash + export CUDA_VER=11.4 + ``` + +6. Compile the library + + ```bash + make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo + ``` + +7. Edit the `config_infer_primary_yoloV8.txt` file according to your model (for YOLOv8s with 80 classes) + + ```bash + [property] + ... + onnx-file=yolov8s.onnx + ... + num-detected-classes=80 + ... + ``` + +8. Edit the `deepstream_app_config` file + + ```bash + ... + [primary-gie] + ... + config-file=config_infer_primary_yoloV8.txt + ``` + +9. You can also change the video source in `deepstream_app_config` file. Here a default video file is loaded + + ```bash + ... + [source0] + ... + uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 + ``` + +### Run Inference + +```bash +deepstream-app -c deepstream_app_config.txt +``` + +!!! Note + + It will take a long time to generate the TensorRT engine file before starting the inference. So please be patient. + +
YOLOv8 with deepstream
+ +!!! Tip + + If you want to convert the model to FP16 precision, simply set `model-engine-file=model_b1_gpu0_fp16.engine` and `network-mode=2` inside `config_infer_primary_yoloV8.txt` + +## INT8 Calibration + +If you want to use INT8 precision for inference, you need to follow the steps below + +1. Set `OPENCV` environment variable + + ```bash + export OPENCV=1 + ``` + +2. Compile the library + + ```bash + make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo + ``` + +3. For COCO dataset, download the [val2017](http://images.cocodataset.org/zips/val2017.zip), extract, and move to `DeepStream-Yolo` folder + +4. Make a new directory for calibration images + + ```bash + mkdir calibration + ``` + +5. Run the following to select 1000 random images from COCO dataset to run calibration + + ```bash + for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \ + cp ${jpg} calibration/; \ + done + ``` + + !!! Note + + NVIDIA recommends at least 500 images to get a good accuracy. On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). You can set it from **head -1000**. For example, for 2000 images, **head -2000**. This process can take a long time. + +6. Create the `calibration.txt` file with all selected images + + ```bash + realpath calibration/*jpg > calibration.txt + ``` + +7. Set environment variables + + ```bash + export INT8_CALIB_IMG_PATH=calibration.txt + export INT8_CALIB_BATCH_SIZE=1 + ``` + + !!! Note + + Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory. + +8. Update the `config_infer_primary_yoloV8.txt` file + + From + + ```bash + ... + model-engine-file=model_b1_gpu0_fp32.engine + #int8-calib-file=calib.table + ... + network-mode=0 + ... + ``` + + To + + ```bash + ... + model-engine-file=model_b1_gpu0_int8.engine + int8-calib-file=calib.table + ... + network-mode=1 + ... + ``` + +### Run Inference + +```bash +deepstream-app -c deepstream_app_config.txt +``` + +## MultiStream Setup + +To set up multiple streams under a single deepstream application, you can do the following changes to the `deepstream_app_config.txt` file + +1. Change the rows and columns to build a grid display according to the number of streams you want to have. For example, for 4 streams, we can add 2 rows and 2 columns. + + ```bash + [tiled-display] + rows=2 + columns=2 + ``` + +2. Set `num-sources=4` and add `uri` of all the 4 streams + + ```bash + [source0] + enable=1 + type=3 + uri= + uri= + uri= + uri= + num-sources=4 + ``` + +### Run Inference + +```bash +deepstream-app -c deepstream_app_config.txt +``` + +
Multistream setup
+ +## Benchmark Results + +The following table summarizes how YOLOv8s models perform at different TensorRT precision levels with an input size of 640x640 on NVIDIA Jetson Orin NX 16GB. + +| Model Name | Precision | Inference Time (ms/im) | FPS | +| ---------- | --------- | ---------------------- | --- | +| YOLOv8s | FP32 | 15.63 | 64 | +| | FP16 | 7.94 | 126 | +| | INT8 | 5.53 | 181 | + +### Acknowledgements + +This guide was initially created by our friends at Seeed Studio, Lakshantha and Elaine. diff --git a/docs/en/guides/index.md b/docs/en/guides/index.md index 613f3c0a..6f3fe9a6 100644 --- a/docs/en/guides/index.md +++ b/docs/en/guides/index.md @@ -35,7 +35,8 @@ Here's a compilation of in-depth guides to help you master different aspects of - [Conda Quickstart](conda-quickstart.md) 🚀 NEW: Step-by-step guide to setting up a [Conda](https://anaconda.org/conda-forge/ultralytics) environment for Ultralytics. Learn how to install and start using the Ultralytics package efficiently with Conda. - [Docker Quickstart](docker-quickstart.md) 🚀 NEW: Complete guide to setting up and using Ultralytics YOLO models with [Docker](https://hub.docker.com/r/ultralytics/ultralytics). Learn how to install Docker, manage GPU support, and run YOLO models in isolated containers for consistent development and deployment. - [Raspberry Pi](raspberry-pi.md) 🚀 NEW: Quickstart tutorial to run YOLO models to the latest Raspberry Pi hardware. -- [NVIDIA-Jetson](nvidia-jetson.md) 🚀 NEW: Quickstart guide for deploying YOLO models on NVIDIA Jetson devices. +- [NVIDIA Jetson](nvidia-jetson.md) 🚀 NEW: Quickstart guide for deploying YOLO models on NVIDIA Jetson devices. +- [DeepStream on NVIDIA Jetson](deepstream-nvidia-jetson.md) 🚀 NEW: Quickstart guide for deploying YOLO models on NVIDIA Jetson devices using DeepStream and TensorRT. - [Triton Inference Server Integration](triton-inference-server.md) 🚀 NEW: Dive into the integration of Ultralytics YOLOv8 with NVIDIA's Triton Inference Server for scalable and efficient deep learning inference deployments. - [YOLO Thread-Safe Inference](yolo-thread-safe-inference.md) 🚀 NEW: Guidelines for performing inference with YOLO models in a thread-safe manner. Learn the importance of thread safety and best practices to prevent race conditions and ensure consistent predictions. - [Isolating Segmentation Objects](isolating-segmentation-objects.md) 🚀 NEW: Step-by-step recipe and explanation on how to extract and/or isolate objects from images using Ultralytics Segmentation. diff --git a/docs/en/yolov5/index.md b/docs/en/yolov5/index.md index 45f3e6a1..593d13b6 100644 --- a/docs/en/yolov5/index.md +++ b/docs/en/yolov5/index.md @@ -39,7 +39,6 @@ Here's a compilation of comprehensive tutorials that will guide you through diff - [Multi-GPU Training](tutorials/multi_gpu_training.md): Understand how to leverage multiple GPUs to expedite your training. - [PyTorch Hub](tutorials/pytorch_hub_model_loading.md) 🌟 NEW: Learn to load pre-trained models via PyTorch Hub. - [TFLite, ONNX, CoreML, TensorRT Export](tutorials/model_export.md) 🚀: Understand how to export your model to different formats. -- [NVIDIA Jetson platform Deployment](tutorials/running_on_jetson_nano.md) 🌟 NEW: Learn how to deploy your YOLOv5 model on NVIDIA Jetson platform. - [Test-Time Augmentation (TTA)](tutorials/test_time_augmentation.md): Explore how to use TTA to improve your model's prediction accuracy. - [Model Ensembling](tutorials/model_ensembling.md): Learn the strategy of combining multiple models for improved performance. - [Model Pruning/Sparsity](tutorials/model_pruning_and_sparsity.md): Understand pruning and sparsity concepts, and how to create a more efficient model. diff --git a/docs/en/yolov5/tutorials/running_on_jetson_nano.md b/docs/en/yolov5/tutorials/running_on_jetson_nano.md deleted file mode 100644 index cb3dd080..00000000 --- a/docs/en/yolov5/tutorials/running_on_jetson_nano.md +++ /dev/null @@ -1,319 +0,0 @@ ---- -comments: true -description: Learn how to deploy models on NVIDIA Jetson using TensorRT and DeepStream SDK. Follow our step-by-step guide for optimized AI inference. -keywords: NVIDIA Jetson, TensorRT, DeepStream SDK, AI deployment, Jetson Nano, Jetson Xavier NX, YOLOv5, AI inference, Ultralytics ---- - -# Deploy on NVIDIA Jetson using TensorRT and DeepStream SDK - -📚 This guide explains how to deploy a trained model into NVIDIA Jetson Platform and perform inference using TensorRT and DeepStream SDK. Here we use TensorRT to maximize the inference performance on the Jetson platform. - -## Hardware Verification - -We have tested and verified this guide on the following Jetson devices - -- [Seeed reComputer J1010 built with Jetson Nano module](https://www.seeedstudio.com/Jetson-10-1-A0-p-5336.html) -- [Seeed reComputer J2021 built with Jetson Xavier NX module](https://www.seeedstudio.com/reComputer-J2021-p-5438.html) - -## Before You Start - -Make sure you have properly installed **JetPack SDK** with all the **SDK Components** and **DeepStream SDK** on the Jetson device as this includes CUDA, TensorRT and DeepStream SDK which are needed for this guide. - -JetPack SDK provides a full development environment for hardware-accelerated AI-at-the-edge development. All Jetson modules and developer kits are supported by JetPack SDK. - -There are two major installation methods including, - -1. SD Card Image Method -2. NVIDIA SDK Manager Method - -You can find a very detailed installation guide from NVIDIA [official website](https://developer.nvidia.com/jetpack-sdk-461). You can also find guides corresponding to the above-mentioned [reComputer J1010](https://wiki.seeedstudio.com/reComputer_J1010_J101_Flash_Jetpack) and [reComputer J2021](https://wiki.seeedstudio.com/reComputer_J2021_J202_Flash_Jetpack). - -## Install Necessary Packages - -- **Step 1.** Access the terminal of Jetson device, install pip and upgrade it - -```sh -sudo apt update -sudo apt install -y python3-pip -pip3 install --upgrade pip -``` - -- **Step 2.** Clone the following repo - -```sh -git clone https://github.com/ultralytics/yolov5 -``` - -- **Step 3.** Open **requirements.txt** - -```sh -cd yolov5 -vi requirements.txt -``` - -- **Step 5.** Edit the following lines. Here you need to press **i** first to enter editing mode. Press **ESC**, then type **:wq** to save and quit - -```sh -# torch>=1.8.0 -# torchvision>=0.9.0 -``` - -**Note:** torch and torchvision are excluded for now because they will be installed later. - -- **Step 6.** install the below dependency - -```sh -sudo apt install -y libfreetype6-dev -``` - -- **Step 7.** Install the necessary packages - -```sh -pip3 install -r requirements.txt -``` - -## Install PyTorch and Torchvision - -We cannot install PyTorch and Torchvision from pip because they are not compatible to run on Jetson platform which is based on **ARM aarch64 architecture**. Therefore, we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source. - -Visit [this page](https://forums.developer.nvidia.com/t/pytorch-for-jetson) to access all the PyTorch and Torchvision links. - -Here are some of the versions supported by JetPack 4.6 and above. - -**PyTorch v1.10.0** - -Supported by JetPack 4.4 (L4T R32.4.3) / JetPack 4.4.1 (L4T R32.4.4) / JetPack 4.5 (L4T R32.5.0) / JetPack 4.5.1 (L4T R32.5.1) / JetPack 4.6 (L4T R32.6.1) with Python 3.6 - -- **file_name:** torch-1.10.0-cp36-cp36m-linux_aarch64.whl -- **URL:** [https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl](https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl) - -**PyTorch v1.12.0** - -Supported by JetPack 5.0 (L4T R34.1.0) / JetPack 5.0.1 (L4T R34.1.1) / JetPack 5.0.2 (L4T R35.1.0) with Python 3.8 - -- **file_name:** torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl -- **URL:** [https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl](https://developer.download.nvidia.com/compute/redist/jp/v50/pytorch/torch-1.12.0a0+2c916ef.nv22.3-cp38-cp38-linux_aarch64.whl) - -- **Step 1.** Install torch according to your JetPack version in the following format - -```sh -wget -O -pip3 install -``` - -For example, here we are running **JP4.6.1**, and therefore we choose **PyTorch v1.10.0** - -```sh -cd ~ -sudo apt-get install -y libopenblas-base libopenmpi-dev -wget https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl -O torch-1.10.0-cp36-cp36m-linux_aarch64.whl -pip3 install torch-1.10.0-cp36-cp36m-linux_aarch64.whl -``` - -- **Step 2.** Install torchvision depending on the version of PyTorch that you have installed. For example, we chose **PyTorch v1.10.0**, which means, we need to choose **Torchvision v0.11.1** - -```sh -sudo apt install -y libjpeg-dev zlib1g-dev -git clone --branch v0.11.1 https://github.com/pytorch/vision torchvision -cd torchvision -sudo python3 setup.py install -``` - -Here a list of the corresponding torchvision version that you need to install according to the PyTorch version: - -- PyTorch v1.10 - torchvision v0.11.1 -- PyTorch v1.12 - torchvision v0.13.0 - -## DeepStream Configuration for YOLOv5 - -- **Step 1.** Clone the following repo - -```sh -cd ~ -git clone https://github.com/marcoslucianops/DeepStream-Yolo -``` - -- **Step 2.** Copy **gen_wts_yoloV5.py** from **DeepStream-Yolo/utils** into **yolov5** directory - -```sh -cp DeepStream-Yolo/utils/gen_wts_yoloV5.py yolov5 -``` - -- **Step 3.** Inside the yolov5 repo, download **pt file** from YOLOv5 releases (example for YOLOv5s 6.1) - -```sh -cd yolov5 -wget https://github.com/ultralytics/yolov5/releases/download/v6.1/yolov5s.pt -``` - -- **Step 4.** Generate the **cfg** and **wts** files - -```sh -python3 gen_wts_yoloV5.py -w yolov5s.pt -``` - -**Note**: To change the inference size (default: 640) - -```sh --s SIZE ---size SIZE --s HEIGHT WIDTH ---size HEIGHT WIDTH - -Example for 1280: - --s 1280 -or --s 1280 1280 -``` - -- **Step 5.** Copy the generated **cfg** and **wts** files into the **DeepStream-Yolo** folder - -```sh -cp yolov5s.cfg ~/DeepStream-Yolo -cp yolov5s.wts ~/DeepStream-Yolo -``` - -- **Step 6.** Open the **DeepStream-Yolo** folder and compile the library - -```sh -cd ~/DeepStream-Yolo -CUDA_VER=11.4 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.1 -CUDA_VER=10.2 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.0.1 / 6.0 -``` - -- **Step 7.** Edit the **config_infer_primary_yoloV5.txt** file according to your model - -```sh -[property] -... -custom-network-config=yolov5s.cfg -model-file=yolov5s.wts -... -``` - -- **Step 8.** Edit the **deepstream_app_config** file - -```sh -... -[primary-gie] -... -config-file=config_infer_primary_yoloV5.txt -``` - -- **Step 9.** Change the video source in **deepstream_app_config** file. Here a default video file is loaded as you can see below - -```sh -... -[source0] -... -uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4 -``` - -## Run the Inference - -```sh -deepstream-app -c deepstream_app_config.txt -``` - -
YOLOv5 with deepstream FP32
- -The above result is running on **Jetson Xavier NX** with **FP32** and **YOLOv5s 640x640**. We can see that the **FPS** is around **30**. - -## INT8 Calibration - -If you want to use INT8 precision for inference, you need to follow the steps below - -- **Step 1.** Install OpenCV - -```sh -sudo apt-get install libopencv-dev -``` - -- **Step 2.** Compile/recompile the **nvdsinfer_custom_impl_Yolo** library with OpenCV support - -```sh -cd ~/DeepStream-Yolo -CUDA_VER=11.4 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.1 -CUDA_VER=10.2 OPENCV=1 make -C nvdsinfer_custom_impl_Yolo # for DeepStream 6.0.1 / 6.0 -``` - -- **Step 3.** For COCO dataset, download the [val2017](https://drive.google.com/file/d/1gbvfn7mcsGDRZ_luJwtITL-ru2kK99aK/view?usp=sharing), extract, and move to **DeepStream-Yolo** folder - -- **Step 4.** Make a new directory for calibration images - -```sh -mkdir calibration -``` - -- **Step 5.** Run the following to select 1000 random images from COCO dataset to run calibration - -```sh -for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \ - cp ${jpg} calibration/; \ -done -``` - -**Note:** NVIDIA recommends at least 500 images to get a good accuracy. On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). Higher INT8_CALIB_BATCH_SIZE values will result in more accuracy and faster calibration speed. Set it according to you GPU memory. You can set it from **head -1000**. For example, for 2000 images, **head -2000**. This process can take a long time. - -- **Step 6.** Create the **calibration.txt** file with all selected images - -```sh -realpath calibration/*jpg > calibration.txt -``` - -- **Step 7.** Set environment variables - -```sh -export INT8_CALIB_IMG_PATH=calibration.txt -export INT8_CALIB_BATCH_SIZE=1 -``` - -- **Step 8.** Update the **config_infer_primary_yoloV5.txt** file - -From - -```sh -... -model-engine-file=model_b1_gpu0_fp32.engine -#int8-calib-file=calib.table -... -network-mode=0 -... -``` - -To - -```sh -... -model-engine-file=model_b1_gpu0_int8.engine -int8-calib-file=calib.table -... -network-mode=1 -... -``` - -- **Step 9.** Run the inference - -```sh -deepstream-app -c deepstream_app_config.txt -``` - -
YOLOv5 with deepstream INT8
- -The above result is running on **Jetson Xavier NX** with **INT8** and **YOLOv5s 640x640**. We can see that the **FPS** is around **60**. - -## Benchmark results - -The following table summarizes how different models perform on **Jetson Xavier NX**. - -| Model Name | Precision | Inference Size | Inference Time (ms) | FPS | -| ---------- | --------- | -------------- | ------------------- | --- | -| YOLOv5s | FP32 | 320x320 | 16.66 | 60 | -| | FP32 | 640x640 | 33.33 | 30 | -| | INT8 | 640x640 | 16.66 | 60 | -| YOLOv5n | FP32 | 640x640 | 16.66 | 60 | - -### Additional - -This tutorial is written by our friends at seeed @lakshanthad and Elaine diff --git a/mkdocs.yml b/mkdocs.yml index c0e41d03..76cd8e78 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -326,6 +326,7 @@ nav: - Docker Quickstart: guides/docker-quickstart.md - Raspberry Pi: guides/raspberry-pi.md - NVIDIA Jetson: guides/nvidia-jetson.md + - DeepStream on NVIDIA Jetson: guides/deepstream-nvidia-jetson.md - Triton Inference Server: guides/triton-inference-server.md - Isolating Segmentation Objects: guides/isolating-segmentation-objects.md - Edge TPU on Raspberry Pi: guides/coral-edge-tpu-on-raspberry-pi.md @@ -357,7 +358,6 @@ nav: - Multi-GPU Training: yolov5/tutorials/multi_gpu_training.md - PyTorch Hub: yolov5/tutorials/pytorch_hub_model_loading.md - TFLite, ONNX, CoreML, TensorRT Export: yolov5/tutorials/model_export.md - - NVIDIA Jetson Nano Deployment: yolov5/tutorials/running_on_jetson_nano.md - Test-Time Augmentation (TTA): yolov5/tutorials/test_time_augmentation.md - Model Ensembling: yolov5/tutorials/model_ensembling.md - Pruning/Sparsity Tutorial: yolov5/tutorials/model_pruning_and_sparsity.md @@ -657,7 +657,7 @@ plugins: tutorials/hyperparameter-evolution.md: yolov5/tutorials/hyperparameter_evolution.md tutorials/model-ensembling.md: yolov5/tutorials/model_ensembling.md tutorials/multi-gpu-training.md: yolov5/tutorials/multi_gpu_training.md - tutorials/nvidia-jetson.md: yolov5/tutorials/running_on_jetson_nano.md + tutorials/nvidia-jetson.md: guides/nvidia-jetson.md tutorials/pruning-sparsity.md: yolov5/tutorials/model_pruning_and_sparsity.md tutorials/pytorch-hub.md: yolov5/tutorials/pytorch_hub_model_loading.md tutorials/roboflow.md: yolov5/tutorials/roboflow_datasets_integration.md @@ -676,7 +676,7 @@ plugins: yolov5/tta.md: yolov5/tutorials/test_time_augmentation.md yolov5/multi_gpu_training.md: yolov5/tutorials/multi_gpu_training.md yolov5/ensemble.md: yolov5/tutorials/model_ensembling.md - yolov5/jetson_nano.md: yolov5/tutorials/running_on_jetson_nano.md + yolov5/jetson_nano.md: guides/nvidia-jetson.md yolov5/transfer_learn_frozen.md: yolov5/tutorials/transfer_learning_with_frozen_layers.md yolov5/neural_magic.md: yolov5/tutorials/neural_magic_pruning_quantization.md yolov5/train_custom_data.md: yolov5/tutorials/train_custom_data.md @@ -691,7 +691,7 @@ plugins: yolov5/tutorials/multi_gpu_training_tutorial.md: yolov5/tutorials/multi_gpu_training.md yolov5/tutorials/yolov5_pytorch_hub_tutorial.md: yolov5/tutorials/pytorch_hub_model_loading.md yolov5/tutorials/model_export_tutorial.md: yolov5/tutorials/model_export.md - yolov5/tutorials/jetson_nano_tutorial.md: yolov5/tutorials/running_on_jetson_nano.md + yolov5/tutorials/jetson_nano_tutorial.md: guides/nvidia-jetson.md yolov5/tutorials/yolov5_model_ensembling_tutorial.md: yolov5/tutorials/model_ensembling.md yolov5/tutorials/roboflow_integration.md: yolov5/tutorials/roboflow_datasets_integration.md yolov5/tutorials/pruning_and_sparsity_tutorial.md: yolov5/tutorials/model_pruning_and_sparsity.md @@ -703,7 +703,8 @@ plugins: yolov5/tutorials/yolov5_train_custom_data.md: yolov5/tutorials/train_custom_data.md yolov5/tutorials/comet_integration_tutorial.md: yolov5/tutorials/comet_logging_integration.md yolov5/tutorials/yolov5_pruning_and_sparsity_tutorial.md: yolov5/tutorials/model_pruning_and_sparsity.md - yolov5/tutorials/yolov5_jetson_nano_tutorial.md: yolov5/tutorials/running_on_jetson_nano.md + yolov5/tutorials/yolov5_jetson_nano_tutorial.md: guides/nvidia-jetson.md + yolov5/tutorials/running_on_jetson_nano.md: guides/nvidia-jetson.md yolov5/tutorials/yolov5_roboflow_integration.md: yolov5/tutorials/roboflow_datasets_integration.md yolov5/tutorials/hyperparameter_evolution_tutorial.md: yolov5/tutorials/hyperparameter_evolution.md yolov5/tutorials/yolov5_hyperparameter_evolution_tutorial.md: yolov5/tutorials/hyperparameter_evolution.md