Update Jetson doc with NVIDIA Jetson Orin Nano Super Developer Kit (#18289)

Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Laughing-q <1185102784@qq.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2024-12-18 11:53:04 -08:00 · 2024-12-18 11:53:04 -08:00 · 0504d3a8e0
commit 0504d3a8e0
parent cdb36b6b66
1 changed files with 214 additions and 83 deletions
--- a/docs/en/guides/nvidia-jetson.md
+++ b/docs/en/guides/nvidia-jetson.md
@ -2,12 +2,17 @@
 comments: true
 description: Learn to deploy Ultralytics YOLO11 on NVIDIA Jetson devices with our detailed guide. Explore performance benchmarks and maximize AI capabilities.
 keywords: Ultralytics, YOLO11, NVIDIA Jetson, JetPack, AI deployment, performance benchmarks, embedded systems, deep learning, TensorRT, computer vision
+benchmark_version: 8.3.51
 ---

 # Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11

 This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices. Additionally, it showcases performance benchmarks to demonstrate the capabilities of YOLO11 on these small and powerful devices.

+!!! tip "New product support"
+
+    We have updated this guide with the latest [NVIDIA Jetson Orin Nano Super Developer Kit](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit) which delivers up to 67 TOPS of AI performance — a 1.7X improvement over its predecessor — to seamlessly run the most popular AI models.
+
 <p align="center">
  <br>
  <iframe loading="lazy" width="720" height="405" src="https://www.youtube.com/embed/mUybgOlSxxA"
@ -23,7 +28,7 @@ This comprehensive guide provides a detailed walkthrough for deploying Ultralyti

 !!! note

-    This guide has been tested with both [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running the latest stable JetPack release of [JP6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60), JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.1](https://developer.nvidia.com/embedded/jetpack-sdk-461). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.
+    This guide has been tested with [NVIDIA Jetson Orin Nano Super Developer Kit](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit) running the latest stable JetPack release of [JP6.1](https://developer.nvidia.com/embedded/jetpack-sdk-61), [Seeed Studio reComputer J4012](https://www.seeedstudio.com/reComputer-J4012-p-5586.html) which is based on NVIDIA Jetson Orin NX 16GB running JetPack release of [JP6.0](https://developer.nvidia.com/embedded/jetpack-sdk-60)/ JetPack release of [JP5.1.3](https://developer.nvidia.com/embedded/jetpack-sdk-513) and [Seeed Studio reComputer J1020 v2](https://www.seeedstudio.com/reComputer-J1020-v2-p-5498.html) which is based on NVIDIA Jetson Nano 4GB running JetPack release of [JP4.6.1](https://developer.nvidia.com/embedded/jetpack-sdk-461). It is expected to work across all the NVIDIA Jetson hardware lineup including latest and legacy.

 ## What is NVIDIA Jetson?

@ -33,14 +38,14 @@ NVIDIA Jetson is a series of embedded computing boards designed to bring acceler

 [Jetson Orin](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/) is the latest iteration of the NVIDIA Jetson family based on NVIDIA Ampere architecture which brings drastically improved AI performance when compared to the previous generations. Below table compared few of the Jetson devices in the ecosystem.

-|                   | Jetson AGX Orin 64GB                                              | Jetson Orin NX 16GB                                              | Jetson Orin Nano 8GB                                          | Jetson AGX Xavier                                           | Jetson Xavier NX                                              | Jetson Nano                                   |
+|                   | Jetson AGX Orin 64GB                                              | Jetson Orin NX 16GB                                              | Jetson Orin Nano Super                                        | Jetson AGX Xavier                                           | Jetson Xavier NX                                              | Jetson Nano                                   |
 | ----------------- | ----------------------------------------------------------------- | ---------------------------------------------------------------- | ------------------------------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------- | --------------------------------------------- |
-| AI Performance    | 275 TOPS                                                          | 100 TOPS                                                         | 40 TOPs                                                       | 32 TOPS                                                     | 21 TOPS                                                       | 472 GFLOPS                                    |
+| AI Performance    | 275 TOPS                                                          | 100 TOPS                                                         | 67 TOPs                                                       | 32 TOPS                                                     | 21 TOPS                                                       | 472 GFLOPS                                    |
 | GPU               | 2048-core NVIDIA Ampere architecture GPU with 64 Tensor Cores     | 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores    | 1024-core NVIDIA Ampere architecture GPU with 32 Tensor Cores | 512-core NVIDIA Volta architecture GPU with 64 Tensor Cores | 384-core NVIDIA Volta™ architecture GPU with 48 Tensor Cores | 128-core NVIDIA Maxwell™ architecture GPU    |
-| GPU Max Frequency | 1.3 GHz                                                           | 918 MHz                                                          | 625 MHz                                                       | 1377 MHz                                                    | 1100 MHz                                                      | 921MHz                                        |
+| GPU Max Frequency | 1.3 GHz                                                           | 918 MHz                                                          | 1020 MHz                                                      | 1377 MHz                                                    | 1100 MHz                                                      | 921MHz                                        |
 | CPU               | 12-core NVIDIA Arm® Cortex A78AE v8.2 64-bit CPU 3MB L2 + 6MB L3 | 8-core NVIDIA Arm® Cortex A78AE v8.2 64-bit CPU 2MB L2 + 4MB L3 | 6-core Arm® Cortex®-A78AE v8.2 64-bit CPU 1.5MB L2 + 4MB L3 | 8-core NVIDIA Carmel Arm®v8.2 64-bit CPU 8MB L2 + 4MB L3   | 6-core NVIDIA Carmel Arm®v8.2 64-bit CPU 6MB L2 + 4MB L3     | Quad-Core Arm® Cortex®-A57 MPCore processor |
-| CPU Max Frequency | 2.2 GHz                                                           | 2.0 GHz                                                          | 1.5 GHz                                                       | 2.2 GHz                                                     | 1.9 GHz                                                       | 1.43GHz                                       |
-| Memory            | 64GB 256-bit LPDDR5 204.8GB/s                                     | 16GB 128-bit LPDDR5 102.4GB/s                                    | 8GB 128-bit LPDDR5 68 GB/s                                    | 32GB 256-bit LPDDR4x 136.5GB/s                              | 8GB 128-bit LPDDR4x 59.7GB/s                                  | 4GB 64-bit LPDDR4 25.6GB/s"                   |
+| CPU Max Frequency | 2.2 GHz                                                           | 2.0 GHz                                                          | 1.7 GHz                                                       | 2.2 GHz                                                     | 1.9 GHz                                                       | 1.43GHz                                       |
+| Memory            | 64GB 256-bit LPDDR5 204.8GB/s                                     | 16GB 128-bit LPDDR5 102.4GB/s                                    | 8GB 128-bit LPDDR5 102 GB/s                                   | 32GB 256-bit LPDDR4x 136.5GB/s                              | 8GB 128-bit LPDDR4x 59.7GB/s                                  | 4GB 64-bit LPDDR4 25.6GB/s"                   |

 For a more detailed comparison table, please visit the **Technical Specifications** section of [official NVIDIA Jetson page](https://developer.nvidia.com/embedded/jetson-modules).

@ -106,7 +111,7 @@ After this is done, skip to [Use TensorRT on NVIDIA Jetson section](#use-tensorr

 For a native installation without Docker, please refer to the steps below.

-### Run on JetPack 6.x
+### Run on JetPack 6.1

 #### Install Ultralytics Package

@ -136,25 +141,34 @@ Here we will install Ultralytics package on the Jetson with optional dependencie

 The above ultralytics installation will install Torch and Torchvision. However, these 2 packages installed via pip are not compatible to run on Jetson platform which is based on ARM64 architecture. Therefore, we need to manually install pre-built PyTorch pip wheel and compile/ install Torchvision from source.

-Install `torch 2.3.0` and `torchvision 0.18` according to JP6.0
+Install `torch 2.5.0` and `torchvision 0.20` according to JP6.1

 ```bash
-sudo apt-get install libopenmpi-dev libopenblas-base libomp-dev -y
-pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.3.0-cp310-cp310-linux_aarch64.whl
-pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.18.0a0+6043bc2-cp310-cp310-linux_aarch64.whl
+pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.5.0a0+872d972e41.nv24.08-cp310-cp310-linux_aarch64.whl
+pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.20.0a0+afc54f7-cp310-cp310-linux_aarch64.whl
 ```

-Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).
+!!! note
+
+    Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).
+
+Install [`cuSPARSELt`](https://developer.nvidia.com/cusparselt-downloads?target_os=Linux&target_arch=aarch64-jetson&Compilation=Native&Distribution=Ubuntu&target_version=22.04&target_type=deb_network) to fix a dependency issue with `torch 2.5.0`
+
+```bash
+wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb
+sudo dpkg -i cuda-keyring_1.1-1_all.deb
+sudo apt-get update
+sudo apt-get -y install libcusparselt0 libcusparselt-dev
+```

 #### Install `onnxruntime-gpu`

 The [onnxruntime-gpu](https://pypi.org/project/onnxruntime-gpu/) package hosted in PyPI does not have `aarch64` binaries for the Jetson. So we need to manually install this package. This package is needed for some of the exports.

-All different `onnxruntime-gpu` packages corresponding to different JetPack and Python versions are listed [here](https://elinux.org/Jetson_Zoo#ONNX_Runtime). However, here we will download and install `onnxruntime-gpu 1.18.0` with `Python3.10` support.
+All different `onnxruntime-gpu` packages corresponding to different JetPack and Python versions are listed [here](https://elinux.org/Jetson_Zoo#ONNX_Runtime). However, here we will download and install `onnxruntime-gpu 1.20.0` with `Python3.10` support.

 ```bash
-wget https://nvidia.box.com/shared/static/48dtuob7meiw6ebgfsfqakc9vse62sg4.whl -O onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl
-pip install onnxruntime_gpu-1.18.0-cp310-cp310-linux_aarch64.whl
+pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/onnxruntime_gpu-1.20.0-cp310-cp310-linux_aarch64.whl
 ```

 !!! note
@ -217,7 +231,9 @@ The above ultralytics installation will install Torch and Torchvision. However,
    python3 setup.py install --user
    ```

-Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).
+!!! note
+
+    Visit the [PyTorch for Jetson page](https://forums.developer.nvidia.com/t/pytorch-for-jetson/72048) to access all different versions of PyTorch for different JetPack versions. For a more detailed list on the PyTorch, Torchvision compatibility, visit the [PyTorch and Torchvision compatibility page](https://github.com/pytorch/vision).

 #### Install `onnxruntime-gpu`

@ -325,106 +341,221 @@ The following Jetson devices are equipped with DLA hardware:

 ## NVIDIA Jetson Orin YOLO11 Benchmarks

-YOLO11 benchmarks were run by the Ultralytics team on 10 different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on Seeed Studio reComputer J4012 powered by Jetson Orin NX 16GB device at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640.
+YOLO11 benchmarks were run by the Ultralytics team on 10 different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on both NVIDIA Jetson Orin Nano Super Developer Kit and Seeed Studio reComputer J4012 powered by Jetson Orin NX 16GB device at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640.

-### Comparison Chart
+### Comparison Charts

 Even though all model exports are working with NVIDIA Jetson, we have only included **PyTorch, TorchScript, TensorRT** for the comparison chart below because, they make use of the GPU on the Jetson and are guaranteed to produce the best results. All the other exports only utilize the CPU and the performance is not as good as the above three. You can find benchmarks for all exports in the section after this chart.

-<div style="text-align: center;">
-    <img src="https://github.com/ultralytics/docs/releases/download/0/nvidia-jetson-benchmarks.avif" alt="NVIDIA Jetson Ecosystem">
-</div>
+#### NVIDIA Jetson Orin Nano Super Developer Kit

-### Detailed Comparison Table
+<figure style="text-align: center;">
+    <img src="https://github.com/ultralytics/assets/releases/download/v0.0.0/jetson-orin-nano-super-benchmarks.avif" alt="Jetson Orin Nano Super Benchmarks">
+    <figcaption style="font-style: italic; color: gray;">Benchmarked with Ultralytics {{ benchmark_version }}</figcaption>
+</figure>
+
+#### NVIDIA Jetson Orin NX 16GB
+
+<figure style="text-align: center;">
+    <img src="https://github.com/ultralytics/assets/releases/download/v0.0.0/jetson-orin-nx-16-benchmarks.avif" alt="Jetson Orin NX 16GB Benchmarks">
+    <figcaption style="font-style: italic; color: gray;">Benchmarked with Ultralytics {{ benchmark_version }}</figcaption>
+</figure>
+
+### Detailed Comparison Tables

 The below table represents the benchmark results for five different models (YOLO11n, YOLO11s, YOLO11m, YOLO11l, YOLO11x) across ten different formats (PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN), giving us the status, size, mAP50-95(B) metric, and inference time for each combination.

+#### NVIDIA Jetson Orin Nano Super Developer Kit
+
 !!! performance

    === "YOLO11n"

        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
        |-----------------|--------|-------------------|-------------|------------------------|
-        | PyTorch         | ✅      | 5.4               | 0.6176      | 19.80                  |
-        | TorchScript     | ✅      | 10.5              | 0.6100      | 13.30                  |
-        | ONNX            | ✅      | 10.2              | 0.6082      | 67.92                  |
-        | OpenVINO        | ✅      | 10.4              | 0.6082      | 118.21                 |
-        | TensorRT (FP32) | ✅      | 14.1              | 0.6100      | 7.94                   |
-        | TensorRT (FP16) | ✅      | 8.3               | 0.6082      | 4.80                   |
-        | TensorRT (INT8) | ✅      | 6.6               | 0.3256      | 4.17                   |
-        | TF SavedModel   | ✅      | 25.8              | 0.6082      | 185.88                 |
-        | TF GraphDef     | ✅      | 10.3              | 0.6082      | 256.66                 |
-        | TF Lite         | ✅      | 10.3              | 0.6082      | 284.64                 |
-        | PaddlePaddle    | ✅      | 20.4              | 0.6082      | 477.41                 |
-        | NCNN            | ✅      | 10.2              | 0.6106      | 32.18                  |
+        | PyTorch         | ✅      | 5.4               | 0.6176      | 21.3                   |
+        | TorchScript     | ✅      | 10.5              | 0.6100      | 13.40                  |
+        | ONNX            | ✅      | 10.2              | 0.6100      | 7.94                   |
+        | OpenVINO        | ✅      | 10.4              | 0.6091      | 57.36                  |
+        | TensorRT (FP32) | ✅      | 11.9              | 0.6082      | 7.60                   |
+        | TensorRT (FP16) | ✅      | 8.3               | 0.6096      | 4.91                   |
+        | TensorRT (INT8) | ✅      | 5.6               | 0.3180      | 3.91                   |
+        | TF SavedModel   | ✅      | 25.8              | 0.6082      | 223.98                 |
+        | TF GraphDef     | ✅      | 10.3              | 0.6082      | 289.95                 |
+        | TF Lite         | ✅      | 10.3              | 0.6082      | 328.29                 |
+        | PaddlePaddle    | ✅      | 20.4              | 0.6082      | 530.46                 |
+        | MNN             | ✅      | 10.1              | 0.6120      | 74.75                  |
+        | NCNN            | ✅      | 10.2              | 0.6106      | 46.12                  |

    === "YOLO11s"

        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
        |-----------------|--------|-------------------|-------------|------------------------|
-        | PyTorch         | ✅      | 18.4              | 0.7526      | 20.20                  |
-        | TorchScript     | ✅      | 36.5              | 0.7416      | 23.42                  |
-        | ONNX            | ✅      | 36.3              | 0.7416      | 162.01                 |
-        | OpenVINO        | ✅      | 36.4              | 0.7416      | 159.61                 |
-        | TensorRT (FP32) | ✅      | 40.3              | 0.7416      | 13.93                  |
-        | TensorRT (FP16) | ✅      | 21.7              | 0.7416      | 7.47                   |
-        | TensorRT (INT8) | ✅      | 13.6              | 0.3179      | 5.66                   |
-        | TF SavedModel   | ✅      | 91.1              | 0.7416      | 316.46                 |
-        | TF GraphDef     | ✅      | 36.4              | 0.7416      | 506.71                 |
-        | TF Lite         | ✅      | 36.4              | 0.7416      | 842.97                 |
-        | PaddlePaddle    | ✅      | 72.5              | 0.7416      | 1172.57                |
-        | NCNN            | ✅      | 36.2              | 0.7419      | 66.00                  |
+        | PyTorch         | ✅      | 18.4              | 0.7526      | 22.00                  |
+        | TorchScript     | ✅      | 36.5              | 0.7400      | 21.35                  |
+        | ONNX            | ✅      | 36.3              | 0.7400      | 13.91                  |
+        | OpenVINO        | ✅      | 36.4              | 0.7391      | 126.95                 |
+        | TensorRT (FP32) | ✅      | 38.0              | 0.7400      | 13.29                  |
+        | TensorRT (FP16) | ✅      | 21.3              | 0.7431      | 7.30                   |
+        | TensorRT (INT8) | ✅      | 12.2              | 0.3243      | 5.25                   |
+        | TF SavedModel   | ✅      | 91.1              | 0.7400      | 406.73                 |
+        | TF GraphDef     | ✅      | 36.4              | 0.7400      | 629.80                 |
+        | TF Lite         | ✅      | 36.4              | 0.7400      | 953.98                 |
+        | PaddlePaddle    | ✅      | 72.5              | 0.7400      | 1311.67                |
+        | MNN             | ✅      | 36.2              | 0.7392      | 187.66                 |
+        | NCNN            | ✅      | 36.2              | 0.7403      | 122.02                 |

    === "YOLO11m"

        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
        |-----------------|--------|-------------------|-------------|------------------------|
-        | PyTorch         | ✅      | 38.8              | 0.7595      | 36.70                  |
-        | TorchScript     | ✅      | 77.3              | 0.7643      | 50.95                  |
-        | ONNX            | ✅      | 76.9              | 0.7643      | 416.34                 |
-        | OpenVINO        | ✅      | 77.1              | 0.7643      | 370.99                 |
-        | TensorRT (FP32) | ✅      | 81.5              | 0.7640      | 30.49                  |
-        | TensorRT (FP16) | ✅      | 42.2              | 0.7658      | 14.93                  |
-        | TensorRT (INT8) | ✅      | 24.3              | 0.4118      | 10.32                  |
-        | TF SavedModel   | ✅      | 192.7             | 0.7643      | 597.08                 |
-        | TF GraphDef     | ✅      | 77.0              | 0.7643      | 1016.12                |
-        | TF Lite         | ✅      | 77.0              | 0.7643      | 2494.60                |
-        | PaddlePaddle    | ✅      | 153.8             | 0.7643      | 3218.99                |
-        | NCNN            | ✅      | 76.8              | 0.7691      | 192.77                 |
+        | PyTorch         | ✅      | 38.8              | 0.7598      | 33.00                  |
+        | TorchScript     | ✅      | 77.3              | 0.7643      | 48.17                  |
+        | ONNX            | ✅      | 76.9              | 0.7641      | 29.31                  |
+        | OpenVINO        | ✅      | 77.1              | 0.7642      | 313.49                 |
+        | TensorRT (FP32) | ✅      | 78.7              | 0.7641      | 28.21                  |
+        | TensorRT (FP16) | ✅      | 41.8              | 0.7653      | 13.99                  |
+        | TensorRT (INT8) | ✅      | 23.2              | 0.4194      | 9.58                   |
+        | TF SavedModel   | ✅      | 192.7             | 0.7643      | 802.30                 |
+        | TF GraphDef     | ✅      | 77.0              | 0.7643      | 1335.42                |
+        | TF Lite         | ✅      | 77.0              | 0.7643      | 2842.42                |
+        | PaddlePaddle    | ✅      | 153.8             | 0.7643      | 3644.29                |
+        | MNN             | ✅      | 76.8              | 0.7648      | 503.90                 |
+        | NCNN            | ✅      | 76.8              | 0.7674      | 298.78                 |

    === "YOLO11l"

        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
        |-----------------|--------|-------------------|-------------|------------------------|
-        | PyTorch         | ✅      | 49.0              | 0.7475      | 47.6                   |
-        | TorchScript     | ✅      | 97.6              | 0.7250      | 66.36                  |
-        | ONNX            | ✅      | 97.0              | 0.7250      | 532.58                 |
-        | OpenVINO        | ✅      | 97.3              | 0.7250      | 477.55                 |
-        | TensorRT (FP32) | ✅      | 101.6             | 0.7250      | 38.71                  |
-        | TensorRT (FP16) | ✅      | 52.6              | 0.7265      | 19.35                  |
-        | TensorRT (INT8) | ✅      | 31.6              | 0.3856      | 13.50                  |
-        | TF SavedModel   | ✅      | 243.3             | 0.7250      | 895.24                 |
-        | TF GraphDef     | ✅      | 97.2              | 0.7250      | 1301.19                |
-        | TF Lite         | ✅      | 97.2              | 0.7250      | 3202.93                |
-        | PaddlePaddle    | ✅      | 193.9             | 0.7250      | 4206.98                |
-        | NCNN            | ✅      | 96.9              | 0.7252      | 225.75                 |
+        | PyTorch         | ✅      | 49.0              | 0.7475      | 43.00                  |
+        | TorchScript     | ✅      | 97.6              | 0.7250      | 62.94                  |
+        | ONNX            | ✅      | 97.0              | 0.7250      | 36.33                  |
+        | OpenVINO        | ✅      | 97.3              | 0.7226      | 387.72                 |
+        | TensorRT (FP32) | ✅      | 99.1              | 0.7250      | 35.59                  |
+        | TensorRT (FP16) | ✅      | 52.0              | 0.7265      | 17.57                  |
+        | TensorRT (INT8) | ✅      | 31.0              | 0.4033      | 12.37                  |
+        | TF SavedModel   | ✅      | 243.3             | 0.7250      | 1116.20                |
+        | TF GraphDef     | ✅      | 97.2              | 0.7250      | 1603.32                |
+        | TF Lite         | ✅      | 97.2              | 0.7250      | 3607.51                |
+        | PaddlePaddle    | ✅      | 193.9             | 0.7250      | 4890.90                |
+        | MNN             | ✅      | 96.9              | 0.7222      | 619.04                 |
+        | NCNN            | ✅      | 96.9              | 0.7252      | 352.85                 |

    === "YOLO11x"

        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
        |-----------------|--------|-------------------|-------------|------------------------|
-        | PyTorch         | ✅      | 109.3             | 0.8288      | 85.60                  |
-        | TorchScript     | ✅      | 218.1             | 0.8308      | 121.67                 |
-        | ONNX            | ✅      | 217.5             | 0.8308      | 1073.14                |
-        | OpenVINO        | ✅      | 217.8             | 0.8308      | 955.60                 |
-        | TensorRT (FP32) | ✅      | 221.6             | 0.8307      | 75.84                  |
-        | TensorRT (FP16) | ✅      | 113.1             | 0.8295      | 35.75                  |
-        | TensorRT (INT8) | ✅      | 62.2              | 0.4783      | 22.23                  |
-        | TF SavedModel   | ✅      | 545.0             | 0.8308      | 1497.40                |
-        | TF GraphDef     | ✅      | 217.8             | 0.8308      | 2552.42                |
-        | TF Lite         | ✅      | 217.8             | 0.8308      | 7044.58                |
-        | PaddlePaddle    | ✅      | 434.9             | 0.8308      | 8386.73                |
-        | NCNN            | ✅      | 217.3             | 0.8304      | 486.36                 |
+        | PyTorch         | ✅      | 109.3             | 0.8288      | 81.00                  |
+        | TorchScript     | ✅      | 218.1             | 0.8308      | 113.49                 |
+        | ONNX            | ✅      | 217.5             | 0.8308      | 75.20                  |
+        | OpenVINO        | ✅      | 217.8             | 0.8285      | 508.12                 |
+        | TensorRT (FP32) | ✅      | 219.5             | 0.8307      | 67.32                  |
+        | TensorRT (FP16) | ✅      | 112.2             | 0.8248      | 32.94                  |
+        | TensorRT (INT8) | ✅      | 61.7              | 0.4854      | 20.72                  |
+        | TF SavedModel   | ✅      | 545.0             | 0.8308      | 1048.8                 |
+        | TF GraphDef     | ✅      | 217.8             | 0.8308      | 2961.8                 |
+        | TF Lite         | ✅      | 217.8             | 0.8308      | 7898.8                 |
+        | PaddlePaddle    | ✅      | 434.8             | 0.8308      | 9903.68                |
+        | MNN             | ✅      | 217.3             | 0.8308      | 1242.97                |
+        | NCNN            | ✅      | 217.3             | 0.8304      | 850.05                 |
+
+    Benchmarked with Ultralytics {{ benchmark_version }}
+
+#### NVIDIA Jetson Orin NX 16GB
+
+!!! performance
+
+    === "YOLO11n"
+
+        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
+        |-----------------|--------|-------------------|-------------|------------------------|
+        | PyTorch         | ✅      | 5.4               | 0.6176      | 19.50                  |
+        | TorchScript     | ✅      | 10.5              | 0.6100      | 13.03                  |
+        | ONNX            | ✅      | 10.2              | 0.6100      | 8.44                   |
+        | OpenVINO        | ✅      | 10.4              | 0.6091      | 40.83                  |
+        | TensorRT (FP32) | ✅      | 11.9              | 0.6100      | 8.05                   |
+        | TensorRT (FP16) | ✅      | 8.2               | 0.6096      | 4.85                   |
+        | TensorRT (INT8) | ✅      | 5.5               | 0.3180      | 4.37                   |
+        | TF SavedModel   | ✅      | 25.8              | 0.6082      | 185.39                 |
+        | TF GraphDef     | ✅      | 10.3              | 0.6082      | 244.85                 |
+        | TF Lite         | ✅      | 10.3              | 0.6082      | 289.77                 |
+        | PaddlePaddle    | ✅      | 20.4              | 0.6082      | 476.52                 |
+        | MNN             | ✅      | 10.1              | 0.6120      | 53.37                  |
+        | NCNN            | ✅      | 10.2              | 0.6106      | 33.55                  |
+
+    === "YOLO11s"
+
+        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
+        |-----------------|--------|-------------------|-------------|------------------------|
+        | PyTorch         | ✅      | 18.4              | 0.7526      | 19.00                  |
+        | TorchScript     | ✅      | 36.5              | 0.7400      | 22.90                  |
+        | ONNX            | ✅      | 36.3              | 0.7400      | 14.44                  |
+        | OpenVINO        | ✅      | 36.4              | 0.7391      | 88.70                  |
+        | TensorRT (FP32) | ✅      | 37.9              | 0.7400      | 14.13                  |
+        | TensorRT (FP16) | ✅      | 21.6              | 0.7406      | 7.55                   |
+        | TensorRT (INT8) | ✅      | 12.2              | 0.3243      | 5.63                   |
+        | TF SavedModel   | ✅      | 91.1              | 0.7400      | 317.61                 |
+        | TF GraphDef     | ✅      | 36.4              | 0.7400      | 515.99                 |
+        | TF Lite         | ✅      | 36.4              | 0.7400      | 838.85                 |
+        | PaddlePaddle    | ✅      | 72.5              | 0.7400      | 1170.07                |
+        | MNN             | ✅      | 36.2              | 0.7413      | 125.23                 |
+        | NCNN            | ✅      | 36.2              | 0.7403      | 68.13                  |
+
+    === "YOLO11m"
+
+        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
+        |-----------------|--------|-------------------|-------------|------------------------|
+        | PyTorch         | ✅      | 38.8              | 0.7598      | 36.50                  |
+        | TorchScript     | ✅      | 77.3              | 0.7643      | 52.55                  |
+        | ONNX            | ✅      | 76.9              | 0.7640      | 31.16                  |
+        | OpenVINO        | ✅      | 77.1              | 0.7642      | 208.57                 |
+        | TensorRT (FP32) | ✅      | 78.7              | 0.7640      | 30.72                  |
+        | TensorRT (FP16) | ✅      | 41.5              | 0.7651      | 14.45                  |
+        | TensorRT (INT8) | ✅      | 23.3              | 0.4194      | 10.19                  |
+        | TF SavedModel   | ✅      | 192.7             | 0.7643      | 590.11                 |
+        | TF GraphDef     | ✅      | 77.0              | 0.7643      | 998.57                 |
+        | TF Lite         | ✅      | 77.0              | 0.7643      | 2486.11                |
+        | PaddlePaddle    | ✅      | 153.8             | 0.7643      | 3236.09                |
+        | MNN             | ✅      | 76.8              | 0.7661      | 335.78                 |
+        | NCNN            | ✅      | 76.8              | 0.7674      | 188.43                 |
+
+    === "YOLO11l"
+
+        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
+        |-----------------|--------|-------------------|-------------|------------------------|
+        | PyTorch         | ✅      | 49.0              | 0.7475      | 46.6                   |
+        | TorchScript     | ✅      | 97.6              | 0.7250      | 66.54                  |
+        | ONNX            | ✅      | 97.0              | 0.7250      | 39.55                  |
+        | OpenVINO        | ✅      | 97.3              | 0.7226      | 262.44                 |
+        | TensorRT (FP32) | ✅      | 99.2              | 0.7250      | 38.68                  |
+        | TensorRT (FP16) | ✅      | 51.9              | 0.7265      | 18.53                  |
+        | TensorRT (INT8) | ✅      | 30.9              | 0.4033      | 13.36                  |
+        | TF SavedModel   | ✅      | 243.3             | 0.7250      | 850.25                 |
+        | TF GraphDef     | ✅      | 97.2              | 0.7250      | 1324.60                |
+        | TF Lite         | ✅      | 97.2              | 0.7250      | 3191.24                |
+        | PaddlePaddle    | ✅      | 193.9             | 0.7250      | 4204.97                |
+        | MNN             | ✅      | 96.9              | 0.7225      | 414.41                 |
+        | NCNN            | ✅      | 96.9              | 0.7252      | 237.74                 |
+
+    === "YOLO11x"
+
+        | Format          | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) |
+        |-----------------|--------|-------------------|-------------|------------------------|
+        | PyTorch         | ✅      | 109.3             | 0.8288      | 86.00                  |
+        | TorchScript     | ✅      | 218.1             | 0.8308      | 122.43                 |
+        | ONNX            | ✅      | 217.5             | 0.8307      | 77.50                  |
+        | OpenVINO        | ✅      | 217.8             | 0.8285      | 508.12                 |
+        | TensorRT (FP32) | ✅      | 219.5             | 0.8307      | 76.44                  |
+        | TensorRT (FP16) | ✅      | 112.0             | 0.8309      | 35.99                  |
+        | TensorRT (INT8) | ✅      | 61.6              | 0.4854      | 22.32                  |
+        | TF SavedModel   | ✅      | 545.0             | 0.8308      | 1470.06                |
+        | TF GraphDef     | ✅      | 217.8             | 0.8308      | 2549.78                |
+        | TF Lite         | ✅      | 217.8             | 0.8308      | 7025.44                |
+        | PaddlePaddle    | ✅      | 434.8             | 0.8308      | 8364.89                |
+        | MNN             | ✅      | 217.3             | 0.8289      | 827.13                 |
+        | NCNN            | ✅      | 217.3             | 0.8304      | 490.29                 |
+
+    Benchmarked with Ultralytics {{ benchmark_version }}

 [Explore more benchmarking efforts by Seeed Studio](https://www.seeedstudio.com/blog/2023/03/30/yolov8-performance-benchmarks-on-nvidia-jetson-devices) running on different versions of NVIDIA Jetson hardware.

@ -500,7 +631,7 @@ Deploying Ultralytics YOLO11 on NVIDIA Jetson devices is a straightforward proce

 ### What performance benchmarks can I expect from YOLO11 models on NVIDIA Jetson devices?

-YOLO11 models have been benchmarked on various NVIDIA Jetson devices showing significant performance improvements. For example, the TensorRT format delivers the best inference performance. The table in the [Detailed Comparison Table](#detailed-comparison-table) section provides a comprehensive view of performance metrics like mAP50-95 and inference time across different model formats.
+YOLO11 models have been benchmarked on various NVIDIA Jetson devices showing significant performance improvements. For example, the TensorRT format delivers the best inference performance. The table in the [Detailed Comparison Tables](#detailed-comparison-tables) section provides a comprehensive view of performance metrics like mAP50-95 and inference time across different model formats.

 ### Why should I use TensorRT for deploying YOLO11 on NVIDIA Jetson?