TensorRT10 with JetPack 6.0 Docs update (#11779)

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: Lakshantha Dissanayake <lakshanthad@yahoo.com>
2024-05-17 13:02:30 -04:00 · 2024-05-17 13:02:30 -04:00 · 10b3564a1b
commit 10b3564a1b
parent 303579c35e
1 changed files with 47 additions and 31 deletions
--- a/docs/en/integrations/tensorrt.md
+++ b/docs/en/integrations/tensorrt.md
@ -145,27 +145,43 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i

 !!! example

-    ```{ .py .annotate }
-    from ultralytics import YOLO
+    === "Python"

-    model = YOLO("yolov8n.pt")
-    model.export(
-        format="engine",
-        dynamic=True, #(1)!
-        batch=8, #(2)!
-        workspace=4, #(3)!
-        int8=True,
-        data="coco.yaml", #(4)!
-    )
+        ```{ .py .annotate }
+        from ultralytics import YOLO
+
+        model = YOLO("yolov8n.pt")
+        model.export(
+            format="engine",
+            dynamic=True, #(1)!
+            batch=8, #(2)!
+            workspace=4, #(3)!
+            int8=True,
+            data="coco.yaml", #(4)!
+        )
+
+        # Load the exported TensorRT INT8 model
+        model = YOLO("yolov8n.engine", task="detect")
+        # Run inference
+        result = model.predict("https://ultralytics.com/images/bus.jpg")
+        ```
+        
+        1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
+        2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 * 8` to avoid scaling errors during calibration.
+        3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
+        4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).

-    model = YOLO("yolov8n.engine", task="detect") # load the model
    
-    ```
-    
-    1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
-    2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 *×* 8` to avoid scaling errors during calibration.
-    3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
-    4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
+    === "CLI"
+
+        ```bash
+        # Export a YOLOv8n PyTorch model to TensorRT format with INT8 quantization
+        yolo export model=yolov8n.pt format=engine batch=8 workspace=4 int8=True data=coco.yaml  # creates 'yolov8n.engine''
+
+        # Run inference with the exported TensorRT quantized model
+        yolo predict model=yolov8n.engine source='https://ultralytics.com/images/bus.jpg'
+        ```
+

 ???+ warning "Calibration Cache"

@ -240,12 +256,12 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i

        | Precision | Eval test        | mean<br>(ms) | min \| max<br>(ms) | top-1 | top-5 | `batch` | size<br><sup>(pixels) |
        |-----------|------------------|--------------|--------------------|-------|-------|---------|-----------------------|
-        | FP32      | Predict          | 0.26         | 0.25 \| 0.28       | 0.35  | 0.61  | 8       | 640                   |
-        | FP32      | ImageNet<sup>val | 0.26         |                    |       |       | 1       | 640                   |
-        | FP16      | Predict          | 0.18         | 0.17 \| 0.19       | 0.35  | 0.61  | 8       | 640                   |
-        | FP16      | ImageNet<sup>val | 0.18         |                    |       |       | 1       | 640                   |
-        | INT8      | Predict          | 0.16         | 0.15 \| 0.57       | 0.32  | 0.59  | 8       | 640                   |
-        | INT8      | ImageNet<sup>val | 0.15         |                    |       |       | 1       | 640                   |
+        | FP32      | Predict          | 0.26         | 0.25 \| 0.28       |       |       | 8       | 640                   |
+        | FP32      | ImageNet<sup>val | 0.26         |                    | 0.35  | 0.61  | 1       | 640                   |
+        | FP16      | Predict          | 0.18         | 0.17 \| 0.19       |       |       | 8       | 640                   |
+        | FP16      | ImageNet<sup>val | 0.18         |                    | 0.35  | 0.61  | 1       | 640                   |
+        | INT8      | Predict          | 0.16         | 0.15 \| 0.57       |       |       | 8       | 640                   |
+        | INT8      | ImageNet<sup>val | 0.15         |                    | 0.32  | 0.59  | 1       | 640                   |

    === "Pose (COCO)"

@ -338,19 +354,19 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i

    === "Jetson Orin NX 16GB"

-        Tested with JetPack 5.1.3 (L4T 35.5.0) Ubuntu 20.04.6, `python 3.8.10`, `ultralytics==8.2.4`, `tensorrt==8.5.2.2`
+        Tested with JetPack 6.0 (L4T 36.3) Ubuntu 22.04.4 LTS, `python 3.10.12`, `ultralytics==8.2.16`, `tensorrt==10.0.1`

        !!! note 
            Inference times shown for `mean`, `min` (fastest), and `max` (slowest) for each test using pre-trained weights `yolov8n.engine`

        | Precision | Eval test    | mean<br>(ms) | min \| max<br>(ms) | mAP<sup>val<br>50(B) | mAP<sup>val<br>50-95(B) | `batch` | size<br><sup>(pixels) |
        |-----------|--------------|--------------|--------------------|----------------------|-------------------------|---------|-----------------------|
-        | FP32      | Predict      | 6.90         | 6.89 \| 6.93       |                      |                         | 8       | 640                   |
-        | FP32      | COCO<sup>val | 6.97         |                    | 0.52                 | 0.37                    | 1       | 640                   |
-        | FP16      | Predict      | 3.36         | 3.35 \| 3.39       |                      |                         | 8       | 640                   |
-        | FP16      | COCO<sup>val | 3.39         |                    | 0.52                 | 0.37                    | 1       | 640                   |
-        | INT8      | Predict      | 2.32         | 2.32 \| 2.34       |                      |                         | 8       | 640                   |
-        | INT8      | COCO<sup>val | 2.33         |                    | 0.47                 | 0.33                    | 1       | 640                   |
+        | FP32      | Predict      | 6.11         | 6.10 \| 6.29       |                      |                         | 8       | 640                   |
+        | FP32      | COCO<sup>val | 6.17         |                    | 0.52                 | 0.37                    | 1       | 640                   |
+        | FP16      | Predict      | 3.18         | 3.18 \| 3.20       |                      |                         | 8       | 640                   |
+        | FP16      | COCO<sup>val | 3.19         |                    | 0.52                 | 0.37                    | 1       | 640                   |
+        | INT8      | Predict      | 2.30         | 2.29 \| 2.35       |                      |                         | 8       | 640                   |
+        | INT8      | COCO<sup>val | 2.32         |                    | 0.46                 | 0.32                    | 1       | 640                   |

 !!! info