TensorRT10 with JetPack 6.0 Docs update (#11779)
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: Lakshantha Dissanayake <lakshanthad@yahoo.com>
This commit is contained in:
parent
303579c35e
commit
10b3564a1b
1 changed files with 47 additions and 31 deletions
|
|
@ -145,27 +145,43 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
|
|||
|
||||
!!! example
|
||||
|
||||
```{ .py .annotate }
|
||||
from ultralytics import YOLO
|
||||
=== "Python"
|
||||
|
||||
model = YOLO("yolov8n.pt")
|
||||
model.export(
|
||||
format="engine",
|
||||
dynamic=True, #(1)!
|
||||
batch=8, #(2)!
|
||||
workspace=4, #(3)!
|
||||
int8=True,
|
||||
data="coco.yaml", #(4)!
|
||||
)
|
||||
```{ .py .annotate }
|
||||
from ultralytics import YOLO
|
||||
|
||||
model = YOLO("yolov8n.pt")
|
||||
model.export(
|
||||
format="engine",
|
||||
dynamic=True, #(1)!
|
||||
batch=8, #(2)!
|
||||
workspace=4, #(3)!
|
||||
int8=True,
|
||||
data="coco.yaml", #(4)!
|
||||
)
|
||||
|
||||
# Load the exported TensorRT INT8 model
|
||||
model = YOLO("yolov8n.engine", task="detect")
|
||||
# Run inference
|
||||
result = model.predict("https://ultralytics.com/images/bus.jpg")
|
||||
```
|
||||
|
||||
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
|
||||
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 * 8` to avoid scaling errors during calibration.
|
||||
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
|
||||
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
|
||||
|
||||
model = YOLO("yolov8n.engine", task="detect") # load the model
|
||||
|
||||
```
|
||||
|
||||
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
|
||||
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 *×* 8` to avoid scaling errors during calibration.
|
||||
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
|
||||
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Export a YOLOv8n PyTorch model to TensorRT format with INT8 quantization
|
||||
yolo export model=yolov8n.pt format=engine batch=8 workspace=4 int8=True data=coco.yaml # creates 'yolov8n.engine''
|
||||
|
||||
# Run inference with the exported TensorRT quantized model
|
||||
yolo predict model=yolov8n.engine source='https://ultralytics.com/images/bus.jpg'
|
||||
```
|
||||
|
||||
|
||||
???+ warning "Calibration Cache"
|
||||
|
||||
|
|
@ -240,12 +256,12 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
|
|||
|
||||
| Precision | Eval test | mean<br>(ms) | min \| max<br>(ms) | top-1 | top-5 | `batch` | size<br><sup>(pixels) |
|
||||
|-----------|------------------|--------------|--------------------|-------|-------|---------|-----------------------|
|
||||
| FP32 | Predict | 0.26 | 0.25 \| 0.28 | 0.35 | 0.61 | 8 | 640 |
|
||||
| FP32 | ImageNet<sup>val | 0.26 | | | | 1 | 640 |
|
||||
| FP16 | Predict | 0.18 | 0.17 \| 0.19 | 0.35 | 0.61 | 8 | 640 |
|
||||
| FP16 | ImageNet<sup>val | 0.18 | | | | 1 | 640 |
|
||||
| INT8 | Predict | 0.16 | 0.15 \| 0.57 | 0.32 | 0.59 | 8 | 640 |
|
||||
| INT8 | ImageNet<sup>val | 0.15 | | | | 1 | 640 |
|
||||
| FP32 | Predict | 0.26 | 0.25 \| 0.28 | | | 8 | 640 |
|
||||
| FP32 | ImageNet<sup>val | 0.26 | | 0.35 | 0.61 | 1 | 640 |
|
||||
| FP16 | Predict | 0.18 | 0.17 \| 0.19 | | | 8 | 640 |
|
||||
| FP16 | ImageNet<sup>val | 0.18 | | 0.35 | 0.61 | 1 | 640 |
|
||||
| INT8 | Predict | 0.16 | 0.15 \| 0.57 | | | 8 | 640 |
|
||||
| INT8 | ImageNet<sup>val | 0.15 | | 0.32 | 0.59 | 1 | 640 |
|
||||
|
||||
=== "Pose (COCO)"
|
||||
|
||||
|
|
@ -338,19 +354,19 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
|
|||
|
||||
=== "Jetson Orin NX 16GB"
|
||||
|
||||
Tested with JetPack 5.1.3 (L4T 35.5.0) Ubuntu 20.04.6, `python 3.8.10`, `ultralytics==8.2.4`, `tensorrt==8.5.2.2`
|
||||
Tested with JetPack 6.0 (L4T 36.3) Ubuntu 22.04.4 LTS, `python 3.10.12`, `ultralytics==8.2.16`, `tensorrt==10.0.1`
|
||||
|
||||
!!! note
|
||||
Inference times shown for `mean`, `min` (fastest), and `max` (slowest) for each test using pre-trained weights `yolov8n.engine`
|
||||
|
||||
| Precision | Eval test | mean<br>(ms) | min \| max<br>(ms) | mAP<sup>val<br>50(B) | mAP<sup>val<br>50-95(B) | `batch` | size<br><sup>(pixels) |
|
||||
|-----------|--------------|--------------|--------------------|----------------------|-------------------------|---------|-----------------------|
|
||||
| FP32 | Predict | 6.90 | 6.89 \| 6.93 | | | 8 | 640 |
|
||||
| FP32 | COCO<sup>val | 6.97 | | 0.52 | 0.37 | 1 | 640 |
|
||||
| FP16 | Predict | 3.36 | 3.35 \| 3.39 | | | 8 | 640 |
|
||||
| FP16 | COCO<sup>val | 3.39 | | 0.52 | 0.37 | 1 | 640 |
|
||||
| INT8 | Predict | 2.32 | 2.32 \| 2.34 | | | 8 | 640 |
|
||||
| INT8 | COCO<sup>val | 2.33 | | 0.47 | 0.33 | 1 | 640 |
|
||||
| FP32 | Predict | 6.11 | 6.10 \| 6.29 | | | 8 | 640 |
|
||||
| FP32 | COCO<sup>val | 6.17 | | 0.52 | 0.37 | 1 | 640 |
|
||||
| FP16 | Predict | 3.18 | 3.18 \| 3.20 | | | 8 | 640 |
|
||||
| FP16 | COCO<sup>val | 3.19 | | 0.52 | 0.37 | 1 | 640 |
|
||||
| INT8 | Predict | 2.30 | 2.29 \| 2.35 | | | 8 | 640 |
|
||||
| INT8 | COCO<sup>val | 2.32 | | 0.46 | 0.32 | 1 | 640 |
|
||||
|
||||
!!! info
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue