ultralytics 8.2.11 new TensorRT INT8 export feature (#10165)

Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
This commit is contained in:
Burhan 2024-05-08 10:05:57 -04:00 committed by GitHub
parent 1d9745182d
commit fcfc44ea9c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
15 changed files with 601 additions and 176 deletions

View file

@ -41,8 +41,8 @@ Here are some of the standout functionalities:
!!! Tip "Tip"
* Export to ONNX or OpenVINO for up to 3x CPU speedup.
* Export to TensorRT for up to 5x GPU speedup.
* Export to [ONNX](../integrations/onnx.md) or [OpenVINO](../integrations/openvino.md) for up to 3x CPU speedup.
* Export to [TensorRT](../integrations/tensorrt.md) for up to 5x GPU speedup.
## Usage Examples
@ -85,7 +85,7 @@ This table details the configurations and options available for exporting YOLO m
| `dynamic` | `bool` | `False` | Allows dynamic input sizes for ONNX and TensorRT exports, enhancing flexibility in handling varying image dimensions. |
| `simplify` | `bool` | `False` | Simplifies the model graph for ONNX exports, potentially improving performance and compatibility. |
| `opset` | `int` | `None` | Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version. |
| `workspace` | `float` | `4.0` | Sets the maximum workspace size in GB for TensorRT optimizations, balancing memory usage and performance. |
| `workspace` | `float` | `4.0` | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance. |
| `nms` | `bool` | `False` | Adds Non-Maximum Suppression (NMS) to the CoreML export, essential for accurate and efficient detection post-processing. |
| `batch` | `int` | `1` | Specifies export model batch inference size or the max number of images the exported model will process concurrently in `predict` mode. |
@ -95,18 +95,18 @@ Adjusting these parameters allows for customization of the export process to fit
Available YOLOv8 export formats are in the table below. You can export to any format using the `format` argument, i.e. `format='onnx'` or `format='engine'`. You can predict or validate directly on exported models, i.e. `yolo predict model=yolov8n.onnx`. Usage examples are shown for your model after export completes.
| Format | `format` Argument | Model | Metadata | Arguments |
|---------------------------------------------------|-------------------|---------------------------|----------|--------------------------------------------------------------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
| [TorchScript](../integrations/torchscript.md) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize`, `batch` |
| [ONNX](../integrations/onnx.md) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset`, `batch` |
| [OpenVINO](../integrations/openvino.md) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [TensorRT](../integrations/tensorrt.md) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace`, `batch` |
| [CoreML](../integrations/coreml.md) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms`, `batch` |
| [TF SavedModel](../integrations/tf-savedmodel.md) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras`, `int8`, `batch` |
| [TF GraphDef](../integrations/tf-graphdef.md) | `pb` | `yolov8n.pb` | ❌ | `imgsz`, `batch` |
| [TF Lite](../integrations/tflite.md) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [TF Edge TPU](../integrations/edge-tpu.md) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz`, `batch` |
| [TF.js](../integrations/tfjs.md) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [PaddlePaddle](../integrations/paddlepaddle.md) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz`, `batch` |
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
| Format | `format` Argument | Model | Metadata | Arguments |
|---------------------------------------------------|-------------------|---------------------------|----------|----------------------------------------------------------------------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
| [TorchScript](../integrations/torchscript.md) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize`, `batch` |
| [ONNX](../integrations/onnx.md) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset`, `batch` |
| [OpenVINO](../integrations/openvino.md) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [TensorRT](../integrations/tensorrt.md) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace`, `int8`, `batch` |
| [CoreML](../integrations/coreml.md) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms`, `batch` |
| [TF SavedModel](../integrations/tf-savedmodel.md) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras`, `int8`, `batch` |
| [TF GraphDef](../integrations/tf-graphdef.md) | `pb` | `yolov8n.pb` | ❌ | `imgsz`, `batch` |
| [TF Lite](../integrations/tflite.md) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [TF Edge TPU](../integrations/edge-tpu.md) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz`, `batch` |
| [TF.js](../integrations/tfjs.md) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [PaddlePaddle](../integrations/paddlepaddle.md) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz`, `batch` |
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |