Update TFLite Docs images (#8605)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
This commit is contained in:
Glenn Jocher 2024-03-03 01:59:43 +01:00 committed by GitHub
parent 1146bb0582
commit 36408c974c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
33 changed files with 112 additions and 107 deletions

View file

@ -16,7 +16,7 @@ By using the TensorRT export format, you can enhance your [Ultralytics YOLOv8](h
<img width="100%" src="https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-601/tensorrt-developer-guide/graphics/whatistrt2.png" alt="TensorRT Overview">
</p>
[TensorRT](https://developer.nvidia.com/tensorrt#:~:text=NVIDIA%20TensorRT%2DLLM%20is%20an,knowledge%20of%20C%2B%2B%20or%20CUDA.), developed by NVIDIA, is an advanced software development kit (SDK) designed for high-speed deep learning inference. Its well-suited for real-time applications like object detection.
[TensorRT](https://developer.nvidia.com/tensorrt), developed by NVIDIA, is an advanced software development kit (SDK) designed for high-speed deep learning inference. Its well-suited for real-time applications like object detection.
This toolkit optimizes deep learning models for NVIDIA GPUs and results in faster and more efficient operations. TensorRT models undergo TensorRT optimization, which includes techniques like layer fusion, precision calibration (INT8 and FP16), dynamic tensor memory management, and kernel auto-tuning. Converting deep learning models into the TensorRT format allows developers to realize the potential of NVIDIA GPUs fully.
@ -40,7 +40,7 @@ TensorRT models offer a range of key features that contribute to their efficienc
## Deployment Options in TensorRT
Before we look at the code for exporting YOLOv8 models to the TensorRT format, lets understand where TensorRT models are normally used.
Before we look at the code for exporting YOLOv8 models to the TensorRT format, lets understand where TensorRT models are normally used.
TensorRT offers several deployment options, and each option balances ease of integration, performance optimization, and flexibility differently:
@ -52,7 +52,7 @@ TensorRT offers several deployment options, and each option balances ease of int
- **Standalone TensorRT Runtime API**: Offers granular control, ideal for performance-critical applications. It's more complex but allows for custom implementation of unsupported operators.
- **NVIDIA Triton Inference Server**: An option that supports models from various frameworks. Particularly suited for cloud or edge inferencing, it provides features like concurrent model execution and model analysis.
- **NVIDIA Triton Inference Server**: An option that supports models from various frameworks. Particularly suited for cloud or edge inference, it provides features like concurrent model execution and model analysis.
## Exporting YOLOv8 Models to TensorRT