Fix mkdocs.yml raw image URLs (#14213)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com>
This commit is contained in:
Glenn Jocher 2024-07-05 02:25:02 +02:00 committed by GitHub
parent d5db9c916f
commit 5d479c73c2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
69 changed files with 4767 additions and 223 deletions

View file

@ -66,3 +66,63 @@ For more detailed technical information and the latest updates, refer to the [Op
---
Ensuring your models achieve optimal performance is not just about tweaking configurations; it's about understanding your application's needs and making informed decisions. Whether you're optimizing for real-time responses or maximizing throughput for large-scale processing, the combination of Ultralytics YOLO models and OpenVINO offers a powerful toolkit for developers to deploy high-performance AI solutions.
## FAQ
### How do I optimize Ultralytics YOLO models for low latency using OpenVINO?
Optimizing Ultralytics YOLO models for low latency involves several key strategies:
1. **Single Inference per Device:** Limit inferences to one at a time per device to minimize delays.
2. **Leveraging Sub-Devices:** Utilize devices like multi-socket CPUs or multi-tile GPUs which can handle multiple requests with minimal latency increase.
3. **OpenVINO Performance Hints:** Use OpenVINO's `ov::hint::PerformanceMode::LATENCY` during model compilation for simplified, device-agnostic tuning.
For more practical tips on optimizing latency, check out the [Latency Optimization section](#optimizing-for-latency) of our guide.
### Why should I use OpenVINO for optimizing Ultralytics YOLO throughput?
OpenVINO enhances Ultralytics YOLO model throughput by maximizing device resource utilization without sacrificing performance. Key benefits include:
- **Performance Hints:** Simple, high-level performance tuning across devices.
- **Explicit Batching and Streams:** Fine-tuning for advanced performance.
- **Multi-Device Execution:** Automated inference load balancing, easing application-level management.
Example configuration:
```python
import openvino.properties.hint as hints
config = {hints.performance_mode: hints.PerformanceMode.THROUGHPUT}
compiled_model = core.compile_model(model, "GPU", config)
```
Learn more about throughput optimization in the [Throughput Optimization section](#optimizing-for-throughput) of our detailed guide.
### What is the best practice for reducing first-inference latency in OpenVINO?
To reduce first-inference latency, consider these practices:
1. **Model Caching:** Use model caching to decrease load and compile times.
2. **Model Mapping vs. Reading:** Use mapping (`ov::enable_mmap(true)`) by default but switch to reading (`ov::enable_mmap(false)`) if the model is on a removable or network drive.
3. **AUTO Device Selection:** Utilize AUTO mode to start with CPU inference and transition to an accelerator seamlessly.
For detailed strategies on managing first-inference latency, refer to the [Managing First-Inference Latency section](#managing-first-inference-latency).
### How do I balance optimizing for latency and throughput with Ultralytics YOLO and OpenVINO?
Balancing latency and throughput optimization requires understanding your application needs:
- **Latency Optimization:** Ideal for real-time applications requiring immediate responses (e.g., consumer-grade apps).
- **Throughput Optimization:** Best for scenarios with many concurrent inferences, maximizing resource use (e.g., large-scale deployments).
Using OpenVINO's high-level performance hints and multi-device modes can help strike the right balance. Choose the appropriate [OpenVINO Performance hints](https://docs.ultralytics.com/integrations/openvino#openvino-performance-hints) based on your specific requirements.
### Can I use Ultralytics YOLO models with other AI frameworks besides OpenVINO?
Yes, Ultralytics YOLO models are highly versatile and can be integrated with various AI frameworks. Options include:
- **TensorRT:** For NVIDIA GPU optimization, follow the [TensorRT integration guide](https://docs.ultralytics.com/integrations/tensorrt).
- **CoreML:** For Apple devices, refer to our [CoreML export instructions](https://docs.ultralytics.com/integrations/coreml).
- **TensorFlow.js:** For web and Node.js apps, see the [TF.js conversion guide](https://docs.ultralytics.com/integrations/tfjs).
Explore more integrations on the [Ultralytics Integrations page](https://docs.ultralytics.com/integrations).