Fix mkdocs.yml raw image URLs (#14213)
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com>
This commit is contained in:
parent
d5db9c916f
commit
5d479c73c2
69 changed files with 4767 additions and 223 deletions
|
|
@ -66,3 +66,63 @@ For more detailed technical information and the latest updates, refer to the [Op
|
|||
---
|
||||
|
||||
Ensuring your models achieve optimal performance is not just about tweaking configurations; it's about understanding your application's needs and making informed decisions. Whether you're optimizing for real-time responses or maximizing throughput for large-scale processing, the combination of Ultralytics YOLO models and OpenVINO offers a powerful toolkit for developers to deploy high-performance AI solutions.
|
||||
|
||||
## FAQ
|
||||
|
||||
### How do I optimize Ultralytics YOLO models for low latency using OpenVINO?
|
||||
|
||||
Optimizing Ultralytics YOLO models for low latency involves several key strategies:
|
||||
|
||||
1. **Single Inference per Device:** Limit inferences to one at a time per device to minimize delays.
|
||||
2. **Leveraging Sub-Devices:** Utilize devices like multi-socket CPUs or multi-tile GPUs which can handle multiple requests with minimal latency increase.
|
||||
3. **OpenVINO Performance Hints:** Use OpenVINO's `ov::hint::PerformanceMode::LATENCY` during model compilation for simplified, device-agnostic tuning.
|
||||
|
||||
For more practical tips on optimizing latency, check out the [Latency Optimization section](#optimizing-for-latency) of our guide.
|
||||
|
||||
### Why should I use OpenVINO for optimizing Ultralytics YOLO throughput?
|
||||
|
||||
OpenVINO enhances Ultralytics YOLO model throughput by maximizing device resource utilization without sacrificing performance. Key benefits include:
|
||||
|
||||
- **Performance Hints:** Simple, high-level performance tuning across devices.
|
||||
- **Explicit Batching and Streams:** Fine-tuning for advanced performance.
|
||||
- **Multi-Device Execution:** Automated inference load balancing, easing application-level management.
|
||||
|
||||
Example configuration:
|
||||
|
||||
```python
|
||||
import openvino.properties.hint as hints
|
||||
|
||||
config = {hints.performance_mode: hints.PerformanceMode.THROUGHPUT}
|
||||
compiled_model = core.compile_model(model, "GPU", config)
|
||||
```
|
||||
|
||||
Learn more about throughput optimization in the [Throughput Optimization section](#optimizing-for-throughput) of our detailed guide.
|
||||
|
||||
### What is the best practice for reducing first-inference latency in OpenVINO?
|
||||
|
||||
To reduce first-inference latency, consider these practices:
|
||||
|
||||
1. **Model Caching:** Use model caching to decrease load and compile times.
|
||||
2. **Model Mapping vs. Reading:** Use mapping (`ov::enable_mmap(true)`) by default but switch to reading (`ov::enable_mmap(false)`) if the model is on a removable or network drive.
|
||||
3. **AUTO Device Selection:** Utilize AUTO mode to start with CPU inference and transition to an accelerator seamlessly.
|
||||
|
||||
For detailed strategies on managing first-inference latency, refer to the [Managing First-Inference Latency section](#managing-first-inference-latency).
|
||||
|
||||
### How do I balance optimizing for latency and throughput with Ultralytics YOLO and OpenVINO?
|
||||
|
||||
Balancing latency and throughput optimization requires understanding your application needs:
|
||||
|
||||
- **Latency Optimization:** Ideal for real-time applications requiring immediate responses (e.g., consumer-grade apps).
|
||||
- **Throughput Optimization:** Best for scenarios with many concurrent inferences, maximizing resource use (e.g., large-scale deployments).
|
||||
|
||||
Using OpenVINO's high-level performance hints and multi-device modes can help strike the right balance. Choose the appropriate [OpenVINO Performance hints](https://docs.ultralytics.com/integrations/openvino#openvino-performance-hints) based on your specific requirements.
|
||||
|
||||
### Can I use Ultralytics YOLO models with other AI frameworks besides OpenVINO?
|
||||
|
||||
Yes, Ultralytics YOLO models are highly versatile and can be integrated with various AI frameworks. Options include:
|
||||
|
||||
- **TensorRT:** For NVIDIA GPU optimization, follow the [TensorRT integration guide](https://docs.ultralytics.com/integrations/tensorrt).
|
||||
- **CoreML:** For Apple devices, refer to our [CoreML export instructions](https://docs.ultralytics.com/integrations/coreml).
|
||||
- **TensorFlow.js:** For web and Node.js apps, see the [TF.js conversion guide](https://docs.ultralytics.com/integrations/tfjs).
|
||||
|
||||
Explore more integrations on the [Ultralytics Integrations page](https://docs.ultralytics.com/integrations).
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue