Fix mkdocs.yml raw image URLs (#14213)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com>
2024-07-05 02:25:02 +02:00 · 2024-07-05 02:25:02 +02:00 · 5d479c73c2
commit 5d479c73c2
parent d5db9c916f
69 changed files with 4767 additions and 223 deletions
--- a/docs/en/guides/optimizing-openvino-latency-vs-throughput-modes.md
+++ b/docs/en/guides/optimizing-openvino-latency-vs-throughput-modes.md
@ -66,3 +66,63 @@ For more detailed technical information and the latest updates, refer to the [Op
 ---

 Ensuring your models achieve optimal performance is not just about tweaking configurations; it's about understanding your application's needs and making informed decisions. Whether you're optimizing for real-time responses or maximizing throughput for large-scale processing, the combination of Ultralytics YOLO models and OpenVINO offers a powerful toolkit for developers to deploy high-performance AI solutions.
+
+## FAQ
+
+### How do I optimize Ultralytics YOLO models for low latency using OpenVINO?
+
+Optimizing Ultralytics YOLO models for low latency involves several key strategies:
+
+1. **Single Inference per Device:** Limit inferences to one at a time per device to minimize delays.
+2. **Leveraging Sub-Devices:** Utilize devices like multi-socket CPUs or multi-tile GPUs which can handle multiple requests with minimal latency increase.
+3. **OpenVINO Performance Hints:** Use OpenVINO's `ov::hint::PerformanceMode::LATENCY` during model compilation for simplified, device-agnostic tuning.
+
+For more practical tips on optimizing latency, check out the [Latency Optimization section](#optimizing-for-latency) of our guide.
+
+### Why should I use OpenVINO for optimizing Ultralytics YOLO throughput?
+
+OpenVINO enhances Ultralytics YOLO model throughput by maximizing device resource utilization without sacrificing performance. Key benefits include:
+
+- **Performance Hints:** Simple, high-level performance tuning across devices.
+- **Explicit Batching and Streams:** Fine-tuning for advanced performance.
+- **Multi-Device Execution:** Automated inference load balancing, easing application-level management.
+
+Example configuration:
+
+```python
+import openvino.properties.hint as hints
+
+config = {hints.performance_mode: hints.PerformanceMode.THROUGHPUT}
+compiled_model = core.compile_model(model, "GPU", config)
+```
+
+Learn more about throughput optimization in the [Throughput Optimization section](#optimizing-for-throughput) of our detailed guide.
+
+### What is the best practice for reducing first-inference latency in OpenVINO?
+
+To reduce first-inference latency, consider these practices:
+
+1. **Model Caching:** Use model caching to decrease load and compile times.
+2. **Model Mapping vs. Reading:** Use mapping (`ov::enable_mmap(true)`) by default but switch to reading (`ov::enable_mmap(false)`) if the model is on a removable or network drive.
+3. **AUTO Device Selection:** Utilize AUTO mode to start with CPU inference and transition to an accelerator seamlessly.
+
+For detailed strategies on managing first-inference latency, refer to the [Managing First-Inference Latency section](#managing-first-inference-latency).
+
+### How do I balance optimizing for latency and throughput with Ultralytics YOLO and OpenVINO?
+
+Balancing latency and throughput optimization requires understanding your application needs:
+
+- **Latency Optimization:** Ideal for real-time applications requiring immediate responses (e.g., consumer-grade apps).
+- **Throughput Optimization:** Best for scenarios with many concurrent inferences, maximizing resource use (e.g., large-scale deployments).
+
+Using OpenVINO's high-level performance hints and multi-device modes can help strike the right balance. Choose the appropriate [OpenVINO Performance hints](https://docs.ultralytics.com/integrations/openvino#openvino-performance-hints) based on your specific requirements.
+
+### Can I use Ultralytics YOLO models with other AI frameworks besides OpenVINO?
+
+Yes, Ultralytics YOLO models are highly versatile and can be integrated with various AI frameworks. Options include:
+
+- **TensorRT:** For NVIDIA GPU optimization, follow the [TensorRT integration guide](https://docs.ultralytics.com/integrations/tensorrt).
+- **CoreML:** For Apple devices, refer to our [CoreML export instructions](https://docs.ultralytics.com/integrations/coreml).
+- **TensorFlow.js:** For web and Node.js apps, see the [TF.js conversion guide](https://docs.ultralytics.com/integrations/tfjs).
+
+Explore more integrations on the [Ultralytics Integrations page](https://docs.ultralytics.com/integrations).