Add FAQ sections to Modes and Tasks (#14181)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
Co-authored-by: Abirami Vina <abirami.vina@gmail.com>
Co-authored-by: RizwanMunawar <chr043416@gmail.com>
Co-authored-by: Muhammad Rizwan Munawar <muhammadrizwanmunawar123@gmail.com>
This commit is contained in:
Glenn Jocher 2024-07-04 17:16:16 +02:00 committed by GitHub
parent e285d3d1b2
commit 6c13bea7b8
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
39 changed files with 2247 additions and 481 deletions

View file

@ -10,7 +10,7 @@ keywords: Model Deployment, Machine Learning Model Deployment, ML Model Deployme
Model deployment is the [step in a computer vision project](./steps-of-a-cv-project.md) that brings a model from the development phase into a real-world application. There are various [model deployment options](./model-deployment-options.md): cloud deployment offers scalability and ease of access, edge deployment reduces latency by bringing the model closer to the data source, and local deployment ensures privacy and control. Choosing the right strategy depends on your application's needs, balancing speed, security, and scalability.
Its also important to follow best practices when deploying a model because deployment can significantly impact the effectiveness and reliability of the model's performance. In this guide, well focus on how to make sure that your model deployment is smooth, efficient, and secure.
It's also important to follow best practices when deploying a model because deployment can significantly impact the effectiveness and reliability of the model's performance. In this guide, we'll focus on how to make sure that your model deployment is smooth, efficient, and secure.
## Model Deployment Options
@ -113,7 +113,7 @@ It's essential to control who can access your model and its data to prevent unau
### Model Obfuscation
Protecting your model from being reverse-engineered or misused can done through model obfuscation. It involves encrypting model parameters, such as weights and biases in neural networks, to make it difficult for unauthorized individuals to understand or alter the model. You can also obfuscate the models architecture by renaming layers and parameters or adding dummy layers, making it harder for attackers to reverse-engineer it. You can also serve the model in a secure environment, like a secure enclave or using a trusted execution environment (TEE), can provide an extra layer of protection during inference.
Protecting your model from being reverse-engineered or misused can done through model obfuscation. It involves encrypting model parameters, such as weights and biases in neural networks, to make it difficult for unauthorized individuals to understand or alter the model. You can also obfuscate the model's architecture by renaming layers and parameters or adding dummy layers, making it harder for attackers to reverse-engineer it. You can also serve the model in a secure environment, like a secure enclave or using a trusted execution environment (TEE), can provide an extra layer of protection during inference.
## Share Ideas With Your Peers

View file

@ -10,11 +10,11 @@ keywords: Overfitting and Underfitting in Machine Learning, Model Testing, Data
After [training](./model-training-tips.md) and [evaluating](./model-evaluation-insights.md) your model, it's time to test it. Model testing involves assessing how well it performs in real-world scenarios. Testing considers factors like accuracy, reliability, fairness, and how easy it is to understand the model's decisions. The goal is to make sure the model performs as intended, delivers the expected results, and fits into the [overall objective of your application](./defining-project-goals.md) or project.
Model testings definition is quite similar to model evaluation, but they are two distinct [steps in a computer vision project](./steps-of-a-cv-project.md). Model evaluation involves metrics and plots to assess the model's accuracy. On the other hand, model testing checks if the models learned behavior is the same as expectations. In this guide, well explore strategies for testing your computer vision models.
Model testing's definition is quite similar to model evaluation, but they are two distinct [steps in a computer vision project](./steps-of-a-cv-project.md). Model evaluation involves metrics and plots to assess the model's accuracy. On the other hand, model testing checks if the model's learned behavior is the same as expectations. In this guide, we'll explore strategies for testing your computer vision models.
## Model Testing Vs. Model Evaluation
First, lets understand the difference between model evaluation and testing with an example.
First, let's understand the difference between model evaluation and testing with an example.
Suppose you have trained a computer vision model to recognize cats and dogs, and you want to deploy this model at a pet store to monitor the animals. During the model evaluation phase, you use a labeled dataset to calculate metrics like accuracy, precision, recall, and F1 score. For instance, the model might have an accuracy of 98% in distinguishing between cats and dogs in a given dataset.
@ -26,7 +26,7 @@ Computer vision models learn from datasets by detecting patterns, making predict
Here are two points to keep in mind before testing your model:
- **Realistic Representation:** The previously unseen testing data should be similar to the data that the model will have to handle when deployed. This helps get a realistic understanding of the models capabilities.
- **Realistic Representation:** The previously unseen testing data should be similar to the data that the model will have to handle when deployed. This helps get a realistic understanding of the model's capabilities.
- **Sufficient Size:** The size of the testing dataset needs to be large enough to provide reliable insights into how well the model performs.
## Testing Your Computer Vision Model
@ -35,18 +35,18 @@ Here are the key steps to take to test your computer vision model and understand
- **Run Predictions:** Use the model to make predictions on the test dataset.
- **Compare Predictions:** Check how well the model's predictions match the actual labels (ground truth).
- **Calculate Performance Metrics:** [Compute metrics](./yolo-performance-metrics.md) like accuracy, precision, recall, and F1 score to understand the models strengths and weaknesses. Testing focuses on how these metrics reflect real-world performance.
- **Calculate Performance Metrics:** [Compute metrics](./yolo-performance-metrics.md) like accuracy, precision, recall, and F1 score to understand the model's strengths and weaknesses. Testing focuses on how these metrics reflect real-world performance.
- **Visualize Results:** Create visual aids like confusion matrices and ROC curves. These help you spot specific areas where the model might not be performing well in practical applications.
Next, the testing results can be analyzed:
- **Misclassified Images:** Identify and review images that the model misclassified to understand where it is going wrong.
- **Error Analysis:** Perform a thorough error analysis to understand the types of errors (e.g., false positives vs. false negatives) and their potential causes.
- **Bias and Fairness:** Check for any biases in the models predictions. Ensure that the model performs equally well across different subsets of the data, especially if it includes sensitive attributes like race, gender, or age.
- **Bias and Fairness:** Check for any biases in the model's predictions. Ensure that the model performs equally well across different subsets of the data, especially if it includes sensitive attributes like race, gender, or age.
## Testing Your YOLOv8 Model
To test your YOLOv8 model, you can use the validation mode. It's a straightforward way to understand the model's strengths and areas that need improvement. Also, youll need to format your test dataset correctly for YOLOv8. For more details on how to use the validation mode, check out the [Model Validation](../modes/val.md) docs page.
To test your YOLOv8 model, you can use the validation mode. It's a straightforward way to understand the model's strengths and areas that need improvement. Also, you'll need to format your test dataset correctly for YOLOv8. For more details on how to use the validation mode, check out the [Model Validation](../modes/val.md) docs page.
## Using YOLOv8 to Predict on Multiple Test Images
@ -76,7 +76,7 @@ Overfitting happens when your model learns the training data too well, including
### Underfitting
Underfitting occurs when your model cant capture the underlying patterns in the data. In computer vision, an underfitted model might not even recognize objects correctly in the training images.
Underfitting occurs when your model can't capture the underlying patterns in the data. In computer vision, an underfitted model might not even recognize objects correctly in the training images.
#### Signs of Underfitting
@ -85,10 +85,11 @@ Underfitting occurs when your model cant capture the underlying patterns in t
### Balancing Overfitting and Underfitting
The key is to find a balance between overfitting and underfitting. Ideally, a model should perform well on both training and validation datasets. Regularly monitoring your models performance through metrics and visual inspections, along with applying the right strategies, can help you achieve the best results.
The key is to find a balance between overfitting and underfitting. Ideally, a model should perform well on both training and validation datasets. Regularly monitoring your model's performance through metrics and visual inspections, along with applying the right strategies, can help you achieve the best results.
<p align="center">
<img width="100%" src="https://viso.ai/wp-content/uploads/2022/07/overfitting-underfitting-appropriate-fitting.jpg" alt="Overfitting and Underfitting Overview">
</p>
## Data Leakage in Computer Vision and How to Avoid It
@ -100,7 +101,7 @@ Data leakage can be tricky to spot and often comes from hidden biases in the tra
- **Camera Bias:** Different angles, lighting, shadows, and camera movements can introduce unwanted patterns.
- **Overlay Bias:** Logos, timestamps, or other overlays in images can mislead the model.
- **Font and Object Bias:** Specific fonts or objects that frequently appear in certain classes can skew the models learning.
- **Font and Object Bias:** Specific fonts or objects that frequently appear in certain classes can skew the model's learning.
- **Spatial Bias:** Imbalances in foreground-background, bounding box distributions, and object locations can affect training.
- **Label and Domain Bias:** Incorrect labels or shifts in data types can lead to leakage.
@ -110,7 +111,7 @@ To find data leakage, you can:
- **Check Performance:** If the model's results are surprisingly good, it might be leaking.
- **Look at Feature Importance:** If one feature is much more important than others, it could indicate leakage.
- **Visual Inspection:** Double-check that the models decisions make sense intuitively.
- **Visual Inspection:** Double-check that the model's decisions make sense intuitively.
- **Verify Data Separation:** Make sure data was divided correctly before any processing.
### Avoiding Data Leakage
@ -138,4 +139,4 @@ These resources will help you navigate challenges and remain updated on the late
## In Summary
Building trustworthy computer vision models relies on rigorous model testing. By testing the model with previously unseen data, we can analyze it and spot weaknesses like overfitting and data leakage. Addressing these issues before deployment helps the model perform well in real-world applications. Its important to remember that model testing is just as crucial as model evaluation in guaranteeing the model's long-term success and effectiveness.
Building trustworthy computer vision models relies on rigorous model testing. By testing the model with previously unseen data, we can analyze it and spot weaknesses like overfitting and data leakage. Addressing these issues before deployment helps the model perform well in real-world applications. It's important to remember that model testing is just as crucial as model evaluation in guaranteeing the model's long-term success and effectiveness.

View file

@ -241,57 +241,68 @@ The original FastSAM paper can be found on [arXiv](https://arxiv.org/abs/2306.12
## FAQ
### What is FastSAM and how does it work?
### What is FastSAM and how does it differ from SAM?
FastSAM, or Fast Segment Anything Model, is a real-time CNN-based solution designed to segment any object within an image. It decouples the segmentation task into two stages: all-instance segmentation and prompt-guided selection. The first stage uses [YOLOv8-seg](../tasks/segment.md) to produce segmentation masks for all instances in the image. The second stage outputs the region-of-interest based on user prompts. This approach significantly reduces computational demands while maintaining competitive performance, making it ideal for various vision tasks.
FastSAM, short for Fast Segment Anything Model, is a real-time convolutional neural network (CNN)-based solution designed to reduce computational demands while maintaining high performance in object segmentation tasks. Unlike the Segment Anything Model (SAM), which uses a heavier Transformer-based architecture, FastSAM leverages [Ultralytics YOLOv8-seg](../tasks/segment.md) for efficient instance segmentation in two stages: all-instance segmentation followed by prompt-guided selection.
### How does FastSAM compare to the Segment Anything Model (SAM)?
### How does FastSAM achieve real-time segmentation performance?
FastSAM addresses the limitations of SAM, which is a heavy Transformer model requiring substantial computational resources. FastSAM offers similar performance with significantly reduced computational demands by leveraging CNNs for real-time segmentation. It achieves competitive results on benchmarks like MS COCO with faster inference speeds using a single NVIDIA RTX 3090. This makes FastSAM a more efficient and practical solution for real-time industrial applications.
FastSAM achieves real-time segmentation by decoupling the segmentation task into all-instance segmentation with YOLOv8-seg and prompt-guided selection stages. By utilizing the computational efficiency of CNNs, FastSAM offers significant reductions in computational and resource demands while maintaining competitive performance. This dual-stage approach enables FastSAM to deliver fast and efficient segmentation suitable for applications requiring quick results.
### Can I use FastSAM for real-time segmentation and what are its practical applications?
### What are the practical applications of FastSAM?
Yes, FastSAM is designed for real-time segmentation tasks. Its efficiency and reduced computational demands make it suitable for various practical applications, including:
FastSAM is practical for a variety of computer vision tasks that require real-time segmentation performance. Applications include:
- Industrial automation where quick segmentation results are necessary.
- Real-time tracking in video streams ([tracking mode](../modes/track.md)).
- Real-time object detection and segmentation in autonomous systems.
- Security and surveillance systems requiring prompt object segmentation.
- Industrial automation for quality control and assurance
- Real-time video analysis for security and surveillance
- Autonomous vehicles for object detection and segmentation
- Medical imaging for precise and quick segmentation tasks
### How do I use FastSAM for inference in Python?
Its ability to handle various user interaction prompts makes FastSAM adaptable and flexible for diverse scenarios.
You can easily integrate FastSAM into your Python applications for inference. Here's an example:
### How do I use the FastSAM model for inference in Python?
To use FastSAM for inference in Python, you can follow the example below:
```python
from ultralytics import FastSAM
from ultralytics.models.fastsam import FastSAMPrompt
# Define an inference source
source = "path/to/image.jpg"
source = "path/to/bus.jpg"
# Create a FastSAM model
model = FastSAM("FastSAM-s.pt") # or FastSAM-x.pt
# Run inference on an image
results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)
everything_results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)
# Process the prompts
prompt_process = FastSAMPrompt(source, results, device="cpu")
annotations = prompt_process.everything_prompt()
prompt_process.plot(annotations=annotations, output="./")
# Prepare a Prompt Process object
prompt_process = FastSAMPrompt(source, everything_results, device="cpu")
# Everything prompt
ann = prompt_process.everything_prompt()
# Bounding box prompt
ann = prompt_process.box_prompt(bbox=[200, 200, 300, 300])
# Text prompt
ann = prompt_process.text_prompt(text="a photo of a dog")
# Point prompt
ann = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1])
prompt_process.plot(annotations=ann, output="./")
```
This snippet demonstrates the simplicity of loading a pre-trained model and running predictions. For more details, refer to the [predict mode](../modes/predict.md).
For more details on inference methods, check the [Predict Usage](#predict-usage) section of the documentation.
### What are the key features of FastSAM?
### What types of prompts does FastSAM support for segmentation tasks?
FastSAM offers several key features:
FastSAM supports multiple prompt types for guiding the segmentation tasks:
1. **Real-time solution**: Leveraging CNNs for immediate results.
2. **Efficiency and performance**: Comparable to SAM but with reduced computational resources.
3. **Prompt-guided segmentation**: Flexibility to segment objects based on various user interactions.
4. **Based on YOLOv8-seg**: Utilizes YOLOv8's capabilities for instance segmentation.
5. **Benchmark performance**: High scores on MS COCO with faster inference speeds.
6. **Model compression feasibility**: Demonstrates significant reduction in computational effort while maintaining performance.
- **Everything Prompt**: Generates segmentation for all visible objects.
- **Bounding Box (BBox) Prompt**: Segments objects within a specified bounding box.
- **Text Prompt**: Uses a descriptive text to segment objects matching the description.
- **Point Prompt**: Segments objects near specific user-defined points.
These features make FastSAM a powerful tool for a wide array of vision tasks. For a comprehensive list of features, visit the [FastSAM overview](#overview).
This flexibility allows FastSAM to adapt to a wide range of user interaction scenarios, enhancing its utility across different applications. For more information on using these prompts, refer to the [Key Features](#key-features) section.

View file

@ -98,52 +98,44 @@ For detailed steps, consult our [Contributing Guide](../help/contributing.md).
## FAQ
### What types of tasks can Ultralytics YOLO models handle?
### What are the key advantages of using Ultralytics YOLOv8 for object detection?
Ultralytics YOLO models support a range of tasks including [object detection](../tasks/detect.md), [instance segmentation](../tasks/segment.md), [image classification](../tasks/classify.md), [pose estimation](../tasks/pose.md), and [multi-object tracking](../modes/track.md). These models are designed to achieve high performance in different computer vision applications, making them versatile tools for various project needs.
Ultralytics YOLOv8 offers enhanced capabilities such as real-time object detection, instance segmentation, pose estimation, and classification. Its optimized architecture ensures high-speed performance without sacrificing accuracy, making it ideal for a variety of applications. YOLOv8 also includes built-in compatibility with popular datasets and models, as detailed on the [YOLOv8 documentation page](../models/yolov8.md).
### How do I train a YOLOv8 model for object detection?
### How can I train a YOLOv8 model on custom data?
To train a YOLOv8 model for object detection, you can either use the Python API or the Command Line Interface (CLI). Below is an example using Python:
Training a YOLOv8 model on custom data can be easily accomplished using Ultralytics' libraries. Here's a quick example:
```python
from ultralytics import YOLO
!!! Example
# Load a COCO-pretrained YOLOv8n model
model = YOLO("yolov8n.pt")
=== "Python"
# Display model information (optional)
model.info()
```python
from ultralytics import YOLO
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
# Load a YOLOv8n model
model = YOLO("yolov8n.pt")
# Train the model on custom dataset
results = model.train(data="custom_data.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo train model=yolov8n.pt data='custom_data.yaml' epochs=100 imgsz=640
```
For more detailed instructions, visit the [Train](../modes/train.md) documentation page.
### Can I contribute my own model to Ultralytics?
Yes, you can contribute your own model to Ultralytics. To do so, follow these steps:
1. **Fork the Repository**: Fork the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics).
2. **Clone Your Fork**: Clone your fork to your local machine and create a new branch.
3. **Implement Your Model**: Add your model while following the coding standards in the [Contributing Guide](../help/contributing.md).
4. **Test Thoroughly**: Ensure your model passes all tests.
5. **Create a Pull Request**: Submit your work for review.
Visit the [Contributing Guide](../help/contributing.md) for detailed steps.
### Which YOLO versions are supported by Ultralytics?
Ultralytics supports a wide range of YOLO versions from [YOLOv3](yolov3.md) to the latest [YOLOv10](yolov10.md). Each version has unique features and improvements. For instance, YOLOv8 supports tasks such as instance segmentation and pose estimation, while YOLOv10 offers NMS-free training and efficiency-accuracy driven architecture.
Ultralytics supports a comprehensive range of YOLO (You Only Look Once) versions from YOLOv3 to YOLOv10, along with models like NAS, SAM, and RT-DETR. Each version is optimized for various tasks such as detection, segmentation, and classification. For detailed information on each model, refer to the [Models Supported by Ultralytics](../models/index.md) documentation.
### How can I run inference with a YOLOv8 model using the Command Line Interface (CLI)?
### Why should I use Ultralytics HUB for machine learning projects?
To run inference with a YOLOv8 model using the CLI, use the following command:
Ultralytics HUB provides a no-code, end-to-end platform for training, deploying, and managing YOLO models. It simplifies complex workflows, enabling users to focus on model performance and application. The HUB also offers cloud training capabilities, comprehensive dataset management, and user-friendly interfaces. Learn more about it on the [Ultralytics HUB](../hub/index.md) documentation page.
```bash
# Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
yolo predict model=yolov8n.pt source=path/to/bus.jpg
```
### What types of tasks can YOLOv8 perform, and how does it compare to other YOLO versions?
For more information on using CLI commands, visit the [Predict](../modes/predict.md) documentation page.
YOLOv8 is a versatile model capable of performing tasks including object detection, instance segmentation, classification, and pose estimation. Compared to earlier versions like YOLOv3 and YOLOv4, YOLOv8 offers significant improvements in speed and accuracy due to its optimized architecture. For a deeper comparison, refer to the [YOLOv8 documentation](../models/yolov8.md) and the [Task pages](../tasks/index.md) for more details on specific tasks.

View file

@ -120,9 +120,13 @@ If you find MobileSAM useful in your research or development work, please consid
## FAQ
### How do I use MobileSAM for image segmentation on a mobile application?
### What is MobileSAM and how does it differ from the original SAM model?
MobileSAM is specifically designed for lightweight and fast image segmentation on mobile applications. To get started, you can download the model weights [here](https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/mobile_sam.pt) and use the following Python code snippet for inference:
MobileSAM is a lightweight, fast image segmentation model designed for mobile applications. It retains the same pipeline as the original SAM but replaces the heavyweight ViT-H encoder (632M parameters) with a smaller Tiny-ViT encoder (5M parameters). This change results in MobileSAM being approximately 5 times smaller and 7 times faster than the original SAM. For instance, MobileSAM operates at about 12ms per image, compared to the original SAM's 456ms. You can learn more about the MobileSAM implementation in various projects [here](https://github.com/ChaoningZhang/MobileSAM).
### How can I test MobileSAM using Ultralytics?
Testing MobileSAM in Ultralytics can be accomplished through straightforward methods. You can use Point and Box prompts to predict segments. Here's an example using a Point prompt:
```python
from ultralytics import SAM
@ -134,35 +138,22 @@ model = SAM("mobile_sam.pt")
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])
```
For more detailed usage and various prompts, refer to the [SAM page](sam.md).
You can also refer to the [Testing MobileSAM](#testing-mobilesam-in-ultralytics) section for more details.
### What are the performance benefits of using MobileSAM over the original SAM?
### Why should I use MobileSAM for my mobile application?
MobileSAM offers significant improvements in both size and speed over the original SAM. Here is a detailed comparison:
MobileSAM is ideal for mobile applications due to its lightweight architecture and fast inference speed. Compared to the original SAM, MobileSAM is approximately 5 times smaller and 7 times faster, making it suitable for environments where computational resources are limited. This efficiency ensures that mobile devices can perform real-time image segmentation without significant latency. Additionally, MobileSAM's models, such as [Inference](../modes/predict.md), are optimized for mobile performance.
- **Image Encoder**: MobileSAM uses a smaller Tiny-ViT (5M parameters) instead of the original heavyweight ViT-H (611M parameters), resulting in an 8ms encoding time versus 452ms with SAM.
- **Overall Pipeline**: MobileSAM's entire pipeline, including image encoding and mask decoding, operates at 12ms per image compared to SAM's 456ms, making it approximately 7 times faster.
In summary, MobileSAM is about 5 times smaller and 7 times faster than the original SAM, making it ideal for mobile applications.
### How was MobileSAM trained, and is the training code available?
### Why should developers adopt MobileSAM for mobile applications?
MobileSAM was trained on a single GPU with a 100k dataset, which is 1% of the original images, in less than a day. While the training code will be made available in the future, you can currently explore other aspects of MobileSAM in the [MobileSAM GitHub repository](https://github.com/ultralytics/assets/releases/download/v8.2.0/mobile_sam.pt). This repository includes pre-trained weights and implementation details for various applications.
Developers should consider using MobileSAM for mobile applications due to its lightweight and fast performance, making it highly efficient for real-time image segmentation tasks.
### What are the primary use cases for MobileSAM?
- **Efficiency**: MobileSAM's Tiny-ViT encoder allows for rapid processing, achieving segmentation results in just 12ms.
- **Size**: The model size is significantly reduced, making it easier to deploy and run on mobile devices.
These advancements facilitate real-time applications, such as augmented reality, mobile games, and other interactive experiences.
MobileSAM is designed for fast and efficient image segmentation in mobile environments. Primary use cases include:
Learn more about the MobileSAM's performance on its [project page](https://github.com/ChaoningZhang/MobileSAM).
- **Real-time object detection and segmentation** for mobile applications.
- **Low-latency image processing** in devices with limited computational resources.
- **Integration in AI-driven mobile apps** for tasks such as augmented reality (AR) and real-time analytics.
### How easy is it to transition from the original SAM to MobileSAM?
Transitioning from the original SAM to MobileSAM is straightforward as MobileSAM retains the same pipeline, including pre-processing, post-processing, and interfaces. Only the image encoder has been changed to the more efficient Tiny-ViT. Users currently using SAM can switch to MobileSAM with minimal code modifications, benefiting from improved performance without the need for significant reconfiguration.
### What tasks are supported by the MobileSAM model?
The MobileSAM model supports instance segmentation tasks. Currently, it is optimized for [Inference](../modes/predict.md) mode. Additional tasks like validation, training, and export are not supported at this time, as indicated in the mode compatibility table:
| Model Type | Tasks Supported | Inference | Validation | Training | Export |
| ---------- | -------------------------------------------- | --------- | ---------- | -------- | ------ |
| MobileSAM | [Instance Segmentation](../tasks/segment.md) | ✅ | ❌ | ❌ | ❌ |
For more information about supported tasks and operational modes, check the [tasks page](../tasks/segment.md) and the mode details like [Inference](../modes/predict.md), [Validation](../modes/val.md), and [Export](../modes/export.md).
For more detailed use cases and performance comparisons, see the section on [Adapting from SAM to MobileSAM](#adapting-from-sam-to-mobilesam).

View file

@ -102,62 +102,52 @@ We would like to acknowledge Baidu and the [PaddlePaddle](https://github.com/Pad
## FAQ
### What is Baidu's RT-DETR and how does it work?
### What is Baidu's RT-DETR model and how does it work?
Baidu's RT-DETR (Real-Time Detection Transformer) is an end-to-end vision transformer-based object detector designed for real-time performance without compromising accuracy. Unlike traditional object detectors, it employs a convolutional backbone with an efficient hybrid encoder that handles multiscale feature processing by decoupling intra-scale interaction and cross-scale fusion. The model also utilizes IoU-aware query selection for initializing object queries, which improves detection accuracy. For flexible applications, the inference speed can be adjusted using different decoder layers without retraining. For more details, you can check out the [original paper](https://arxiv.org/abs/2304.08069).
Baidu's RT-DETR (Real-Time Detection Transformer) is an advanced real-time object detector built upon the Vision Transformer architecture. It efficiently processes multiscale features by decoupling intra-scale interaction and cross-scale fusion through its efficient hybrid encoder. By employing IoU-aware query selection, the model focuses on the most relevant objects, enhancing detection accuracy. Its adaptable inference speed, achieved by adjusting decoder layers without retraining, makes RT-DETR suitable for various real-time object detection scenarios. Learn more about RT-DETR features [here](https://arxiv.org/pdf/2304.08069.pdf).
### How can I use a pre-trained RT-DETR model with Ultralytics?
### How can I use the pre-trained RT-DETR models provided by Ultralytics?
Using a pre-trained RT-DETR model with the Ultralytics Python API is straightforward. Here's an example:
You can leverage Ultralytics Python API to use pre-trained PaddlePaddle RT-DETR models. For instance, to load an RT-DETR-l model pre-trained on COCO val2017 and achieve high FPS on T4 GPU, you can utilize the following example:
```python
from ultralytics import RTDETR
!!! Example
# Load a COCO-pretrained RT-DETR-l model
model = RTDETR("rtdetr-l.pt")
=== "Python"
# Display model information (optional)
model.info()
```python
from ultralytics import RTDETR
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Load a COCO-pretrained RT-DETR-l model
model = RTDETR("rtdetr-l.pt")
# Run inference with the RT-DETR-l model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
# Display model information (optional)
model.info()
You can find more details on specific modes like [Predict](../modes/predict.md), [Train](../modes/train.md), and [Export](../modes/export.md).
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
### What are the key features of RT-DETR that make it unique?
# Run inference with the RT-DETR-l model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
The RT-DETR model has several key features that set it apart:
=== "CLI"
1. **Efficient Hybrid Encoder**: This design processes multiscale features by decoupling intra-scale interaction and cross-scale fusion, reducing computational costs.
2. **IoU-aware Query Selection**: Enhances object query initialization, focusing on the most relevant objects for higher detection accuracy.
3. **Adaptable Inference Speed**: The model supports flexible adjustments of inference speed by using different decoder layers without retraining, making it highly adaptable for various real-time object detection scenarios.
```bash
# Load a COCO-pretrained RT-DETR-l model and train it on the COCO8 example dataset for 100 epochs
yolo train model=rtdetr-l.pt data=coco8.yaml epochs=100 imgsz=640
### What performance can I expect from RT-DETR on different scales?
# Load a COCO-pretrained RT-DETR-l model and run inference on the 'bus.jpg' image
yolo predict model=rtdetr-l.pt source=path/to/bus.jpg
```
The Ultralytics Python API provides pre-trained PaddlePaddle RT-DETR models in different scales, offering notable performance metrics:
### Why should I choose Baidu's RT-DETR over other real-time object detectors?
- **RT-DETR-L**: Achieves 53.0% AP on COCO val2017 and runs at 114 FPS on a T4 GPU.
- **RT-DETR-X**: Achieves 54.8% AP on COCO val2017 and runs at 74 FPS on a T4 GPU.
Baidu's RT-DETR stands out due to its efficient hybrid encoder and IoU-aware query selection, which drastically reduce computational costs while maintaining high accuracy. Its unique ability to adjust inference speed by using different decoder layers without retraining adds significant flexibility. This makes it particularly advantageous for applications requiring real-time performance on accelerated backends like CUDA with TensorRT, outclassing many other real-time object detectors.
This makes the RT-DETR models highly efficient for real-time applications requiring both speed and accuracy.
### How does RT-DETR support adaptable inference speed for different real-time applications?
### How can I acknowledge Baidu's contribution if I use RT-DETR in my research?
Baidu's RT-DETR allows flexible adjustments of inference speed by using different decoder layers without requiring retraining. This adaptability is crucial for scaling performance across various real-time object detection tasks. Whether you need faster processing for lower precision needs or slower, more accurate detections, RT-DETR can be tailored to meet your specific requirements.
If you use Baidu's RT-DETR in your research or development work, you should cite the original paper. Here is the BibTeX entry for your reference:
### Can I use RT-DETR models with other Ultralytics modes, such as training, validation, and export?
```bibtex
@misc{lv2023detrs,
title={DETRs Beat YOLOs on Real-time Object Detection},
author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
year={2023},
eprint={2304.08069},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
Additionally, acknowledge Baidu and the [PaddlePaddle](https://github.com/PaddlePaddle/PaddleDetection) team for creating and maintaining this valuable resource for the computer vision community.
Yes, RT-DETR models are compatible with various Ultralytics modes including training, validation, prediction, and export. You can refer to the respective documentation for detailed instructions on how to utilize these modes: [Train](../modes/train.md), [Val](../modes/val.md), [Predict](../modes/predict.md), and [Export](../modes/export.md). This ensures a comprehensive workflow for developing and deploying your object detection solutions.

View file

@ -226,25 +226,42 @@ We would like to express our gratitude to Meta AI for creating and maintaining t
## FAQ
### What is the Segment Anything Model (SAM)?
### What is the Segment Anything Model (SAM) by Ultralytics?
The Segment Anything Model (SAM) is a cutting-edge image segmentation model designed for promptable segmentation, allowing it to generate segmentation masks based on spatial or text-based prompts. SAM is capable of zero-shot transfer, meaning it can adapt to new image distributions and tasks without prior knowledge. It's trained on the extensive [SA-1B dataset](https://ai.facebook.com/datasets/segment-anything/), which comprises over 1 billion masks on 11 million images. For more details, check the [Introduction to SAM](#introduction-to-sam-the-segment-anything-model).
The Segment Anything Model (SAM) by Ultralytics is a revolutionary image segmentation model designed for promptable segmentation tasks. It leverages advanced architecture, including image and prompt encoders combined with a lightweight mask decoder, to generate high-quality segmentation masks from various prompts such as spatial or text cues. Trained on the expansive [SA-1B dataset](https://ai.facebook.com/datasets/segment-anything/), SAM excels in zero-shot performance, adapting to new image distributions and tasks without prior knowledge. Learn more [here](#introduction-to-sam-the-segment-anything-model).
### How does SAM achieve zero-shot performance in image segmentation?
### How can I use the Segment Anything Model (SAM) for image segmentation?
SAM achieves zero-shot performance by leveraging its advanced architecture, which includes a robust image encoder, a prompt encoder, and a lightweight mask decoder. This configuration enables SAM to respond effectively to any given prompt and adapt to new tasks without additional training. Its training on the highly diverse SA-1B dataset further enhances its adaptability. Learn more about its architecture in the [Key Features of the Segment Anything Model](#key-features-of-the-segment-anything-model-sam).
You can use the Segment Anything Model (SAM) for image segmentation by running inference with various prompts such as bounding boxes or points. Here's an example using Python:
### Can I use SAM for tasks other than segmentation?
```python
from ultralytics import SAM
Yes, SAM can be employed for various downstream tasks beyond its primary segmentation role. These tasks include edge detection, object proposal generation, instance segmentation, and preliminary text-to-mask prediction. Through prompt engineering, SAM can adapt swiftly to new tasks and data distributions, offering flexible applications. For practical use cases and examples, refer to the [How to Use SAM](#how-to-use-sam-versatility-and-power-in-image-segmentation) section.
# Load a model
model = SAM("sam_b.pt")
### How does SAM compare to Ultralytics YOLOv8 models?
# Segment with bounding box prompt
model("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])
While SAM excels in automatic, real-time segmentation with promptable capabilities, Ultralytics YOLOv8 models are smaller, faster, and more efficient for object detection and instance segmentation tasks. For instance, the YOLOv8n-seg model is significantly smaller and faster than the SAM-b model, making it ideal for applications requiring high-speed processing with lower computational resources. See a detailed comparison in the [SAM comparison vs YOLOv8](#sam-comparison-vs-yolov8) section.
# Segment with points prompt
model("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])
```
### How can I auto-annotate a segmentation dataset using SAM?
Alternatively, you can run inference with SAM in the command line interface (CLI):
To auto-annotate a segmentation dataset, you can use the `auto_annotate` function provided by the Ultralytics framework. This function allows you to automatically generate high-quality segmentation masks using a pre-trained detection model paired with the SAM segmentation model:
```bash
yolo predict model=sam_b.pt source=path/to/image.jpg
```
For more detailed usage instructions, visit the [Segmentation section](#sam-prediction-example).
### How do SAM and YOLOv8 compare in terms of performance?
Compared to YOLOv8, SAM models like SAM-b and FastSAM-s are larger and slower but offer unique capabilities for automatic segmentation. For instance, Ultralytics [YOLOv8n-seg](../tasks/segment.md) is 53.4 times smaller and 866 times faster than SAM-b. However, SAM's zero-shot performance makes it highly flexible and efficient in diverse, untrained tasks. Learn more about performance comparisons between SAM and YOLOv8 [here](#sam-comparison-vs-yolov8).
### How can I auto-annotate my dataset using SAM?
Ultralytics' SAM offers an auto-annotation feature that allows generating segmentation datasets using a pre-trained detection model. Here's an example in Python:
```python
from ultralytics.data.annotator import auto_annotate
@ -252,4 +269,12 @@ from ultralytics.data.annotator import auto_annotate
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model="sam_b.pt")
```
This approach accelerates the annotation process by bypassing manual labeling, making it especially useful for large datasets. For step-by-step instructions, visit [Generate Your Segmentation Dataset Using a Detection Model](#generate-your-segmentation-dataset-using-a-detection-model).
This function takes the path to your images and optional arguments for pre-trained detection and SAM segmentation models, along with device and output directory specifications. For a complete guide, see [Auto-Annotation](#auto-annotation-a-quick-path-to-segmentation-datasets).
### What datasets are used to train the Segment Anything Model (SAM)?
SAM is trained on the extensive [SA-1B dataset](https://ai.facebook.com/datasets/segment-anything/) which comprises over 1 billion masks across 11 million images. SA-1B is the largest segmentation dataset to date, providing high-quality and diverse training data, ensuring impressive zero-shot performance in varied segmentation tasks. For more details, visit the [Dataset section](#key-features-of-the-segment-anything-model-sam).
---
This FAQ aims to address common questions related to the Segment Anything Model (SAM) from Ultralytics, enhancing user understanding and facilitating effective use of Ultralytics products. For additional information, explore the relevant sections linked throughout.

View file

@ -119,60 +119,46 @@ We express our gratitude to Deci AI's [SuperGradients](https://github.com/Deci-A
## FAQ
### What is YOLO-NAS and how does it differ from previous YOLO models?
### What is YOLO-NAS and how does it improve over previous YOLO models?
YOLO-NAS, developed by Deci AI, is an advanced object detection model built using Neural Architecture Search (NAS). It offers significant improvements over previous YOLO models, including:
YOLO-NAS, developed by Deci AI, is a state-of-the-art object detection model leveraging advanced Neural Architecture Search (NAS) technology. It addresses the limitations of previous YOLO models by introducing features like quantization-friendly basic blocks and sophisticated training schemes. This results in significant improvements in performance, particularly in environments with limited computational resources. YOLO-NAS also supports quantization, maintaining high accuracy even when converted to its INT8 version, enhancing its suitability for production environments. For more details, see the [Overview](#overview) section.
- **Quantization-Friendly Basic Block:** This helps reduce the precision drop when the model is converted to INT8 quantization.
- **Enhanced Training and Quantization:** YOLO-NAS utilizes sophisticated training schemes and post-training quantization techniques.
- **Pre-trained on Large Datasets:** Utilizes the COCO, Objects365, and Roboflow 100 datasets, making it highly robust for downstream tasks.
### How can I integrate YOLO-NAS models into my Python application?
For more details, refer to the [Overview of YOLO-NAS](#overview).
### How can I use YOLO-NAS models in my Python application?
Ultralytics makes it easy to integrate YOLO-NAS models into your Python applications via the `ultralytics` package. Here's a basic example:
You can easily integrate YOLO-NAS models into your Python application using the `ultralytics` package. Here's a simple example of how to load a pre-trained YOLO-NAS model and perform inference:
```python
from ultralytics import NAS
# Load a pre-trained YOLO-NAS-s model
# Load a COCO-pretrained YOLO-NAS-s model
model = NAS("yolo_nas_s.pt")
# Display model information
model.info()
# Validate the model on the COCO8 dataset
# Validate the model on the COCO8 example dataset
results = model.val(data="coco8.yaml")
# Run inference with the YOLO-NAS-s model on an image
results = model("path/to/image.jpg")
# Run inference with the YOLO-NAS-s model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
For additional examples, see the [Usage Examples](#usage-examples) section of the documentation.
For more information, refer to the [Inference and Validation Examples](#inference-and-validation-examples).
### Why should I use YOLO-NAS for object detection tasks?
### What are the key features of YOLO-NAS and why should I consider using it?
YOLO-NAS offers several advantages that make it a compelling choice for object detection:
YOLO-NAS introduces several key features that make it a superior choice for object detection tasks:
- **High Performance:** Achieves a balance between accuracy and latency, crucial for real-time applications.
- **Pre-Trained on Diverse Datasets:** Provides robust models for various use cases with extensive pre-training on datasets like COCO and Objects365.
- **Quantization Efficiency:** For applications requiring low latency, the INT8 quantized versions show minimal precision drop, making them suitable for resource-constrained environments.
- **Quantization-Friendly Basic Block:** Enhanced architecture that improves model performance with minimal precision drop post quantization.
- **Sophisticated Training and Quantization:** Employs advanced training schemes and post-training quantization techniques.
- **AutoNAC Optimization and Pre-training:** Utilizes AutoNAC optimization and is pre-trained on prominent datasets like COCO, Objects365, and Roboflow 100.
These features contribute to its high accuracy, efficient performance, and suitability for deployment in production environments. Learn more in the [Key Features](#key-features) section.
For a detailed comparison of model variants, see [Pre-trained Models](#pre-trained-models).
### Which tasks and modes are supported by YOLO-NAS models?
### What are the supported tasks and modes for YOLO-NAS models?
YOLO-NAS models support various object detection tasks and modes such as inference, validation, and export. They do not support training. The supported models include YOLO-NAS-s, YOLO-NAS-m, and YOLO-NAS-l, each tailored to different computational capacities and performance needs. For a detailed overview, refer to the [Supported Tasks and Modes](#supported-tasks-and-modes) section.
YOLO-NAS models support several tasks and modes, including:
### Are there pre-trained YOLO-NAS models available and how do I access them?
- **Object Detection:** Suitable for identifying and localizing objects in images.
- **Inference and Validation:** Models can be used for both inference and validation to assess performance.
- **Export:** YOLO-NAS models can be exported to various formats for deployment.
Yes, Ultralytics provides pre-trained YOLO-NAS models that you can access directly. These models are pre-trained on datasets like COCO, ensuring high performance in terms of both speed and accuracy. You can download these models using the links provided in the [Pre-trained Models](#pre-trained-models) section. Here are some examples:
However, the YOLO-NAS implementation using the `ultralytics` package does not currently support training. For more information, visit the [Supported Tasks and Modes](#supported-tasks-and-modes) section.
### How does quantization impact the performance of YOLO-NAS models?
Quantization can significantly reduce the model size and improve inference speed with minimal impact on accuracy. YOLO-NAS introduces a quantization-friendly basic block, resulting in minimal precision loss when converted to INT8. This makes YOLO-NAS highly efficient for deployment in scenarios with resource constraints.
To understand the performance metrics of INT8 quantized models, refer to the [Pre-trained Models](#pre-trained-models) section.
- [YOLO-NAS-s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolo_nas_s.pt)
- [YOLO-NAS-m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolo_nas_m.pt)
- [YOLO-NAS-l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolo_nas_l.pt)

View file

@ -338,41 +338,13 @@ For further reading, the original YOLO-World paper is available on [arXiv](https
## FAQ
### What is the YOLO-World Model and how does it improve open-vocabulary object detection?
### What is the YOLO-World model and how does it work?
The YOLO-World Model is an advanced, real-time object detection model based on Ultralytics YOLOv8, designed specifically for open-vocabulary detection tasks. It leverages vision-language modeling and pre-training on large datasets to detect a broad range of objects based on descriptive texts, significantly reducing computational demands while maintaining high performance. This makes it suitable for real-time applications across various industries needing immediate results.
The YOLO-World model is an advanced, real-time object detection approach based on the [Ultralytics YOLOv8](yolov8.md) framework. It excels in Open-Vocabulary Detection tasks by identifying objects within an image based on descriptive texts. Using vision-language modeling and pre-training on large datasets, YOLO-World achieves high efficiency and performance with significantly reduced computational demands, making it ideal for real-time applications across various industries.
### How do I train a custom YOLO-World Model using the Ultralytics API?
### How does YOLO-World handle inference with custom prompts?
Training a custom YOLO-World Model is straightforward with Ultralytics' API. You can use pretrained weights and configuration files to start training on your dataset. Here is an example of training with Python:
```python
from ultralytics import YOLOWorld
# Load a pretrained YOLOv8s-worldv2 model
model = YOLOWorld("yolov8s-worldv2.pt")
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Run inference with the YOLOv8s-worldv2 model on an image
results = model("path/to/image.jpg")
```
Refer to the [Training](../modes/train.md) page for more details.
### What are the main advantages of using YOLO-World for object detection?
YOLO-World offers several advantages:
- **Real-time detection**: Utilizes CNNs for high-speed inference.
- **Lower computational demand**: Efficiently processes images with minimal resources.
- **Open-vocabulary detection**: Detects objects without predefined categories, based on descriptive texts.
- **High performance**: Outperforms other models on standard benchmarks while running on a single NVIDIA V100 GPU.
### Can I customize the classes YOLO-World detects without retraining the model?
Yes, YOLO-World allows you to dynamically specify detection classes through custom prompts without retraining the model. Here's an example of how to set custom classes:
YOLO-World supports a "prompt-then-detect" strategy, which utilizes an offline vocabulary to enhance efficiency. Custom prompts like captions or specific object categories are pre-encoded and stored as offline vocabulary embeddings. This approach streamlines the detection process without the need for retraining. You can dynamically set these prompts within the model to tailor it to specific detection tasks, as shown below:
```python
from ultralytics import YOLOWorld
@ -385,11 +357,80 @@ model.set_classes(["person", "bus"])
# Execute prediction on an image
results = model.predict("path/to/image.jpg")
# Show results
results[0].show()
```
You can learn more about this feature on the [Predict Usage](#predict-usage) section.
### Why should I choose YOLO-World over traditional Open-Vocabulary detection models?
### What datasets are supported for training YOLO-World from scratch?
YOLO-World provides several advantages over traditional Open-Vocabulary detection models:
YOLO-World supports various datasets for training, including Objects365, GQA, and Flickr30k for detection and grounding tasks. For validation, it supports datasets like LVIS minival. Detailed information about preparing and using these datasets can be found in the [Zero-shot Transfer on COCO Dataset](#zero-shot-transfer-on-coco-dataset) section.
- **Real-Time Performance:** It leverages the computational speed of CNNs to offer quick, efficient detection.
- **Efficiency and Low Resource Requirement:** YOLO-World maintains high performance while significantly reducing computational and resource demands.
- **Customizable Prompts:** The model supports dynamic prompt setting, allowing users to specify custom detection classes without retraining.
- **Benchmark Excellence:** It outperforms other open-vocabulary detectors like MDETR and GLIP in both speed and efficiency on standard benchmarks.
### How do I train a YOLO-World model on my dataset?
Training a YOLO-World model on your dataset is straightforward through the provided Python API or CLI commands. Here's how to start training using Python:
```python
from ultralytics import YOLOWorld
# Load a pretrained YOLOv8s-worldv2 model
model = YOLOWorld("yolov8s-worldv2.pt")
# Train the model on the COCO8 dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
Or using CLI:
```bash
yolo train model=yolov8s-worldv2.yaml data=coco8.yaml epochs=100 imgsz=640
```
### What are the available pre-trained YOLO-World models and their supported tasks?
Ultralytics offers multiple pre-trained YOLO-World models supporting various tasks and operating modes:
| Model Type | Pre-trained Weights | Tasks Supported | Inference | Validation | Training | Export |
| --------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------- | --------- | ---------- | -------- | ------ |
| YOLOv8s-world | [yolov8s-world.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-world.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ❌ |
| YOLOv8s-worldv2 | [yolov8s-worldv2.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s-worldv2.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ✅ |
| YOLOv8m-world | [yolov8m-world.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-world.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ❌ |
| YOLOv8m-worldv2 | [yolov8m-worldv2.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m-worldv2.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ✅ |
| YOLOv8l-world | [yolov8l-world.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-world.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ❌ |
| YOLOv8l-worldv2 | [yolov8l-worldv2.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l-worldv2.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ✅ |
| YOLOv8x-world | [yolov8x-world.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-world.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ❌ |
| YOLOv8x-worldv2 | [yolov8x-worldv2.pt](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x-worldv2.pt) | [Object Detection](../tasks/detect.md) | ✅ | ✅ | ✅ | ✅ |
### How do I reproduce the official results of YOLO-World from scratch?
To reproduce the official results from scratch, you need to prepare the datasets and launch the training using the provided code. The training procedure involves creating a data dictionary and running the `train` method with a custom trainer:
```python
from ultralytics import YOLOWorld
from ultralytics.models.yolo.world.train_world import WorldTrainerFromScratch
data = {
"train": {
"yolo_data": ["Objects365.yaml"],
"grounding_data": [
{
"img_path": "../datasets/flickr30k/images",
"json_file": "../datasets/flickr30k/final_flickr_separateGT_train.json",
},
{
"img_path": "../datasets/GQA/images",
"json_file": "../datasets/GQA/final_mixed_train_no_coco.json",
},
],
},
"val": {"yolo_data": ["lvis.yaml"]},
}
model = YOLOWorld("yolov8s-worldv2.yaml")
model.train(data=data, batch=128, epochs=100, trainer=WorldTrainerFromScratch)
```

View file

@ -19,7 +19,7 @@ Real-time object detection aims to accurately predict object categories and posi
The architecture of YOLOv10 builds upon the strengths of previous YOLO models while introducing several key innovations. The model architecture consists of the following components:
1. **Backbone**: Responsible for feature extraction, the backbone in YOLOv10 uses an enhanced version of CSPNet (Cross Stage Partial Network) to improve gradient flow and reduce computational redundancy.
2. **Neck**: The neck is designed to aggregate features from different scales and passes them to the head. It includes PAN (Path Aggregation Network) layers for effective multi-scale feature fusion.
2. **Neck**: The neck is designed to aggregate features from different scales and passes them to the head. It includes PAN (Path Aggregation Network) layers for effective multiscale feature fusion.
3. **One-to-Many Head**: Generates multiple predictions per object during training to provide rich supervisory signals and improve learning accuracy.
4. **One-to-One Head**: Generates a single best prediction per object during inference to eliminate the need for NMS, thereby reducing latency and improving efficiency.
@ -90,8 +90,8 @@ Compared to other state-of-the-art detectors:
Here is a detailed comparison of YOLOv10 variants with other state-of-the-art models:
| Model | Params (M) | FLOPs (G) | APval (%) | Latency (ms) | Latency (Forward) (ms) |
| ------------------ | ---------- | --------- | --------- | ------------ | ---------------------- |
| Model | Params<br><sup>(M) | FLOPs<br><sup>(G) | mAP<sup>val<br>50-95 | Latency<br><sup>(ms) | Latency-forward<br><sup>(ms) |
| ------------------ | ------------------ | ----------------- | -------------------- | -------------------- | ---------------------------- |
| YOLOv6-3.0-N | 4.7 | 11.4 | 37.0 | 2.69 | **1.76** |
| Gold-YOLO-N | 5.6 | 12.1 | **39.6** | 2.92 | 1.82 |
| YOLOv8-N | 3.2 | 8.7 | 37.3 | 6.16 | 1.77 |
@ -233,63 +233,56 @@ For detailed implementation, architectural innovations, and experimental results
## FAQ
### What is YOLOv10 and how does it differ from previous versions?
### What is YOLOv10 and how does it differ from previous YOLO versions?
YOLOv10 is the latest version of the YOLO (You Only Look Once) series for real-time object detection, optimized for both accuracy and efficiency. Unlike previous versions, YOLOv10 eliminates the need for non-maximum suppression (NMS) by utilizing consistent dual assignments, significantly reducing inference latency. It also introduces a holistic model design approach and various architectural enhancements, making it superior in performance and computational efficiency.
YOLOv10, developed by researchers at [Tsinghua University](https://www.tsinghua.edu.cn/en/), introduces several key innovations to real-time object detection. It eliminates the need for non-maximum suppression (NMS) by employing consistent dual assignments during training and optimized model components for superior performance with reduced computational overhead. For more details on its architecture and key features, check out the [YOLOv10 overview](#overview) section.
Learn more about YOLOv10's architecture in the [Architecture section](#architecture).
### How can I get started with running inference using YOLOv10?
### How can I use YOLOv10 for real-time object detection in Python?
For easy inference, you can use the Ultralytics YOLO Python library or the command line interface (CLI). Below are examples of predicting new images using YOLOv10:
To use YOLOv10 for real-time object detection in Python, you can follow this example script:
!!! Example
```python
from ultralytics import YOLO
=== "Python"
# Load a pre-trained YOLOv10n model
model = YOLO("yolov10n.pt")
```python
from ultralytics import YOLO
# Perform object detection on an image
results = model("image.jpg")
# Load the pre-trained YOLOv10-N model
model = YOLO("yolov10n.pt")
results = model("image.jpg")
results[0].show()
```
# Display the results
results[0].show()
```
=== "CLI"
This example demonstrates loading a pre-trained YOLOv10 model and performing object detection on an image. You can also run detection via CLI with:
```bash
yolo detect predict model=yolov10n.pt source=path/to/image.jpg
```
```bash
# Load a COCO-pretrained YOLOv10n model and run inference on the 'bus.jpg' image
yolo detect predict model=yolov10n.pt source=path/to/bus.jpg
```
For more usage examples, visit our [Usage Examples](#usage-examples) section.
Explore more usage examples in the [Usage Examples](#usage-examples) section.
### Which model variants does YOLOv10 offer and what are their use cases?
### What are the key features that make YOLOv10 stand out?
YOLOv10 offers several model variants to cater to different use cases:
YOLOv10 offers several innovative features:
- **YOLOv10-N**: Suitable for extremely resource-constrained environments
- **YOLOv10-S**: Balances speed and accuracy
- **YOLOv10-M**: General-purpose use
- **YOLOv10-B**: Higher accuracy with increased width
- **YOLOv10-L**: High accuracy at the cost of computational resources
- **YOLOv10-X**: Maximum accuracy and performance
- **NMS-Free Training**: Utilizes consistent dual assignments to eliminate NMS, reducing inference latency.
- **Holistic Model Design**: Comprehensive optimization of model components for both efficiency and accuracy, including lightweight classification heads and large-kernel convolutions.
- **Enhanced Model Capabilities**: Incorporates partial self-attention modules and other advanced techniques to boost performance without significant computational cost.
Each variant is designed for different computational needs and accuracy requirements, making them versatile for a variety of applications. Explore the [Model Variants](#model-variants) section for more information.
Dive into more details on these features in the [Key Features](#key-features) section.
### How does the NMS-free approach in YOLOv10 improve performance?
### Which model variants are available in YOLOv10 and how do they differ?
YOLOv10 eliminates the need for non-maximum suppression (NMS) during inference by employing consistent dual assignments for training. This approach reduces inference latency and enhances prediction efficiency. The architecture also includes a one-to-one head for inference, ensuring that each object gets a single best prediction. For a detailed explanation, see the [Consistent Dual Assignments for NMS-Free Training](#consistent-dual-assignments-for-nms-free-training) section.
YOLOv10 offers several variants tailored for different application needs:
### Where can I find the export options for YOLOv10 models?
- **YOLOv10-N**: Nano version for extremely resource-constrained environments.
- **YOLOv10-S**: Small version balancing speed and accuracy.
- **YOLOv10-M**: Medium version for general-purpose use.
- **YOLOv10-B**: Balanced version with increased width for higher accuracy.
- **YOLOv10-L**: Large version for higher accuracy at the cost of increased computational resources.
- **YOLOv10-X**: Extra-large version for maximum accuracy and performance.
YOLOv10 supports several export formats, including TorchScript, ONNX, OpenVINO, and TensorRT. However, not all export formats provided by Ultralytics are currently supported for YOLOv10 due to its new operations. For details on the supported formats and instructions on exporting, visit the [Exporting YOLOv10](#exporting-yolov10) section.
Each variant offers a trade-off between computational efficiency and detection accuracy, suitable for various real-time applications. See the complete [Model Variants](#model-variants) for more information.
### What are the performance benchmarks for YOLOv10 models?
### How does YOLOv10's performance compare to other object detection models?
YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in both accuracy and efficiency metrics. For example, YOLOv10-S is 1.8x faster than RT-DETR-R18 with a similar Average Precision (AP) on the COCO dataset. Additionally, YOLOv10-B has 46% less latency and 25% fewer parameters than YOLOv9-C while maintaining equivalent performance.
For detailed benchmark results, check out the [Performance](#performance) section.
YOLOv10 outperforms previous YOLO versions and other state-of-the-art models in both accuracy and efficiency. For example, YOLOv10-S is 1.8x faster than RT-DETR-R18 with a similar AP on the COCO dataset. YOLOv10-B shows 46% less latency and 25% fewer parameters than YOLOv9-C with the same performance. Detailed benchmarks can be found in the [Comparisons](#comparisons) section.

View file

@ -99,52 +99,87 @@ Thank you to Joseph Redmon and Ali Farhadi for developing the original YOLOv3.
## FAQ
### What is YOLOv3, and how does it improve object detection?
### What are the differences between YOLOv3, YOLOv3-Ultralytics, and YOLOv3u?
YOLOv3 is the third iteration of the _You Only Look Once (YOLO)_ object detection algorithm. It enhances object detection accuracy by utilizing three different sizes of detection kernels: 13x13, 26x26, and 52x52. This allows the model to detect objects at multiple scales, improving accuracy for objects of varying sizes. YOLOv3 also supports multi-label predictions for bounding boxes and includes a superior feature extractor network.
### Why should I use Ultralytics' implementation of YOLOv3?
Ultralytics' implementation of YOLOv3, known as YOLOv3-Ultralytics, retains the original model's architecture but adds significant enhancements. It offers more pre-trained models, additional training methods, and customization options, making it user-friendly and versatile for practical applications. This implementation enhances the usability and flexibility of YOLOv3 in real-world object detection tasks.
### How does YOLOv3u differ from YOLOv3 and YOLOv3-Ultralytics?
YOLOv3u is an updated version of YOLOv3-Ultralytics that incorporates the anchor-free, objectness-free split head used in YOLOv8 models. This update eliminates the need for pre-defined anchor boxes and objectness scores, making YOLOv3u more robust and accurate in detecting objects of varying sizes and shapes, without altering the backbone and neck architecture of YOLOv3.
### Can I use YOLOv3 models for multiple prediction tasks?
Yes, the YOLOv3 series, including YOLOv3, YOLOv3-Ultralytics, and YOLOv3u, are designed for object detection tasks. They support several modes such as [Inference](../modes/predict.md), [Validation](../modes/val.md), [Training](../modes/train.md), and [Export](../modes/export.md). This versatility ensures they can be used effectively across different stages of model deployment and development in various applications.
YOLOv3 is the third iteration of the YOLO (You Only Look Once) object detection algorithm developed by Joseph Redmon, known for its balance of accuracy and speed, utilizing three different scales (13x13, 26x26, and 52x52) for detections. YOLOv3-Ultralytics is Ultralytics' adaptation of YOLOv3 that adds support for more pre-trained models and facilitates easier model customization. YOLOv3u is an upgraded variant of YOLOv3-Ultralytics, integrating the anchor-free, objectness-free split head from YOLOv8, improving detection robustness and accuracy for various object sizes. For more details on the variants, refer to the [YOLOv3 series](https://github.com/ultralytics/yolov3).
### How can I train a YOLOv3 model using Ultralytics?
You can train a YOLOv3 model using Ultralytics by leveraging the Python code or CLI commands:
Training a YOLOv3 model with Ultralytics is straightforward. You can train the model using either Python or CLI:
**Using Python:**
!!! Example
```python
from ultralytics import YOLO
=== "Python"
# Load a COCO-pretrained YOLOv3n model
model = YOLO("yolov3n.pt")
```python
from ultralytics import YOLO
# Display model information (optional)
model.info()
# Load a COCO-pretrained YOLOv3n model
model = YOLO("yolov3n.pt")
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
# Run inference with the YOLOv3n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
=== "CLI"
**Using CLI:**
```bash
# Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640
```
```bash
# Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640
For more comprehensive training options and guidelines, visit our [Train mode documentation](../modes/train.md).
# Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
yolo predict model=yolov3n.pt source=path/to/bus.jpg
```
### What makes YOLOv3u more accurate for object detection tasks?
For more details, visit the [Train](../modes/train.md) and [Predict](../modes/predict.md) documentation pages.
YOLOv3u improves upon YOLOv3 and YOLOv3-Ultralytics by incorporating the anchor-free, objectness-free split head used in YOLOv8 models. This upgrade eliminates the need for pre-defined anchor boxes and objectness scores, enhancing its capability to detect objects of varying sizes and shapes more precisely. This makes YOLOv3u a better choice for complex and diverse object detection tasks. For more information, refer to the [Why YOLOv3u](#overview) section.
### How can I use YOLOv3 models for inference?
You can perform inference using YOLOv3 models by either Python scripts or CLI commands:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv3n model
model = YOLO("yolov3n.pt")
# Run inference with the YOLOv3n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
=== "CLI"
```bash
# Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
yolo predict model=yolov3n.pt source=path/to/bus.jpg
```
Refer to the [Inference mode documentation](../modes/predict.md) for more details on running YOLO models.
### What tasks are supported by YOLOv3 and its variants?
YOLOv3, YOLOv3-Ultralytics, and YOLOv3u primarily support object detection tasks. These models can be used for various stages of model deployment and development, such as Inference, Validation, Training, and Export. For a comprehensive set of tasks supported and more in-depth details, visit our [Object Detection tasks documentation](../tasks/detect.md).
### Where can I find resources to cite YOLOv3 in my research?
If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository. Example BibTeX citation:
!!! Quote ""
=== "BibTeX"
```bibtex
@article{redmon2018yolov3,
title={YOLOv3: An Incremental Improvement},
author={Redmon, Joseph and Farhadi, Ali},
journal={arXiv preprint arXiv:1804.02767},
year={2018}
}
```
For more citation details, refer to the [Citations and Acknowledgements](#citations-and-acknowledgements) section.

View file

@ -71,41 +71,22 @@ The original YOLOv4 paper can be found on [arXiv](https://arxiv.org/abs/2004.109
## FAQ
### What are the key features of the YOLOv4 model?
### What is YOLOv4 and why should I use it for object detection?
YOLOv4, which stands for "You Only Look Once version 4," is designed with several innovative features that optimize its performance. Key features include:
YOLOv4, which stands for "You Only Look Once version 4," is a state-of-the-art real-time object detection model developed by Alexey Bochkovskiy in 2020. It achieves an optimal balance between speed and accuracy, making it highly suitable for real-time applications. YOLOv4's architecture incorporates several innovative features like Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), and Self-adversarial-training (SAT), among others, to achieve state-of-the-art results. If you're looking for a high-performance model that operates efficiently on conventional GPUs, YOLOv4 is an excellent choice.
- **Weighted-Residual-Connections (WRC)**
- **Cross-Stage-Partial-connections (CSP)**
- **Cross mini-Batch Normalization (CmBN)**
- **Self-adversarial training (SAT)**
- **Mish-activation**
- **Mosaic data augmentation**
- **DropBlock regularization**
- **CIoU loss**
These features collectively enhance YOLOv4's speed and accuracy, making it ideal for real-time object detection tasks. For more details on its architecture, you can visit the [YOLOv4 section](https://docs.ultralytics.com/models/yolov4).
### How does the architecture of YOLOv4 enhance its performance?
### How does YOLOv4 compare to its predecessor, YOLOv3?
The architecture of YOLOv4 includes several key components: the backbone, the neck, and the head. The backbone, which can be models like VGG, ResNet, or CSPDarknet53, is pre-trained to predict classes and bounding boxes. The neck, utilizing PANet, connects feature maps from different stages for comprehensive data extraction. Finally, the head, which uses configurations from YOLOv3, makes the final object detections. YOLOv4 also employs "bag of freebies" techniques like mosaic data augmentation and DropBlock regularization, further optimizing its speed and accuracy.
YOLOv4 introduces several improvements over YOLOv3, including advanced features such as Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), and Cross mini-Batch Normalization (CmBN). These enhancements contribute to better speed and accuracy in object detection:
### What are "bag of freebies" in the context of YOLOv4?
- **Higher Accuracy:** YOLOv4 achieves state-of-the-art results in object detection benchmarks.
- **Improved Speed:** Despite its complex architecture, YOLOv4 maintains real-time performance.
- **Better Backbone and Neck:** YOLOv4 utilizes CSPDarknet53 as the backbone and PANet as the neck, which are more advanced than YOLOv3's components.
For more information, compare the features in the [YOLOv3](yolov3.md) and YOLOv4 documentation.
"Bag of freebies" refers to methods that improve the training accuracy of YOLOv4 without increasing the cost of inference. These techniques include various forms of data augmentation like photometric distortions (adjusting brightness, contrast, etc.) and geometric distortions (scaling, cropping, flipping, rotating). By increasing the variability of the input images, these augmentations help YOLOv4 generalize better to different types of images, thereby improving its robustness and accuracy without compromising its real-time performance.
### Can YOLOv4 be used for training on a conventional GPU?
### Why is YOLOv4 considered suitable for real-time object detection on conventional GPUs?
Yes, YOLOv4 is designed to be efficient on conventional GPU hardware, making it accessible for various users. The model can be trained using a single GPU, which broadens its usability for researchers and developers without access to high-end hardware. The architecture balances efficiency and computational requirements, allowing real-time object detection even on affordable hardware. For specific training guidelines, refer to the instructions provided in the [YOLOv4 GitHub repository](https://github.com/AlexeyAB/darknet).
YOLOv4 is designed to optimize both speed and accuracy, making it ideal for real-time object detection tasks that require quick and reliable performance. It operates efficiently on conventional GPUs, needing only one for both training and inference. This makes it accessible and practical for various applications ranging from recommendation systems to standalone process management, thereby reducing the need for extensive hardware setups and making it a cost-effective solution for real-time object detection.
### What is the "bag of freebies" in YOLOv4?
### How can I get started with YOLOv4 if Ultralytics does not currently support it?
The "bag of freebies" in YOLOv4 refers to techniques that enhance model accuracy during training without increasing inference costs. These include:
- **Photometric Distortions:** Adjusting brightness, contrast, hue, saturation, and noise.
- **Geometric Distortions:** Applying random scaling, cropping, flipping, and rotating.
These techniques improve the model's robustness and ability to generalize across different image types. Learn more about these methods in the [YOLOv4 features and performance](#features-and-performance) section.
### Is YOLOv4 supported by Ultralytics?
As of the latest update, Ultralytics does not currently support YOLOv4 models. Users interested in utilizing YOLOv4 should refer to the original [YOLOv4 GitHub repository](https://github.com/AlexeyAB/darknet) for installation and usage instructions. Ultralytics intends to update their documentation and support once integration with YOLOv4 is implemented. For alternative models supported by Ultralytics, you can explore [Ultralytics YOLO models](https://docs.ultralytics.com/models/).
To get started with YOLOv4, you should visit the official [YOLOv4 GitHub repository](https://github.com/AlexeyAB/darknet). Follow the installation instructions provided in the README file, which typically include cloning the repository, installing dependencies, and setting up environment variables. Once installed, you can train the model by preparing your dataset, configuring the model parameters, and following the usage instructions provided. Since Ultralytics does not currently support YOLOv4, it is recommended to refer directly to the YOLOv4 GitHub for the most up-to-date and detailed guidance.

View file

@ -115,69 +115,48 @@ Please note that YOLOv5 models are provided under [AGPL-3.0](https://github.com/
## FAQ
### What is YOLOv5u and how does it differ from YOLOv5?
### What is Ultralytics YOLOv5u and how does it differ from YOLOv5?
YOLOv5u is an advanced version of the YOLOv5 object detection model developed by Ultralytics. It introduces an anchor-free, objectness-free split head, a feature adopted from the YOLOv8 models. This architectural change enhances the model's accuracy-speed tradeoff, making it more efficient and flexible for various object detection tasks. Learn more about these features in the [YOLOv5 Overview](#overview).
Ultralytics YOLOv5u is an advanced version of YOLOv5, integrating the anchor-free, objectness-free split head that enhances the accuracy-speed tradeoff for real-time object detection tasks. Unlike the traditional YOLOv5, YOLOv5u adopts an anchor-free detection mechanism, making it more flexible and adaptive in diverse scenarios. For more detailed information on its features, you can refer to the [YOLOv5 Overview](#overview).
### Why should I use the anchor-free split head in YOLOv5u?
### How does the anchor-free Ultralytics head improve object detection performance in YOLOv5u?
The anchor-free split head in YOLOv5u offers several advantages:
The anchor-free Ultralytics head in YOLOv5u improves object detection performance by eliminating the dependency on predefined anchor boxes. This results in a more flexible and adaptive detection mechanism that can handle various object sizes and shapes with greater efficiency. This enhancement directly contributes to a balanced tradeoff between accuracy and speed, making YOLOv5u suitable for real-time applications. Learn more about its architecture in the [Key Features](#key-features) section.
- **Flexibility:** It alleviates the need for predefined anchor boxes, making the model more adaptable to diverse object scales and shapes.
- **Simplicity:** Reducing dependencies on anchor boxes simplifies the model architecture, potentially decreasing the computational load.
- **Performance:** Empirical results show enhanced performance in terms of accuracy and speed, making it suitable for real-time applications.
### Can I use pre-trained YOLOv5u models for different tasks and modes?
For detailed information, see the [Anchor-free Split Ultralytics Head section](#key-features).
Yes, you can use pre-trained YOLOv5u models for various tasks such as [Object Detection](../tasks/detect.md). These models support multiple modes, including [Inference](../modes/predict.md), [Validation](../modes/val.md), [Training](../modes/train.md), and [Export](../modes/export.md). This flexibility allows users to leverage the capabilities of YOLOv5u models across different operational requirements. For a detailed overview, check the [Supported Tasks and Modes](#supported-tasks-and-modes) section.
### How can I deploy the YOLOv5u model for real-time object detection?
### How do the performance metrics of YOLOv5u models compare on different platforms?
Deploying YOLOv5u for real-time object detection involves several steps:
The performance metrics of YOLOv5u models vary depending on the platform and hardware used. For example, the YOLOv5nu model achieves a 34.3 mAP on COCO dataset with a speed of 73.6 ms on CPU (ONNX) and 1.06 ms on A100 TensorRT. Detailed performance metrics for different YOLOv5u models can be found in the [Performance Metrics](#performance-metrics) section, which provides a comprehensive comparison across various devices.
1. **Load the Model:**
### How can I train a YOLOv5u model using the Ultralytics Python API?
You can train a YOLOv5u model by loading a pre-trained model and running the training command with your dataset. Here's a quick example:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolov5u.pt")
# Load a COCO-pretrained YOLOv5n model
model = YOLO("yolov5n.pt")
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
2. **Run Inference:**
```python
results = model("path/to/image.jpg")
=== "CLI"
```bash
# Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640
```
For a comprehensive guide, refer to the [Usage Examples](#usage-examples) section.
### What are the pre-trained model variants available for YOLOv5u?
YOLOv5u offers a variety of pre-trained models to cater to different needs:
- **YOLOv5nu**
- **YOLOv5su**
- **YOLOv5mu**
- **YOLOv5lu**
- **YOLOv5xu**
- **YOLOv5n6u**
- **YOLOv5s6u**
- **YOLOv5m6u**
- **YOLOv5l6u**
- **YOLOv5x6u**
These models support tasks like detection and offer various modes such as [Inference](../modes/predict.md), [Validation](../modes/val.md), [Training](../modes/train.md), and [Export](../modes/export.md). For detailed metrics, see the [Performance Metrics](#performance-metrics) section.
### How do YOLOv5u models perform on different hardware setups?
YOLOv5u models have been evaluated on both CPU and GPU hardware, demonstrating competitive performance metrics across various setups. For example:
- **YOLOv5nu.pt:**
- **Speed (CPU ONNX):** 73.6 ms
- **Speed (A100 TensorRT):** 1.06 ms
- **mAP (50-95):** 34.3
- **YOLOv5lu.pt:**
- **Speed (CPU ONNX):** 408.4 ms
- **Speed (A100 TensorRT):** 2.50 ms
- **mAP (50-95):** 52.2
For more detailed performance metrics, visit the [Performance Metrics](#performance-metrics) section.
For more detailed instructions, visit the [Usage Examples](#usage-examples) section.

View file

@ -107,53 +107,56 @@ The original YOLOv6 paper can be found on [arXiv](https://arxiv.org/abs/2301.055
## FAQ
### What is Meituan YOLOv6 and how does it differ from other YOLO models?
### What is Meituan YOLOv6 and what makes it unique?
Meituan YOLOv6 is a highly advanced object detection model that balances speed and accuracy, making it ideal for real-time applications. This model features unique enhancements such as the Bidirectional Concatenation (BiC) module, Anchor-Aided Training (AAT) strategy, and an improved backbone and neck design, providing state-of-the-art performance on the COCO dataset. Unlike prior YOLO models, YOLOv6 incorporates these innovative strategies to enhance both inference speed and detection accuracy.
Meituan YOLOv6 is a state-of-the-art object detector that balances speed and accuracy, ideal for real-time applications. It features notable architectural enhancements like the Bi-directional Concatenation (BiC) module and an Anchor-Aided Training (AAT) strategy. These innovations provide substantial performance gains with minimal speed degradation, making YOLOv6 a competitive choice for object detection tasks.
### How do I use the YOLOv6 model in a Python script?
### How does the Bi-directional Concatenation (BiC) Module in YOLOv6 improve performance?
Using the YOLOv6 model in a Python script is straightforward. Here is a sample code snippet to get you started:
The Bi-directional Concatenation (BiC) module in YOLOv6 enhances localization signals in the detector's neck, delivering performance improvements with negligible speed impact. This module effectively combines different feature maps, increasing the model's ability to detect objects accurately. For more details on YOLOv6's features, refer to the [Key Features](#key-features) section.
```python
from ultralytics import YOLO
### How can I train a YOLOv6 model using Ultralytics?
# Build a YOLOv6n model from scratch
model = YOLO("yolov6n.yaml")
You can train a YOLOv6 model using Ultralytics with simple Python or CLI commands. For instance:
# Display model information (optional)
model.info()
!!! Example
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
=== "Python"
# Run inference with the YOLOv6n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
```
```python
from ultralytics import YOLO
For more detailed examples and documentation, visit the [Train](../modes/train.md) and [Predict](../modes/predict.md) pages.
# Build a YOLOv6n model from scratch
model = YOLO("yolov6n.yaml")
### What are the performance metrics for different scales of YOLOv6 models?
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
YOLOv6 offers pretrained models in various scales with the following performance metrics on the COCO val2017 dataset:
=== "CLI"
- **YOLOv6-N**: 37.5% AP at 1187 FPS using an NVIDIA Tesla T4 GPU
- **YOLOv6-S**: 45.0% AP at 484 FPS
- **YOLOv6-M**: 50.0% AP at 226 FPS
- **YOLOv6-L**: 52.8% AP at 116 FPS
- **YOLOv6-L6**: State-of-the-art accuracy for real-time
```bash
yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640
```
These metrics make YOLOv6 a versatile choice for both high-accuracy and high-speed applications.
For more information, visit the [Train](../modes/train.md) page.
### What are the unique features of YOLOv6 that improve its performance?
### What are the different versions of YOLOv6 and their performance metrics?
YOLOv6 introduces several key features that enhance its performance:
YOLOv6 offers multiple versions, each optimized for different performance requirements:
- **Bidirectional Concatenation (BiC) Module**: Improves localization signals and offers performance gains with minimal speed degradation.
- **Anchor-Aided Training (AAT) Strategy**: Combines the benefits of anchor-based and anchor-free methods for better efficiency without sacrificing inference speed.
- **Enhanced Backbone and Neck Design**: Adds additional stages to the backbone and neck, achieving state-of-the-art results on high-resolution inputs.
- **Self-Distillation Strategy**: Boosts smaller model performance by refining the auxiliary regression branch during training and removing it during inference to maintain speed.
- YOLOv6-N: 37.5% AP at 1187 FPS
- YOLOv6-S: 45.0% AP at 484 FPS
- YOLOv6-M: 50.0% AP at 226 FPS
- YOLOv6-L: 52.8% AP at 116 FPS
- YOLOv6-L6: State-of-the-art accuracy in real-time scenarios
### How can YOLOv6 be used for mobile and embedded applications?
These models are evaluated on the COCO dataset using an NVIDIA Tesla T4 GPU. For more on performance metrics, see the [Performance Metrics](#performance-metrics) section.
YOLOv6 supports quantized models for different precisions and models optimized for mobile platforms, making it suitable for applications requiring low-latency and energy-efficient computations. For deployment on mobile and edge devices, you can explore conversion to formats like TFLite and ONNX, as detailed in the [Export](../modes/export.md) documentation. Quantized models ensure high performance even on resource-constrained devices, enabling real-time object detection in mobile and IoT applications.
### How does the Anchor-Aided Training (AAT) strategy benefit YOLOv6?
Anchor-Aided Training (AAT) in YOLOv6 combines elements of anchor-based and anchor-free approaches, enhancing the model's detection capabilities without compromising inference efficiency. This strategy leverages anchors during training to improve bounding box predictions, making YOLOv6 effective in diverse object detection tasks.
### Which operational modes are supported by YOLOv6 models in Ultralytics?
YOLOv6 supports various operational modes including Inference, Validation, Training, and Export. This flexibility allows users to fully exploit the model's capabilities in different scenarios. Check out the [Supported Tasks and Modes](#supported-tasks-and-modes) section for a detailed overview of each mode.

View file

@ -115,22 +115,40 @@ The original YOLOv7 paper can be found on [arXiv](https://arxiv.org/pdf/2207.026
## FAQ
### What makes YOLOv7 the most accurate real-time object detector?
### What is YOLOv7 and why is it considered a breakthrough in real-time object detection?
YOLOv7 stands out due to its superior accuracy and speed. With an accuracy of 56.8% AP and the ability to process up to 161 FPS on a GPU V100, it surpasses all known real-time object detectors. Additionally, YOLOv7 introduces features like model re-parameterization, dynamic label assignment, and efficient parameter usage, enhancing both speed and accuracy. Check out the detailed comparison in the [source paper](https://arxiv.org/pdf/2207.02696.pdf).
YOLOv7 is a cutting-edge real-time object detection model that achieves unparalleled speed and accuracy. It surpasses other models, such as YOLOX, YOLOv5, and PPYOLOE, in both parameters usage and inference speed. YOLOv7's distinguishing features include its model re-parameterization and dynamic label assignment, which optimize its performance without increasing inference costs. For more technical details about its architecture and comparison metrics with other state-of-the-art object detectors, refer to the [YOLOv7 paper](https://arxiv.org/pdf/2207.02696.pdf).
### How does model re-parameterization work in YOLOv7?
### How does YOLOv7 improve on previous YOLO models like YOLOv4 and YOLOv5?
Model re-parameterization in YOLOv7 involves optimizing the gradient propagation path across various network layers. This strategy effectively recalibrates the training process, improving detection accuracy without increasing the inference cost. For more details, refer to the [Model Re-parameterization section](#key-features) in the documentation.
YOLOv7 introduces several innovations, including model re-parameterization and dynamic label assignment, which enhance the training process and improve inference accuracy. Compared to YOLOv5, YOLOv7 significantly boosts speed and accuracy. For instance, YOLOv7-X improves accuracy by 2.2% and reduces parameters by 22% compared to YOLOv5-X. Detailed comparisons can be found in the performance table [YOLOv7 comparison with SOTA object detectors](#comparison-of-sota-object-detectors).
### Why should I choose YOLOv7 over YOLOv5 or other object detectors?
### Can I use YOLOv7 with Ultralytics tools and platforms?
YOLOv7 outperforms YOLOv5 and other detectors like YOLOR, YOLOX, and PPYOLOE in both speed and accuracy. For instance, YOLOv7 achieves 127 FPS faster and 10.7% higher accuracy compared to YOLOv5-N. Furthermore, YOLOv7 effectively reduces parameters and computation while delivering higher AP scores, making it an optimal choice for real-time applications. Learn more in our [comparison table](#comparison-of-sota-object-detectors).
As of now, Ultralytics does not directly support YOLOv7 in its tools and platforms. Users interested in using YOLOv7 need to follow the installation and usage instructions provided in the [YOLOv7 GitHub repository](https://github.com/WongKinYiu/yolov7). For other state-of-the-art models, you can explore and train using Ultralytics tools like [Ultralytics HUB](../hub/quickstart.md).
### What datasets is YOLOv7 trained on, and how do they impact its performance?
### How do I install and run YOLOv7 for a custom object detection project?
YOLOv7 is trained exclusively on the MS COCO dataset without using additional datasets or pre-trained weights. This robust dataset provides a wide variety of images and annotations that contribute to YOLOv7's high accuracy and generalization capabilities. Explore more about dataset formats and usage in our [datasets section](https://docs.ultralytics.com/datasets/detect/coco/).
To install and run YOLOv7, follow these steps:
### Are there any practical YOLOv7 usage examples available?
1. Clone the YOLOv7 repository:
```bash
git clone https://github.com/WongKinYiu/yolov7
```
2. Navigate to the cloned directory and install dependencies:
```bash
cd yolov7
pip install -r requirements.txt
```
3. Prepare your dataset and configure the model parameters according to the [usage instructions](https://github.com/WongKinYiu/yolov7) provided in the repository.
For further guidance, visit the YOLOv7 GitHub repository for the latest information and updates.
Currently, Ultralytics does not directly support YOLOv7 models. However, you can find detailed installation and usage instructions on the YOLOv7 GitHub repository. These steps involve cloning the repository, installing dependencies, and setting up your environment to train and use the model. Follow the [YOLOv7 GitHub repository](https://github.com/WongKinYiu/yolov7) for the latest updates. For other examples, see our [usage examples](#usage-examples) section.
### What are the key features and optimizations introduced in YOLOv7?
YOLOv7 offers several key features that revolutionize real-time object detection:
- **Model Re-parameterization**: Enhances the model's performance by optimizing gradient propagation paths.
- **Dynamic Label Assignment**: Uses a coarse-to-fine lead guided method to assign dynamic targets for outputs across different branches, improving accuracy.
- **Extended and Compound Scaling**: Efficiently utilizes parameters and computation to scale the model for various real-time applications.
- **Efficiency**: Reduces parameter count by 40% and computation by 50% compared to other state-of-the-art models while achieving faster inference speeds.
For further details on these features, see the [YOLOv7 Overview](#overview) section.

View file

@ -187,31 +187,63 @@ Please note that the DOI is pending and will be added to the citation once it is
## FAQ
### What differentiates YOLOv8 from previous YOLO versions?
### What is YOLOv8 and how does it differ from previous YOLO versions?
YOLOv8 builds upon the advancements of its predecessors by incorporating state-of-the-art backbone and neck architectures for improved feature extraction and object detection performance. It utilizes an anchor-free split head for better accuracy and efficiency. With a focus on maintaining the optimal accuracy-speed tradeoff, YOLOv8 is suitable for real-time object detection across diverse applications. Explore more in [YOLOv8 Key Features](#key-features).
YOLOv8 is the latest iteration in the Ultralytics YOLO series, designed to improve real-time object detection performance with advanced features. Unlike earlier versions, YOLOv8 incorporates an **anchor-free split Ultralytics head**, state-of-the-art backbone and neck architectures, and offers optimized accuracy-speed tradeoff, making it ideal for diverse applications. For more details, check the [Overview](#overview) and [Key Features](#key-features) sections.
### How can I use YOLOv8 for different tasks like segmentation and pose estimation?
### How can I use YOLOv8 for different computer vision tasks?
YOLOv8 is versatile, offering specialized variants for various tasks such as object detection, instance segmentation, pose/keypoints detection, oriented object detection, and classification. These models come pre-trained and are optimized for high performance and accuracy. For more details, refer to the [Supported Tasks and Modes](#supported-tasks-and-modes).
YOLOv8 supports a wide range of computer vision tasks, including object detection, instance segmentation, pose/keypoints detection, oriented object detection, and classification. Each model variant is optimized for its specific task and compatible with various operational modes like [Inference](../modes/predict.md), [Validation](../modes/val.md), [Training](../modes/train.md), and [Export](../modes/export.md). Refer to the [Supported Tasks and Modes](#supported-tasks-and-modes) section for more information.
### How do I run inference using a YOLOv8 model in Python?
### What are the performance metrics for YOLOv8 models?
To run inference with a YOLOv8 model in Python, you can use the `YOLO` class from the Ultralytics package. Here's a basic example:
YOLOv8 models achieve state-of-the-art performance across various benchmarking datasets. For instance, the YOLOv8n model achieves a mAP (mean Average Precision) of 37.3 on the COCO dataset and a speed of 0.99 ms on A100 TensorRT. Detailed performance metrics for each model variant across different tasks and datasets can be found in the [Performance Metrics](#performance-metrics) section.
```python
from ultralytics import YOLO
### How do I train a YOLOv8 model?
model = YOLO("yolov8n.pt")
results = model("path/to/image.jpg")
```
Training a YOLOv8 model can be done using either Python or CLI. Below are examples for training a model using a COCO-pretrained YOLOv8 model on the COCO8 dataset for 100 epochs:
For detailed examples, see the [Usage Examples](#usage-examples) section.
!!! Example
### What are the performance benchmarks for YOLOv8 models?
=== "Python"
YOLOv8 models are benchmarked on datasets such as COCO and Open Images V7, showing significant improvements in mAP and speed across various hardware setups. Detailed performance metrics include parameters, FLOPs, and inference speeds on different devices. For comprehensive benchmark details, visit [Performance Metrics](#performance-metrics).
```python
from ultralytics import YOLO
### How do I export a YOLOv8 model for deployment?
# Load a COCO-pretrained YOLOv8n model
model = YOLO("yolov8n.pt")
You can export YOLOv8 models to various formats like ONNX, TensorRT, and CoreML for seamless deployment across different platforms. The export process ensures maximum compatibility and performance optimization. Learn more about exporting models in the [Export](../modes/export.md) section.
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
```
For further details, visit the [Training](../modes/train.md) documentation.
### Can I benchmark YOLOv8 models for performance?
Yes, YOLOv8 models can be benchmarked for performance in terms of speed and accuracy across various export formats. You can use PyTorch, ONNX, TensorRT, and more for benchmarking. Below are example commands for benchmarking using Python and CLI:
!!! Example
=== "Python"
```python
from ultralytics.utils.benchmarks import benchmark
# Benchmark on GPU
benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
```
=== "CLI"
```bash
yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
```
For additional information, check the [Performance Metrics](#performance-metrics) section.

View file

@ -95,7 +95,7 @@ Comparatively, YOLOv9 exhibits remarkable gains:
- **Lightweight Models**: YOLOv9s surpasses the YOLO MS-S in parameter efficiency and computational load while achieving an improvement of 0.40.6% in AP.
- **Medium to Large Models**: YOLOv9m and YOLOv9e show notable advancements in balancing the trade-off between model complexity and detection performance, offering significant reductions in parameters and computations against the backdrop of improved accuracy.
The YOLOv9c model, in particular, highlights the effectiveness of the architecture's optimizations. It operates with 42% fewer parameters and 21% less computational demand than YOLOv7 AF, yet it achieves comparable accuracy, demonstrating YOLOv9's significant efficiency improvements. Furthermore, the YOLOv9e model sets a new standard for large models, with 15% fewer parameters and 25% less computational need than [YOLOv8x](yolov8.md), alongside a incremental 1.7% improvement in AP.
The YOLOv9c model, in particular, highlights the effectiveness of the architecture's optimizations. It operates with 42% fewer parameters and 21% less computational demand than YOLOv7 AF, yet it achieves comparable accuracy, demonstrating YOLOv9's significant efficiency improvements. Furthermore, the YOLOv9e model sets a new standard for large models, with 15% fewer parameters and 25% less computational need than [YOLOv8x](yolov8.md), alongside an incremental 1.7% improvement in AP.
These results showcase YOLOv9's strategic advancements in model design, emphasizing its enhanced efficiency without compromising on the precision essential for real-time object detection tasks. The model not only pushes the boundaries of performance metrics but also emphasizes the importance of computational efficiency, making it a pivotal development in the field of computer vision.
@ -180,32 +180,38 @@ The original YOLOv9 paper can be found on [arXiv](https://arxiv.org/pdf/2402.136
## FAQ
### What is YOLOv9 and why should I use it for real-time object detection?
### What innovations does YOLOv9 introduce for real-time object detection?
YOLOv9 is the latest iteration of the YOLO (You Only Look Once) object detection family, featuring groundbreaking techniques like Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN). This model demonstrates significant improvements in efficiency, accuracy, and adaptability, setting new benchmarks on the [MS COCO](../datasets/detect/coco.md) dataset. The integration of PGI and GELAN ensures retention of crucial data throughout the detection process, making YOLOv9 an exceptional choice for real-time object detection.
YOLOv9 introduces groundbreaking techniques such as Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). These innovations address information loss challenges in deep neural networks, ensuring high efficiency, accuracy, and adaptability. PGI preserves essential data across network layers, while GELAN optimizes parameter utilization and computational efficiency. Learn more about [YOLOv9's core innovations](#core-innovations-of-yolov9) that set new benchmarks on the MS COCO dataset.
### How do Programmable Gradient Information (PGI) and GELAN improve YOLOv9's performance?
### How does YOLOv9 perform on the MS COCO dataset compared to other models?
Programmable Gradient Information (PGI) helps counteract information loss in deep neural networks by ensuring the preservation of essential data across network layers. This leads to more reliable gradient generation and better model convergence. The Generalized Efficient Layer Aggregation Network (GELAN) enhances parameter utilization and computational efficiency by allowing flexible integration of various computational blocks. These innovations collectively improve the accuracy and efficiency of YOLOv9. For more details, see the [Innovations section](#programmable-gradient-information-pgi).
YOLOv9 outperforms state-of-the-art real-time object detectors by achieving higher accuracy and efficiency. On the [COCO dataset](../datasets/detect/coco.md), YOLOv9 models exhibit superior mAP scores across various sizes while maintaining or reducing computational overhead. For instance, YOLOv9c achieves comparable accuracy with 42% fewer parameters and 21% less computational demand than YOLOv7 AF. Explore [performance comparisons](#performance-on-ms-coco-dataset) for detailed metrics.
### What makes YOLOv9 more efficient than previous YOLO versions?
### How can I train a YOLOv9 model using Python and CLI?
YOLOv9 incorporates several innovations like PGI and GELAN that effectively address the information bottleneck and improve layer aggregation, respectively. This results in high parameter efficiency and reduced computational load. Models such as YOLOv9s and YOLOv9e demonstrate significant gains in performance compared to previous versions. For a detailed performance comparison, refer to the [Performance section on MS COCO](#performance-on-ms-coco-dataset).
### Can I train YOLOv9 on my custom dataset using Ultralytics?
Yes, you can easily train YOLOv9 on your custom dataset using Ultralytics. The Ultralytics YOLO framework supports various modes including [Train](../modes/train.md), [Predict](../modes/predict.md), and [Val](../modes/val.md). For example, you can start training a YOLOv9c model with the following Python code snippet:
You can train a YOLOv9 model using both Python and CLI commands. For Python, instantiate a model using the `YOLO` class and call the `train` method:
```python
from ultralytics import YOLO
# Load the YOLOv9c model configuration
model = YOLO("yolov9c.yaml")
# Train the model on your custom dataset
results = model.train(data="custom_dataset.yaml", epochs=100, imgsz=640)
# Build a YOLOv9c model from pretrained weights and train
model = YOLO("yolov9c.pt")
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
### How does YOLOv9 compare to other state-of-the-art real-time object detectors?
For CLI training, execute:
YOLOv9 shows superior performance across different model sizes on the [COCO dataset](../datasets/detect/coco.md). It achieves higher mean Average Precision (mAP) with fewer parameters and computational resources compared to competitors. For instance, the YOLOv9c model operates with 42% fewer parameters and 21% less computational demand than YOLOv7 AF, while matching its accuracy. Detailed performance metrics can be found in the [Performance Comparison table](#performance-on-ms-coco-dataset).
```bash
yolo train model=yolov9c.yaml data=coco8.yaml epochs=100 imgsz=640
```
Learn more about [usage examples](#usage-examples) for training and inference.
### What are the advantages of using Ultralytics YOLOv9 for lightweight models?
YOLOv9 is designed to mitigate information loss, which is particularly important for lightweight models often prone to losing significant information. By integrating Programmable Gradient Information (PGI) and reversible functions, YOLOv9 ensures essential data retention, enhancing the model's accuracy and efficiency. This makes it highly suitable for applications requiring compact models with high performance. For more details, explore the section on [YOLOv9's impact on lightweight models](#impact-on-lightweight-models).
### What tasks and modes does YOLOv9 support?
YOLOv9 supports various tasks including object detection and instance segmentation. It is compatible with multiple operational modes such as inference, validation, training, and export. This versatility makes YOLOv9 adaptable to diverse real-time computer vision applications. Refer to the [supported tasks and modes](#supported-tasks-and-modes) section for more information.

View file

@ -104,3 +104,70 @@ Benchmarks will attempt to run automatically on all possible export formats belo
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### How do I benchmark my YOLOv8 model's performance using Ultralytics?
Ultralytics YOLOv8 offers a Benchmark mode to assess your model's performance across different export formats. This mode provides insights into key metrics such as mean Average Precision (mAP50-95), accuracy, and inference time in milliseconds. To run benchmarks, you can use either Python or CLI commands. For example, to benchmark on a GPU:
!!! Example
=== "Python"
```python
from ultralytics.utils.benchmarks import benchmark
# Benchmark on GPU
benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
```
=== "CLI"
```bash
yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
```
For more details on benchmark arguments, visit the [Arguments](#arguments) section.
### What are the benefits of exporting YOLOv8 models to different formats?
Exporting YOLOv8 models to different formats such as ONNX, TensorRT, and OpenVINO allows you to optimize performance based on your deployment environment. For instance:
- **ONNX:** Provides up to 3x CPU speedup.
- **TensorRT:** Offers up to 5x GPU speedup.
- **OpenVINO:** Specifically optimized for Intel hardware.
These formats enhance both the speed and accuracy of your models, making them more efficient for various real-world applications. Visit the [Export](../modes/export.md) page for complete details.
### Why is benchmarking crucial in evaluating YOLOv8 models?
Benchmarking your YOLOv8 models is essential for several reasons:
- **Informed Decisions:** Understand the trade-offs between speed and accuracy.
- **Resource Allocation:** Gauge the performance across different hardware options.
- **Optimization:** Determine which export format offers the best performance for specific use cases.
- **Cost Efficiency:** Optimize hardware usage based on benchmark results.
Key metrics such as mAP50-95, Top-5 accuracy, and inference time help in making these evaluations. Refer to the [Key Metrics](#key-metrics-in-benchmark-mode) section for more information.
### Which export formats are supported by YOLOv8, and what are their advantages?
YOLOv8 supports a variety of export formats, each tailored for specific hardware and use cases:
- **ONNX:** Best for CPU performance.
- **TensorRT:** Ideal for GPU efficiency.
- **OpenVINO:** Optimized for Intel hardware.
- **CoreML & TensorFlow:** Useful for iOS and general ML applications.
For a complete list of supported formats and their respective advantages, check out the [Supported Export Formats](#supported-export-formats) section.
### What arguments can I use to fine-tune my YOLOv8 benchmarks?
When running benchmarks, several arguments can be customized to suit specific needs:
- **model:** Path to the model file (e.g., "yolov8n.pt").
- **data:** Path to a YAML file defining the dataset (e.g., "coco8.yaml").
- **imgsz:** The input image size, either as a single integer or a tuple.
- **half:** Enable FP16 inference for better performance.
- **int8:** Activate INT8 quantization for edge devices.
- **device:** Specify the computation device (e.g., "cpu", "cuda:0").
- **verbose:** Control the level of logging detail.
For a full list of arguments, refer to the [Arguments](#arguments) section.

View file

@ -110,3 +110,103 @@ Available YOLOv8 export formats are in the table below. You can export to any fo
| [TF.js](../integrations/tfjs.md) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [PaddlePaddle](../integrations/paddlepaddle.md) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz`, `batch` |
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
## FAQ
### How do I export a YOLOv8 model to ONNX format?
Exporting a YOLOv8 model to ONNX format is straightforward with Ultralytics. It provides both Python and CLI methods for exporting models.
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt") # load an official model
model = YOLO("path/to/best.pt") # load a custom trained model
# Export the model
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx # export official model
yolo export model=path/to/best.pt format=onnx # export custom trained model
```
For more details on the process, including advanced options like handling different input sizes, refer to the [ONNX](../integrations/onnx.md) section.
### What are the benefits of using TensorRT for model export?
Using TensorRT for model export offers significant performance improvements. YOLOv8 models exported to TensorRT can achieve up to a 5x GPU speedup, making it ideal for real-time inference applications.
- **Versatility:** Optimize models for a specific hardware setup.
- **Speed:** Achieve faster inference through advanced optimizations.
- **Compatibility:** Integrate smoothly with NVIDIA hardware.
To learn more about integrating TensorRT, see the [TensorRT](../integrations/tensorrt.md) integration guide.
### How do I enable INT8 quantization when exporting my YOLOv8 model?
INT8 quantization is an excellent way to compress the model and speed up inference, especially on edge devices. Here's how you can enable INT8 quantization:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolov8n.pt") # Load a model
model.export(format="onnx", int8=True)
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx int8=True # export model with INT8 quantization
```
INT8 quantization can be applied to various formats, such as TensorRT and CoreML. More details can be found in the [Export](../modes/export.md) section.
### Why is dynamic input size important when exporting models?
Dynamic input size allows the exported model to handle varying image dimensions, providing flexibility and optimizing processing efficiency for different use cases. When exporting to formats like ONNX or TensorRT, enabling dynamic input size ensures that the model can adapt to different input shapes seamlessly.
To enable this feature, use the `dynamic=True` flag during export:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
model.export(format="onnx", dynamic=True)
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx dynamic=True
```
For additional context, refer to the [dynamic input size configuration](#arguments).
### What are the key export arguments to consider for optimizing model performance?
Understanding and configuring export arguments is crucial for optimizing model performance:
- **`format:`** The target format for the exported model (e.g., `onnx`, `torchscript`, `tensorflow`).
- **`imgsz:`** Desired image size for the model input (e.g., `640` or `(height, width)`).
- **`half:`** Enables FP16 quantization, reducing model size and potentially speeding up inference.
- **`optimize:`** Applies specific optimizations for mobile or constrained environments.
- **`int8:`** Enables INT8 quantization, highly beneficial for edge deployments.
For a detailed list and explanations of all the export arguments, visit the [Export Arguments](#arguments) section.

View file

@ -71,3 +71,130 @@ Track mode is used for tracking objects in real-time using a YOLOv8 model. In th
Benchmark mode is used to profile the speed and accuracy of various export formats for YOLOv8. The benchmarks provide information on the size of the exported format, its `mAP50-95` metrics (for object detection, segmentation and pose) or `accuracy_top5` metrics (for classification), and the inference time in milliseconds per image across various export formats like ONNX, OpenVINO, TensorRT and others. This information can help users choose the optimal export format for their specific use case based on their requirements for speed and accuracy.
[Benchmark Examples](benchmark.md){ .md-button }
## FAQ
### How do I train a custom object detection model with Ultralytics YOLOv8?
Training a custom object detection model with Ultralytics YOLOv8 involves using the train mode. You need a dataset formatted in YOLO format, containing images and corresponding annotation files. Use the following command to start the training process:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Train a custom model
model = YOLO("yolov8n.pt")
model.train(data="path/to/dataset.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo train data=path/to/dataset.yaml epochs=100 imgsz=640
```
For more detailed instructions, you can refer to the [Ultralytics Train Guide](../modes/train.md).
### What metrics does Ultralytics YOLOv8 use to validate the model's performance?
Ultralytics YOLOv8 uses various metrics during the validation process to assess model performance. These include:
- **mAP (mean Average Precision)**: This evaluates the accuracy of object detection.
- **IOU (Intersection over Union)**: Measures the overlap between predicted and ground truth bounding boxes.
- **Precision and Recall**: Precision measures the ratio of true positive detections to the total detected positives, while recall measures the ratio of true positive detections to the total actual positives.
You can run the following command to start the validation:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Validate the model
model = YOLO("yolov8n.pt")
model.val(data="path/to/validation.yaml")
```
=== "CLI"
```bash
yolo val data=path/to/validation.yaml
```
Refer to the [Validation Guide](../modes/val.md) for further details.
### How can I export my YOLOv8 model for deployment?
Ultralytics YOLOv8 offers export functionality to convert your trained model into various deployment formats such as ONNX, TensorRT, CoreML, and more. Use the following example to export your model:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Export the model
model = YOLO("yolov8n.pt")
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx
```
Detailed steps for each export format can be found in the [Export Guide](../modes/export.md).
### What is the purpose of the benchmark mode in Ultralytics YOLOv8?
Benchmark mode in Ultralytics YOLOv8 is used to analyze the speed and accuracy of various export formats such as ONNX, TensorRT, and OpenVINO. It provides metrics like model size, `mAP50-95` for object detection, and inference time across different hardware setups, helping you choose the most suitable format for your deployment needs.
!!! Example
=== "Python"
```python
from ultralytics.utils.benchmarks import benchmark
# Benchmark on GPU
benchmark(model="yolov8n.pt", data="coco8.yaml", imgsz=640, half=False, device=0)
```
=== "CLI"
```bash
yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
```
For more details, refer to the [Benchmark Guide](../modes/benchmark.md).
### How can I perform real-time object tracking using Ultralytics YOLOv8?
Real-time object tracking can be achieved using the track mode in Ultralytics YOLOv8. This mode extends object detection capabilities to track objects across video frames or live feeds. Use the following example to enable tracking:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Track objects in a video
model = YOLO("yolov8n.pt")
model.track(source="path/to/video.mp4")
```
=== "CLI"
```bash
yolo track source=path/to/video.mp4
```
For in-depth instructions, visit the [Track Guide](../modes/track.md).

View file

@ -800,3 +800,25 @@ This script will run predictions on each frame of the video, visualize the resul
[car spare parts]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/a0f802a8-0776-44cf-8f17-93974a4a28a1
[football player detect]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/7d320e1f-fc57-4d7f-a691-78ee579c3442
[human fall detect]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/86437c4a-3227-4eee-90ef-9efb697bdb43
## FAQ
### What is Ultralytics YOLOv8 and its predict mode for real-time inference?
Ultralytics YOLOv8 is a state-of-the-art model for real-time object detection, segmentation, and classification. Its **predict mode** allows users to perform high-speed inference on various data sources such as images, videos, and live streams. Designed for performance and versatility, it also offers batch processing and streaming modes. For more details on its features, check out the [Ultralytics YOLOv8 predict mode](#key-features-of-predict-mode).
### How can I run inference using Ultralytics YOLOv8 on different data sources?
Ultralytics YOLOv8 can process a wide range of data sources, including individual images, videos, directories, URLs, and streams. You can specify the data source in the `model.predict()` call. For example, use `'image.jpg'` for a local image or `'https://ultralytics.com/images/bus.jpg'` for a URL. Check out the detailed examples for various [inference sources](#inference-sources) in the documentation.
### How do I optimize YOLOv8 inference speed and memory usage?
To optimize inference speed and manage memory efficiently, you can use the streaming mode by setting `stream=True` in the predictor's call method. The streaming mode generates a memory-efficient generator of `Results` objects instead of loading all frames into memory. For processing long videos or large datasets, streaming mode is particularly useful. Learn more about [streaming mode](#key-features-of-predict-mode).
### What inference arguments does Ultralytics YOLOv8 support?
The `model.predict()` method in YOLOv8 supports various arguments such as `conf`, `iou`, `imgsz`, `device`, and more. These arguments allow you to customize the inference process, setting parameters like confidence thresholds, image size, and the device used for computation. Detailed descriptions of these arguments can be found in the [inference arguments](#inference-arguments) section.
### How can I visualize and save the results of YOLOv8 predictions?
After running inference with YOLOv8, the `Results` objects contain methods for displaying and saving annotated images. You can use methods like `result.show()` and `result.save(filename="result.jpg")` to visualize and save the results. For a comprehensive list of these methods, refer to the [working with results](#working-with-results) section.

View file

@ -367,3 +367,130 @@ Together, let's enhance the tracking capabilities of the Ultralytics YOLO ecosys
[fish track]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/a5146d0f-bfa8-4e0a-b7df-3c1446cd8142
[people track]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/93bb4ee2-77a0-4e4e-8eb6-eb8f527f0527
[vehicle track]: https://github.com/RizwanMunawar/ultralytics/assets/62513924/ee6e6038-383b-4f21-ac29-b2a1c7d386ab
## FAQ
### What is Multi-Object Tracking and how does Ultralytics YOLO support it?
Multi-object tracking in video analytics involves both identifying objects and maintaining a unique ID for each detected object across video frames. Ultralytics YOLO supports this by providing real-time tracking along with object IDs, facilitating tasks such as security surveillance and sports analytics. The system uses trackers like BoT-SORT and ByteTrack, which can be configured via YAML files.
### How do I configure a custom tracker for Ultralytics YOLO?
You can configure a custom tracker by copying an existing tracker configuration file (e.g., `custom_tracker.yaml`) from the [Ultralytics tracker configuration directory](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers) and modifying parameters as needed, except for the `tracker_type`. Use this file in your tracking model like so:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
results = model.track(source="https://youtu.be/LNwODJXcvt4", tracker="custom_tracker.yaml")
```
=== "CLI"
```bash
yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" tracker='custom_tracker.yaml'
```
### How can I run object tracking on multiple video streams simultaneously?
To run object tracking on multiple video streams simultaneously, you can use Python's `threading` module. Each thread will handle a separate video stream. Here's an example of how you can set this up:
!!! Example "Multithreaded Tracking"
```python
import threading
import cv2
from ultralytics import YOLO
def run_tracker_in_thread(filename, model, file_index):
video = cv2.VideoCapture(filename)
while True:
ret, frame = video.read()
if not ret:
break
results = model.track(frame, persist=True)
res_plotted = results[0].plot()
cv2.imshow(f"Tracking_Stream_{file_index}", res_plotted)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
video.release()
model1 = YOLO("yolov8n.pt")
model2 = YOLO("yolov8n-seg.pt")
video_file1 = "path/to/video1.mp4"
video_file2 = 0 # Path to a second video file, or 0 for a webcam
tracker_thread1 = threading.Thread(target=run_tracker_in_thread, args=(video_file1, model1, 1), daemon=True)
tracker_thread2 = threading.Thread(target=run_tracker_in_thread, args=(video_file2, model2, 2), daemon=True)
tracker_thread1.start()
tracker_thread2.start()
tracker_thread1.join()
tracker_thread2.join()
cv2.destroyAllWindows()
```
### What are the real-world applications of multi-object tracking with Ultralytics YOLO?
Multi-object tracking with Ultralytics YOLO has numerous applications, including:
- **Transportation:** Vehicle tracking for traffic management and autonomous driving.
- **Retail:** People tracking for in-store analytics and security.
- **Aquaculture:** Fish tracking for monitoring aquatic environments.
These applications benefit from Ultralytics YOLO's ability to process high-frame-rate videos in real time.
### How can I visualize object tracks over multiple video frames with Ultralytics YOLO?
To visualize object tracks over multiple video frames, you can use the YOLO model's tracking features along with OpenCV to draw the paths of detected objects. Here's an example script that demonstrates this:
!!! Example "Plotting tracks over multiple video frames"
```python
from collections import defaultdict
import cv2
import numpy as np
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
video_path = "path/to/video.mp4"
cap = cv2.VideoCapture(video_path)
track_history = defaultdict(lambda: [])
while cap.isOpened():
success, frame = cap.read()
if success:
results = model.track(frame, persist=True)
boxes = results[0].boxes.xywh.cpu()
track_ids = results[0].boxes.id.int().cpu().tolist()
annotated_frame = results[0].plot()
for box, track_id in zip(boxes, track_ids):
x, y, w, h = box
track = track_history[track_id]
track.append((float(x), float(y)))
if len(track) > 30:
track.pop(0)
points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
cv2.polylines(annotated_frame, [points], isClosed=False, color=(230, 230, 230), thickness=10)
cv2.imshow("YOLOv8 Tracking", annotated_frame)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
break
cap.release()
cv2.destroyAllWindows()
```
This script will plot the tracking lines showing the movement paths of the tracked objects over time.

View file

@ -336,3 +336,110 @@ To use TensorBoard locally run the below command and view results at http://loca
This will load TensorBoard and direct it to the directory where your training logs are saved.
After setting up your logger, you can then proceed with your model training. All training metrics will be automatically logged in your chosen platform, and you can access these logs to monitor your model's performance over time, compare different models, and identify areas for improvement.
## FAQ
### How do I train an object detection model using Ultralytics YOLOv8?
To train an object detection model using Ultralytics YOLOv8, you can either use the Python API or the CLI. Below is an example for both:
!!! Example "Single-GPU and CPU Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
```
For more details, refer to the [Train Settings](#train-settings) section.
### What are the key features of Ultralytics YOLOv8's Train mode?
The key features of Ultralytics YOLOv8's Train mode include:
- **Automatic Dataset Download:** Automatically downloads standard datasets like COCO, VOC, and ImageNet.
- **Multi-GPU Support:** Scale training across multiple GPUs for faster processing.
- **Hyperparameter Configuration:** Customize hyperparameters through YAML files or CLI arguments.
- **Visualization and Monitoring:** Real-time tracking of training metrics for better insights.
These features make training efficient and customizable to your needs. For more details, see the [Key Features of Train Mode](#key-features-of-train-mode) section.
### How do I resume training from an interrupted session in Ultralytics YOLOv8?
To resume training from an interrupted session, set the `resume` argument to `True` and specify the path to the last saved checkpoint.
!!! Example "Resume Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load the partially trained model
model = YOLO("path/to/last.pt")
# Resume training
results = model.train(resume=True)
```
=== "CLI"
```bash
yolo train resume model=path/to/last.pt
```
Check the section on [Resuming Interrupted Trainings](#resuming-interrupted-trainings) for more information.
### Can I train YOLOv8 models on Apple M1 and M2 chips?
Yes, Ultralytics YOLOv8 supports training on Apple M1 and M2 chips utilizing the Metal Performance Shaders (MPS) framework. Specify 'mps' as your training device.
!!! Example "MPS Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolov8n.pt")
# Train the model on M1/M2 chip
results = model.train(data="coco8.yaml", epochs=100, imgsz=640, device="mps")
```
=== "CLI"
```bash
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640 device=mps
```
For more details, refer to the [Apple M1 and M2 MPS Training](#apple-m1-and-m2-mps-training) section.
### What are the common training settings, and how do I configure them?
Ultralytics YOLOv8 allows you to configure a variety of training settings such as batch size, learning rate, epochs, and more through arguments. Here's a brief overview:
| Argument | Default | Description |
| -------- | ------- | ---------------------------------------------------------------------- |
| `model` | `None` | Path to the model file for training. |
| `data` | `None` | Path to the dataset configuration file (e.g., `coco8.yaml`). |
| `epochs` | `100` | Total number of training epochs. |
| `batch` | `16` | Batch size, adjustable as integer or auto mode. |
| `imgsz` | `640` | Target image size for training. |
| `device` | `None` | Computational device(s) for training like `cpu`, `0`, `0,1`, or `mps`. |
| `save` | `True` | Enables saving of training checkpoints and final model weights. |
For an in-depth guide on training settings, check the [Train Settings](#train-settings) section.

View file

@ -121,3 +121,108 @@ The below examples showcase YOLO model validation with custom arguments in Pytho
```bash
yolo val model=yolov8n.pt data=coco8.yaml imgsz=640 batch=16 conf=0.25 iou=0.6 device=0
```
## FAQ
### How do I validate my YOLOv8 model with Ultralytics?
To validate your YOLOv8 model, you can use the Val mode provided by Ultralytics. For example, using the Python API, you can load a model and run validation with:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt")
# Validate the model
metrics = model.val()
print(metrics.box.map) # map50-95
```
Alternatively, you can use the command-line interface (CLI):
```bash
yolo val model=yolov8n.pt
```
For further customization, you can adjust various arguments like `imgsz`, `batch`, and `conf` in both Python and CLI modes. Check the [Arguments for YOLO Model Validation](#arguments-for-yolo-model-validation) section for the full list of parameters.
### What metrics can I get from YOLOv8 model validation?
YOLOv8 model validation provides several key metrics to assess model performance. These include:
- mAP50 (mean Average Precision at IoU threshold 0.5)
- mAP75 (mean Average Precision at IoU threshold 0.75)
- mAP50-95 (mean Average Precision across multiple IoU thresholds from 0.5 to 0.95)
Using the Python API, you can access these metrics as follows:
```python
metrics = model.val() # assumes `model` has been loaded
print(metrics.box.map) # mAP50-95
print(metrics.box.map50) # mAP50
print(metrics.box.map75) # mAP75
print(metrics.box.maps) # list of mAP50-95 for each category
```
For a complete performance evaluation, it's crucial to review all these metrics. For more details, refer to the [Key Features of Val Mode](#key-features-of-val-mode).
### What are the advantages of using Ultralytics YOLO for validation?
Using Ultralytics YOLO for validation provides several advantages:
- **Precision:** YOLOv8 offers accurate performance metrics including mAP50, mAP75, and mAP50-95.
- **Convenience:** The models remember their training settings, making validation straightforward.
- **Flexibility:** You can validate against the same or different datasets and image sizes.
- **Hyperparameter Tuning:** Validation metrics help in fine-tuning models for better performance.
These benefits ensure that your models are evaluated thoroughly and can be optimized for superior results. Learn more about these advantages in the [Why Validate with Ultralytics YOLO](#why-validate-with-ultralytics-yolo) section.
### Can I validate my YOLOv8 model using a custom dataset?
Yes, you can validate your YOLOv8 model using a custom dataset. Specify the `data` argument with the path to your dataset configuration file. This file should include paths to the validation data, class names, and other relevant details.
Example in Python:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt")
# Validate with a custom dataset
metrics = model.val(data="path/to/your/custom_dataset.yaml")
print(metrics.box.map) # map50-95
```
Example using CLI:
```bash
yolo val model=yolov8n.pt data=path/to/your/custom_dataset.yaml
```
For more customizable options during validation, see the [Example Validation with Arguments](#example-validation-with-arguments) section.
### How do I save validation results to a JSON file in YOLOv8?
To save the validation results to a JSON file, you can set the `save_json` argument to `True` when running validation. This can be done in both the Python API and CLI.
Example in Python:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n.pt")
# Save validation results to JSON
metrics = model.val(save_json=True)
```
Example using CLI:
```bash
yolo val model=yolov8n.pt save_json=True
```
This functionality is particularly useful for further analysis or integration with other tools. Check the [Arguments for YOLO Model Validation](#arguments-for-yolo-model-validation) for more details.

View file

@ -337,3 +337,86 @@ The table below provides an overview of the settings available for adjustment wi
| `wandb` | `True` | `bool` | Whether to use Weights & Biases logging |
As you navigate through your projects or experiments, be sure to revisit these settings to ensure that they are optimally configured for your needs.
## FAQ
### How do I install Ultralytics YOLOv8 using pip?
To install Ultralytics YOLOv8 with pip, execute the following command:
```bash
pip install ultralytics
```
For the latest stable release, this will install the `ultralytics` package directly from the Python Package Index (PyPI). For more details, visit the [ultralytics package on PyPI](https://pypi.org/project/ultralytics/).
Alternatively, you can install the latest development version directly from GitHub:
```bash
pip install git+https://github.com/ultralytics/ultralytics.git
```
Make sure to have the Git command-line tool installed on your system.
### Can I install Ultralytics YOLOv8 using conda?
Yes, you can install Ultralytics YOLOv8 using conda by running:
```bash
conda install -c conda-forge ultralytics
```
This method is an excellent alternative to pip and ensures compatibility with other packages in your environment. For CUDA environments, it's best to install `ultralytics`, `pytorch`, and `pytorch-cuda` simultaneously to resolve any conflicts:
```bash
conda install -c pytorch -c nvidia -c conda-forge pytorch torchvision pytorch-cuda=11.8 ultralytics
```
For more instructions, visit the [Conda quickstart guide](guides/conda-quickstart.md).
### What are the advantages of using Docker to run Ultralytics YOLOv8?
Using Docker to run Ultralytics YOLOv8 provides an isolated and consistent environment, ensuring smooth performance across different systems. It also eliminates the complexity of local installation. Official Docker images from Ultralytics are available on [Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics), with different variants tailored for GPU, CPU, ARM64, NVIDIA Jetson, and Conda environments. Below are the commands to pull and run the latest image:
```bash
# Pull the latest ultralytics image from Docker Hub
sudo docker pull ultralytics/ultralytics:latest
# Run the ultralytics image in a container with GPU support
sudo docker run -it --ipc=host --gpus all ultralytics/ultralytics:latest
```
For more detailed Docker instructions, check out the [Docker quickstart guide](guides/docker-quickstart.md).
### How do I clone the Ultralytics repository for development?
To clone the Ultralytics repository and set up a development environment, use the following steps:
```bash
# Clone the ultralytics repository
git clone https://github.com/ultralytics/ultralytics
# Navigate to the cloned directory
cd ultralytics
# Install the package in editable mode for development
pip install -e .
```
This approach allows you to contribute to the project or experiment with the latest source code. For more details, visit the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics).
### Why should I use Ultralytics YOLOv8 CLI?
The Ultralytics YOLOv8 command line interface (CLI) simplifies running object detection tasks without requiring Python code. You can execute single-line commands for tasks like training, validation, and prediction straight from your terminal. The basic syntax for `yolo` commands is:
```bash
yolo TASK MODE ARGS
```
For example, to train a detection model with specified parameters:
```bash
yolo train data=coco8.yaml model=yolov8n.pt epochs=10 lr0=0.01
```
Check out the full [CLI Guide](usage/cli.md) to explore more commands and usage examples.

View file

@ -179,3 +179,93 @@ Available YOLOv8-cls export formats are in the table below. You can export to an
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n-cls_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### What is the purpose of YOLOv8 in image classification?
YOLOv8 models, such as `yolov8n-cls.pt`, are designed for efficient image classification. They assign a single class label to an entire image along with a confidence score. This is particularly useful for applications where knowing the specific class of an image is sufficient, rather than identifying the location or shape of objects within the image.
### How do I train a YOLOv8 model for image classification?
To train a YOLOv8 model, you can use either Python or CLI commands. For example, to train a `yolov8n-cls` model on the MNIST160 dataset for 100 epochs at an image size of 64:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-cls.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="mnist160", epochs=100, imgsz=64)
```
=== "CLI"
```bash
yolo classify train data=mnist160 model=yolov8n-cls.pt epochs=100 imgsz=64
```
For more configuration options, visit the [Configuration](../usage/cfg.md) page.
### Where can I find pretrained YOLOv8 classification models?
Pretrained YOLOv8 classification models can be found in the [Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/v8) section. Models like `yolov8n-cls.pt`, `yolov8s-cls.pt`, `yolov8m-cls.pt`, etc., are pretrained on the [ImageNet](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/ImageNet.yaml) dataset and can be easily downloaded and used for various image classification tasks.
### How can I export a trained YOLOv8 model to different formats?
You can export a trained YOLOv8 model to various formats using Python or CLI commands. For instance, to export a model to ONNX format:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-cls.pt") # load the trained model
# Export the model to ONNX
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n-cls.pt format=onnx # export the trained model to ONNX format
```
For detailed export options, refer to the [Export](../modes/export.md) page.
### How do I validate a trained YOLOv8 classification model?
To validate a trained model's accuracy on a dataset like MNIST160, you can use the following Python or CLI commands:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-cls.pt") # load the trained model
# Validate the model
metrics = model.val() # no arguments needed, uses the dataset and settings from training
metrics.top1 # top1 accuracy
metrics.top5 # top5 accuracy
```
=== "CLI"
```bash
yolo classify val model=yolov8n-cls.pt # validate the trained model
```
For more information, visit the [Validate](#val) section.

View file

@ -180,3 +180,111 @@ Available YOLOv8 export formats are in the table below. You can export to any fo
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### How do I train a YOLOv8 model on my custom dataset?
Training a YOLOv8 model on a custom dataset involves a few steps:
1. **Prepare the Dataset**: Ensure your dataset is in the YOLO format. For guidance, refer to our [Dataset Guide](../datasets/detect/index.md).
2. **Load the Model**: Use the Ultralytics YOLO library to load a pre-trained model or create a new model from a YAML file.
3. **Train the Model**: Execute the `train` method in Python or the `yolo detect train` command in CLI.
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolov8n.pt")
# Train the model on your custom dataset
model.train(data="my_custom_dataset.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo detect train data=my_custom_dataset.yaml model=yolov8n.pt epochs=100 imgsz=640
```
For detailed configuration options, visit the [Configuration](../usage/cfg.md) page.
### What pretrained models are available in YOLOv8?
Ultralytics YOLOv8 offers various pretrained models for object detection, segmentation, and pose estimation. These models are pretrained on the COCO dataset or ImageNet for classification tasks. Here are some of the available models:
- [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n.pt)
- [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt)
- [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8m.pt)
- [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8l.pt)
- [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8x.pt)
For a detailed list and performance metrics, refer to the [Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models/v8) section.
### How can I validate the accuracy of my trained YOLOv8 model?
To validate the accuracy of your trained YOLOv8 model, you can use the `.val()` method in Python or the `yolo detect val` command in CLI. This will provide metrics like mAP50-95, mAP50, and more.
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load the model
model = YOLO("path/to/best.pt")
# Validate the model
metrics = model.val()
print(metrics.box.map) # mAP50-95
```
=== "CLI"
```bash
yolo detect val model=path/to/best.pt
```
For more validation details, visit the [Val](../modes/val.md) page.
### What formats can I export a YOLOv8 model to?
Ultralytics YOLOv8 allows exporting models to various formats such as ONNX, TensorRT, CoreML, and more to ensure compatibility across different platforms and devices.
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load the model
model = YOLO("yolov8n.pt")
# Export the model to ONNX format
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx
```
Check the full list of supported formats and instructions on the [Export](../modes/export.md) page.
### Why should I use Ultralytics YOLOv8 for object detection?
Ultralytics YOLOv8 is designed to offer state-of-the-art performance for object detection, segmentation, and pose estimation. Here are some key advantages:
1. **Pretrained Models**: Utilize models pretrained on popular datasets like COCO and ImageNet for faster development.
2. **High Accuracy**: Achieves impressive mAP scores, ensuring reliable object detection.
3. **Speed**: Optimized for real-time inference, making it ideal for applications requiring swift processing.
4. **Flexibility**: Export models to various formats like ONNX and TensorRT for deployment across multiple platforms.
Explore our [Blog](https://www.ultralytics.com/blog) for use cases and success stories showcasing YOLOv8 in action.

View file

@ -55,3 +55,68 @@ Oriented object detection goes a step further than regular object detection with
## Conclusion
YOLOv8 supports multiple tasks, including detection, segmentation, classification, oriented object detection and keypoints detection. Each of these tasks has different objectives and use cases. By understanding the differences between these tasks, you can choose the appropriate task for your computer vision application.
## FAQ
### What tasks can Ultralytics YOLOv8 perform?
Ultralytics YOLOv8 is a versatile AI framework capable of performing various computer vision tasks with high accuracy and speed. These tasks include:
- **[Detection](detect.md):** Identifying and localizing objects in images or video frames by drawing bounding boxes around them.
- **[Segmentation](segment.md):** Segmenting images into different regions based on their content, useful for applications like medical imaging.
- **[Classification](classify.md):** Categorizing entire images based on their content, leveraging variants of the EfficientNet architecture.
- **[Pose estimation](pose.md):** Detecting specific keypoints in an image or video frame to track movements or poses.
- **[Oriented Object Detection (OBB)](obb.md):** Detecting rotated objects with an added orientation angle for enhanced accuracy.
### How do I use Ultralytics YOLOv8 for object detection?
To use Ultralytics YOLOv8 for object detection, follow these steps:
1. Prepare your dataset in the appropriate format.
2. Train the YOLOv8 model using the detection task.
3. Use the model to make predictions by feeding in new images or video frames.
!!! Example
=== "Python"
```python
from ultralytics import YOLO
model = YOLO("yolov8n.pt") # Load pre-trained model
results = model.predict(source="image.jpg") # Perform object detection
results[0].show()
```
=== "CLI"
```bash
yolo detect model=yolov8n.pt source='image.jpg'
```
For more detailed instructions, check out our [detection examples](detect.md).
### What are the benefits of using YOLOv8 for segmentation tasks?
Using YOLOv8 for segmentation tasks provides several advantages:
1. **High Accuracy:** The segmentation task leverages a variant of the U-Net architecture to achieve precise segmentation.
2. **Speed:** YOLOv8 is optimized for real-time applications, offering quick processing even for high-resolution images.
3. **Multiple Applications:** It is ideal for medical imaging, autonomous driving, and other applications requiring detailed image segmentation.
Learn more about the benefits and use cases of YOLOv8 for segmentation in the [segmentation section](segment.md).
### Can Ultralytics YOLOv8 handle pose estimation and keypoint detection?
Yes, Ultralytics YOLOv8 can effectively perform pose estimation and keypoint detection with high accuracy and speed. This feature is particularly useful for tracking movements in sports analytics, healthcare, and human-computer interaction applications. YOLOv8 detects keypoints in an image or video frame, allowing for precise pose estimation.
For more details and implementation tips, visit our [pose estimation examples](pose.md).
### Why should I choose Ultralytics YOLOv8 for oriented object detection (OBB)?
Oriented Object Detection (OBB) with YOLOv8 provides enhanced precision by detecting objects with an additional angle parameter. This feature is beneficial for applications requiring accurate localization of rotated objects, such as aerial imagery analysis and warehouse automation.
- **Increased Precision:** The angle component reduces false positives for rotated objects.
- **Versatile Applications:** Useful for tasks in geospatial analysis, robotics, etc.
Check out the [Oriented Object Detection section](obb.md) for more details and examples.

View file

@ -201,3 +201,91 @@ Available YOLOv8-obb export formats are in the table below. You can export to an
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n-obb_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### What are Oriented Bounding Boxes (OBB) and how do they differ from regular bounding boxes?
Oriented Bounding Boxes (OBB) include an additional angle to enhance object localization accuracy in images. Unlike regular bounding boxes, which are axis-aligned rectangles, OBBs can rotate to fit the orientation of the object better. This is particularly useful for applications requiring precise object placement, such as aerial or satellite imagery ([Dataset Guide](../datasets/obb/index.md)).
### How do I train a YOLOv8n-obb model using a custom dataset?
To train a YOLOv8n-obb model with a custom dataset, follow the example below using Python or CLI:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolov8n-obb.pt")
# Train the model
results = model.train(data="path/to/custom_dataset.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo obb train data=path/to/custom_dataset.yaml model=yolov8n-obb.pt epochs=100 imgsz=640
```
For more training arguments, check the [Configuration](../usage/cfg.md) section.
### What datasets can I use for training YOLOv8-OBB models?
YOLOv8-OBB models are pretrained on datasets like [DOTAv1](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/DOTAv1.yaml) but you can use any dataset formatted for OBB. Detailed information on OBB dataset formats can be found in the [Dataset Guide](../datasets/obb/index.md).
### How can I export a YOLOv8-OBB model to ONNX format?
Exporting a YOLOv8-OBB model to ONNX format is straightforward using either Python or CLI:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-obb.pt")
# Export the model
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n-obb.pt format=onnx
```
For more export formats and details, refer to the [Export](../modes/export.md) page.
### How do I validate the accuracy of a YOLOv8n-obb model?
To validate a YOLOv8n-obb model, you can use Python or CLI commands as shown below:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-obb.pt")
# Validate the model
metrics = model.val(data="dota8.yaml")
```
=== "CLI"
```bash
yolo obb val model=yolov8n-obb.pt data=dota8.yaml
```
See full validation details in the [Val](../modes/val.md) section.

View file

@ -195,3 +195,64 @@ Available YOLOv8-pose export formats are in the table below. You can export to a
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n-pose_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### What is Pose Estimation with Ultralytics YOLOv8 and how does it work?
Pose estimation with Ultralytics YOLOv8 involves identifying specific points, known as keypoints, in an image. These keypoints typically represent joints or other important features of the object. The output includes the `[x, y]` coordinates and confidence scores for each point. YOLOv8-pose models are specifically designed for this task and use the `-pose` suffix, such as `yolov8n-pose.pt`. These models are pre-trained on datasets like [COCO keypoints](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml) and can be used for various pose estimation tasks. For more information, visit the [Pose Estimation Page](#pose-estimation).
### How can I train a YOLOv8-pose model on a custom dataset?
Training a YOLOv8-pose model on a custom dataset involves loading a model, either a new model defined by a YAML file or a pre-trained model. You can then start the training process using your specified dataset and parameters.
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-pose.yaml") # build a new model from YAML
model = YOLO("yolov8n-pose.pt") # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="your-dataset.yaml", epochs=100, imgsz=640)
```
For comprehensive details on training, refer to the [Train Section](#train).
### How do I validate a trained YOLOv8-pose model?
Validation of a YOLOv8-pose model involves assessing its accuracy using the same dataset parameters retained during training. Here's an example:
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-pose.pt") # load an official model
model = YOLO("path/to/best.pt") # load a custom model
# Validate the model
metrics = model.val() # no arguments needed, dataset and settings remembered
```
For more information, visit the [Val Section](#val).
### Can I export a YOLOv8-pose model to other formats, and how?
Yes, you can export a YOLOv8-pose model to various formats like ONNX, CoreML, TensorRT, and more. This can be done using either Python or the Command Line Interface (CLI).
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolov8n-pose.pt") # load an official model
model = YOLO("path/to/best.pt") # load a custom trained model
# Export the model
model.export(format="onnx")
```
Refer to the [Export Section](#export) for more details.
### What are the available Ultralytics YOLOv8-pose models and their performance metrics?
Ultralytics YOLOv8 offers various pretrained pose models such as YOLOv8n-pose, YOLOv8s-pose, YOLOv8m-pose, among others. These models differ in size, accuracy (mAP), and speed. For instance, the YOLOv8n-pose model achieves a mAP<sup>pose</sup>50-95 of 50.4 and an mAP<sup>pose</sup>50 of 80.1. For a complete list and performance details, visit the [Models Section](#models).

View file

@ -185,3 +185,93 @@ Available YOLOv8-seg export formats are in the table below. You can export to an
| [NCNN](../integrations/ncnn.md) | `ncnn` | `yolov8n-seg_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
See full `export` details in the [Export](../modes/export.md) page.
## FAQ
### How do I train a YOLOv8 segmentation model on a custom dataset?
To train a YOLOv8 segmentation model on a custom dataset, you first need to prepare your dataset in the YOLO segmentation format. You can use tools like [JSON2YOLO](https://github.com/ultralytics/JSON2YOLO) to convert datasets from other formats. Once your dataset is ready, you can train the model using Python or CLI commands:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8 segment model
model = YOLO("yolov8n-seg.pt")
# Train the model
results = model.train(data="path/to/your_dataset.yaml", epochs=100, imgsz=640)
```
=== "CLI"
```bash
yolo segment train data=path/to/your_dataset.yaml model=yolov8n-seg.pt epochs=100 imgsz=640
```
Check the [Configuration](../usage/cfg.md) page for more available arguments.
### What is the difference between object detection and instance segmentation in YOLOv8?
Object detection identifies and localizes objects within an image by drawing bounding boxes around them, whereas instance segmentation not only identifies the bounding boxes but also delineates the exact shape of each object. YOLOv8 instance segmentation models provide masks or contours that outline each detected object, which is particularly useful for tasks where knowing the precise shape of objects is important, such as medical imaging or autonomous driving.
### Why use YOLOv8 for instance segmentation?
Ultralytics YOLOv8 is a state-of-the-art model recognized for its high accuracy and real-time performance, making it ideal for instance segmentation tasks. YOLOv8 Segment models come pretrained on the [COCO dataset](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml), ensuring robust performance across a variety of objects. Additionally, YOLOv8 supports training, validation, prediction, and export functionalities with seamless integration, making it highly versatile for both research and industry applications.
### How do I load and validate a pretrained YOLOv8 segmentation model?
Loading and validating a pretrained YOLOv8 segmentation model is straightforward. Here's how you can do it using both Python and CLI:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolov8n-seg.pt")
# Validate the model
metrics = model.val()
print("Mean Average Precision for boxes:", metrics.box.map)
print("Mean Average Precision for masks:", metrics.seg.map)
```
=== "CLI"
```bash
yolo segment val model=yolov8n-seg.pt
```
These steps will provide you with validation metrics like Mean Average Precision (mAP), crucial for assessing model performance.
### How can I export a YOLOv8 segmentation model to ONNX format?
Exporting a YOLOv8 segmentation model to ONNX format is simple and can be done using Python or CLI commands:
!!! Example
=== "Python"
```python
from ultralytics import YOLO
# Load a pretrained model
model = YOLO("yolov8n-seg.pt")
# Export the model to ONNX format
model.export(format="onnx")
```
=== "CLI"
```bash
yolo export model=yolov8n-seg.pt format=onnx
```
For more details on exporting to various formats, refer to the [Export](../modes/export.md) page.

View file

@ -99,3 +99,126 @@ Here are all supported callbacks. See callbacks [source code](https://github.com
| ----------------- | ---------------------------------------- |
| `on_export_start` | Triggered when the export process starts |
| `on_export_end` | Triggered when the export process ends |
## FAQ
### What are Ultralytics callbacks and how can I use them?
**Ultralytics callbacks** are specialized entry points triggered during key stages of model operations like training, validation, exporting, and prediction. These callbacks allow for custom functionality at specific points in the process, enabling enhancements and modifications to the workflow. Each callback accepts a `Trainer`, `Validator`, or `Predictor` object, depending on the operation type. For detailed properties of these objects, refer to the [Reference section](../reference/cfg/__init__.md).
To use a callback, you can define a function and then add it to the model with the `add_callback` method. Here's an example of how to return additional information during prediction:
```python
from ultralytics import YOLO
def on_predict_batch_end(predictor):
"""Handle prediction batch end by combining results with corresponding frames; modifies predictor results."""
_, image, _, _ = predictor.batch
image = image if isinstance(image, list) else [image]
predictor.results = zip(predictor.results, image)
model = YOLO("yolov8n.pt")
model.add_callback("on_predict_batch_end", on_predict_batch_end)
for result, frame in model.predict():
pass
```
### How can I customize Ultralytics training routine using callbacks?
To customize your Ultralytics training routine using callbacks, you can inject your logic at specific stages of the training process. Ultralytics YOLO provides a variety of training callbacks such as `on_train_start`, `on_train_end`, and `on_train_batch_end`. These allow you to add custom metrics, processing, or logging.
Here's an example of how to log additional metrics at the end of each training epoch:
```python
from ultralytics import YOLO
def on_train_epoch_end(trainer):
"""Custom logic for additional metrics logging at the end of each training epoch."""
additional_metric = compute_additional_metric(trainer)
trainer.log({"additional_metric": additional_metric})
model = YOLO("yolov8n.pt")
model.add_callback("on_train_epoch_end", on_train_epoch_end)
model.train(data="coco.yaml", epochs=10)
```
Refer to the [Training Guide](../modes/train.md) for more details on how to effectively use training callbacks.
### Why should I use callbacks during validation in Ultralytics YOLO?
Using **callbacks during validation** in Ultralytics YOLO can enhance model evaluation by allowing custom processing, logging, or metrics calculation. Callbacks such as `on_val_start`, `on_val_batch_end`, and `on_val_end` provide entry points to inject custom logic, ensuring detailed and comprehensive validation processes.
For instance, you might want to log additional validation metrics or save intermediate results for further analysis. Here's an example of how to log custom metrics at the end of validation:
```python
from ultralytics import YOLO
def on_val_end(validator):
"""Log custom metrics at end of validation."""
custom_metric = compute_custom_metric(validator)
validator.log({"custom_metric": custom_metric})
model = YOLO("yolov8n.pt")
model.add_callback("on_val_end", on_val_end)
model.val(data="coco.yaml")
```
Check out the [Validation Guide](../modes/val.md) for further insights on incorporating callbacks into your validation process.
### How do I attach a custom callback for the prediction mode in Ultralytics YOLO?
To attach a custom callback for the **prediction mode** in Ultralytics YOLO, you define a callback function and register it with the prediction process. Common prediction callbacks include `on_predict_start`, `on_predict_batch_end`, and `on_predict_end`. These allow for modification of prediction outputs and integration of additional functionalities like data logging or result transformation.
Here is an example where a custom callback is used to log predictions:
```python
from ultralytics import YOLO
def on_predict_end(predictor):
"""Log predictions at the end of prediction."""
for result in predictor.results:
log_prediction(result)
model = YOLO("yolov8n.pt")
model.add_callback("on_predict_end", on_predict_end)
results = model.predict(source="image.jpg")
```
For more comprehensive usage, refer to the [Prediction Guide](../modes/predict.md) which includes detailed instructions and additional customization options.
### What are some practical examples of using callbacks in Ultralytics YOLO?
Ultralytics YOLO supports various practical implementations of callbacks to enhance and customize different phases like training, validation, and prediction. Some practical examples include:
1. **Logging Custom Metrics**: Log additional metrics at different stages, such as the end of training or validation epochs.
2. **Data Augmentation**: Implement custom data transformations or augmentations during prediction or training batches.
3. **Intermediate Results**: Save intermediate results such as predictions or frames for further analysis or visualization.
Example: Combining frames with prediction results during prediction using `on_predict_batch_end`:
```python
from ultralytics import YOLO
def on_predict_batch_end(predictor):
"""Combine prediction results with frames."""
_, image, _, _ = predictor.batch
image = image if isinstance(image, list) else [image]
predictor.results = zip(predictor.results, image)
model = YOLO("yolov8n.pt")
model.add_callback("on_predict_batch_end", on_predict_batch_end)
for result, frame in model.predict():
pass
```
Explore the [Complete Callback Reference](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/callbacks/base.py) to find more options and examples.

View file

@ -277,3 +277,38 @@ Effective logging, checkpointing, plotting, and file management can help you kee
| `exist_ok` | `False` | Determines whether to overwrite an existing experiment directory if one with the same name already exists. Setting this to `True` allows overwriting, while `False` prevents it. |
| `plots` | `False` | Controls the generation and saving of training and validation plots. Set to `True` to create plots such as loss curves, precision-recall curves, and sample predictions. Useful for visually tracking model performance over time. |
| `save` | `False` | Enables the saving of training checkpoints and final model weights. Set to `True` to periodically save model states, allowing training to be resumed from these checkpoints or models to be deployed. |
## FAQ
### How do I improve the performance of my YOLO model during training?
Improving YOLO model performance involves tuning hyperparameters like batch size, learning rate, momentum, and weight decay. Adjusting augmentation settings, selecting the right optimizer, and employing techniques like early stopping or mixed precision can also help. For detailed guidance on training settings, refer to the [Train Guide](../modes/train.md).
### What are the key hyperparameters to consider for YOLO model accuracy?
Key hyperparameters affecting YOLO model accuracy include:
- **Batch Size (`batch`)**: Larger batch sizes can stabilize training but may require more memory.
- **Learning Rate (`lr0`)**: Controls the step size for weight updates; smaller rates offer fine adjustments but slow convergence.
- **Momentum (`momentum`)**: Helps accelerate gradient vectors in the right directions, dampening oscillations.
- **Image Size (`imgsz`)**: Larger image sizes can improve accuracy but increase computational load.
Adjust these values based on your dataset and hardware capabilities. Explore more in the [Train Settings](#train-settings) section.
### How do I set the learning rate for training a YOLO model?
The learning rate (`lr0`) is crucial for optimization. A common starting point is `0.01` for SGD or `0.001` for Adam. It's essential to monitor training metrics and adjust if necessary. Use cosine learning rate schedulers (`cos_lr`) or warmup techniques (`warmup_epochs`, `warmup_momentum`) to dynamically modify the rate during training. Find more details in the [Train Guide](../modes/train.md).
### What are the default inference settings for YOLO models?
Default inference settings include:
- **Confidence Threshold (`conf=0.25`)**: Minimum confidence for detections.
- **IoU Threshold (`iou=0.7`)**: For Non-Maximum Suppression (NMS).
- **Image Size (`imgsz=640`)**: Resizes input images prior to inference.
- **Device (`device=None`)**: Selects CPU or GPU for inference.
For a comprehensive overview, visit the [Predict Settings](#predict-settings) section and the [Predict Guide](../modes/predict.md).
### Why should I use mixed precision training with YOLO models?
Mixed precision training, enabled with `amp=True`, helps reduce memory usage and can speed up training by utilizing the advantages of both FP16 and FP32. This is beneficial for modern GPUs, which support mixed precision natively, allowing more models to fit in memory and enable faster computations without significant loss in accuracy. Learn more about this in the [Train Guide](../modes/train.md).

View file

@ -230,3 +230,55 @@ This will create `default_copy.yaml`, which you can then pass as `cfg=default_co
yolo copy-cfg
yolo cfg=default_copy.yaml imgsz=320
```
## FAQ
### How do I use the Ultralytics YOLOv8 command line interface (CLI) for model training?
To train a YOLOv8 model using the CLI, you can execute a simple one-line command in the terminal. For example, to train a detection model for 10 epochs with a learning rate of 0.01, you would run:
```bash
yolo train data=coco8.yaml model=yolov8n.pt epochs=10 lr0=0.01
```
This command uses the `train` mode with specific arguments. Refer to the full list of available arguments in the [Configuration Guide](cfg.md).
### What tasks can I perform with the Ultralytics YOLOv8 CLI?
The Ultralytics YOLOv8 CLI supports a variety of tasks including detection, segmentation, classification, validation, prediction, export, and tracking. For instance:
- **Train a Model**: Run `yolo train data=<data.yaml> model=<model.pt> epochs=<num>`.
- **Run Predictions**: Use `yolo predict model=<model.pt> source=<data_source> imgsz=<image_size>`.
- **Export a Model**: Execute `yolo export model=<model.pt> format=<export_format>`.
Each task can be customized with various arguments. For detailed syntax and examples, see the respective sections like [Train](#train), [Predict](#predict), and [Export](#export).
### How can I validate the accuracy of a trained YOLOv8 model using the CLI?
To validate a YOLOv8 model's accuracy, use the `val` mode. For example, to validate a pretrained detection model with a batch size of 1 and image size of 640, run:
```bash
yolo val model=yolov8n.pt data=coco8.yaml batch=1 imgsz=640
```
This command evaluates the model on the specified dataset and provides performance metrics. For more details, refer to the [Val](#val) section.
### What formats can I export my YOLOv8 models to using the CLI?
YOLOv8 models can be exported to various formats such as ONNX, CoreML, TensorRT, and more. For instance, to export a model to ONNX format, run:
```bash
yolo export model=yolov8n.pt format=onnx
```
For complete details, visit the [Export](../modes/export.md) page.
### How do I customize YOLOv8 CLI commands to override default arguments?
To override default arguments in YOLOv8 CLI commands, pass them as `arg=value` pairs. For example, to train a model with custom arguments, use:
```bash
yolo train data=coco8.yaml model=yolov8n.pt epochs=10 lr0=0.01
```
For a full list of available arguments and their descriptions, refer to the [Configuration Guide](cfg.md). Ensure arguments are formatted correctly, as shown in the [Overriding default arguments](#overriding-default-arguments) section.

View file

@ -93,3 +93,87 @@ To know more about Callback triggering events and entry point, checkout our [Cal
## Other engine components
There are other components that can be customized similarly like `Validators` and `Predictors`. See Reference section for more information on these.
## FAQ
### How do I customize the Ultralytics YOLOv8 DetectionTrainer for specific tasks?
To customize the Ultralytics YOLOv8 `DetectionTrainer` for a specific task, you can override its methods to adapt to your custom model and dataloader. Start by inheriting from `DetectionTrainer` and then redefine methods like `get_model` to implement your custom functionalities. Here's an example:
```python
from ultralytics.models.yolo.detect import DetectionTrainer
class CustomTrainer(DetectionTrainer):
def get_model(self, cfg, weights):
"""Loads a custom detection model given configuration and weight files."""
...
trainer = CustomTrainer(overrides={...})
trainer.train()
trained_model = trainer.best # get best model
```
For further customization like changing the `loss function` or adding a `callback`, you can reference our [Callbacks Guide](../usage/callbacks.md).
### What are the key components of the BaseTrainer in Ultralytics YOLOv8?
The `BaseTrainer` in Ultralytics YOLOv8 serves as the foundation for training routines and can be customized for various tasks by overriding its generic methods. Key components include:
- `get_model(cfg, weights)` to build the model to be trained.
- `get_dataloader()` to build the dataloader.
For more details on the customization and source code, see the [`BaseTrainer` Reference](../reference/engine/trainer.md).
### How can I add a callback to the Ultralytics YOLOv8 DetectionTrainer?
You can add callbacks to monitor and modify the training process in Ultralytics YOLOv8 `DetectionTrainer`. For instance, here's how you can add a callback to log model weights after every training epoch:
```python
from ultralytics.models.yolo.detect import DetectionTrainer
# callback to upload model weights
def log_model(trainer):
"""Logs the path of the last model weight used by the trainer."""
last_weight_path = trainer.last
print(last_weight_path)
trainer = DetectionTrainer(overrides={...})
trainer.add_callback("on_train_epoch_end", log_model) # Adds to existing callbacks
trainer.train()
```
For further details on callback events and entry points, refer to our [Callbacks Guide](../usage/callbacks.md).
### Why should I use Ultralytics YOLOv8 for model training?
Ultralytics YOLOv8 offers a high-level abstraction on powerful engine executors, making it ideal for rapid development and customization. Key benefits include:
- **Ease of Use**: Both command-line and Python interfaces simplify complex tasks.
- **Performance**: Optimized for real-time object detection and various vision AI applications.
- **Customization**: Easily extendable for custom models, loss functions, and dataloaders.
Learn more about YOLOv8's capabilities by visiting [Ultralytics YOLO](https://www.ultralytics.com/yolo).
### Can I use the Ultralytics YOLOv8 DetectionTrainer for non-standard models?
Yes, Ultralytics YOLOv8 `DetectionTrainer` is highly flexible and can be customized for non-standard models. By inheriting from `DetectionTrainer`, you can overload different methods to support your specific model's needs. Here's a simple example:
```python
from ultralytics.models.yolo.detect import DetectionTrainer
class CustomDetectionTrainer(DetectionTrainer):
def get_model(self, cfg, weights):
"""Loads a custom detection model."""
...
trainer = CustomDetectionTrainer(overrides={...})
trainer.train()
```
For more comprehensive instructions and examples, review the [DetectionTrainer](../reference/engine/trainer.md) documentation.

View file

@ -330,3 +330,89 @@ Explorer API can be used to explore datasets with advanced semantic, vector-simi
You can easily customize Trainers to support custom tasks or explore R&D ideas. Learn more about Customizing `Trainers`, `Validators` and `Predictors` to suit your project needs in the Customization Section.
[Customization tutorials](engine.md){ .md-button }
## FAQ
### How can I integrate YOLOv8 into my Python project for object detection?
Integrating Ultralytics YOLOv8 into your Python projects is simple. You can load a pre-trained model or train a new model from scratch. Here's how to get started:
```python
from ultralytics import YOLO
# Load a pretrained YOLO model
model = YOLO("yolov8n.pt")
# Perform object detection on an image
results = model("https://ultralytics.com/images/bus.jpg")
# Visualize the results
for result in results:
result.show()
```
See more detailed examples in our [Predict Mode](../modes/predict.md) section.
### What are the different modes available in YOLOv8?
Ultralytics YOLOv8 provides various modes to cater to different machine learning workflows. These include:
- **[Train](../modes/train.md)**: Train a model using custom datasets.
- **[Val](../modes/val.md)**: Validate model performance on a validation set.
- **[Predict](../modes/predict.md)**: Make predictions on new images or video streams.
- **[Export](../modes/export.md)**: Export models to various formats like ONNX, TensorRT.
- **[Track](../modes/track.md)**: Real-time object tracking in video streams.
- **[Benchmark](../modes/benchmark.md)**: Benchmark model performance across different configurations.
Each mode is designed to provide comprehensive functionalities for different stages of model development and deployment.
### How do I train a custom YOLOv8 model using my dataset?
To train a custom YOLOv8 model, you need to specify your dataset and other hyperparameters. Here's a quick example:
```python
from ultralytics import YOLO
# Load the YOLO model
model = YOLO("yolov8n.yaml")
# Train the model with custom dataset
model.train(data="path/to/your/dataset.yaml", epochs=10)
```
For more details on training and hyperlinks to example usage, visit our [Train Mode](../modes/train.md) page.
### How do I export YOLOv8 models for deployment?
Exporting YOLOv8 models in a format suitable for deployment is straightforward with the `export` function. For example, you can export a model to ONNX format:
```python
from ultralytics import YOLO
# Load the YOLO model
model = YOLO("yolov8n.pt")
# Export the model to ONNX format
model.export(format="onnx")
```
For various export options, refer to the [Export Mode](../modes/export.md) documentation.
### Can I validate my YOLOv8 model on different datasets?
Yes, validating YOLOv8 models on different datasets is possible. After training, you can use the validation mode to evaluate the performance:
```python
from ultralytics import YOLO
# Load a YOLOv8 model
model = YOLO("yolov8n.yaml")
# Train the model
model.train(data="coco8.yaml", epochs=5)
# Validate the model on a different dataset
model.val(data="path/to/separate/data.yaml")
```
Check the [Val Mode](../modes/val.md) page for detailed examples and usage.

View file

@ -568,3 +568,64 @@ make_divisible(7, 3)
make_divisible(7, 2)
# >>> 8
```
## FAQ
### What utilities are included in the Ultralytics package to enhance machine learning workflows?
The Ultralytics package includes a variety of utilities designed to streamline and optimize machine learning workflows. Key utilities include [auto-annotation](../reference/data/annotator.md#ultralytics.data.annotator.auto_annotate) for labeling datasets, converting COCO to YOLO format with [convert_coco](../reference/data/converter.md#ultralytics.data.converter.convert_coco), compressing images, and dataset auto-splitting. These tools aim to reduce manual effort, ensure consistency, and enhance data processing efficiency.
### How can I use Ultralytics to auto-label my dataset?
If you have a pre-trained Ultralytics YOLO object detection model, you can use it with the [SAM](../models/sam.md) model to auto-annotate your dataset in segmentation format. Here's an example:
```python
from ultralytics.data.annotator import auto_annotate
auto_annotate(
data="path/to/new/data",
det_model="yolov8n.pt",
sam_model="mobile_sam.pt",
device="cuda",
output_dir="path/to/save_labels",
)
```
For more details, check the [auto_annotate reference section](../reference/data/annotator.md#ultralytics.data.annotator.auto_annotate).
### How do I convert COCO dataset annotations to YOLO format in Ultralytics?
To convert COCO JSON annotations into YOLO format for object detection, you can use the `convert_coco` utility. Here's a sample code snippet:
```python
from ultralytics.data.converter import convert_coco
convert_coco(
"../datasets/coco/annotations/",
use_segments=False,
use_keypoints=False,
cls91to80=True,
)
```
For additional information, visit the [convert_coco reference page](../reference/data/converter.md#ultralytics.data.converter.convert_coco).
### What is the purpose of the YOLO Data Explorer in the Ultralytics package?
The [YOLO Explorer](../datasets/explorer/index.md) is a powerful tool introduced in the `8.1.0` update to enhance dataset understanding. It allows you to use text queries to find object instances in your dataset, making it easier to analyze and manage your data. This tool provides valuable insights into dataset composition and distribution, helping to improve model training and performance.
### How can I convert bounding boxes to segments in Ultralytics?
To convert existing bounding box data (in `x y w h` format) to segments, you can use the `yolo_bbox2segment` function. Ensure your files are organized with separate directories for images and labels.
```python
from ultralytics.data.converter import yolo_bbox2segment
yolo_bbox2segment(
im_dir="path/to/images",
save_dir=None, # saved to "labels-segment" in the images directory
sam_model="sam_b.pt",
)
```
For more information, visit the [yolo_bbox2segment reference page](../reference/data/converter.md#ultralytics.data.converter.yolo_bbox2segment).

View file

@ -647,6 +647,7 @@ plugins:
reference/ops.md: reference/utils/ops.md
reference/results.md: reference/engine/results.md
reference/base_val.md: index.md
reference/index.md: reference/cfg/__init__.md
tasks/classification.md: tasks/classify.md
tasks/detection.md: tasks/detect.md
tasks/segmentation.md: tasks/segment.md

View file

@ -505,7 +505,7 @@ class Model(nn.Module):
used to customize various aspects of the validation process.
Returns:
(dict): Validation metrics obtained from the validation process.
(ultralytics.utils.metrics.DetMetrics): Validation metrics obtained from the validation process.
Raises:
AssertionError: If the model is not a PyTorch model.