Add Chinese Modes and Tasks Docs (#6274)
Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This commit is contained in:
parent
795b95bdcb
commit
e3a538bbde
293 changed files with 3681 additions and 736 deletions
1
docs/en/CNAME
Normal file
1
docs/en/CNAME
Normal file
|
|
@ -0,0 +1 @@
|
|||
docs.ultralytics.com
|
||||
81
docs/en/datasets/classify/caltech101.md
Normal file
81
docs/en/datasets/classify/caltech101.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn about the Caltech-101 dataset, its structure and uses in machine learning. Includes instructions to train a YOLO model using this dataset.
|
||||
keywords: Caltech-101, dataset, YOLO training, machine learning, object recognition, ultralytics
|
||||
---
|
||||
|
||||
# Caltech-101 Dataset
|
||||
|
||||
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is a widely used dataset for object recognition tasks, containing around 9,000 images from 101 object categories. The categories were chosen to reflect a variety of real-world objects, and the images themselves were carefully selected and annotated to provide a challenging benchmark for object recognition algorithms.
|
||||
|
||||
## Key Features
|
||||
|
||||
- The Caltech-101 dataset comprises around 9,000 color images divided into 101 categories.
|
||||
- The categories encompass a wide variety of objects, including animals, vehicles, household items, and people.
|
||||
- The number of images per category varies, with about 40 to 800 images in each category.
|
||||
- Images are of variable sizes, with most images being medium resolution.
|
||||
- Caltech-101 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
Unlike many other datasets, the Caltech-101 dataset is not formally split into training and testing sets. Users typically create their own splits based on their specific needs. However, a common practice is to use a random subset of images for training (e.g., 30 images per category) and the remaining images for testing.
|
||||
|
||||
## Applications
|
||||
|
||||
The Caltech-101 dataset is extensively used for training and evaluating deep learning models in object recognition tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its wide variety of categories and high-quality images make it an excellent dataset for research and development in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLO model on the Caltech-101 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='caltech101', epochs=100, imgsz=416)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=caltech101 model=yolov8n-cls.pt epochs=100 imgsz=416
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The Caltech-101 dataset contains high-quality color images of various objects, providing a well-structured dataset for object recognition tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the objects in the Caltech-101 dataset, emphasizing the significance of a diverse dataset for training robust object recognition models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the Caltech-101 dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{fei2007learning,
|
||||
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories},
|
||||
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro},
|
||||
journal={Computer vision and Image understanding},
|
||||
volume={106},
|
||||
number={1},
|
||||
pages={59--70},
|
||||
year={2007},
|
||||
publisher={Elsevier}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Li Fei-Fei, Rob Fergus, and Pietro Perona for creating and maintaining the Caltech-101 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the Caltech-101 dataset and its creators, visit the [Caltech-101 dataset website](https://data.caltech.edu/records/mzrjq-6wc02).
|
||||
78
docs/en/datasets/classify/caltech256.md
Normal file
78
docs/en/datasets/classify/caltech256.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the Caltech-256 dataset, a diverse collection of images used for object recognition tasks in machine learning. Learn to train a YOLO model on the dataset.
|
||||
keywords: Ultralytics, YOLO, Caltech-256, dataset, object recognition, machine learning, computer vision, deep learning
|
||||
---
|
||||
|
||||
# Caltech-256 Dataset
|
||||
|
||||
The [Caltech-256](https://data.caltech.edu/records/nyy15-4j048) dataset is an extensive collection of images used for object classification tasks. It contains around 30,000 images divided into 257 categories (256 object categories and 1 background category). The images are carefully curated and annotated to provide a challenging and diverse benchmark for object recognition algorithms.
|
||||
|
||||
## Key Features
|
||||
|
||||
- The Caltech-256 dataset comprises around 30,000 color images divided into 257 categories.
|
||||
- Each category contains a minimum of 80 images.
|
||||
- The categories encompass a wide variety of real-world objects, including animals, vehicles, household items, and people.
|
||||
- Images are of variable sizes and resolutions.
|
||||
- Caltech-256 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
Like Caltech-101, the Caltech-256 dataset does not have a formal split between training and testing sets. Users typically create their own splits according to their specific needs. A common practice is to use a random subset of images for training and the remaining images for testing.
|
||||
|
||||
## Applications
|
||||
|
||||
The Caltech-256 dataset is extensively used for training and evaluating deep learning models in object recognition tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its diverse set of categories and high-quality images make it an invaluable dataset for research and development in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLO model on the Caltech-256 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='caltech256', epochs=100, imgsz=416)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=caltech256 model=yolov8n-cls.pt epochs=100 imgsz=416
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The Caltech-256 dataset contains high-quality color images of various objects, providing a comprehensive dataset for object recognition tasks. Here are some examples of images from the dataset ([credit](https://ml4a.github.io/demos/tsne_viewer.html)):
|
||||
|
||||

|
||||
|
||||
The example showcases the diversity and complexity of the objects in the Caltech-256 dataset, emphasizing the importance of a varied dataset for training robust object recognition models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the Caltech-256 dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{griffin2007caltech,
|
||||
title={Caltech-256 object category dataset},
|
||||
author={Griffin, Gregory and Holub, Alex and Perona, Pietro},
|
||||
year={2007}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Gregory Griffin, Alex Holub, and Pietro Perona for creating and maintaining the Caltech-256 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the
|
||||
|
||||
Caltech-256 dataset and its creators, visit the [Caltech-256 dataset website](https://data.caltech.edu/records/nyy15-4j048).
|
||||
80
docs/en/datasets/classify/cifar10.md
Normal file
80
docs/en/datasets/classify/cifar10.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the CIFAR-10 dataset, widely used for training in machine learning and computer vision, and learn how to use it with Ultralytics YOLO.
|
||||
keywords: CIFAR-10, dataset, machine learning, image classification, computer vision, YOLO, Ultralytics, training, testing, deep learning, Convolutional Neural Networks, Support Vector Machines
|
||||
---
|
||||
|
||||
# CIFAR-10 Dataset
|
||||
|
||||
The [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a collection of images used widely for machine learning and computer vision algorithms. It was developed by researchers at the CIFAR institute and consists of 60,000 32x32 color images in 10 different classes.
|
||||
|
||||
## Key Features
|
||||
|
||||
- The CIFAR-10 dataset consists of 60,000 images, divided into 10 classes.
|
||||
- Each class contains 6,000 images, split into 5,000 for training and 1,000 for testing.
|
||||
- The images are colored and of size 32x32 pixels.
|
||||
- The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
|
||||
- CIFAR-10 is commonly used for training and testing in the field of machine learning and computer vision.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The CIFAR-10 dataset is split into two subsets:
|
||||
|
||||
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
|
||||
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
|
||||
|
||||
## Applications
|
||||
|
||||
The CIFAR-10 dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a well-rounded dataset for research and development in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLO model on the CIFAR-10 dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='cifar10', epochs=100, imgsz=32)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=cifar10 model=yolov8n-cls.pt epochs=100 imgsz=32
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The CIFAR-10 dataset contains color images of various objects, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the objects in the CIFAR-10 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the CIFAR-10 dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@TECHREPORT{Krizhevsky09learningmultiple,
|
||||
author={Alex Krizhevsky},
|
||||
title={Learning multiple layers of features from tiny images},
|
||||
institution={},
|
||||
year={2009}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-10 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the CIFAR-10 dataset and its creator, visit the [CIFAR-10 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).
|
||||
80
docs/en/datasets/classify/cifar100.md
Normal file
80
docs/en/datasets/classify/cifar100.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover how to leverage the CIFAR-100 dataset for machine learning and computer vision tasks with YOLO. Gain insights on its structure, use, and utilization for model training.
|
||||
keywords: Ultralytics, YOLO, CIFAR-100 dataset, image classification, machine learning, computer vision, YOLO model training
|
||||
---
|
||||
|
||||
# CIFAR-100 Dataset
|
||||
|
||||
The [CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a significant extension of the CIFAR-10 dataset, composed of 60,000 32x32 color images in 100 different classes. It was developed by researchers at the CIFAR institute, offering a more challenging dataset for more complex machine learning and computer vision tasks.
|
||||
|
||||
## Key Features
|
||||
|
||||
- The CIFAR-100 dataset consists of 60,000 images, divided into 100 classes.
|
||||
- Each class contains 600 images, split into 500 for training and 100 for testing.
|
||||
- The images are colored and of size 32x32 pixels.
|
||||
- The 100 different classes are grouped into 20 coarse categories for higher level classification.
|
||||
- CIFAR-100 is commonly used for training and testing in the field of machine learning and computer vision.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The CIFAR-100 dataset is split into two subsets:
|
||||
|
||||
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
|
||||
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
|
||||
|
||||
## Applications
|
||||
|
||||
The CIFAR-100 dataset is extensively used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a more challenging and comprehensive dataset for research and development in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLO model on the CIFAR-100 dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='cifar100', epochs=100, imgsz=32)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=cifar100 model=yolov8n-cls.pt epochs=100 imgsz=32
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The CIFAR-100 dataset contains color images of various objects, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the objects in the CIFAR-100 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the CIFAR-100 dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@TECHREPORT{Krizhevsky09learningmultiple,
|
||||
author={Alex Krizhevsky},
|
||||
title={Learning multiple layers of features from tiny images},
|
||||
institution={},
|
||||
year={2009}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-100 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the CIFAR-100 dataset and its creator, visit the [CIFAR-100 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).
|
||||
79
docs/en/datasets/classify/fashion-mnist.md
Normal file
79
docs/en/datasets/classify/fashion-mnist.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to use the Fashion-MNIST dataset for image classification with the Ultralytics YOLO model. Covers dataset structure, labels, applications, and usage.
|
||||
keywords: Ultralytics, YOLO, Fashion-MNIST, dataset, image classification, machine learning, deep learning, neural networks, training, testing
|
||||
---
|
||||
|
||||
# Fashion-MNIST Dataset
|
||||
|
||||
The [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset is a database of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms.
|
||||
|
||||
## Key Features
|
||||
|
||||
- Fashion-MNIST contains 60,000 training images and 10,000 testing images of Zalando's article images.
|
||||
- The dataset comprises grayscale images of size 28x28 pixels.
|
||||
- Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255.
|
||||
- Fashion-MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The Fashion-MNIST dataset is split into two subsets:
|
||||
|
||||
1. **Training Set**: This subset contains 60,000 images used for training machine learning models.
|
||||
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
|
||||
|
||||
## Labels
|
||||
|
||||
Each training and test example is assigned to one of the following labels:
|
||||
|
||||
0. T-shirt/top
|
||||
1. Trouser
|
||||
2. Pullover
|
||||
3. Dress
|
||||
4. Coat
|
||||
5. Sandal
|
||||
6. Shirt
|
||||
7. Sneaker
|
||||
8. Bag
|
||||
9. Ankle boot
|
||||
|
||||
## Applications
|
||||
|
||||
The Fashion-MNIST dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a CNN model on the Fashion-MNIST dataset for 100 epochs with an image size of 28x28, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='fashion-mnist', epochs=100, imgsz=28)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=fashion-mnist model=yolov8n-cls.pt epochs=100 imgsz=28
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The Fashion-MNIST dataset contains grayscale images of Zalando's article images, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the images in the Fashion-MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
If you use the Fashion-MNIST dataset in your research or development work, please acknowledge the dataset by linking to the [GitHub repository](https://github.com/zalandoresearch/fashion-mnist). This dataset was made available by Zalando Research.
|
||||
83
docs/en/datasets/classify/imagenet.md
Normal file
83
docs/en/datasets/classify/imagenet.md
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
---
|
||||
comments: true
|
||||
description: Understand how to use ImageNet, an extensive annotated image dataset for object recognition research, with Ultralytics YOLO models. Learn about its structure, usage, and significance in computer vision.
|
||||
keywords: Ultralytics, YOLO, ImageNet, dataset, object recognition, deep learning, computer vision, machine learning, dataset training, model training, image classification, object detection
|
||||
---
|
||||
|
||||
# ImageNet Dataset
|
||||
|
||||
[ImageNet](https://www.image-net.org/) is a large-scale database of annotated images designed for use in visual object recognition research. It contains over 14 million images, with each image annotated using WordNet synsets, making it one of the most extensive resources available for training deep learning models in computer vision tasks.
|
||||
|
||||
## Key Features
|
||||
|
||||
- ImageNet contains over 14 million high-resolution images spanning thousands of object categories.
|
||||
- The dataset is organized according to the WordNet hierarchy, with each synset representing a category.
|
||||
- ImageNet is widely used for training and benchmarking in the field of computer vision, particularly for image classification and object detection tasks.
|
||||
- The annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been instrumental in advancing computer vision research.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The ImageNet dataset is organized using the WordNet hierarchy. Each node in the hierarchy represents a category, and each category is described by a synset (a collection of synonymous terms). The images in ImageNet are annotated with one or more synsets, providing a rich resource for training models to recognize various objects and their relationships.
|
||||
|
||||
## ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
|
||||
|
||||
The annual [ImageNet Large Scale Visual Recognition Challenge (ILSVRC)](http://image-net.org/challenges/LSVRC/) has been an important event in the field of computer vision. It has provided a platform for researchers and developers to evaluate their algorithms and models on a large-scale dataset with standardized evaluation metrics. The ILSVRC has led to significant advancements in the development of deep learning models for image classification, object detection, and other computer vision tasks.
|
||||
|
||||
## Applications
|
||||
|
||||
The ImageNet dataset is widely used for training and evaluating deep learning models in various computer vision tasks, such as image classification, object detection, and object localization. Some popular deep learning architectures, such as AlexNet, VGG, and ResNet, were developed and benchmarked using the ImageNet dataset.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a deep learning model on the ImageNet dataset for 100 epochs with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='imagenet', epochs=100, imgsz=224)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo train data=imagenet model=yolov8n-cls.pt epochs=100 imgsz=224
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The ImageNet dataset contains high-resolution images spanning thousands of object categories, providing a diverse and extensive dataset for training and evaluating computer vision models. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the images in the ImageNet dataset, highlighting the importance of a diverse dataset for training robust computer vision models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the ImageNet dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{ILSVRC15,
|
||||
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
|
||||
title={ImageNet Large Scale Visual Recognition Challenge},
|
||||
year={2015},
|
||||
journal={International Journal of Computer Vision (IJCV)},
|
||||
volume={115},
|
||||
number={3},
|
||||
pages={211-252}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset as a valuable resource for the machine learning and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).
|
||||
78
docs/en/datasets/classify/imagenet10.md
Normal file
78
docs/en/datasets/classify/imagenet10.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the compact ImageNet10 Dataset developed by Ultralytics. Ideal for fast testing of computer vision training pipelines and CV model sanity checks.
|
||||
keywords: Ultralytics, YOLO, ImageNet10 Dataset, Image detection, Deep Learning, ImageNet, AI model testing, Computer vision, Machine learning
|
||||
---
|
||||
|
||||
# ImageNet10 Dataset
|
||||
|
||||
The [ImageNet10](https://github.com/ultralytics/yolov5/releases/download/v1.0/imagenet10.zip) dataset is a small-scale subset of the [ImageNet](https://www.image-net.org/) database, developed by [Ultralytics](https://ultralytics.com) and designed for CI tests, sanity checks, and fast testing of training pipelines. This dataset is composed of the first image in the training set and the first image from the validation set of the first 10 classes in ImageNet. Although significantly smaller, it retains the structure and diversity of the original ImageNet dataset.
|
||||
|
||||
## Key Features
|
||||
|
||||
- ImageNet10 is a compact version of ImageNet, with 20 images representing the first 10 classes of the original dataset.
|
||||
- The dataset is organized according to the WordNet hierarchy, mirroring the structure of the full ImageNet dataset.
|
||||
- It is ideally suited for CI tests, sanity checks, and rapid testing of training pipelines in computer vision tasks.
|
||||
- Although not designed for model benchmarking, it can provide a quick indication of a model's basic functionality and correctness.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The ImageNet10 dataset, like the original ImageNet, is organized using the WordNet hierarchy. Each of the 10 classes in ImageNet10 is described by a synset (a collection of synonymous terms). The images in ImageNet10 are annotated with one or more synsets, providing a compact resource for testing models to recognize various objects and their relationships.
|
||||
|
||||
## Applications
|
||||
|
||||
The ImageNet10 dataset is useful for quickly testing and debugging computer vision models and pipelines. Its small size allows for rapid iteration, making it ideal for continuous integration tests and sanity checks. It can also be used for fast preliminary testing of new models or changes to existing models before moving on to full-scale testing with the complete ImageNet dataset.
|
||||
|
||||
## Usage
|
||||
|
||||
To test a deep learning model on the ImageNet10 dataset with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Test Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='imagenet10', epochs=5, imgsz=224)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo train data=imagenet10 model=yolov8n-cls.pt epochs=5 imgsz=224
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The ImageNet10 dataset contains a subset of images from the original ImageNet dataset. These images are chosen to represent the first 10 classes in the dataset, providing a diverse yet compact dataset for quick testing and evaluation.
|
||||
|
||||

|
||||
The example showcases the variety and complexity of the images in the ImageNet10 dataset, highlighting its usefulness for sanity checks and quick testing of computer vision models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the ImageNet10 dataset in your research or development work, please cite the original ImageNet paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{ILSVRC15,
|
||||
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
|
||||
title={ImageNet Large Scale Visual Recognition Challenge},
|
||||
year={2015},
|
||||
journal={International Journal of Computer Vision (IJCV)},
|
||||
volume={115},
|
||||
number={3},
|
||||
pages={211-252}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset. The ImageNet10 dataset, while a compact subset, is a valuable resource for quick testing and debugging in the machine learning and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).
|
||||
113
docs/en/datasets/classify/imagenette.md
Normal file
113
docs/en/datasets/classify/imagenette.md
Normal file
|
|
@ -0,0 +1,113 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn about the ImageNette dataset and its usage in deep learning model training. Find code snippets for model training and explore ImageNette datatypes.
|
||||
keywords: ImageNette dataset, Ultralytics, YOLO, Image classification, Machine Learning, Deep learning, Training code snippets, CNN, ImageNette160, ImageNette320
|
||||
---
|
||||
|
||||
# ImageNette Dataset
|
||||
|
||||
The [ImageNette](https://github.com/fastai/imagenette) dataset is a subset of the larger [Imagenet](http://www.image-net.org/) dataset, but it only includes 10 easily distinguishable classes. It was created to provide a quicker, easier-to-use version of Imagenet for software development and education.
|
||||
|
||||
## Key Features
|
||||
|
||||
- ImageNette contains images from 10 different classes such as tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute.
|
||||
- The dataset comprises colored images of varying dimensions.
|
||||
- ImageNette is widely used for training and testing in the field of machine learning, especially for image classification tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The ImageNette dataset is split into two subsets:
|
||||
|
||||
1. **Training Set**: This subset contains several thousands of images used for training machine learning models. The exact number varies per class.
|
||||
2. **Validation Set**: This subset consists of several hundreds of images used for validating and benchmarking the trained models. Again, the exact number varies per class.
|
||||
|
||||
## Applications
|
||||
|
||||
The ImageNette dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), and various other machine learning algorithms. The dataset's straightforward format and well-chosen classes make it a handy resource for both beginner and experienced practitioners in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model on the ImageNette dataset for 100 epochs with a standard image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='imagenette', epochs=100, imgsz=224)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=imagenette model=yolov8n-cls.pt epochs=100 imgsz=224
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The ImageNette dataset contains colored images of various objects and scenes, providing a diverse dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the images in the ImageNette dataset, highlighting the importance of a diverse dataset for training robust image classification models.
|
||||
|
||||
## ImageNette160 and ImageNette320
|
||||
|
||||
For faster prototyping and training, the ImageNette dataset is also available in two reduced sizes: ImageNette160 and ImageNette320. These datasets maintain the same classes and structure as the full ImageNette dataset, but the images are resized to a smaller dimension. As such, these versions of the dataset are particularly useful for preliminary model testing, or when computational resources are limited.
|
||||
|
||||
To use these datasets, simply replace 'imagenette' with 'imagenette160' or 'imagenette320' in the training command. The following code snippets illustrate this:
|
||||
|
||||
!!! example "Train Example with ImageNette160"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model with ImageNette160
|
||||
results = model.train(data='imagenette160', epochs=100, imgsz=160)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model with ImageNette160
|
||||
yolo detect train data=imagenette160 model=yolov8n-cls.pt epochs=100 imgsz=160
|
||||
```
|
||||
|
||||
!!! example "Train Example with ImageNette320"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model with ImageNette320
|
||||
results = model.train(data='imagenette320', epochs=100, imgsz=320)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model with ImageNette320
|
||||
yolo detect train data=imagenette320 model=yolov8n-cls.pt epochs=100 imgsz=320
|
||||
```
|
||||
|
||||
These smaller versions of the dataset allow for rapid iterations during the development process while still providing valuable and realistic image classification tasks.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the ImageNette dataset in your research or development work, please acknowledge it appropriately. For more information about the ImageNette dataset, visit the [ImageNette dataset GitHub page](https://github.com/fastai/imagenette).
|
||||
84
docs/en/datasets/classify/imagewoof.md
Normal file
84
docs/en/datasets/classify/imagewoof.md
Normal file
|
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the ImageWoof dataset, designed for challenging dog breed classification. Train AI models with Ultralytics YOLO using this dataset.
|
||||
keywords: ImageWoof, image classification, dog breeds, machine learning, deep learning, Ultralytics, YOLO, dataset
|
||||
---
|
||||
|
||||
# ImageWoof Dataset
|
||||
|
||||
The [ImageWoof](https://github.com/fastai/imagenette) dataset is a subset of the ImageNet consisting of 10 classes that are challenging to classify, since they're all dog breeds. It was created as a more difficult task for image classification algorithms to solve, aiming at encouraging development of more advanced models.
|
||||
|
||||
## Key Features
|
||||
|
||||
- ImageWoof contains images of 10 different dog breeds: Australian terrier, Border terrier, Samoyed, Beagle, Shih-Tzu, English foxhound, Rhodesian ridgeback, Dingo, Golden retriever, and Old English sheepdog.
|
||||
- The dataset provides images at various resolutions (full size, 320px, 160px), accommodating for different computational capabilities and research needs.
|
||||
- It also includes a version with noisy labels, providing a more realistic scenario where labels might not always be reliable.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The ImageWoof dataset structure is based on the dog breed classes, with each breed having its own directory of images.
|
||||
|
||||
## Applications
|
||||
|
||||
The ImageWoof dataset is widely used for training and evaluating deep learning models in image classification tasks, especially when it comes to more complex and similar classes. The dataset's challenge lies in the subtle differences between the dog breeds, pushing the limits of model's performance and generalization.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a CNN model on the ImageWoof dataset for 100 epochs with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='imagewoof', epochs=100, imgsz=224)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=imagewoof model=yolov8n-cls.pt epochs=100 imgsz=224
|
||||
```
|
||||
|
||||
## Dataset Variants
|
||||
|
||||
ImageWoof dataset comes in three different sizes to accommodate various research needs and computational capabilities:
|
||||
|
||||
1. **Full Size (imagewoof)**: This is the original version of the ImageWoof dataset. It contains full-sized images and is ideal for final training and performance benchmarking.
|
||||
|
||||
2. **Medium Size (imagewoof320)**: This version contains images resized to have a maximum edge length of 320 pixels. It's suitable for faster training without significantly sacrificing model performance.
|
||||
|
||||
3. **Small Size (imagewoof160)**: This version contains images resized to have a maximum edge length of 160 pixels. It's designed for rapid prototyping and experimentation where training speed is a priority.
|
||||
|
||||
To use these variants in your training, simply replace 'imagewoof' in the dataset argument with 'imagewoof320' or 'imagewoof160'. For example:
|
||||
|
||||
```python
|
||||
# For medium-sized dataset
|
||||
model.train(data='imagewoof320', epochs=100, imgsz=224)
|
||||
|
||||
# For small-sized dataset
|
||||
model.train(data='imagewoof160', epochs=100, imgsz=224)
|
||||
```
|
||||
|
||||
It's important to note that using smaller images will likely yield lower performance in terms of classification accuracy. However, it's an excellent way to iterate quickly in the early stages of model development and prototyping.
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The ImageWoof dataset contains colorful images of various dog breeds, providing a challenging dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the subtle differences and similarities among the different dog breeds in the ImageWoof dataset, highlighting the complexity and difficulty of the classification task.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the ImageWoof dataset in your research or development work, please make sure to acknowledge the creators of the dataset by linking to the [official dataset repository](https://github.com/fastai/imagenette).
|
||||
|
||||
We would like to acknowledge the FastAI team for creating and maintaining the ImageWoof dataset as a valuable resource for the machine learning and computer vision research community. For more information about the ImageWoof dataset, visit the [ImageWoof dataset repository](https://github.com/fastai/imagenette).
|
||||
120
docs/en/datasets/classify/index.md
Normal file
120
docs/en/datasets/classify/index.md
Normal file
|
|
@ -0,0 +1,120 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore image classification datasets supported by Ultralytics, learn the standard dataset format, and set up your own dataset for training models.
|
||||
keywords: Ultralytics, image classification, dataset, machine learning, CIFAR-10, ImageNet, MNIST, torchvision
|
||||
---
|
||||
|
||||
# Image Classification Datasets Overview
|
||||
|
||||
## Dataset format
|
||||
|
||||
The folder structure for classification datasets in torchvision typically follows a standard format:
|
||||
|
||||
```
|
||||
root/
|
||||
|-- class1/
|
||||
| |-- img1.jpg
|
||||
| |-- img2.jpg
|
||||
| |-- ...
|
||||
|
|
||||
|-- class2/
|
||||
| |-- img1.jpg
|
||||
| |-- img2.jpg
|
||||
| |-- ...
|
||||
|
|
||||
|-- class3/
|
||||
| |-- img1.jpg
|
||||
| |-- img2.jpg
|
||||
| |-- ...
|
||||
|
|
||||
|-- ...
|
||||
```
|
||||
|
||||
In this folder structure, the `root` directory contains one subdirectory for each class in the dataset. Each subdirectory is named after the corresponding class and contains all the images for that class. Each image file is named uniquely and is typically in a common image file format such as JPEG or PNG.
|
||||
|
||||
** Example **
|
||||
|
||||
For example, in the CIFAR10 dataset, the folder structure would look like this:
|
||||
|
||||
```
|
||||
cifar-10-/
|
||||
|
|
||||
|-- train/
|
||||
| |-- airplane/
|
||||
| | |-- 10008_airplane.png
|
||||
| | |-- 10009_airplane.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- automobile/
|
||||
| | |-- 1000_automobile.png
|
||||
| | |-- 1001_automobile.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- bird/
|
||||
| | |-- 10014_bird.png
|
||||
| | |-- 10015_bird.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- ...
|
||||
|
|
||||
|-- test/
|
||||
| |-- airplane/
|
||||
| | |-- 10_airplane.png
|
||||
| | |-- 11_airplane.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- automobile/
|
||||
| | |-- 100_automobile.png
|
||||
| | |-- 101_automobile.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- bird/
|
||||
| | |-- 1000_bird.png
|
||||
| | |-- 1001_bird.png
|
||||
| | |-- ...
|
||||
| |
|
||||
| |-- ...
|
||||
```
|
||||
|
||||
In this example, the `train` directory contains subdirectories for each class in the dataset, and each class subdirectory contains all the images for that class. The `test` directory has a similar structure. The `root` directory also contains other files that are part of the CIFAR10 dataset.
|
||||
|
||||
## Usage
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='path/to/dataset', epochs=100, imgsz=640)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=path/to/data model=yolov8n-cls.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
Ultralytics supports the following datasets with automatic download:
|
||||
|
||||
* [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
|
||||
* [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
|
||||
* [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
|
||||
* [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
|
||||
* [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
|
||||
* [ImageNet](imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
|
||||
* [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
|
||||
* [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
|
||||
* [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
|
||||
* [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
|
||||
|
||||
### Adding your own dataset
|
||||
|
||||
If you have your own dataset and would like to use it for training classification models with Ultralytics, ensure that it follows the format specified above under "Dataset format" and then point your `data` argument to the dataset directory.
|
||||
86
docs/en/datasets/classify/mnist.md
Normal file
86
docs/en/datasets/classify/mnist.md
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
comments: true
|
||||
description: Detailed guide on the MNIST Dataset, a benchmark in the machine learning community for image classification tasks. Learn about its structure, usage and application.
|
||||
keywords: MNIST dataset, Ultralytics, image classification, machine learning, computer vision, deep learning, AI, dataset guide
|
||||
---
|
||||
|
||||
# MNIST Dataset
|
||||
|
||||
The [MNIST](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) dataset is a large database of handwritten digits that is commonly used for training various image processing systems and machine learning models. It was created by "re-mixing" the samples from NIST's original datasets and has become a benchmark for evaluating the performance of image classification algorithms.
|
||||
|
||||
## Key Features
|
||||
|
||||
- MNIST contains 60,000 training images and 10,000 testing images of handwritten digits.
|
||||
- The dataset comprises grayscale images of size 28x28 pixels.
|
||||
- The images are normalized to fit into a 28x28 pixel bounding box and anti-aliased, introducing grayscale levels.
|
||||
- MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The MNIST dataset is split into two subsets:
|
||||
|
||||
1. **Training Set**: This subset contains 60,000 images of handwritten digits used for training machine learning models.
|
||||
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
|
||||
|
||||
## Extended MNIST (EMNIST)
|
||||
|
||||
Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the successor to MNIST. While MNIST included images only of handwritten digits, EMNIST includes all the images from NIST Special Database 19, which is a large database of handwritten uppercase and lowercase letters as well as digits. The images in EMNIST were converted into the same 28x28 pixel format, by the same process, as were the MNIST images. Accordingly, tools that work with the older, smaller MNIST dataset will likely work unmodified with EMNIST.
|
||||
|
||||
## Applications
|
||||
|
||||
The MNIST dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
|
||||
|
||||
## Usage
|
||||
|
||||
To train a CNN model on the MNIST dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='mnist', epochs=100, imgsz=32)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
cnn detect train data=mnist model=yolov8n-cls.pt epochs=100 imgsz=28
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The MNIST dataset contains grayscale images of handwritten digits, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
|
||||
|
||||

|
||||
|
||||
The example showcases the variety and complexity of the handwritten digits in the MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the MNIST dataset in your
|
||||
|
||||
research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{lecun2010mnist,
|
||||
title={MNIST handwritten digit database},
|
||||
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
|
||||
journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
|
||||
volume={2},
|
||||
year={2010}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for creating and maintaining the MNIST dataset as a valuable resource for the machine learning and computer vision research community. For more information about the MNIST dataset and its creators, visit the [MNIST dataset website](http://yann.lecun.com/exdb/mnist/).
|
||||
97
docs/en/datasets/detect/argoverse.md
Normal file
97
docs/en/datasets/detect/argoverse.md
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore Argoverse, a comprehensive dataset for autonomous driving tasks including 3D tracking, motion forecasting and depth estimation used in YOLO.
|
||||
keywords: Argoverse dataset, autonomous driving, YOLO, 3D tracking, motion forecasting, LiDAR data, HD maps, ultralytics documentation
|
||||
---
|
||||
|
||||
# Argoverse Dataset
|
||||
|
||||
The [Argoverse](https://www.argoverse.org/) dataset is a collection of data designed to support research in autonomous driving tasks, such as 3D tracking, motion forecasting, and stereo depth estimation. Developed by Argo AI, the dataset provides a wide range of high-quality sensor data, including high-resolution images, LiDAR point clouds, and map data.
|
||||
|
||||
!!! note
|
||||
|
||||
The Argoverse dataset *.zip file required for training was removed from Amazon S3 after the shutdown of Argo AI by Ford, but we have made it available for manual download on [Google Drive](https://drive.google.com/file/d/1st9qW3BeIwQsnR0t8mRpvbsSWIo16ACi/view?usp=drive_link).
|
||||
|
||||
## Key Features
|
||||
|
||||
- Argoverse contains over 290K labeled 3D object tracks and 5 million object instances across 1,263 distinct scenes.
|
||||
- The dataset includes high-resolution camera images, LiDAR point clouds, and richly annotated HD maps.
|
||||
- Annotations include 3D bounding boxes for objects, object tracks, and trajectory information.
|
||||
- Argoverse provides multiple subsets for different tasks, such as 3D tracking, motion forecasting, and stereo depth estimation.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The Argoverse dataset is organized into three main subsets:
|
||||
|
||||
1. **Argoverse 3D Tracking**: This subset contains 113 scenes with over 290K labeled 3D object tracks, focusing on 3D object tracking tasks. It includes LiDAR point clouds, camera images, and sensor calibration information.
|
||||
2. **Argoverse Motion Forecasting**: This subset consists of 324K vehicle trajectories collected from 60 hours of driving data, suitable for motion forecasting tasks.
|
||||
3. **Argoverse Stereo Depth Estimation**: This subset is designed for stereo depth estimation tasks and includes over 10K stereo image pairs with corresponding LiDAR point clouds for ground truth depth estimation.
|
||||
|
||||
## Applications
|
||||
|
||||
The Argoverse dataset is widely used for training and evaluating deep learning models in autonomous driving tasks such as 3D object tracking, motion forecasting, and stereo depth estimation. The dataset's diverse set of sensor data, object annotations, and map information make it a valuable resource for researchers and practitioners in the field of autonomous driving.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Argoverse dataset, the `Argoverse.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Argoverse.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Argoverse.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/Argoverse.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/Argoverse.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the Argoverse dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='Argoverse.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=Argoverse.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The Argoverse dataset contains a diverse set of sensor data, including camera images, LiDAR point clouds, and HD map information, providing rich context for autonomous driving tasks. Here are some examples of data from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Argoverse 3D Tracking**: This image demonstrates an example of 3D object tracking, where objects are annotated with 3D bounding boxes. The dataset provides LiDAR point clouds and camera images to facilitate the development of models for this task.
|
||||
|
||||
The example showcases the variety and complexity of the data in the Argoverse dataset and highlights the importance of high-quality sensor data for autonomous driving tasks.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the Argoverse dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@inproceedings{chang2019argoverse,
|
||||
title={Argoverse: 3D Tracking and Forecasting with Rich Maps},
|
||||
author={Chang, Ming-Fang and Lambert, John and Sangkloy, Patsorn and Singh, Jagjeet and Bak, Slawomir and Hartnett, Andrew and Wang, Dequan and Carr, Peter and Lucey, Simon and Ramanan, Deva and others},
|
||||
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
|
||||
pages={8748--8757},
|
||||
year={2019}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Argo AI for creating and maintaining the Argoverse dataset as a valuable resource for the autonomous driving research community. For more information about the Argoverse dataset and its creators, visit the [Argoverse dataset website](https://www.argoverse.org/).
|
||||
94
docs/en/datasets/detect/coco.md
Normal file
94
docs/en/datasets/detect/coco.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how COCO, a leading dataset for object detection and segmentation, integrates with Ultralytics. Discover ways to use it for training YOLO models.
|
||||
keywords: Ultralytics, COCO dataset, object detection, YOLO, YOLO model training, image segmentation, computer vision, deep learning models
|
||||
---
|
||||
|
||||
# COCO Dataset
|
||||
|
||||
The [COCO](https://cocodataset.org/#home) (Common Objects in Context) dataset is a large-scale object detection, segmentation, and captioning dataset. It is designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It is an essential dataset for researchers and developers working on object detection, segmentation, and pose estimation tasks.
|
||||
|
||||
## Key Features
|
||||
|
||||
- COCO contains 330K images, with 200K images having annotations for object detection, segmentation, and captioning tasks.
|
||||
- The dataset comprises 80 object categories, including common objects like cars, bicycles, and animals, as well as more specific categories such as umbrellas, handbags, and sports equipment.
|
||||
- Annotations include object bounding boxes, segmentation masks, and captions for each image.
|
||||
- COCO provides standardized evaluation metrics like mean Average Precision (mAP) for object detection, and mean Average Recall (mAR) for segmentation tasks, making it suitable for comparing model performance.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The COCO dataset is split into three subsets:
|
||||
|
||||
1. **Train2017**: This subset contains 118K images for training object detection, segmentation, and captioning models.
|
||||
2. **Val2017**: This subset has 5K images used for validation purposes during model training.
|
||||
3. **Test2017**: This subset consists of 20K images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
|
||||
|
||||
## Applications
|
||||
|
||||
The COCO dataset is widely used for training and evaluating deep learning models in object detection (such as YOLO, Faster R-CNN, and SSD), instance segmentation (such as Mask R-CNN), and keypoint detection (such as OpenPose). The dataset's diverse set of object categories, large number of annotated images, and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO dataset, the `coco.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the COCO dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The COCO dataset contains a diverse set of images with various object categories and complex scenes. Here are some examples of images from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
80
docs/en/datasets/detect/coco8.md
Normal file
80
docs/en/datasets/detect/coco8.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the benefits of using the practical and diverse COCO8 dataset for object detection model testing. Learn to configure and use it via Ultralytics HUB and YOLOv8.
|
||||
keywords: Ultralytics, COCO8 dataset, object detection, model testing, dataset configuration, detection approaches, sanity check, training pipelines, YOLOv8
|
||||
---
|
||||
|
||||
# COCO8 Dataset
|
||||
|
||||
## Introduction
|
||||
|
||||
[Ultralytics](https://ultralytics.com) COCO8 is a small, but versatile object detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
|
||||
|
||||
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
|
||||
and [YOLOv8](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8 dataset, the `coco8.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco8.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco8.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the COCO8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
Here are some examples of images from the COCO8 dataset, along with their corresponding annotations:
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/26833433/236818348-e6260a3d-0454-436b-83a9-de366ba07235.jpg" alt="Dataset sample image" width="800">
|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO8 dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
91
docs/en/datasets/detect/globalwheat2020.md
Normal file
91
docs/en/datasets/detect/globalwheat2020.md
Normal file
|
|
@ -0,0 +1,91 @@
|
|||
---
|
||||
comments: true
|
||||
description: Understand how to utilize the vast Global Wheat Head Dataset for building wheat head detection models. Features, structure, applications, usage, sample data, and citation.
|
||||
keywords: Ultralytics, YOLO, Global Wheat Head Dataset, wheat head detection, plant phenotyping, crop management, deep learning, outdoor images, annotations, YAML configuration
|
||||
---
|
||||
|
||||
# Global Wheat Head Dataset
|
||||
|
||||
The [Global Wheat Head Dataset](http://www.global-wheat.com/) is a collection of images designed to support the development of accurate wheat head detection models for applications in wheat phenotyping and crop management. Wheat heads, also known as spikes, are the grain-bearing parts of the wheat plant. Accurate estimation of wheat head density and size is essential for assessing crop health, maturity, and yield potential. The dataset, created by a collaboration of nine research institutes from seven countries, covers multiple growing regions to ensure models generalize well across different environments.
|
||||
|
||||
## Key Features
|
||||
|
||||
- The dataset contains over 3,000 training images from Europe (France, UK, Switzerland) and North America (Canada).
|
||||
- It includes approximately 1,000 test images from Australia, Japan, and China.
|
||||
- Images are outdoor field images, capturing the natural variability in wheat head appearances.
|
||||
- Annotations include wheat head bounding boxes to support object detection tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The Global Wheat Head Dataset is organized into two main subsets:
|
||||
|
||||
1. **Training Set**: This subset contains over 3,000 images from Europe and North America. The images are labeled with wheat head bounding boxes, providing ground truth for training object detection models.
|
||||
2. **Test Set**: This subset consists of approximately 1,000 images from Australia, Japan, and China. These images are used for evaluating the performance of trained models on unseen genotypes, environments, and observational conditions.
|
||||
|
||||
## Applications
|
||||
|
||||
The Global Wheat Head Dataset is widely used for training and evaluating deep learning models in wheat head detection tasks. The dataset's diverse set of images, capturing a wide range of appearances, environments, and conditions, make it a valuable resource for researchers and practitioners in the field of plant phenotyping and crop management.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Global Wheat Head Dataset, the `GlobalWheat2020.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/GlobalWheat2020.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/GlobalWheat2020.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the Global Wheat Head Dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='GlobalWheat2020.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=GlobalWheat2020.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The Global Wheat Head Dataset contains a diverse set of outdoor field images, capturing the natural variability in wheat head appearances, environments, and conditions. Here are some examples of data from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Wheat Head Detection**: This image demonstrates an example of wheat head detection, where wheat heads are annotated with bounding boxes. The dataset provides a variety of images to facilitate the development of models for this task.
|
||||
|
||||
The example showcases the variety and complexity of the data in the Global Wheat Head Dataset and highlights the importance of accurate wheat head detection for applications in wheat phenotyping and crop management.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the Global Wheat Head Dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{david2020global,
|
||||
title={Global Wheat Head Detection (GWHD) Dataset: A Large and Diverse Dataset of High-Resolution RGB-Labelled Images to Develop and Benchmark Wheat Head Detection Methods},
|
||||
author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul and others},
|
||||
journal={arXiv preprint arXiv:2005.02162},
|
||||
year={2020}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the researchers and institutions that contributed to the creation and maintenance of the Global Wheat Head Dataset as a valuable resource for the plant phenotyping and crop management research community. For more information about the dataset and its creators, visit the [Global Wheat Head Dataset website](http://www.global-wheat.com/).
|
||||
108
docs/en/datasets/detect/index.md
Normal file
108
docs/en/datasets/detect/index.md
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
---
|
||||
comments: true
|
||||
description: Navigate through supported dataset formats, methods to utilize them and how to add your own datasets. Get insights on porting or converting label formats.
|
||||
keywords: Ultralytics, YOLO, datasets, object detection, dataset formats, label formats, data conversion
|
||||
---
|
||||
|
||||
# Object Detection Datasets Overview
|
||||
|
||||
Training a robust and accurate object detection model requires a comprehensive dataset. This guide introduces various formats of datasets that are compatible with the Ultralytics YOLO model and provides insights into their structure, usage, and how to convert between different formats.
|
||||
|
||||
## Supported Dataset Formats
|
||||
|
||||
### Ultralytics YOLO format
|
||||
|
||||
The Ultralytics YOLO format is a dataset configuration format that allows you to define the dataset root directory, the relative paths to training/validation/testing image directories or *.txt files containing image paths, and a dictionary of class names. Here is an example:
|
||||
|
||||
```yaml
|
||||
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
|
||||
path: ../datasets/coco8 # dataset root dir
|
||||
train: images/train # train images (relative to 'path') 4 images
|
||||
val: images/val # val images (relative to 'path') 4 images
|
||||
test: # test images (optional)
|
||||
|
||||
# Classes (80 COCO classes)
|
||||
names:
|
||||
0: person
|
||||
1: bicycle
|
||||
2: car
|
||||
...
|
||||
77: teddy bear
|
||||
78: hair drier
|
||||
79: toothbrush
|
||||
```
|
||||
|
||||
Labels for this format should be exported to YOLO format with one `*.txt` file per image. If there are no objects in an image, no `*.txt` file is required. The `*.txt` file should be formatted with one row per object in `class x_center y_center width height` format. Box coordinates must be in **normalized xywh** format (from 0 to 1). If your boxes are in pixels, you should divide `x_center` and `width` by image width, and `y_center` and `height` by image height. Class numbers should be zero-indexed (start with 0).
|
||||
|
||||
<p align="center"><img width="750" src="https://user-images.githubusercontent.com/26833433/91506361-c7965000-e886-11ea-8291-c72b98c25eec.jpg" alt="Example labelled image"></p>
|
||||
|
||||
The label file corresponding to the above image contains 2 persons (class `0`) and a tie (class `27`):
|
||||
|
||||
<p align="center"><img width="428" src="https://user-images.githubusercontent.com/26833433/112467037-d2568c00-8d66-11eb-8796-55402ac0d62f.png" alt="Example label file"></p>
|
||||
|
||||
When using the Ultralytics YOLO format, organize your training and validation images and labels as shown in the example below.
|
||||
|
||||
<p align="center"><img width="700" src="https://user-images.githubusercontent.com/26833433/134436012-65111ad1-9541-4853-81a6-f19a3468b75f.png" alt="Example dataset directory structure"></p>
|
||||
|
||||
## Usage
|
||||
|
||||
Here's how you can use these formats to train your model:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
Here is a list of the supported datasets and a brief description for each:
|
||||
|
||||
- [**Argoverse**](./argoverse.md): A collection of sensor data collected from autonomous vehicles. It contains 3D tracking annotations for car objects.
|
||||
- [**COCO**](./coco.md): Common Objects in Context (COCO) is a large-scale object detection, segmentation, and captioning dataset with 80 object categories.
|
||||
- [**COCO8**](./coco8.md): A smaller subset of the COCO dataset, COCO8 is more lightweight and faster to train.
|
||||
- [**GlobalWheat2020**](./globalwheat2020.md): A dataset containing images of wheat heads for the Global Wheat Challenge 2020.
|
||||
- [**Objects365**](./objects365.md): A large-scale object detection dataset with 365 object categories and 600k images, aimed at advancing object detection research.
|
||||
- [**OpenImagesV7**](./open-images-v7.md): A comprehensive dataset by Google with 1.7M train images and 42k validation images.
|
||||
- [**SKU-110K**](./sku-110k.md): A dataset containing images of densely packed retail products, intended for retail environment object detection.
|
||||
- [**VisDrone**](./visdrone.md): A dataset focusing on drone-based images, containing various object categories like cars, pedestrians, and cyclists.
|
||||
- [**VOC**](./voc.md): PASCAL VOC is a popular object detection dataset with 20 object categories including vehicles, animals, and furniture.
|
||||
- [**xView**](./xview.md): A dataset containing high-resolution satellite imagery, designed for the detection of various object classes in overhead views.
|
||||
|
||||
### Adding your own dataset
|
||||
|
||||
If you have your own dataset and would like to use it for training detection models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
|
||||
|
||||
## Port or Convert Label Formats
|
||||
|
||||
### COCO Dataset Format to YOLO Format
|
||||
|
||||
You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_coco
|
||||
|
||||
convert_coco(labels_dir='path/to/coco/annotations/')
|
||||
```
|
||||
|
||||
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
|
||||
|
||||
Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
|
||||
92
docs/en/datasets/detect/objects365.md
Normal file
92
docs/en/datasets/detect/objects365.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the Objects365 dataset, a wide-scale, high-quality resource for object detection research. Learn to use it with the Ultralytics YOLO model.
|
||||
keywords: Objects365, object detection, Ultralytics, dataset, YOLO, bounding boxes, annotations, computer vision, deep learning, training models
|
||||
---
|
||||
|
||||
# Objects365 Dataset
|
||||
|
||||
The [Objects365](https://www.objects365.org/) dataset is a large-scale, high-quality dataset designed to foster object detection research with a focus on diverse objects in the wild. Created by a team of [Megvii](https://en.megvii.com/) researchers, the dataset offers a wide range of high-resolution images with a comprehensive set of annotated bounding boxes covering 365 object categories.
|
||||
|
||||
## Key Features
|
||||
|
||||
- Objects365 contains 365 object categories, with 2 million images and over 30 million bounding boxes.
|
||||
- The dataset includes diverse objects in various scenarios, providing a rich and challenging benchmark for object detection tasks.
|
||||
- Annotations include bounding boxes for objects, making it suitable for training and evaluating object detection models.
|
||||
- Objects365 pre-trained models significantly outperform ImageNet pre-trained models, leading to better generalization on various tasks.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The Objects365 dataset is organized into a single set of images with corresponding annotations:
|
||||
|
||||
- **Images**: The dataset includes 2 million high-resolution images, each containing a variety of objects across 365 categories.
|
||||
- **Annotations**: The images are annotated with over 30 million bounding boxes, providing comprehensive ground truth information for object detection tasks.
|
||||
|
||||
## Applications
|
||||
|
||||
The Objects365 dataset is widely used for training and evaluating deep learning models in object detection tasks. The dataset's diverse set of object categories and high-quality annotations make it a valuable resource for researchers and practitioners in the field of computer vision.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Objects365 Dataset, the `Objects365.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Objects365.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Objects365.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/Objects365.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/Objects365.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the Objects365 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='Objects365.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=Objects365.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The Objects365 dataset contains a diverse set of high-resolution images with objects from 365 categories, providing rich context for object detection tasks. Here are some examples of the images in the dataset:
|
||||
|
||||

|
||||
|
||||
- **Objects365**: This image demonstrates an example of object detection, where objects are annotated with bounding boxes. The dataset provides a wide range of images to facilitate the development of models for this task.
|
||||
|
||||
The example showcases the variety and complexity of the data in the Objects365 dataset and highlights the importance of accurate object detection for computer vision applications.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the Objects365 dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@inproceedings{shao2019objects365,
|
||||
title={Objects365: A Large-scale, High-quality Dataset for Object Detection},
|
||||
author={Shao, Shuai and Li, Zeming and Zhang, Tianyuan and Peng, Chao and Yu, Gang and Li, Jing and Zhang, Xiangyu and Sun, Jian},
|
||||
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
|
||||
pages={8425--8434},
|
||||
year={2019}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the team of researchers who created and maintain the Objects365 dataset as a valuable resource for the computer vision research community. For more information about the Objects365 dataset and its creators, visit the [Objects365 dataset website](https://www.objects365.org/).
|
||||
110
docs/en/datasets/detect/open-images-v7.md
Normal file
110
docs/en/datasets/detect/open-images-v7.md
Normal file
|
|
@ -0,0 +1,110 @@
|
|||
---
|
||||
comments: true
|
||||
description: Dive into Google's Open Images V7, a comprehensive dataset offering a broad scope for computer vision research. Understand its usage with deep learning models.
|
||||
keywords: Open Images V7, object detection, segmentation masks, visual relationships, localized narratives, computer vision, deep learning, annotations, bounding boxes
|
||||
---
|
||||
|
||||
# Open Images V7 Dataset
|
||||
|
||||
[Open Images V7](https://storage.googleapis.com/openimages/web/index.html) is a versatile and expansive dataset championed by Google. Aimed at propelling research in the realm of computer vision, it boasts a vast collection of images annotated with a plethora of data, including image-level labels, object bounding boxes, object segmentation masks, visual relationships, and localized narratives.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- Encompasses ~9M images annotated in various ways to suit multiple computer vision tasks.
|
||||
- Houses a staggering 16M bounding boxes across 600 object classes in 1.9M images. These boxes are primarily hand-drawn by experts ensuring high precision.
|
||||
- Visual relationship annotations totaling 3.3M are available, detailing 1,466 unique relationship triplets, object properties, and human activities.
|
||||
- V5 introduced segmentation masks for 2.8M objects across 350 classes.
|
||||
- V6 introduced 675k localized narratives that amalgamate voice, text, and mouse traces highlighting described objects.
|
||||
- V7 introduced 66.4M point-level labels on 1.4M images, spanning 5,827 classes.
|
||||
- Encompasses 61.4M image-level labels across a diverse set of 20,638 classes.
|
||||
- Provides a unified platform for image classification, object detection, relationship detection, instance segmentation, and multimodal image descriptions.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
Open Images V7 is structured in multiple components catering to varied computer vision challenges:
|
||||
|
||||
- **Images**: About 9 million images, often showcasing intricate scenes with an average of 8.3 objects per image.
|
||||
- **Bounding Boxes**: Over 16 million boxes that demarcate objects across 600 categories.
|
||||
- **Segmentation Masks**: These detail the exact boundary of 2.8M objects across 350 classes.
|
||||
- **Visual Relationships**: 3.3M annotations indicating object relationships, properties, and actions.
|
||||
- **Localized Narratives**: 675k descriptions combining voice, text, and mouse traces.
|
||||
- **Point-Level Labels**: 66.4M labels across 1.4M images, suitable for zero/few-shot semantic segmentation.
|
||||
|
||||
## Applications
|
||||
|
||||
Open Images V7 is a cornerstone for training and evaluating state-of-the-art models in various computer vision tasks. The dataset's broad scope and high-quality annotations make it indispensable for researchers and developers specializing in computer vision.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
Typically, datasets come with a YAML (Yet Another Markup Language) file that delineates the dataset's configuration. For the case of Open Images V7, a hypothetical `OpenImagesV7.yaml` might exist. For accurate paths and configurations, one should refer to the dataset's official repository or documentation.
|
||||
|
||||
!!! example "OpenImagesV7.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/open-images-v7.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the Open Images V7 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! warning
|
||||
|
||||
The complete Open Images V7 dataset comprises 1,743,042 training images and 41,620 validation images, requiring approximately **561 GB of storage space** upon download.
|
||||
|
||||
Executing the commands provided below will trigger an automatic download of the full dataset if it's not already present locally. Before running the below example it's crucial to:
|
||||
|
||||
- Verify that your device has enough storage capacity.
|
||||
- Ensure a robust and speedy internet connection.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a COCO-pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Train the model on the Open Images V7 dataset
|
||||
results = model.train(data='open-images-v7.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a COCO-pretrained YOLOv8n model on the Open Images V7 dataset
|
||||
yolo detect train data=open-images-v7.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
Illustrations of the dataset help provide insights into its richness:
|
||||
|
||||

|
||||
|
||||
- **Open Images V7**: This image exemplifies the depth and detail of annotations available, including bounding boxes, relationships, and segmentation masks.
|
||||
|
||||
Researchers can gain invaluable insights into the array of computer vision challenges that the dataset addresses, from basic object detection to intricate relationship identification.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
For those employing Open Images V7 in their work, it's prudent to cite the relevant papers and acknowledge the creators:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{OpenImages,
|
||||
author = {Alina Kuznetsova and Hassan Rom and Neil Alldrin and Jasper Uijlings and Ivan Krasin and Jordi Pont-Tuset and Shahab Kamali and Stefan Popov and Matteo Malloci and Alexander Kolesnikov and Tom Duerig and Vittorio Ferrari},
|
||||
title = {The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale},
|
||||
year = {2020},
|
||||
journal = {IJCV}
|
||||
}
|
||||
```
|
||||
|
||||
A heartfelt acknowledgment goes out to the Google AI team for creating and maintaining the Open Images V7 dataset. For a deep dive into the dataset and its offerings, navigate to the [official Open Images V7 website](https://storage.googleapis.com/openimages/web/index.html).
|
||||
93
docs/en/datasets/detect/sku-110k.md
Normal file
93
docs/en/datasets/detect/sku-110k.md
Normal file
|
|
@ -0,0 +1,93 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the SKU-110k dataset of densely packed retail shelf images for object detection research. Learn how to use it with Ultralytics.
|
||||
keywords: SKU-110k dataset, object detection, retail shelf images, Ultralytics, YOLO, computer vision, deep learning models
|
||||
---
|
||||
|
||||
# SKU-110k Dataset
|
||||
|
||||
The [SKU-110k](https://github.com/eg4000/SKU110K_CVPR19) dataset is a collection of densely packed retail shelf images, designed to support research in object detection tasks. Developed by Eran Goldman et al., the dataset contains over 110,000 unique store keeping unit (SKU) categories with densely packed objects, often looking similar or even identical, positioned in close proximity.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- SKU-110k contains images of store shelves from around the world, featuring densely packed objects that pose challenges for state-of-the-art object detectors.
|
||||
- The dataset includes over 110,000 unique SKU categories, providing a diverse range of object appearances.
|
||||
- Annotations include bounding boxes for objects and SKU category labels.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The SKU-110k dataset is organized into three main subsets:
|
||||
|
||||
1. **Training set**: This subset contains images and annotations used for training object detection models.
|
||||
2. **Validation set**: This subset consists of images and annotations used for model validation during training.
|
||||
3. **Test set**: This subset is designed for the final evaluation of trained object detection models.
|
||||
|
||||
## Applications
|
||||
|
||||
The SKU-110k dataset is widely used for training and evaluating deep learning models in object detection tasks, especially in densely packed scenes such as retail shelf displays. The dataset's diverse set of SKU categories and densely packed object arrangements make it a valuable resource for researchers and practitioners in the field of computer vision.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the SKU-110K dataset, the `SKU-110K.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/SKU-110K.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/SKU-110K.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/SKU-110K.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/SKU-110K.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the SKU-110K dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='SKU-110K.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=SKU-110K.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The SKU-110k dataset contains a diverse set of retail shelf images with densely packed objects, providing rich context for object detection tasks. Here are some examples of data from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Densely packed retail shelf image**: This image demonstrates an example of densely packed objects in a retail shelf setting. Objects are annotated with bounding boxes and SKU category labels.
|
||||
|
||||
The example showcases the variety and complexity of the data in the SKU-110k dataset and highlights the importance of high-quality data for object detection tasks.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the SKU-110k dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@inproceedings{goldman2019dense,
|
||||
author = {Eran Goldman and Roei Herzig and Aviv Eisenschtat and Jacob Goldberger and Tal Hassner},
|
||||
title = {Precise Detection in Densely Packed Scenes},
|
||||
booktitle = {Proc. Conf. Comput. Vision Pattern Recognition (CVPR)},
|
||||
year = {2019}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Eran Goldman et al. for creating and maintaining the SKU-110k dataset as a valuable resource for the computer vision research community. For more information about the SKU-110k dataset and its creators, visit the [SKU-110k dataset GitHub repository](https://github.com/eg4000/SKU110K_CVPR19).
|
||||
92
docs/en/datasets/detect/visdrone.md
Normal file
92
docs/en/datasets/detect/visdrone.md
Normal file
|
|
@ -0,0 +1,92 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the VisDrone Dataset, a large-scale benchmark for drone-based image analysis, and learn how to train a YOLO model using it.
|
||||
keywords: VisDrone Dataset, Ultralytics, drone-based image analysis, YOLO model, object detection, object tracking, crowd counting
|
||||
---
|
||||
|
||||
# VisDrone Dataset
|
||||
|
||||
The [VisDrone Dataset](https://github.com/VisDrone/VisDrone-Dataset) is a large-scale benchmark created by the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China. It contains carefully annotated ground truth data for various computer vision tasks related to drone-based image and video analysis.
|
||||
|
||||
VisDrone is composed of 288 video clips with 261,908 frames and 10,209 static images, captured by various drone-mounted cameras. The dataset covers a wide range of aspects, including location (14 different cities across China), environment (urban and rural), objects (pedestrians, vehicles, bicycles, etc.), and density (sparse and crowded scenes). The dataset was collected using various drone platforms under different scenarios and weather and lighting conditions. These frames are manually annotated with over 2.6 million bounding boxes of targets such as pedestrians, cars, bicycles, and tricycles. Attributes like scene visibility, object class, and occlusion are also provided for better data utilization.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The VisDrone dataset is organized into five main subsets, each focusing on a specific task:
|
||||
|
||||
1. **Task 1**: Object detection in images
|
||||
2. **Task 2**: Object detection in videos
|
||||
3. **Task 3**: Single-object tracking
|
||||
4. **Task 4**: Multi-object tracking
|
||||
5. **Task 5**: Crowd counting
|
||||
|
||||
## Applications
|
||||
|
||||
The VisDrone dataset is widely used for training and evaluating deep learning models in drone-based computer vision tasks such as object detection, object tracking, and crowd counting. The dataset's diverse set of sensor data, object annotations, and attributes make it a valuable resource for researchers and practitioners in the field of drone-based computer vision.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Visdrone dataset, the `VisDrone.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/VisDrone.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/VisDrone.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the VisDrone dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='VisDrone.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=VisDrone.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The VisDrone dataset contains a diverse set of images and videos captured by drone-mounted cameras. Here are some examples of data from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Task 1**: Object detection in images - This image demonstrates an example of object detection in images, where objects are annotated with bounding boxes. The dataset provides a wide variety of images taken from different locations, environments, and densities to facilitate the development of models for this task.
|
||||
|
||||
The example showcases the variety and complexity of the data in the VisDrone dataset and highlights the importance of high-quality sensor data for drone-based computer vision tasks.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the VisDrone dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@ARTICLE{9573394,
|
||||
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Detection and Tracking Meet Drones Challenge},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3119563}}
|
||||
```
|
||||
|
||||
We would like to acknowledge the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China, for creating and maintaining the VisDrone dataset as a valuable resource for the drone-based computer vision research community. For more information about the VisDrone dataset and its creators, visit the [VisDrone Dataset GitHub repository](https://github.com/VisDrone/VisDrone-Dataset).
|
||||
95
docs/en/datasets/detect/voc.md
Normal file
95
docs/en/datasets/detect/voc.md
Normal file
|
|
@ -0,0 +1,95 @@
|
|||
---
|
||||
comments: true
|
||||
description: A complete guide to the PASCAL VOC dataset used for object detection, segmentation and classification tasks with relevance to YOLO model training.
|
||||
keywords: Ultralytics, PASCAL VOC dataset, object detection, segmentation, image classification, YOLO, model training, VOC.yaml, deep learning
|
||||
---
|
||||
|
||||
# VOC Dataset
|
||||
|
||||
The [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/) (Visual Object Classes) dataset is a well-known object detection, segmentation, and classification dataset. It is designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It is an essential dataset for researchers and developers working on object detection, segmentation, and classification tasks.
|
||||
|
||||
## Key Features
|
||||
|
||||
- VOC dataset includes two main challenges: VOC2007 and VOC2012.
|
||||
- The dataset comprises 20 object categories, including common objects like cars, bicycles, and animals, as well as more specific categories such as boats, sofas, and dining tables.
|
||||
- Annotations include object bounding boxes and class labels for object detection and classification tasks, and segmentation masks for the segmentation tasks.
|
||||
- VOC provides standardized evaluation metrics like mean Average Precision (mAP) for object detection and classification, making it suitable for comparing model performance.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The VOC dataset is split into three subsets:
|
||||
|
||||
1. **Train**: This subset contains images for training object detection, segmentation, and classification models.
|
||||
2. **Validation**: This subset has images used for validation purposes during model training.
|
||||
3. **Test**: This subset consists of images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [PASCAL VOC evaluation server](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php) for performance evaluation.
|
||||
|
||||
## Applications
|
||||
|
||||
The VOC dataset is widely used for training and evaluating deep learning models in object detection (such as YOLO, Faster R-CNN, and SSD), instance segmentation (such as Mask R-CNN), and image classification. The dataset's diverse set of object categories, large number of annotated images, and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the VOC dataset, the `VOC.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VOC.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VOC.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/VOC.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/VOC.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n model on the VOC dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='VOC.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from
|
||||
a pretrained *.pt model
|
||||
yolo detect train data=VOC.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The VOC dataset contains a diverse set of images with various object categories and complex scenes. Here are some examples of images from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the VOC dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the VOC dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{everingham2010pascal,
|
||||
title={The PASCAL Visual Object Classes (VOC) Challenge},
|
||||
author={Mark Everingham and Luc Van Gool and Christopher K. I. Williams and John Winn and Andrew Zisserman},
|
||||
year={2010},
|
||||
eprint={0909.5206},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the PASCAL VOC Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the VOC dataset and its creators, visit the [PASCAL VOC dataset website](http://host.robots.ox.ac.uk/pascal/VOC/).
|
||||
97
docs/en/datasets/detect/xview.md
Normal file
97
docs/en/datasets/detect/xview.md
Normal file
|
|
@ -0,0 +1,97 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore xView, a large-scale, high resolution satellite imagery dataset for object detection. Dive into dataset structure, usage examples & its potential applications.
|
||||
keywords: Ultralytics, YOLO, computer vision, xView dataset, satellite imagery, object detection, overhead imagery, training, deep learning, dataset YAML
|
||||
---
|
||||
|
||||
# xView Dataset
|
||||
|
||||
The [xView](http://xviewdataset.org/) dataset is one of the largest publicly available datasets of overhead imagery, containing images from complex scenes around the world annotated using bounding boxes. The goal of the xView dataset is to accelerate progress in four computer vision frontiers:
|
||||
|
||||
1. Reduce minimum resolution for detection.
|
||||
2. Improve learning efficiency.
|
||||
3. Enable discovery of more object classes.
|
||||
4. Improve detection of fine-grained classes.
|
||||
|
||||
xView builds on the success of challenges like Common Objects in Context (COCO) and aims to leverage computer vision to analyze the growing amount of available imagery from space in order to understand the visual world in new ways and address a range of important applications.
|
||||
|
||||
## Key Features
|
||||
|
||||
- xView contains over 1 million object instances across 60 classes.
|
||||
- The dataset has a resolution of 0.3 meters, providing higher resolution imagery than most public satellite imagery datasets.
|
||||
- xView features a diverse collection of small, rare, fine-grained, and multi-type objects with bounding box annotation.
|
||||
- Comes with a pre-trained baseline model using the TensorFlow object detection API and an example for PyTorch.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The xView dataset is composed of satellite images collected from WorldView-3 satellites at a 0.3m ground sample distance. It contains over 1 million objects across 60 classes in over 1,400 km² of imagery.
|
||||
|
||||
## Applications
|
||||
|
||||
The xView dataset is widely used for training and evaluating deep learning models for object detection in overhead imagery. The dataset's diverse set of object classes and high-resolution imagery make it a valuable resource for researchers and practitioners in the field of computer vision, especially for satellite imagery analysis.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the xView dataset, the `xView.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/xView.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/xView.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/xView.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/xView.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model on the xView dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='xView.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=xView.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
The xView dataset contains high-resolution satellite images with a diverse set of objects annotated using bounding boxes. Here are some examples of data from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Overhead Imagery**: This image demonstrates an example of object detection in overhead imagery, where objects are annotated with bounding boxes. The dataset provides high-resolution satellite images to facilitate the development of models for this task.
|
||||
|
||||
The example showcases the variety and complexity of the data in the xView dataset and highlights the importance of high-quality satellite imagery for object detection tasks.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the xView dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lam2018xview,
|
||||
title={xView: Objects in Context in Overhead Imagery},
|
||||
author={Darius Lam and Richard Kuzma and Kevin McGee and Samuel Dooley and Michael Laielli and Matthew Klaric and Yaroslav Bulatov and Brendan McCord},
|
||||
year={2018},
|
||||
eprint={1802.07856},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the [Defense Innovation Unit](https://www.diu.mil/) (DIU) and the creators of the xView dataset for their valuable contribution to the computer vision research community. For more information about the xView dataset and its creators, visit the [xView dataset website](http://xviewdataset.org/).
|
||||
123
docs/en/datasets/index.md
Normal file
123
docs/en/datasets/index.md
Normal file
|
|
@ -0,0 +1,123 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore various computer vision datasets supported by Ultralytics for object detection, segmentation, pose estimation, image classification, and multi-object tracking.
|
||||
keywords: computer vision, datasets, Ultralytics, YOLO, object detection, instance segmentation, pose estimation, image classification, multi-object tracking
|
||||
---
|
||||
|
||||
# Datasets Overview
|
||||
|
||||
Ultralytics provides support for various datasets to facilitate computer vision tasks such as detection, instance segmentation, pose estimation, classification, and multi-object tracking. Below is a list of the main Ultralytics datasets, followed by a summary of each computer vision task and the respective datasets.
|
||||
|
||||
## [Detection Datasets](detect/index.md)
|
||||
|
||||
Bounding box object detection is a computer vision technique that involves detecting and localizing objects in an image by drawing a bounding box around each object.
|
||||
|
||||
- [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations.
|
||||
- [COCO](detect/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning with over 200K labeled images.
|
||||
- [COCO8](detect/coco8.md): Contains the first 4 images from COCO train and COCO val, suitable for quick tests.
|
||||
- [Global Wheat 2020](detect/globalwheat2020.md): A dataset of wheat head images collected from around the world for object detection and localization tasks.
|
||||
- [Objects365](detect/objects365.md): A high-quality, large-scale dataset for object detection with 365 object categories and over 600K annotated images.
|
||||
- [OpenImagesV7](detect/open-images-v7.md): A comprehensive dataset by Google with 1.7M train images and 42k validation images.
|
||||
- [SKU-110K](detect/sku-110k.md): A dataset featuring dense object detection in retail environments with over 11K images and 1.7 million bounding boxes.
|
||||
- [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
|
||||
- [VOC](detect/voc.md): The Pascal Visual Object Classes (VOC) dataset for object detection and segmentation with 20 object classes and over 11K images.
|
||||
- [xView](detect/xview.md): A dataset for object detection in overhead imagery with 60 object categories and over 1 million annotated objects.
|
||||
|
||||
## [Instance Segmentation Datasets](segment/index.md)
|
||||
|
||||
Instance segmentation is a computer vision technique that involves identifying and localizing objects in an image at the pixel level.
|
||||
|
||||
- [COCO](segment/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
|
||||
- [COCO8-seg](segment/coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
|
||||
|
||||
## [Pose Estimation](pose/index.md)
|
||||
|
||||
Pose estimation is a technique used to determine the pose of the object relative to the camera or the world coordinate system.
|
||||
|
||||
- [COCO](pose/coco.md): A large-scale dataset with human pose annotations designed for pose estimation tasks.
|
||||
- [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
|
||||
- [Tiger-pose](pose/tiger-pose.md): A compact dataset consisting of 263 images focused on tigers, annotated with 12 keypoints per tiger for pose estimation tasks.
|
||||
|
||||
## [Classification](classify/index.md)
|
||||
|
||||
Image classification is a computer vision task that involves categorizing an image into one or more predefined classes or categories based on its visual content.
|
||||
|
||||
- [Caltech 101](classify/caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
|
||||
- [Caltech 256](classify/caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
|
||||
- [CIFAR-10](classify/cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
|
||||
- [CIFAR-100](classify/cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
|
||||
- [Fashion-MNIST](classify/fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
|
||||
- [ImageNet](classify/imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
|
||||
- [ImageNet-10](classify/imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
|
||||
- [Imagenette](classify/imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
|
||||
- [Imagewoof](classify/imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
|
||||
- [MNIST](classify/mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
|
||||
|
||||
## [Oriented Bounding Boxes (OBB)](obb/index.md)
|
||||
|
||||
Oriented Bounding Boxes (OBB) is a method in computer vision for detecting angled objects in images using rotated bounding boxes, often applied to aerial and satellite imagery.
|
||||
|
||||
- [DOTAv2](obb/dota-v2.md): A popular OBB aerial imagery dataset with 1.7 million instances and 11,268 images.
|
||||
|
||||
## [Multi-Object Tracking](track/index.md)
|
||||
|
||||
Multi-object tracking is a computer vision technique that involves detecting and tracking multiple objects over time in a video sequence.
|
||||
|
||||
- [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations for multi-object tracking tasks.
|
||||
- [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
|
||||
|
||||
## Contribute New Datasets
|
||||
|
||||
Contributing a new dataset involves several steps to ensure that it aligns well with the existing infrastructure. Below are the necessary steps:
|
||||
|
||||
### Steps to Contribute a New Dataset
|
||||
|
||||
1. **Collect Images**: Gather the images that belong to the dataset. These could be collected from various sources, such as public databases or your own collection.
|
||||
|
||||
2. **Annotate Images**: Annotate these images with bounding boxes, segments, or keypoints, depending on the task.
|
||||
|
||||
3. **Export Annotations**: Convert these annotations into the YOLO *.txt file format which Ultralytics supports.
|
||||
|
||||
4. **Organize Dataset**: Arrange your dataset into the correct folder structure. You should have `train/` and `val/` top-level directories, and within each, an `images/` and `labels/` sub-directory.
|
||||
|
||||
```
|
||||
dataset/
|
||||
├── train/
|
||||
│ ├── images/
|
||||
│ └── labels/
|
||||
└── val/
|
||||
├── images/
|
||||
└── labels/
|
||||
```
|
||||
|
||||
5. **Create a `data.yaml` File**: In your dataset's root directory, create a `data.yaml` file that describes the dataset, classes, and other necessary information.
|
||||
|
||||
6. **Optimize Images (Optional)**: If you want to reduce the size of the dataset for more efficient processing, you can optimize the images using the code below. This is not required, but recommended for smaller dataset sizes and faster download speeds.
|
||||
|
||||
7. **Zip Dataset**: Compress the entire dataset folder into a zip file.
|
||||
|
||||
8. **Document and PR**: Create a documentation page describing your dataset and how it fits into the existing framework. After that, submit a Pull Request (PR). Refer to [Ultralytics Contribution Guidelines](https://docs.ultralytics.com/help/contributing) for more details on how to submit a PR.
|
||||
|
||||
### Example Code to Optimize and Zip a Dataset
|
||||
|
||||
!!! example "Optimize and Zip a Dataset"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
from ultralytics.data.utils import compress_one_image
|
||||
from ultralytics.utils.downloads import zip_directory
|
||||
|
||||
# Define dataset directory
|
||||
path = Path('path/to/dataset')
|
||||
|
||||
# Optimize images in dataset (optional)
|
||||
for f in path.rglob('*.jpg'):
|
||||
compress_one_image(f)
|
||||
|
||||
# Zip dataset into 'path/to/dataset.zip'
|
||||
zip_directory(path)
|
||||
```
|
||||
|
||||
By following these steps, you can contribute a new dataset that integrates well with Ultralytics' existing structure.
|
||||
129
docs/en/datasets/obb/dota-v2.md
Normal file
129
docs/en/datasets/obb/dota-v2.md
Normal file
|
|
@ -0,0 +1,129 @@
|
|||
---
|
||||
comments: true
|
||||
description: Delve into DOTA v2, an Oriented Bounding Box (OBB) aerial imagery dataset with 1.7 million instances and 11,268 images.
|
||||
keywords: DOTA v2, object detection, aerial images, computer vision, deep learning, annotations, oriented bounding boxes, OBB
|
||||
---
|
||||
|
||||
# DOTA v2 Dataset with OBB
|
||||
|
||||
[DOTA v2](https://captain-whu.github.io/DOTA/index.html) stands as a specialized dataset, emphasizing object detection in aerial images. Originating from the DOTA series of datasets, it offers annotated images capturing a diverse array of aerial scenes with Oriented Bounding Boxes (OBB).
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- Collection from various sensors and platforms, with image sizes ranging from 800 × 800 to 20,000 × 20,000 pixels.
|
||||
- Features more than 1.7M Oriented Bounding Boxes across 18 categories.
|
||||
- Encompasses multiscale object detection.
|
||||
- Instances are annotated by experts using arbitrary (8 d.o.f.) quadrilateral, capturing objects of different scales, orientations, and shapes.
|
||||
|
||||
## Dataset Versions
|
||||
|
||||
### DOTA-v1.0
|
||||
|
||||
- Contains 15 common categories.
|
||||
- Comprises 2,806 images with 188,282 instances.
|
||||
- Split ratios: 1/2 for training, 1/6 for validation, and 1/3 for testing.
|
||||
|
||||
### DOTA-v1.5
|
||||
|
||||
- Incorporates the same images as DOTA-v1.0.
|
||||
- Very small instances (less than 10 pixels) are also annotated.
|
||||
- Addition of a new category: "container crane".
|
||||
- A total of 403,318 instances.
|
||||
- Released for the DOAI Challenge 2019 on Object Detection in Aerial Images.
|
||||
|
||||
### DOTA-v2.0
|
||||
|
||||
- Collections from Google Earth, GF-2 Satellite, and other aerial images.
|
||||
- Contains 18 common categories.
|
||||
- Comprises 11,268 images with a whopping 1,793,658 instances.
|
||||
- New categories introduced: "airport" and "helipad".
|
||||
- Image splits:
|
||||
- Training: 1,830 images with 268,627 instances.
|
||||
- Validation: 593 images with 81,048 instances.
|
||||
- Test-dev: 2,792 images with 353,346 instances.
|
||||
- Test-challenge: 6,053 images with 1,090,637 instances.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
DOTA v2 exhibits a structured layout tailored for OBB object detection challenges:
|
||||
|
||||
- **Images**: A vast collection of high-resolution aerial images capturing diverse terrains and structures.
|
||||
- **Oriented Bounding Boxes**: Annotations in the form of rotated rectangles encapsulating objects irrespective of their orientation, ideal for capturing objects like airplanes, ships, and buildings.
|
||||
|
||||
## Applications
|
||||
|
||||
DOTA v2 serves as a benchmark for training and evaluating models specifically tailored for aerial image analysis. With the inclusion of OBB annotations, it provides a unique challenge, enabling the development of specialized object detection models that cater to aerial imagery's nuances.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
Typically, datasets incorporate a YAML (Yet Another Markup Language) file detailing the dataset's configuration. For DOTA v2, a hypothetical `DOTAv2.yaml` could be used. For accurate paths and configurations, it's vital to consult the dataset's official repository or documentation.
|
||||
|
||||
!!! example "DOTAv2.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/DOTAv2.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model on the DOTA v2 dataset, you can utilize the following code snippets. Always refer to your model's documentation for a thorough list of available arguments.
|
||||
|
||||
!!! warning
|
||||
|
||||
Please note that all images and associated annotations in the DOTAv2 dataset can be used for academic purposes, but commercial use is prohibited. Your understanding and respect for the dataset creators' wishes are greatly appreciated!
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLOv8n-OBB model from scratch
|
||||
model = YOLO('yolov8n-obb.yaml')
|
||||
|
||||
# Train the model on the DOTAv2 dataset
|
||||
results = model.train(data='DOTAv2.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLOv8n-OBB model on the DOTAv2 dataset
|
||||
yolo detect train data=DOTAv2.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Data and Annotations
|
||||
|
||||
Having a glance at the dataset illustrates its depth:
|
||||
|
||||

|
||||
|
||||
- **DOTA v2**: This snapshot underlines the complexity of aerial scenes and the significance of Oriented Bounding Box annotations, capturing objects in their natural orientation.
|
||||
|
||||
The dataset's richness offers invaluable insights into object detection challenges exclusive to aerial imagery.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
For those leveraging DOTA v2 in their endeavors, it's pertinent to cite the relevant research papers:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{9560031,
|
||||
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
|
||||
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
|
||||
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
|
||||
year={2021},
|
||||
volume={},
|
||||
number={},
|
||||
pages={1-1},
|
||||
doi={10.1109/TPAMI.2021.3117983}
|
||||
}
|
||||
```
|
||||
|
||||
A special note of gratitude to the team behind DOTA v2 for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the [official DOTA v2 website](https://captain-whu.github.io/DOTA/index.html).
|
||||
84
docs/en/datasets/obb/index.md
Normal file
84
docs/en/datasets/obb/index.md
Normal file
|
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
comments: true
|
||||
description: Dive deep into various oriented bounding box (OBB) dataset formats compatible with Ultralytics YOLO models. Grasp the nuances of using and converting datasets to this format.
|
||||
keywords: Ultralytics, YOLO, oriented bounding boxes, OBB, dataset formats, label formats, DOTA v2, data conversion
|
||||
---
|
||||
|
||||
# Oriented Bounding Box (OBB) Datasets Overview
|
||||
|
||||
Training a precise object detection model with oriented bounding boxes (OBB) requires a thorough dataset. This guide explains the various OBB dataset formats compatible with Ultralytics YOLO models, offering insights into their structure, application, and methods for format conversions.
|
||||
|
||||
## Supported OBB Dataset Formats
|
||||
|
||||
### YOLO OBB Format
|
||||
|
||||
The YOLO OBB format designates bounding boxes by their four corner points with coordinates normalized between 0 and 1. It follows this format:
|
||||
|
||||
```bash
|
||||
class_index, x1, y1, x2, y2, x3, y3, x4, y4
|
||||
```
|
||||
|
||||
Internally, YOLO processes losses and outputs in the `xywhr` format, which represents the bounding box's center point (xy), width, height, and rotation.
|
||||
|
||||
<p align="center"><img width="800" src="https://user-images.githubusercontent.com/26833433/259471881-59020fe2-09a4-4dcc-acce-9b0f7cfa40ee.png" alt="OBB format examples"></p>
|
||||
|
||||
An example of a `*.txt` label file for the above image, which contains an object of class `0` in OBB format, could look like:
|
||||
|
||||
```bash
|
||||
0 0.780811 0.743961 0.782371 0.74686 0.777691 0.752174 0.776131 0.749758
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a model using these OBB formats:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLOv8n-OBB model from scratch
|
||||
model = YOLO('yolov8n-obb.yaml')
|
||||
|
||||
# Train the model on the DOTAv2 dataset
|
||||
results = model.train(data='DOTAv2.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Train a new YOLOv8n-OBB model on the DOTAv2 dataset
|
||||
yolo detect train data=DOTAv2.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
Currently, the following datasets with Oriented Bounding Boxes are supported:
|
||||
|
||||
- [**DOTA v2**](./dota-v2.md): DOTA (A Large-scale Dataset for Object Detection in Aerial Images) version 2, emphasizes detection from aerial perspectives and contains oriented bounding boxes with 1.7 million instances and 11,268 images.
|
||||
|
||||
### Incorporating your own OBB dataset
|
||||
|
||||
For those looking to introduce their own datasets with oriented bounding boxes, ensure compatibility with the "YOLO OBB format" mentioned above. Convert your annotations to this required format and detail the paths, classes, and class names in a corresponding YAML configuration file.
|
||||
|
||||
## Convert Label Formats
|
||||
|
||||
### DOTA Dataset Format to YOLO OBB Format
|
||||
|
||||
Transitioning labels from the DOTA dataset format to the YOLO OBB format can be achieved with this script:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_dota_to_yolo_obb
|
||||
|
||||
convert_dota_to_yolo_obb('path/to/DOTA')
|
||||
```
|
||||
|
||||
This conversion mechanism is instrumental for datasets in the DOTA format, ensuring alignment with the Ultralytics YOLO OBB format.
|
||||
|
||||
It's imperative to validate the compatibility of the dataset with your model and adhere to the necessary format conventions. Properly structured datasets are pivotal for training efficient object detection models with oriented bounding boxes.
|
||||
95
docs/en/datasets/pose/coco.md
Normal file
95
docs/en/datasets/pose/coco.md
Normal file
|
|
@ -0,0 +1,95 @@
|
|||
---
|
||||
comments: true
|
||||
description: Detailed guide on the special COCO-Pose Dataset in Ultralytics. Learn about its key features, structure, and usage in pose estimation tasks with YOLO.
|
||||
keywords: Ultralytics YOLO, COCO-Pose Dataset, Deep Learning, Pose Estimation, Training Models, Dataset YAML, openpose, YOLO
|
||||
---
|
||||
|
||||
# COCO-Pose Dataset
|
||||
|
||||
The [COCO-Pose](https://cocodataset.org/#keypoints-2017) dataset is a specialized version of the COCO (Common Objects in Context) dataset, designed for pose estimation tasks. It leverages the COCO Keypoints 2017 images and labels to enable the training of models like YOLO for pose estimation tasks.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- COCO-Pose builds upon the COCO Keypoints 2017 dataset which contains 200K images labeled with keypoints for pose estimation tasks.
|
||||
- The dataset supports 17 keypoints for human figures, facilitating detailed pose estimation.
|
||||
- Like COCO, it provides standardized evaluation metrics, including Object Keypoint Similarity (OKS) for pose estimation tasks, making it suitable for comparing model performance.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The COCO-Pose dataset is split into three subsets:
|
||||
|
||||
1. **Train2017**: This subset contains a portion of the 118K images from the COCO dataset, annotated for training pose estimation models.
|
||||
2. **Val2017**: This subset has a selection of images used for validation purposes during model training.
|
||||
3. **Test2017**: This subset consists of images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
|
||||
|
||||
## Applications
|
||||
|
||||
The COCO-Pose dataset is specifically used for training and evaluating deep learning models in keypoint detection and pose estimation tasks, such as OpenPose. The dataset's large number of annotated images and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners focused on pose estimation.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Pose dataset, the `coco-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco-pose.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco-pose.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n-pose model on the COCO-Pose dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco-pose.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco-pose.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
The COCO-Pose dataset contains a diverse set of images with human figures annotated with keypoints. Here are some examples of images from the dataset, along with their corresponding annotations:
|
||||
|
||||

|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO-Pose dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO-Pose dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO-Pose dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
80
docs/en/datasets/pose/coco8-pose.md
Normal file
80
docs/en/datasets/pose/coco8-pose.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the versatile COCO8-Pose dataset, perfect for testing and debugging pose detection models. Learn how to get started with YOLOv8-pose model training.
|
||||
keywords: Ultralytics, YOLOv8, pose detection, COCO8-Pose dataset, dataset, model training, YAML
|
||||
---
|
||||
|
||||
# COCO8-Pose Dataset
|
||||
|
||||
## Introduction
|
||||
|
||||
[Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
|
||||
|
||||
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
|
||||
and [YOLOv8](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Pose dataset, the `coco8-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco8-pose.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco8-pose.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n-pose model on the COCO8-Pose dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco8-pose.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco8-pose.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
Here are some examples of images from the COCO8-Pose dataset, along with their corresponding annotations:
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/26833433/236818283-52eecb96-fc6a-420d-8a26-d488b352dd4c.jpg" alt="Dataset sample image" width="800">
|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO8-Pose dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
138
docs/en/datasets/pose/index.md
Normal file
138
docs/en/datasets/pose/index.md
Normal file
|
|
@ -0,0 +1,138 @@
|
|||
---
|
||||
comments: true
|
||||
description: Understand the YOLO pose dataset format and learn to use Ultralytics datasets to train your pose estimation models effectively.
|
||||
keywords: Ultralytics, YOLO, pose estimation, datasets, training, YAML, keypoints, COCO-Pose, COCO8-Pose, data conversion
|
||||
---
|
||||
|
||||
# Pose Estimation Datasets Overview
|
||||
|
||||
## Supported Dataset Formats
|
||||
|
||||
### Ultralytics YOLO format
|
||||
|
||||
The dataset label format used for training YOLO pose models is as follows:
|
||||
|
||||
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
|
||||
2. One row per object: Each row in the text file corresponds to one object instance in the image.
|
||||
3. Object information per row: Each row contains the following information about the object instance:
|
||||
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
|
||||
- Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
|
||||
- Object width and height: The width and height of the object, normalized to be between 0 and 1.
|
||||
- Object keypoint coordinates: The keypoints of the object, normalized to be between 0 and 1.
|
||||
|
||||
Here is an example of the label format for pose estimation task:
|
||||
|
||||
Format with Dim = 2
|
||||
|
||||
```
|
||||
<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn>
|
||||
```
|
||||
|
||||
Format with Dim = 3
|
||||
|
||||
```
|
||||
<class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility>
|
||||
```
|
||||
|
||||
In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of bounding box, and `<px1> <py1> <px2> <py2> ... <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces.
|
||||
|
||||
### Dataset YAML format
|
||||
|
||||
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
|
||||
|
||||
```yaml
|
||||
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
|
||||
path: ../datasets/coco8-pose # dataset root dir
|
||||
train: images/train # train images (relative to 'path') 4 images
|
||||
val: images/val # val images (relative to 'path') 4 images
|
||||
test: # test images (optional)
|
||||
|
||||
# Keypoints
|
||||
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
|
||||
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
|
||||
|
||||
# Classes dictionary
|
||||
names:
|
||||
0: person
|
||||
```
|
||||
|
||||
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
|
||||
|
||||
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
|
||||
|
||||
(Optional) if the points are symmetric then need flip_idx, like left-right side of human or face. For example if we assume five keypoints of facial landmark: [left eye, right eye, nose, left mouth, right mouth], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3] (just exchange the left-right index, i.e 0-1 and 3-4, and do not modify others like nose in this example).
|
||||
|
||||
## Usage
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco128-pose.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco128-pose.yaml model=yolov8n-pose.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
This section outlines the datasets that are compatible with Ultralytics YOLO format and can be used for training pose estimation models:
|
||||
|
||||
### COCO-Pose
|
||||
|
||||
- **Description**: COCO-Pose is a large-scale object detection, segmentation, and pose estimation dataset. It is a subset of the popular COCO dataset and focuses on human pose estimation. COCO-Pose includes multiple keypoints for each human instance.
|
||||
- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
|
||||
- **Number of Classes**: 1 (Human).
|
||||
- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
|
||||
- **Usage**: Suitable for training human pose estimation models.
|
||||
- **Additional Notes**: The dataset is rich and diverse, containing over 200k labeled images.
|
||||
- [Read more about COCO-Pose](./coco.md)
|
||||
|
||||
### COCO8-Pose
|
||||
|
||||
- **Description**: [Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation.
|
||||
- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
|
||||
- **Number of Classes**: 1 (Human).
|
||||
- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
|
||||
- **Usage**: Suitable for testing and debugging object detection models, or for experimenting with new detection approaches.
|
||||
- **Additional Notes**: COCO8-Pose is ideal for sanity checks and CI checks.
|
||||
- [Read more about COCO8-Pose](./coco8-pose.md)
|
||||
|
||||
### Tiger-Pose
|
||||
|
||||
- **Description**: [Ultralytics](https://ultralytics.com) This animal pose dataset comprises 263 images sourced from a [YouTube Video](https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUbVGlnZXIgd2Fsa2luZyByZWZlcmVuY2UubXA0), with 210 images allocated for training and 53 for validation.
|
||||
- **Label Format**: Same as Ultralytics YOLO format as described above, with 12 keypoints for animal pose and no visible dimension.
|
||||
- **Number of Classes**: 1 (Tiger).
|
||||
- **Keypoints**: 12 keypoints.
|
||||
- **Usage**: Great for animal pose or any other pose that is not human-based.
|
||||
- [Read more about Tiger-Pose](./tiger-pose.md)
|
||||
|
||||
### Adding your own dataset
|
||||
|
||||
If you have your own dataset and would like to use it for training pose estimation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
|
||||
|
||||
### Conversion Tool
|
||||
|
||||
Ultralytics provides a convenient conversion tool to convert labels from the popular COCO dataset format to YOLO format:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_coco
|
||||
|
||||
convert_coco(labels_dir='path/to/coco/annotations/', use_keypoints=True)
|
||||
```
|
||||
|
||||
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format. The `use_keypoints` parameter specifies whether to include keypoints (for pose estimation) in the converted labels.
|
||||
88
docs/en/datasets/pose/tiger-pose.md
Normal file
88
docs/en/datasets/pose/tiger-pose.md
Normal file
|
|
@ -0,0 +1,88 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the versatile Tiger-Pose dataset, perfect for testing and debugging pose detection models. Learn how to get started with YOLOv8-pose model training.
|
||||
keywords: Ultralytics, YOLOv8, pose detection, COCO8-Pose dataset, dataset, model training, YAML
|
||||
---
|
||||
|
||||
# Tiger-Pose Dataset
|
||||
|
||||
## Introduction
|
||||
|
||||
[Ultralytics](https://ultralytics.com) introduces the Tiger-Pose dataset, a versatile collection designed for pose estimation tasks. This dataset comprises 263 images sourced from a [YouTube Video](https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUbVGlnZXIgd2Fsa2luZyByZWZlcmVuY2UubXA0), with 210 images allocated for training and 53 for validation. It serves as an excellent resource for testing and troubleshooting pose estimation algorithm.
|
||||
|
||||
Despite its manageable size of 210 images, tiger-pose dataset offers diversity, making it suitable for assessing training pipelines, identifying potential errors, and serving as a valuable preliminary step before working with larger datasets for pose estimation.
|
||||
|
||||
This dataset is intended for use with [Ultralytics HUB](https://hub.ultralytics.com)
|
||||
and [YOLOv8](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file serves as the means to specify the configuration details of a dataset. It encompasses crucial data such as file paths, class definitions, and other pertinent information. Specifically, for the `tiger-pose.yaml` file, you can check [Ultralytics Tiger-Pose Dataset Configuration File](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/tiger-pose.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/tiger-pose.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/tiger-pose.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n-pose model on the Tiger-Pose dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='tiger-pose.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo task=pose mode=train data=tiger-pose.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
Here are some examples of images from the Tiger-Pose dataset, along with their corresponding annotations:
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/62513924/272491921-c963d2bf-505f-4a15-abd7-259de302cffa.jpg" alt="Dataset sample image" width="100%">
|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the Tiger-Pose dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Inference Example
|
||||
|
||||
!!! example "Inference Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('path/to/best.pt') # load a tiger-pose trained model
|
||||
|
||||
# Run inference
|
||||
results = model.predict(source="https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUYdGlnZXIgd2Fsa2luZyByZWZlcmVuY2Ug" show=True)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Run inference using a tiger-pose trained model
|
||||
yolo task=pose mode=predict source="https://www.youtube.com/watch?v=MIBAT6BGE6U&pp=ygUYdGlnZXIgd2Fsa2luZyByZWZlcmVuY2Ug" show=True model="path/to/best.pt"
|
||||
```
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
The dataset has been released available under the [AGPL-3.0 License](https://github.com/ultralytics/ultralytics/blob/main/LICENSE).
|
||||
94
docs/en/datasets/segment/coco.md
Normal file
94
docs/en/datasets/segment/coco.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the possibilities of the COCO-Seg dataset, designed for object instance segmentation and YOLO model training. Discover key features, dataset structure, applications, and usage.
|
||||
keywords: Ultralytics, YOLO, COCO-Seg, dataset, instance segmentation, model training, deep learning, computer vision
|
||||
---
|
||||
|
||||
# COCO-Seg Dataset
|
||||
|
||||
The [COCO-Seg](https://cocodataset.org/#home) dataset, an extension of the COCO (Common Objects in Context) dataset, is specially designed to aid research in object instance segmentation. It uses the same images as COCO but introduces more detailed segmentation annotations. This dataset is a crucial resource for researchers and developers working on instance segmentation tasks, especially for training YOLO models.
|
||||
|
||||
## Key Features
|
||||
|
||||
- COCO-Seg retains the original 330K images from COCO.
|
||||
- The dataset consists of the same 80 object categories found in the original COCO dataset.
|
||||
- Annotations now include more detailed instance segmentation masks for each object in the images.
|
||||
- COCO-Seg provides standardized evaluation metrics like mean Average Precision (mAP) for object detection, and mean Average Recall (mAR) for instance segmentation tasks, enabling effective comparison of model performance.
|
||||
|
||||
## Dataset Structure
|
||||
|
||||
The COCO-Seg dataset is partitioned into three subsets:
|
||||
|
||||
1. **Train2017**: This subset contains 118K images for training instance segmentation models.
|
||||
2. **Val2017**: This subset includes 5K images used for validation purposes during model training.
|
||||
3. **Test2017**: This subset encompasses 20K images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7383) for performance evaluation.
|
||||
|
||||
## Applications
|
||||
|
||||
COCO-Seg is widely used for training and evaluating deep learning models in instance segmentation, such as the YOLO models. The large number of annotated images, the diversity of object categories, and the standardized evaluation metrics make it an indispensable resource for computer vision researchers and practitioners.
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Seg dataset, the `coco.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n-seg model on the COCO-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco-seg.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco-seg.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
COCO-Seg, like its predecessor COCO, contains a diverse set of images with various object categories and complex scenes. However, COCO-Seg introduces more detailed instance segmentation masks for each object in the images. Here are some examples of images from the dataset, along with their corresponding instance segmentation masks:
|
||||
|
||||

|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This aids the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO-Seg dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO-Seg dataset in your research or development work, please cite the original COCO paper and acknowledge the extension to COCO-Seg:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We extend our thanks to the COCO Consortium for creating and maintaining this invaluable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
80
docs/en/datasets/segment/coco8-seg.md
Normal file
80
docs/en/datasets/segment/coco8-seg.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
comments: true
|
||||
description: 'Discover the COCO8-Seg: a compact but versatile instance segmentation dataset ideal for testing Ultralytics YOLOv8 detection approaches. Complete usage guide included.'
|
||||
keywords: COCO8-Seg dataset, Ultralytics, YOLOv8, instance segmentation, dataset configuration, YAML, YOLOv8n-seg model, mosaiced dataset images
|
||||
---
|
||||
|
||||
# COCO8-Seg Dataset
|
||||
|
||||
## Introduction
|
||||
|
||||
[Ultralytics](https://ultralytics.com) COCO8-Seg is a small, but versatile instance segmentation dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging segmentation models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.
|
||||
|
||||
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
|
||||
and [YOLOv8](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
## Dataset YAML
|
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Seg dataset, the `coco8-seg.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml).
|
||||
|
||||
!!! example "ultralytics/cfg/datasets/coco8-seg.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco8-seg.yaml"
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
To train a YOLOv8n-seg model on the COCO8-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
|
||||
|
||||
!!! example "Train Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco8-seg.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco8-seg.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Sample Images and Annotations
|
||||
|
||||
Here are some examples of images from the COCO8-Seg dataset, along with their corresponding annotations:
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/26833433/236818387-f7bde7df-caaa-46d1-8341-1f7504cd11a1.jpg" alt="Dataset sample image" width="800">
|
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
|
||||
|
||||
The example showcases the variety and complexity of the images in the COCO8-Seg dataset and the benefits of using mosaicing during the training process.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use the COCO dataset in your research or development work, please cite the following paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lin2015microsoft,
|
||||
title={Microsoft COCO: Common Objects in Context},
|
||||
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
|
||||
year={2015},
|
||||
eprint={1405.0312},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).
|
||||
148
docs/en/datasets/segment/index.md
Normal file
148
docs/en/datasets/segment/index.md
Normal file
|
|
@ -0,0 +1,148 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how Ultralytics YOLO supports various dataset formats for instance segmentation. This guide includes information on data conversions, auto-annotations, and dataset usage.
|
||||
keywords: Ultralytics, YOLO, Instance Segmentation, Dataset, YAML, COCO, Auto-Annotation, Image Segmentation
|
||||
---
|
||||
|
||||
# Instance Segmentation Datasets Overview
|
||||
|
||||
## Supported Dataset Formats
|
||||
|
||||
### Ultralytics YOLO format
|
||||
|
||||
The dataset label format used for training YOLO segmentation models is as follows:
|
||||
|
||||
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
|
||||
2. One row per object: Each row in the text file corresponds to one object instance in the image.
|
||||
3. Object information per row: Each row contains the following information about the object instance:
|
||||
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
|
||||
- Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
|
||||
|
||||
The format for a single row in the segmentation dataset file is as follows:
|
||||
|
||||
```
|
||||
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
|
||||
```
|
||||
|
||||
In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces.
|
||||
|
||||
Here is an example of the YOLO dataset format for a single image with two objects made up of a 3-point segment and a 5-point segment.
|
||||
|
||||
```
|
||||
0 0.681 0.485 0.670 0.487 0.676 0.487
|
||||
1 0.504 0.000 0.501 0.004 0.498 0.004 0.493 0.010 0.492 0.0104
|
||||
```
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
- The length of each row does **not** have to be equal.
|
||||
- Each segmentation label must have a **minimum of 3 xy points**: `<class-index> <x1> <y1> <x2> <y2> <x3> <y3>`
|
||||
|
||||
### Dataset YAML format
|
||||
|
||||
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
|
||||
|
||||
```yaml
|
||||
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
|
||||
path: ../datasets/coco8-seg # dataset root dir
|
||||
train: images/train # train images (relative to 'path') 4 images
|
||||
val: images/val # val images (relative to 'path') 4 images
|
||||
test: # test images (optional)
|
||||
|
||||
# Classes (80 COCO classes)
|
||||
names:
|
||||
0: person
|
||||
1: bicycle
|
||||
2: car
|
||||
...
|
||||
77: teddy bear
|
||||
78: hair drier
|
||||
79: toothbrush
|
||||
```
|
||||
|
||||
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
|
||||
|
||||
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
|
||||
|
||||
## Usage
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco128-seg.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco128-seg.yaml model=yolov8n-seg.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
## Supported Datasets
|
||||
|
||||
* [COCO](coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
|
||||
* [COCO8-seg](coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
|
||||
|
||||
### Adding your own dataset
|
||||
|
||||
If you have your own dataset and would like to use it for training segmentation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
|
||||
|
||||
## Port or Convert Label Formats
|
||||
|
||||
### COCO Dataset Format to YOLO Format
|
||||
|
||||
You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.converter import convert_coco
|
||||
|
||||
convert_coco(labels_dir='path/to/coco/annotations/', use_segments=True)
|
||||
```
|
||||
|
||||
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
|
||||
|
||||
Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
|
||||
|
||||
## Auto-Annotation
|
||||
|
||||
Auto-annotation is an essential feature that allows you to generate a segmentation dataset using a pre-trained detection model. It enables you to quickly and accurately annotate a large number of images without the need for manual labeling, saving time and effort.
|
||||
|
||||
### Generate Segmentation Dataset Using a Detection Model
|
||||
|
||||
To auto-annotate your dataset using the Ultralytics framework, you can use the `auto_annotate` function as shown below:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.data.annotator import auto_annotate
|
||||
|
||||
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model='sam_b.pt')
|
||||
```
|
||||
|
||||
Certainly, here is the table updated with code snippets:
|
||||
|
||||
| Argument | Type | Description | Default |
|
||||
|--------------|-------------------------|-------------------------------------------------------------------------------------------------------------|----------------|
|
||||
| `data` | `str` | Path to a folder containing images to be annotated. | `None` |
|
||||
| `det_model` | `str, optional` | Pre-trained YOLO detection model. Defaults to `'yolov8x.pt'`. | `'yolov8x.pt'` |
|
||||
| `sam_model` | `str, optional` | Pre-trained SAM segmentation model. Defaults to `'sam_b.pt'`. | `'sam_b.pt'` |
|
||||
| `device` | `str, optional` | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | `''` |
|
||||
| `output_dir` | `str or None, optional` | Directory to save the annotated results. Defaults to a `'labels'` folder in the same directory as `'data'`. | `None` |
|
||||
|
||||
The `auto_annotate` function takes the path to your images, along with optional arguments for specifying the pre-trained detection and [SAM segmentation models](https://docs.ultralytics.com/models/sam), the device to run the models on, and the output directory for saving the annotated results.
|
||||
|
||||
By leveraging the power of pre-trained models, auto-annotation can significantly reduce the time and effort required for creating high-quality segmentation datasets. This feature is particularly useful for researchers and developers working with large image collections, as it allows them to focus on model development and evaluation rather than manual annotation.
|
||||
29
docs/en/datasets/track/index.md
Normal file
29
docs/en/datasets/track/index.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
---
|
||||
comments: true
|
||||
description: Understand multi-object tracking datasets, upcoming features and how to use them with YOLO in Python and CLI. Dive in now!.
|
||||
keywords: Ultralytics, YOLO, multi-object tracking, datasets, detection, segmentation, pose models, Python, CLI
|
||||
---
|
||||
|
||||
# Multi-object Tracking Datasets Overview
|
||||
|
||||
## Dataset Format (Coming Soon)
|
||||
|
||||
Multi-Object Detector doesn't need standalone training and directly supports pre-trained detection, segmentation or Pose models. Support for training trackers alone is coming soon
|
||||
|
||||
## Usage
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
model = YOLO('yolov8n.pt')
|
||||
results = model.track(source="https://youtu.be/LNwODJXcvt4", conf=0.3, iou=0.5, show=True)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" conf=0.3, iou=0.5 show
|
||||
```
|
||||
152
docs/en/guides/azureml-quickstart.md
Normal file
152
docs/en/guides/azureml-quickstart.md
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
---
|
||||
comments: true
|
||||
description: Step-by-step Quickstart Guide to Running YOLOv8 Object Detection Models on AzureML for Fast Prototyping and Testing
|
||||
keywords: Ultralytics, YOLOv8, Object Detection, Azure Machine Learning, Quickstart Guide, Prototype, Compute Instance, Terminal, Notebook, IPython Kernel, CLI, Python SDK
|
||||
---
|
||||
|
||||
# YOLOv8 🚀 on AzureML
|
||||
|
||||
## What is Azure?
|
||||
|
||||
[Azure](https://azure.microsoft.com/) is Microsoft's cloud computing platform, designed to help organizations move their workloads to the cloud from on-premises data centers. With the full spectrum of cloud services including those for computing, databases, analytics, machine learning, and networking, users can pick and choose from these services to develop and scale new applications, or run existing applications, in the public cloud.
|
||||
|
||||
## What is Azure Machine Learning (AzureML)?
|
||||
|
||||
Azure Machine Learning, commonly referred to as AzureML, is a fully managed cloud service that enables data scientists and developers to efficiently embed predictive analytics into their applications, helping organizations use massive data sets and bring all the benefits of the cloud to machine learning. AzureML offers a variety of services and capabilities aimed at making machine learning accessible, easy to use, and scalable. It provides capabilities like automated machine learning, drag-and-drop model training, as well as a robust Python SDK so that developers can make the most out of their machine learning models.
|
||||
|
||||
## How Does AzureML Benefit YOLO Users?
|
||||
|
||||
For users of YOLO (You Only Look Once), AzureML provides a robust, scalable, and efficient platform to both train and deploy machine learning models. Whether you are looking to run quick prototypes or scale up to handle more extensive data, AzureML's flexible and user-friendly environment offers various tools and services to fit your needs. You can leverage AzureML to:
|
||||
|
||||
- Easily manage large datasets and computational resources for training.
|
||||
- Utilize built-in tools for data preprocessing, feature selection, and model training.
|
||||
- Collaborate more efficiently with capabilities for MLOps (Machine Learning Operations), including but not limited to monitoring, auditing, and versioning of models and data.
|
||||
|
||||
In the subsequent sections, you will find a quickstart guide detailing how to run YOLOv8 object detection models using AzureML, either from a compute terminal or a notebook.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Before you can get started, make sure you have access to an AzureML workspace. If you don't have one, you can create a new [AzureML workspace](https://learn.microsoft.com/azure/machine-learning/concept-workspace?view=azureml-api-2) by following Azure's official documentation. This workspace acts as a centralized place to manage all AzureML resources.
|
||||
|
||||
## Create a compute instance
|
||||
|
||||
From your AzureML workspace, select Compute > Compute instances > New, select the instance with the resources you need.
|
||||
|
||||
<p align="center">
|
||||
<img width="1280" src="https://github.com/ouphi/ultralytics/assets/17216799/3e92fcc0-a08e-41a4-af81-d289cfe3b8f2" alt="Create Azure Compute Instance">
|
||||
</p>
|
||||
|
||||
## Quickstart from Terminal
|
||||
|
||||
Start your compute and open a Terminal:
|
||||
|
||||
<p align="center">
|
||||
<img width="480" src="https://github.com/ouphi/ultralytics/assets/17216799/635152f1-f4a3-4261-b111-d416cb5ef357" alt="Open Terminal">
|
||||
</p>
|
||||
|
||||
### Create virtualenv
|
||||
|
||||
Create your conda virtualenv and install pip in it:
|
||||
|
||||
```bash
|
||||
conda create --name yolov8env -y
|
||||
conda activate yolov8env
|
||||
conda install pip -y
|
||||
```
|
||||
|
||||
Install the required dependencies:
|
||||
|
||||
```bash
|
||||
cd ultralytics
|
||||
pip install -r requirements.txt
|
||||
pip install ultralytics
|
||||
pip install onnx>=1.12.0
|
||||
```
|
||||
|
||||
### Perform YOLOv8 tasks
|
||||
|
||||
Predict:
|
||||
|
||||
```bash
|
||||
yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'
|
||||
```
|
||||
|
||||
Train a detection model for 10 epochs with an initial learning_rate of 0.01:
|
||||
|
||||
```bash
|
||||
yolo train data=coco128.yaml model=yolov8n.pt epochs=10 lr0=0.01
|
||||
```
|
||||
|
||||
You can find more [instructions to use the Ultralytics CLI here](https://docs.ultralytics.com/quickstart/#use-ultralytics-with-cli).
|
||||
|
||||
## Quickstart from a Notebook
|
||||
|
||||
### Create a new IPython kernel
|
||||
|
||||
Open the compute Terminal.
|
||||
|
||||
<p align="center">
|
||||
<img width="480" src="https://github.com/ouphi/ultralytics/assets/17216799/635152f1-f4a3-4261-b111-d416cb5ef357" alt="Open Terminal">
|
||||
</p>
|
||||
|
||||
From your compute terminal, you need to create a new ipykernel that will be used by your notebook to manage your dependencies:
|
||||
|
||||
```bash
|
||||
conda create --name yolov8env -y
|
||||
conda activate yolov8env
|
||||
conda install pip -y
|
||||
conda install ipykernel -y
|
||||
python -m ipykernel install --user --name yolov8env --display-name "yolov8env"
|
||||
```
|
||||
|
||||
Close your terminal and create a new notebook. From your Notebook, you can select the new kernel.
|
||||
|
||||
Then you can open a Notebook cell and install the required dependencies:
|
||||
|
||||
```bash
|
||||
%%bash
|
||||
source activate yolov8env
|
||||
cd ultralytics
|
||||
pip install -r requirements.txt
|
||||
pip install ultralytics
|
||||
pip install onnx>=1.12.0
|
||||
```
|
||||
|
||||
Note that we need to use the `source activate yolov8env` for all the %%bash cells, to make sure that the %%bash cell uses environment we want.
|
||||
|
||||
Run some predictions using the [Ultralytics CLI](https://docs.ultralytics.com/quickstart/#use-ultralytics-with-cli):
|
||||
|
||||
```bash
|
||||
%%bash
|
||||
source activate yolov8env
|
||||
yolo predict model=yolov8n.pt source='https://ultralytics.com/images/bus.jpg'
|
||||
```
|
||||
|
||||
Or with the [Ultralytics Python interface](https://docs.ultralytics.com/quickstart/#use-ultralytics-with-python), for example to train the model:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO("yolov8n.pt") # load an official YOLOv8n model
|
||||
|
||||
# Use the model
|
||||
model.train(data="coco128.yaml", epochs=3) # train the model
|
||||
metrics = model.val() # evaluate model performance on the validation set
|
||||
results = model("https://ultralytics.com/images/bus.jpg") # predict on an image
|
||||
path = model.export(format="onnx") # export the model to ONNX format
|
||||
```
|
||||
|
||||
You can use either the Ultralytics CLI or Python interface for running YOLOv8 tasks, as described in the terminal section above.
|
||||
|
||||
By following these steps, you should be able to get YOLOv8 running quickly on AzureML for quick trials. For more advanced uses, you may refer to the full AzureML documentation linked at the beginning of this guide.
|
||||
|
||||
## Explore More with AzureML
|
||||
|
||||
This guide serves as an introduction to get you up and running with YOLOv8 on AzureML. However, it only scratches the surface of what AzureML can offer. To delve deeper and unlock the full potential of AzureML for your machine learning projects, consider exploring the following resources:
|
||||
|
||||
- [Create a Data Asset](https://learn.microsoft.com/azure/machine-learning/how-to-create-data-assets): Learn how to set up and manage your data assets effectively within the AzureML environment.
|
||||
- [Initiate an AzureML Job](https://learn.microsoft.com/azure/machine-learning/how-to-train-model): Get a comprehensive understanding of how to kickstart your machine learning training jobs on AzureML.
|
||||
- [Register a Model](https://learn.microsoft.com/azure/machine-learning/how-to-manage-models): Familiarize yourself with model management practices including registration, versioning, and deployment.
|
||||
- [Train YOLOv8 with AzureML Python SDK](https://medium.com/@ouphi/how-to-train-the-yolov8-model-with-azure-machine-learning-python-sdk-8268696be8ba): Explore a step-by-step guide on using the AzureML Python SDK to train your YOLOv8 models.
|
||||
- [Train YOLOv8 with AzureML CLI](https://medium.com/@ouphi/how-to-train-the-yolov8-model-with-azureml-and-the-az-cli-73d3c870ba8e): Discover how to utilize the command-line interface for streamlined training and management of YOLOv8 models on AzureML.
|
||||
132
docs/en/guides/conda-quickstart.md
Normal file
132
docs/en/guides/conda-quickstart.md
Normal file
|
|
@ -0,0 +1,132 @@
|
|||
---
|
||||
comments: true
|
||||
description: Comprehensive guide to setting up and using Ultralytics YOLO models in a Conda environment. Learn how to install the package, manage dependencies, and get started with object detection projects.
|
||||
keywords: Ultralytics, YOLO, Conda, environment setup, object detection, package installation, deep learning, machine learning, guide
|
||||
---
|
||||
|
||||
# Conda Quickstart Guide for Ultralytics
|
||||
|
||||
<p align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/266324397-32119e21-8c86-43e5-a00e-79827d303d10.png" alt="Ultralytics Conda Package Visual">
|
||||
</p>
|
||||
|
||||
This guide provides a comprehensive introduction to setting up a Conda environment for your Ultralytics projects. Conda is an open-source package and environment management system that offers an excellent alternative to pip for installing packages and dependencies. Its isolated environments make it particularly well-suited for data science and machine learning endeavors. For more details, visit the Ultralytics Conda package on [Anaconda](https://anaconda.org/conda-forge/ultralytics) and check out the Ultralytics feedstock repository for package updates on [GitHub](https://github.com/conda-forge/ultralytics-feedstock/).
|
||||
|
||||
[](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics)
|
||||
|
||||
## What You Will Learn
|
||||
|
||||
- Setting up a Conda environment
|
||||
- Installing Ultralytics via Conda
|
||||
- Initializing Ultralytics in your environment
|
||||
- Using Ultralytics Docker images with Conda
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- You should have Anaconda or Miniconda installed on your system. If not, download and install it from [Anaconda](https://www.anaconda.com/) or [Miniconda](https://docs.conda.io/projects/miniconda/en/latest/).
|
||||
|
||||
---
|
||||
|
||||
## Setting up a Conda Environment
|
||||
|
||||
First, let's create a new Conda environment. Open your terminal and run the following command:
|
||||
|
||||
```bash
|
||||
conda create --name ultralytics-env python=3.8 -y
|
||||
```
|
||||
|
||||
Activate the new environment:
|
||||
|
||||
```bash
|
||||
conda activate ultralytics-env
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Installing Ultralytics
|
||||
|
||||
You can install the Ultralytics package from the conda-forge channel. Execute the following command:
|
||||
|
||||
```bash
|
||||
conda install -c conda-forge ultralytics
|
||||
```
|
||||
|
||||
### Note on CUDA Environment
|
||||
|
||||
If you're working in a CUDA-enabled environment, it's a good practice to install `ultralytics`, `pytorch`, and `pytorch-cuda` together to resolve any conflicts:
|
||||
|
||||
```bash
|
||||
conda install -c pytorch -c nvidia -c conda-forge pytorch torchvision pytorch-cuda=11.8 ultralytics
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Using Ultralytics
|
||||
|
||||
With Ultralytics installed, you can now start using its robust features for object detection, instance segmentation, and more. For example, to predict an image, you can run:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
model = YOLO('yolov8n.pt') # initialize model
|
||||
results = model('path/to/image.jpg') # perform inference
|
||||
results.show() # display results
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Ultralytics Conda Docker Image
|
||||
|
||||
If you prefer using Docker, Ultralytics offers Docker images with a Conda environment included. You can pull these images from [DockerHub](https://hub.docker.com/r/ultralytics/ultralytics).
|
||||
|
||||
Pull the latest Ultralytics image:
|
||||
|
||||
```bash
|
||||
# Set image name as a variable
|
||||
t=ultralytics/ultralytics:latest-conda
|
||||
|
||||
# Pull the latest Ultralytics image from Docker Hub
|
||||
sudo docker pull $t
|
||||
```
|
||||
|
||||
Run the image:
|
||||
|
||||
```bash
|
||||
# Run the Ultralytics image in a container with GPU support
|
||||
sudo docker run -it --ipc=host --gpus all $t # all GPUs
|
||||
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t # specify GPUs
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
Certainly, you can include the following section in your Conda guide to inform users about speeding up installation using `libmamba`:
|
||||
|
||||
---
|
||||
|
||||
## Speeding Up Installation with Libmamba
|
||||
|
||||
If you're looking to [speed up the package installation](https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community) process in Conda, you can opt to use `libmamba`, a fast, cross-platform, and dependency-aware package manager that serves as an alternative solver to Conda's default.
|
||||
|
||||
### How to Enable Libmamba
|
||||
|
||||
To enable `libmamba` as the solver for Conda, you can perform the following steps:
|
||||
|
||||
1. First, install the `conda-libmamba-solver` package. This can be skipped if your Conda version is 4.11 or above, as `libmamba` is included by default.
|
||||
|
||||
```bash
|
||||
conda install conda-libmamba-solver
|
||||
```
|
||||
|
||||
2. Next, configure Conda to use `libmamba` as the solver:
|
||||
|
||||
```bash
|
||||
conda config --set solver libmamba
|
||||
```
|
||||
|
||||
And that's it! Your Conda installation will now use `libmamba` as the solver, which should result in a faster package installation process.
|
||||
|
||||
---
|
||||
|
||||
Congratulations! You have successfully set up a Conda environment, installed the Ultralytics package, and are now ready to explore its rich functionalities. Feel free to dive deeper into the [Ultralytics documentation](https://docs.ultralytics.com/) for more advanced tutorials and examples.
|
||||
119
docs/en/guides/docker-quickstart.md
Normal file
119
docs/en/guides/docker-quickstart.md
Normal file
|
|
@ -0,0 +1,119 @@
|
|||
---
|
||||
comments: true
|
||||
description: Complete guide to setting up and using Ultralytics YOLO models with Docker. Learn how to install Docker, manage GPU support, and run YOLO models in isolated containers.
|
||||
keywords: Ultralytics, YOLO, Docker, GPU, containerization, object detection, package installation, deep learning, machine learning, guide
|
||||
---
|
||||
|
||||
# Docker Quickstart Guide for Ultralytics
|
||||
|
||||
<p align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/270173601-fc7011bd-e67c-452f-a31a-aa047dcd2771.png" alt="Ultralytics Docker Package Visual">
|
||||
</p>
|
||||
|
||||
This guide serves as a comprehensive introduction to setting up a Docker environment for your Ultralytics projects. [Docker](https://docker.com/) is a platform for developing, shipping, and running applications in containers. It is particularly beneficial for ensuring that the software will always run the same, regardless of where it's deployed. For more details, visit the Ultralytics Docker repository on [Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics).
|
||||
|
||||
[](https://hub.docker.com/r/ultralytics/ultralytics)
|
||||
|
||||
## What You Will Learn
|
||||
|
||||
- Setting up Docker with NVIDIA support
|
||||
- Installing Ultralytics Docker images
|
||||
- Running Ultralytics in a Docker container
|
||||
- Mounting local directories into the container
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Make sure Docker is installed on your system. If not, you can download and install it from [Docker's website](https://www.docker.com/products/docker-desktop).
|
||||
- Ensure that your system has an NVIDIA GPU and NVIDIA drivers are installed.
|
||||
|
||||
---
|
||||
|
||||
## Setting up Docker with NVIDIA Support
|
||||
|
||||
First, verify that the NVIDIA drivers are properly installed by running:
|
||||
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
|
||||
### Installing NVIDIA Docker Runtime
|
||||
|
||||
Now, let's install the NVIDIA Docker runtime to enable GPU support in Docker containers:
|
||||
|
||||
```bash
|
||||
# Add NVIDIA package repositories
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
|
||||
distribution=$(lsb_release -cs)
|
||||
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
|
||||
|
||||
# Install NVIDIA Docker runtime
|
||||
sudo apt-get update
|
||||
sudo apt-get install -y nvidia-docker2
|
||||
|
||||
# Restart Docker service to apply changes
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
### Verify NVIDIA Runtime with Docker
|
||||
|
||||
Run `docker info | grep -i runtime` to ensure that `nvidia` appears in the list of runtimes:
|
||||
|
||||
```bash
|
||||
docker info | grep -i runtime
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Installing Ultralytics Docker Images
|
||||
|
||||
Ultralytics offers several Docker images optimized for various platforms and use-cases:
|
||||
|
||||
- **Dockerfile:** GPU image, ideal for training.
|
||||
- **Dockerfile-arm64:** For ARM64 architecture, suitable for devices like [Raspberry Pi](raspberry-pi.md).
|
||||
- **Dockerfile-cpu:** CPU-only version for inference and non-GPU environments.
|
||||
- **Dockerfile-jetson:** Optimized for NVIDIA Jetson devices.
|
||||
- **Dockerfile-python:** Minimal Python environment for lightweight applications.
|
||||
- **Dockerfile-conda:** Includes [Miniconda3](https://docs.conda.io/projects/miniconda/en/latest/) and Ultralytics package installed via Conda.
|
||||
|
||||
To pull the latest image:
|
||||
|
||||
```bash
|
||||
# Set image name as a variable
|
||||
t=ultralytics/ultralytics:latest
|
||||
|
||||
# Pull the latest Ultralytics image from Docker Hub
|
||||
sudo docker pull $t
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Running Ultralytics in Docker Container
|
||||
|
||||
Here's how to execute the Ultralytics Docker container:
|
||||
|
||||
```bash
|
||||
# Run with all GPUs
|
||||
sudo docker run -it --ipc=host --gpus all $t
|
||||
|
||||
# Run specifying which GPUs to use
|
||||
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t
|
||||
```
|
||||
|
||||
The `-it` flag assigns a pseudo-TTY and keeps stdin open, allowing you to interact with the container. The `--ipc=host` flag enables sharing of host's IPC namespace, essential for sharing memory between processes. The `--gpus` flag allows the container to access the host's GPUs.
|
||||
|
||||
### Note on File Accessibility
|
||||
|
||||
To work with files on your local machine within the container, you can use Docker volumes:
|
||||
|
||||
```bash
|
||||
# Mount a local directory into the container
|
||||
sudo docker run -it --ipc=host --gpus all -v /path/on/host:/path/in/container $t
|
||||
```
|
||||
|
||||
Replace `/path/on/host` with the directory path on your local machine and `/path/in/container` with the desired path inside the Docker container.
|
||||
|
||||
---
|
||||
|
||||
Congratulations! You're now set up to use Ultralytics with Docker and ready to take advantage of its powerful capabilities. For alternate installation methods, feel free to explore the [Ultralytics quickstart documentation](https://docs.ultralytics.com/quickstart/).
|
||||
206
docs/en/guides/hyperparameter-tuning.md
Normal file
206
docs/en/guides/hyperparameter-tuning.md
Normal file
|
|
@ -0,0 +1,206 @@
|
|||
---
|
||||
comments: true
|
||||
description: Dive into hyperparameter tuning in Ultralytics YOLO models. Learn how to optimize performance using the Tuner class and genetic evolution.
|
||||
keywords: Ultralytics, YOLO, Hyperparameter Tuning, Tuner Class, Genetic Evolution, Optimization
|
||||
---
|
||||
|
||||
# Ultralytics YOLO Hyperparameter Tuning Guide
|
||||
|
||||
## Introduction
|
||||
|
||||
Hyperparameter tuning is not just a one-time set-up but an iterative process aimed at optimizing the machine learning model's performance metrics, such as accuracy, precision, and recall. In the context of Ultralytics YOLO, these hyperparameters could range from learning rate to architectural details, such as the number of layers or types of activation functions used.
|
||||
|
||||
### What are Hyperparameters?
|
||||
|
||||
Hyperparameters are high-level, structural settings for the algorithm. They are set prior to the training phase and remain constant during it. Here are some commonly tuned hyperparameters in Ultralytics YOLO:
|
||||
|
||||
- **Learning Rate** `lr0`: Determines the step size at each iteration while moving towards a minimum in the loss function.
|
||||
- **Batch Size** `batch`: Number of images processed simultaneously in a forward pass.
|
||||
- **Number of Epochs** `epochs`: An epoch is one complete forward and backward pass of all the training examples.
|
||||
- **Architecture Specifics**: Such as channel counts, number of layers, types of activation functions, etc.
|
||||
|
||||
<p align="center">
|
||||
<img width="640" src="https://user-images.githubusercontent.com/26833433/263858934-4f109a2f-82d9-4d08-8bd6-6fd1ff520bcd.png" alt="Hyperparameter Tuning Visual">
|
||||
</p>
|
||||
|
||||
For a full list of augmentation hyperparameters used in YOLOv8 please refer to [https://docs.ultralytics.com/usage/cfg/#augmentation](https://docs.ultralytics.com/usage/cfg/#augmentation).
|
||||
|
||||
### Genetic Evolution and Mutation
|
||||
|
||||
Ultralytics YOLO uses genetic algorithms to optimize hyperparameters. Genetic algorithms are inspired by the mechanism of natural selection and genetics.
|
||||
|
||||
- **Mutation**: In the context of Ultralytics YOLO, mutation helps in locally searching the hyperparameter space by applying small, random changes to existing hyperparameters, producing new candidates for evaluation.
|
||||
- **Crossover**: Although crossover is a popular genetic algorithm technique, it is not currently used in Ultralytics YOLO for hyperparameter tuning. The focus is mainly on mutation for generating new hyperparameter sets.
|
||||
|
||||
## Preparing for Hyperparameter Tuning
|
||||
|
||||
Before you begin the tuning process, it's important to:
|
||||
|
||||
1. **Identify the Metrics**: Determine the metrics you will use to evaluate the model's performance. This could be AP50, F1-score, or others.
|
||||
2. **Set the Tuning Budget**: Define how much computational resources you're willing to allocate. Hyperparameter tuning can be computationally intensive.
|
||||
|
||||
## Steps Involved
|
||||
|
||||
### Initialize Hyperparameters
|
||||
|
||||
Start with a reasonable set of initial hyperparameters. This could either be the default hyperparameters set by Ultralytics YOLO or something based on your domain knowledge or previous experiments.
|
||||
|
||||
### Mutate Hyperparameters
|
||||
|
||||
Use the `_mutate` method to produce a new set of hyperparameters based on the existing set.
|
||||
|
||||
### Train Model
|
||||
|
||||
Training is performed using the mutated set of hyperparameters. The training performance is then assessed.
|
||||
|
||||
### Evaluate Model
|
||||
|
||||
Use metrics like AP50, F1-score, or custom metrics to evaluate the model's performance.
|
||||
|
||||
### Log Results
|
||||
|
||||
It's crucial to log both the performance metrics and the corresponding hyperparameters for future reference.
|
||||
|
||||
### Repeat
|
||||
|
||||
The process is repeated until either the set number of iterations is reached or the performance metric is satisfactory.
|
||||
|
||||
## Usage Example
|
||||
|
||||
Here's how to use the `model.tune()` method to utilize the `Tuner` class for hyperparameter tuning of YOLOv8n on COCO8 for 30 epochs with an AdamW optimizer and skipping plotting, checkpointing and validation other than on final epoch for faster Tuning.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Initialize the YOLO model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Tune hyperparameters on COCO8 for 30 epochs
|
||||
model.tune(data='coco8.yaml', epochs=30, iterations=300, optimizer='AdamW', plots=False, save=False, val=False)
|
||||
```
|
||||
|
||||
## Results
|
||||
|
||||
After you've successfully completed the hyperparameter tuning process, you will obtain several files and directories that encapsulate the results of the tuning. The following describes each:
|
||||
|
||||
### File Structure
|
||||
|
||||
Here's what the directory structure of the results will look like. Training directories like `train1/` contain individual tuning iterations, i.e. one model trained with one set of hyperparameters. The `tune/` directory contains tuning results from all the individual model trainings:
|
||||
|
||||
```plaintext
|
||||
runs/
|
||||
└── detect/
|
||||
├── train1/
|
||||
├── train2/
|
||||
├── ...
|
||||
└── tune/
|
||||
├── best_hyperparameters.yaml
|
||||
├── best_fitness.png
|
||||
├── tune_results.csv
|
||||
├── tune_scatter_plots.png
|
||||
└── weights/
|
||||
├── last.pt
|
||||
└── best.pt
|
||||
```
|
||||
|
||||
### File Descriptions
|
||||
|
||||
#### best_hyperparameters.yaml
|
||||
|
||||
This YAML file contains the best-performing hyperparameters found during the tuning process. You can use this file to initialize future trainings with these optimized settings.
|
||||
|
||||
- **Format**: YAML
|
||||
- **Usage**: Hyperparameter results
|
||||
- **Example**:
|
||||
```yaml
|
||||
# 558/900 iterations complete ✅ (45536.81s)
|
||||
# Results saved to /usr/src/ultralytics/runs/detect/tune
|
||||
# Best fitness=0.64297 observed at iteration 498
|
||||
# Best fitness metrics are {'metrics/precision(B)': 0.87247, 'metrics/recall(B)': 0.71387, 'metrics/mAP50(B)': 0.79106, 'metrics/mAP50-95(B)': 0.62651, 'val/box_loss': 2.79884, 'val/cls_loss': 2.72386, 'val/dfl_loss': 0.68503, 'fitness': 0.64297}
|
||||
# Best fitness model is /usr/src/ultralytics/runs/detect/train498
|
||||
# Best fitness hyperparameters are printed below.
|
||||
|
||||
lr0: 0.00269
|
||||
lrf: 0.00288
|
||||
momentum: 0.73375
|
||||
weight_decay: 0.00015
|
||||
warmup_epochs: 1.22935
|
||||
warmup_momentum: 0.1525
|
||||
box: 18.27875
|
||||
cls: 1.32899
|
||||
dfl: 0.56016
|
||||
hsv_h: 0.01148
|
||||
hsv_s: 0.53554
|
||||
hsv_v: 0.13636
|
||||
degrees: 0.0
|
||||
translate: 0.12431
|
||||
scale: 0.07643
|
||||
shear: 0.0
|
||||
perspective: 0.0
|
||||
flipud: 0.0
|
||||
fliplr: 0.08631
|
||||
mosaic: 0.42551
|
||||
mixup: 0.0
|
||||
copy_paste: 0.0
|
||||
```
|
||||
|
||||
#### best_fitness.png
|
||||
|
||||
This is a plot displaying fitness (typically a performance metric like AP50) against the number of iterations. It helps you visualize how well the genetic algorithm performed over time.
|
||||
|
||||
- **Format**: PNG
|
||||
- **Usage**: Performance visualization
|
||||
|
||||
<p align="center">
|
||||
<img width="640" src="https://user-images.githubusercontent.com/26833433/266847423-9d0aea13-d5c4-4771-b06e-0b817a498260.png" alt="Hyperparameter Tuning Fitness vs Iteration">
|
||||
</p>
|
||||
|
||||
#### tune_results.csv
|
||||
|
||||
A CSV file containing detailed results of each iteration during the tuning. Each row in the file represents one iteration, and it includes metrics like fitness score, precision, recall, as well as the hyperparameters used.
|
||||
|
||||
- **Format**: CSV
|
||||
- **Usage**: Per-iteration results tracking.
|
||||
- **Example**:
|
||||
```csv
|
||||
fitness,lr0,lrf,momentum,weight_decay,warmup_epochs,warmup_momentum,box,cls,dfl,hsv_h,hsv_s,hsv_v,degrees,translate,scale,shear,perspective,flipud,fliplr,mosaic,mixup,copy_paste
|
||||
0.05021,0.01,0.01,0.937,0.0005,3.0,0.8,7.5,0.5,1.5,0.015,0.7,0.4,0.0,0.1,0.5,0.0,0.0,0.0,0.5,1.0,0.0,0.0
|
||||
0.07217,0.01003,0.00967,0.93897,0.00049,2.79757,0.81075,7.5,0.50746,1.44826,0.01503,0.72948,0.40658,0.0,0.0987,0.4922,0.0,0.0,0.0,0.49729,1.0,0.0,0.0
|
||||
0.06584,0.01003,0.00855,0.91009,0.00073,3.42176,0.95,8.64301,0.54594,1.72261,0.01503,0.59179,0.40658,0.0,0.0987,0.46955,0.0,0.0,0.0,0.49729,0.80187,0.0,0.0
|
||||
```
|
||||
|
||||
#### tune_scatter_plots.png
|
||||
|
||||
This file contains scatter plots generated from `tune_results.csv`, helping you visualize relationships between different hyperparameters and performance metrics. Note that hyperparameters initialized to 0 will not be tuned, such as `degrees` and `shear` below.
|
||||
|
||||
- **Format**: PNG
|
||||
- **Usage**: Exploratory data analysis
|
||||
|
||||
<p align="center">
|
||||
<img width="1000" src="https://user-images.githubusercontent.com/26833433/266847488-ec382f3d-79bc-4087-a0e0-42fb8b62cad2.png" alt="Hyperparameter Tuning Scatter Plots">
|
||||
</p>
|
||||
|
||||
#### weights/
|
||||
|
||||
This directory contains the saved PyTorch models for the last and the best iterations during the hyperparameter tuning process.
|
||||
|
||||
- **`last.pt`**: The last.pt weights for the iteration that achieved the best fitness score.
|
||||
- **`best.pt`**: The best.pt weights for the iteration that achieved the best fitness score.
|
||||
|
||||
Using these results, you can make more informed decisions for your future model trainings and analyses. Feel free to consult these artifacts to understand how well your model performed and how you might improve it further.
|
||||
|
||||
## Conclusion
|
||||
|
||||
The hyperparameter tuning process in Ultralytics YOLO is simplified yet powerful, thanks to its genetic algorithm-based approach focused on mutation. Following the steps outlined in this guide will assist you in systematically tuning your model to achieve better performance.
|
||||
|
||||
### Further Reading
|
||||
|
||||
1. [Hyperparameter Optimization in Wikipedia](https://en.wikipedia.org/wiki/Hyperparameter_optimization)
|
||||
2. [YOLOv5 Hyperparameter Evolution Guide](https://docs.ultralytics.com/yolov5/tutorials/hyperparameter_evolution/)
|
||||
3. [Efficient Hyperparameter Tuning with Ray Tune and YOLOv8](https://docs.ultralytics.com/integrations/ray-tune/)
|
||||
|
||||
For deeper insights, you can explore the `Tuner` class source code and accompanying documentation. Should you have any questions, feature requests, or need further assistance, feel free to reach out to us on [GitHub](https://github.com/ultralytics/ultralytics/issues/new/choose) or [Discord](https://ultralytics.com/discord).
|
||||
36
docs/en/guides/index.md
Normal file
36
docs/en/guides/index.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
---
|
||||
comments: true
|
||||
description: In-depth exploration of Ultralytics' YOLO. Learn about the YOLO object detection model, how to train it on custom data, multi-GPU training, exporting, predicting, deploying, and more.
|
||||
keywords: Ultralytics, YOLO, Deep Learning, Object detection, PyTorch, Tutorial, Multi-GPU training, Custom data training, SAHI, Tiled Inference
|
||||
---
|
||||
|
||||
# Comprehensive Tutorials to Ultralytics YOLO
|
||||
|
||||
Welcome to the Ultralytics' YOLO 🚀 Guides! Our comprehensive tutorials cover various aspects of the YOLO object detection model, ranging from training and prediction to deployment. Built on PyTorch, YOLO stands out for its exceptional speed and accuracy in real-time object detection tasks.
|
||||
|
||||
Whether you're a beginner or an expert in deep learning, our tutorials offer valuable insights into the implementation and optimization of YOLO for your computer vision projects. Let's dive in!
|
||||
|
||||
## Guides
|
||||
|
||||
Here's a compilation of in-depth guides to help you master different aspects of Ultralytics YOLO.
|
||||
|
||||
* [YOLO Common Issues](yolo-common-issues.md) ⭐ RECOMMENDED: Practical solutions and troubleshooting tips to the most frequently encountered issues when working with Ultralytics YOLO models.
|
||||
* [YOLO Performance Metrics](yolo-performance-metrics.md) ⭐ ESSENTIAL: Understand the key metrics like mAP, IoU, and F1 score used to evaluate the performance of your YOLO models. Includes practical examples and tips on how to improve detection accuracy and speed.
|
||||
* [Model Deployment Options](model-deployment-options.md): Overview of YOLO model deployment formats like ONNX, OpenVINO, and TensorRT, with pros and cons for each to inform your deployment strategy.
|
||||
* [K-Fold Cross Validation](kfold-cross-validation.md) 🚀 NEW: Learn how to improve model generalization using K-Fold cross-validation technique.
|
||||
* [Hyperparameter Tuning](hyperparameter-tuning.md) 🚀 NEW: Discover how to optimize your YOLO models by fine-tuning hyperparameters using the Tuner class and genetic evolution algorithms.
|
||||
* [SAHI Tiled Inference](sahi-tiled-inference.md) 🚀 NEW: Comprehensive guide on leveraging SAHI's sliced inference capabilities with YOLOv8 for object detection in high-resolution images.
|
||||
* [AzureML Quickstart](azureml-quickstart.md) 🚀 NEW: Get up and running with Ultralytics YOLO models on Microsoft's Azure Machine Learning platform. Learn how to train, deploy, and scale your object detection projects in the cloud.
|
||||
* [Conda Quickstart](conda-quickstart.md) 🚀 NEW: Step-by-step guide to setting up a [Conda](https://anaconda.org/conda-forge/ultralytics) environment for Ultralytics. Learn how to install and start using the Ultralytics package efficiently with Conda.
|
||||
* [Docker Quickstart](docker-quickstart.md) 🚀 NEW: Complete guide to setting up and using Ultralytics YOLO models with [Docker](https://hub.docker.com/r/ultralytics/ultralytics). Learn how to install Docker, manage GPU support, and run YOLO models in isolated containers for consistent development and deployment.
|
||||
* [Raspberry Pi](raspberry-pi.md) 🚀 NEW: Quickstart tutorial to run YOLO models to the latest Raspberry Pi hardware.
|
||||
* [Triton Inference Server Integration](triton-inference-server.md) 🚀 NEW: Dive into the integration of Ultralytics YOLOv8 with NVIDIA's Triton Inference Server for scalable and efficient deep learning inference deployments.
|
||||
* [YOLO Thread-Safe Inference](yolo-thread-safe-inference.md) 🚀 NEW: Guidelines for performing inference with YOLO models in a thread-safe manner. Learn the importance of thread safety and best practices to prevent race conditions and ensure consistent predictions.
|
||||
|
||||
## Contribute to Our Guides
|
||||
|
||||
We welcome contributions from the community! If you've mastered a particular aspect of Ultralytics YOLO that's not yet covered in our guides, we encourage you to share your expertise. Writing a guide is a great way to give back to the community and help us make our documentation more comprehensive and user-friendly.
|
||||
|
||||
To get started, please read our [Contributing Guide](https://docs.ultralytics.com/help/contributing) for guidelines on how to open up a Pull Request (PR) 🛠️. We look forward to your contributions!
|
||||
|
||||
Let's work together to make the Ultralytics YOLO ecosystem more robust and versatile 🙏!
|
||||
279
docs/en/guides/kfold-cross-validation.md
Normal file
279
docs/en/guides/kfold-cross-validation.md
Normal file
|
|
@ -0,0 +1,279 @@
|
|||
---
|
||||
comments: true
|
||||
description: An in-depth guide demonstrating the implementation of K-Fold Cross Validation with the Ultralytics ecosystem for object detection datasets, leveraging Python, YOLO, and sklearn.
|
||||
keywords: K-Fold cross validation, Ultralytics, YOLO detection format, Python, sklearn, object detection
|
||||
---
|
||||
|
||||
# K-Fold Cross Validation with Ultralytics
|
||||
|
||||
## Introduction
|
||||
|
||||
This comprehensive guide illustrates the implementation of K-Fold Cross Validation for object detection datasets within the Ultralytics ecosystem. We'll leverage the YOLO detection format and key Python libraries such as sklearn, pandas, and PyYaml to guide you through the necessary setup, the process of generating feature vectors, and the execution of a K-Fold dataset split.
|
||||
|
||||
<p align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/258589390-8d815058-ece8-48b9-a94e-0e1ab53ea0f6.png" alt="K-Fold Cross Validation Overview">
|
||||
</p>
|
||||
|
||||
Whether your project involves the Fruit Detection dataset or a custom data source, this tutorial aims to help you comprehend and apply K-Fold Cross Validation to bolster the reliability and robustness of your machine learning models. While we're applying `k=5` folds for this tutorial, keep in mind that the optimal number of folds can vary depending on your dataset and the specifics of your project.
|
||||
|
||||
Without further ado, let's dive in!
|
||||
|
||||
## Setup
|
||||
|
||||
- Your annotations should be in the [YOLO detection format](https://docs.ultralytics.com/datasets/detect/).
|
||||
|
||||
- This guide assumes that annotation files are locally available.
|
||||
|
||||
- For our demonstration, we use the [Fruit Detection](https://www.kaggle.com/datasets/lakshaytyagi01/fruit-detection/code) dataset.
|
||||
|
||||
- This dataset contains a total of 8479 images.
|
||||
- It includes 6 class labels, each with its total instance counts listed below.
|
||||
|
||||
| Class Label | Instance Count |
|
||||
|:------------|:--------------:|
|
||||
| Apple | 7049 |
|
||||
| Grapes | 7202 |
|
||||
| Pineapple | 1613 |
|
||||
| Orange | 15549 |
|
||||
| Banana | 3536 |
|
||||
| Watermelon | 1976 |
|
||||
|
||||
- Necessary Python packages include:
|
||||
|
||||
- `ultralytics`
|
||||
- `sklearn`
|
||||
- `pandas`
|
||||
- `pyyaml`
|
||||
|
||||
- This tutorial operates with `k=5` folds. However, you should determine the best number of folds for your specific dataset.
|
||||
|
||||
1. Initiate a new Python virtual environment (`venv`) for your project and activate it. Use `pip` (or your preferred package manager) to install:
|
||||
|
||||
- The Ultralytics library: `pip install -U ultralytics`. Alternatively, you can clone the official [repo](https://github.com/ultralytics/ultralytics).
|
||||
- Scikit-learn, pandas, and PyYAML: `pip install -U scikit-learn pandas pyyaml`.
|
||||
|
||||
2. Verify that your annotations are in the [YOLO detection format](https://docs.ultralytics.com/datasets/detect/).
|
||||
|
||||
- For this tutorial, all annotation files are found in the `Fruit-Detection/labels` directory.
|
||||
|
||||
## Generating Feature Vectors for Object Detection Dataset
|
||||
|
||||
1. Start by creating a new Python file and import the required libraries.
|
||||
|
||||
```python
|
||||
import datetime
|
||||
import shutil
|
||||
from pathlib import Path
|
||||
from collections import Counter
|
||||
|
||||
import yaml
|
||||
import numpy as np
|
||||
import pandas as pd
|
||||
from ultralytics import YOLO
|
||||
from sklearn.model_selection import KFold
|
||||
```
|
||||
|
||||
2. Proceed to retrieve all label files for your dataset.
|
||||
|
||||
```python
|
||||
dataset_path = Path('./Fruit-detection') # replace with 'path/to/dataset' for your custom data
|
||||
labels = sorted(dataset_path.rglob("*labels/*.txt")) # all data in 'labels'
|
||||
```
|
||||
|
||||
3. Now, read the contents of the dataset YAML file and extract the indices of the class labels.
|
||||
|
||||
```python
|
||||
yaml_file = 'path/to/data.yaml' # your data YAML with data directories and names dictionary
|
||||
with open(yaml_file, 'r', encoding="utf8") as y:
|
||||
classes = yaml.safe_load(y)['names']
|
||||
cls_idx = sorted(classes.keys())
|
||||
```
|
||||
|
||||
4. Initialize an empty `pandas` DataFrame.
|
||||
|
||||
```python
|
||||
indx = [l.stem for l in labels] # uses base filename as ID (no extension)
|
||||
labels_df = pd.DataFrame([], columns=cls_idx, index=indx)
|
||||
```
|
||||
|
||||
5. Count the instances of each class-label present in the annotation files.
|
||||
|
||||
```python
|
||||
for label in labels:
|
||||
lbl_counter = Counter()
|
||||
|
||||
with open(label,'r') as lf:
|
||||
lines = lf.readlines()
|
||||
|
||||
for l in lines:
|
||||
# classes for YOLO label uses integer at first position of each line
|
||||
lbl_counter[int(l.split(' ')[0])] += 1
|
||||
|
||||
labels_df.loc[label.stem] = lbl_counter
|
||||
|
||||
labels_df = labels_df.fillna(0.0) # replace `nan` values with `0.0`
|
||||
```
|
||||
|
||||
6. The following is a sample view of the populated DataFrame:
|
||||
|
||||
```pandas
|
||||
0 1 2 3 4 5
|
||||
'0000a16e4b057580_jpg.rf.00ab48988370f64f5ca8ea4...' 0.0 0.0 0.0 0.0 0.0 7.0
|
||||
'0000a16e4b057580_jpg.rf.7e6dce029fb67f01eb19aa7...' 0.0 0.0 0.0 0.0 0.0 7.0
|
||||
'0000a16e4b057580_jpg.rf.bc4d31cdcbe229dd022957a...' 0.0 0.0 0.0 0.0 0.0 7.0
|
||||
'00020ebf74c4881c_jpg.rf.508192a0a97aa6c4a3b6882...' 0.0 0.0 0.0 1.0 0.0 0.0
|
||||
'00020ebf74c4881c_jpg.rf.5af192a2254c8ecc4188a25...' 0.0 0.0 0.0 1.0 0.0 0.0
|
||||
... ... ... ... ... ... ...
|
||||
'ff4cd45896de38be_jpg.rf.c4b5e967ca10c7ced3b9e97...' 0.0 0.0 0.0 0.0 0.0 2.0
|
||||
'ff4cd45896de38be_jpg.rf.ea4c1d37d2884b3e3cbce08...' 0.0 0.0 0.0 0.0 0.0 2.0
|
||||
'ff5fd9c3c624b7dc_jpg.rf.bb519feaa36fc4bf630a033...' 1.0 0.0 0.0 0.0 0.0 0.0
|
||||
'ff5fd9c3c624b7dc_jpg.rf.f0751c9c3aa4519ea3c9d6a...' 1.0 0.0 0.0 0.0 0.0 0.0
|
||||
'fffe28b31f2a70d4_jpg.rf.7ea16bd637ba0711c53b540...' 0.0 6.0 0.0 0.0 0.0 0.0
|
||||
```
|
||||
|
||||
The rows index the label files, each corresponding to an image in your dataset, and the columns correspond to your class-label indices. Each row represents a pseudo feature-vector, with the count of each class-label present in your dataset. This data structure enables the application of K-Fold Cross Validation to an object detection dataset.
|
||||
|
||||
## K-Fold Dataset Split
|
||||
|
||||
1. Now we will use the `KFold` class from `sklearn.model_selection` to generate `k` splits of the dataset.
|
||||
|
||||
- Important:
|
||||
- Setting `shuffle=True` ensures a randomized distribution of classes in your splits.
|
||||
- By setting `random_state=M` where `M` is a chosen integer, you can obtain repeatable results.
|
||||
|
||||
```python
|
||||
ksplit = 5
|
||||
kf = KFold(n_splits=ksplit, shuffle=True, random_state=20) # setting random_state for repeatable results
|
||||
|
||||
kfolds = list(kf.split(labels_df))
|
||||
```
|
||||
|
||||
2. The dataset has now been split into `k` folds, each having a list of `train` and `val` indices. We will construct a DataFrame to display these results more clearly.
|
||||
|
||||
```python
|
||||
folds = [f'split_{n}' for n in range(1, ksplit + 1)]
|
||||
folds_df = pd.DataFrame(index=indx, columns=folds)
|
||||
|
||||
for idx, (train, val) in enumerate(kfolds, start=1):
|
||||
folds_df[f'split_{idx}'].loc[labels_df.iloc[train].index] = 'train'
|
||||
folds_df[f'split_{idx}'].loc[labels_df.iloc[val].index] = 'val'
|
||||
```
|
||||
|
||||
3. Now we will calculate the distribution of class labels for each fold as a ratio of the classes present in `val` to those present in `train`.
|
||||
|
||||
```python
|
||||
fold_lbl_distrb = pd.DataFrame(index=folds, columns=cls_idx)
|
||||
|
||||
for n, (train_indices, val_indices) in enumerate(kfolds, start=1):
|
||||
train_totals = labels_df.iloc[train_indices].sum()
|
||||
val_totals = labels_df.iloc[val_indices].sum()
|
||||
|
||||
# To avoid division by zero, we add a small value (1E-7) to the denominator
|
||||
ratio = val_totals / (train_totals + 1E-7)
|
||||
fold_lbl_distrb.loc[f'split_{n}'] = ratio
|
||||
```
|
||||
|
||||
The ideal scenario is for all class ratios to be reasonably similar for each split and across classes. This, however, will be subject to the specifics of your dataset.
|
||||
|
||||
4. Next, we create the directories and dataset YAML files for each split.
|
||||
|
||||
```python
|
||||
supported_extensions = ['.jpg', '.jpeg', '.png']
|
||||
|
||||
# Initialize an empty list to store image file paths
|
||||
images = []
|
||||
|
||||
# Loop through supported extensions and gather image files
|
||||
for ext in supported_extensions:
|
||||
images.extend(sorted((dataset_path / 'images').rglob(f"*{ext}")))
|
||||
|
||||
# Create the necessary directories and dataset YAML files (unchanged)
|
||||
save_path = Path(dataset_path / f'{datetime.date.today().isoformat()}_{ksplit}-Fold_Cross-val')
|
||||
save_path.mkdir(parents=True, exist_ok=True)
|
||||
ds_yamls = []
|
||||
|
||||
for split in folds_df.columns:
|
||||
# Create directories
|
||||
split_dir = save_path / split
|
||||
split_dir.mkdir(parents=True, exist_ok=True)
|
||||
(split_dir / 'train' / 'images').mkdir(parents=True, exist_ok=True)
|
||||
(split_dir / 'train' / 'labels').mkdir(parents=True, exist_ok=True)
|
||||
(split_dir / 'val' / 'images').mkdir(parents=True, exist_ok=True)
|
||||
(split_dir / 'val' / 'labels').mkdir(parents=True, exist_ok=True)
|
||||
|
||||
# Create dataset YAML files
|
||||
dataset_yaml = split_dir / f'{split}_dataset.yaml'
|
||||
ds_yamls.append(dataset_yaml)
|
||||
|
||||
with open(dataset_yaml, 'w') as ds_y:
|
||||
yaml.safe_dump({
|
||||
'path': split_dir.as_posix(),
|
||||
'train': 'train',
|
||||
'val': 'val',
|
||||
'names': classes
|
||||
}, ds_y)
|
||||
```
|
||||
|
||||
5. Lastly, copy images and labels into the respective directory ('train' or 'val') for each split.
|
||||
|
||||
- __NOTE:__ The time required for this portion of the code will vary based on the size of your dataset and your system hardware.
|
||||
|
||||
```python
|
||||
for image, label in zip(images, labels):
|
||||
for split, k_split in folds_df.loc[image.stem].items():
|
||||
# Destination directory
|
||||
img_to_path = save_path / split / k_split / 'images'
|
||||
lbl_to_path = save_path / split / k_split / 'labels'
|
||||
|
||||
# Copy image and label files to new directory (SamefileError if file already exists)
|
||||
shutil.copy(image, img_to_path / image.name)
|
||||
shutil.copy(label, lbl_to_path / label.name)
|
||||
```
|
||||
|
||||
## Save Records (Optional)
|
||||
|
||||
Optionally, you can save the records of the K-Fold split and label distribution DataFrames as CSV files for future reference.
|
||||
|
||||
```python
|
||||
folds_df.to_csv(save_path / "kfold_datasplit.csv")
|
||||
fold_lbl_distrb.to_csv(save_path / "kfold_label_distribution.csv")
|
||||
```
|
||||
|
||||
## Train YOLO using K-Fold Data Splits
|
||||
|
||||
1. First, load the YOLO model.
|
||||
|
||||
```python
|
||||
weights_path = 'path/to/weights.pt'
|
||||
model = YOLO(weights_path, task='detect')
|
||||
```
|
||||
|
||||
2. Next, iterate over the dataset YAML files to run training. The results will be saved to a directory specified by the `project` and `name` arguments. By default, this directory is 'exp/runs#' where # is an integer index.
|
||||
|
||||
```python
|
||||
results = {}
|
||||
|
||||
# Define your additional arguments here
|
||||
batch = 16
|
||||
project = 'kfold_demo'
|
||||
epochs = 100
|
||||
|
||||
for k in range(ksplit):
|
||||
dataset_yaml = ds_yamls[k]
|
||||
model.train(data=dataset_yaml,epochs=epochs, batch=batch, project=project) # include any train arguments
|
||||
results[k] = model.metrics # save output metrics for further analysis
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this guide, we have explored the process of using K-Fold cross-validation for training the YOLO object detection model. We learned how to split our dataset into K partitions, ensuring a balanced class distribution across the different folds.
|
||||
|
||||
We also explored the procedure for creating report DataFrames to visualize the data splits and label distributions across these splits, providing us a clear insight into the structure of our training and validation sets.
|
||||
|
||||
Optionally, we saved our records for future reference, which could be particularly useful in large-scale projects or when troubleshooting model performance.
|
||||
|
||||
Finally, we implemented the actual model training using each split in a loop, saving our training results for further analysis and comparison.
|
||||
|
||||
This technique of K-Fold cross-validation is a robust way of making the most out of your available data, and it helps to ensure that your model performance is reliable and consistent across different data subsets. This results in a more generalizable and reliable model that is less likely to overfit to specific data patterns.
|
||||
|
||||
Remember that although we used YOLO in this guide, these steps are mostly transferable to other machine learning models. Understanding these steps allows you to apply cross-validation effectively in your own machine learning projects. Happy coding!
|
||||
305
docs/en/guides/model-deployment-options.md
Normal file
305
docs/en/guides/model-deployment-options.md
Normal file
|
|
@ -0,0 +1,305 @@
|
|||
---
|
||||
comments: true
|
||||
Description: A guide to help determine which deployment option to choose for your YOLOv8 model, including essential considerations.
|
||||
keywords: YOLOv8, Deployment, PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, CoreML, TensorFlow, Export
|
||||
---
|
||||
|
||||
# Understanding YOLOv8’s Deployment Options
|
||||
|
||||
## Introduction
|
||||
|
||||
*Setting the Scene:* You've come a long way on your journey with YOLOv8. You've diligently collected data, meticulously annotated it, and put in the hours to train and rigorously evaluate your custom YOLOv8 model. Now, it’s time to put your model to work for your specific application, use case, or project. But there's a critical decision that stands before you: how to export and deploy your model effectively.
|
||||
|
||||
This guide walks you through YOLOv8’s deployment options and the essential factors to consider to choose the right option for your project.
|
||||
|
||||
## How to Select the Right Deployment Option for Your YOLOv8 Model
|
||||
|
||||
When it's time to deploy your YOLOv8 model, selecting a suitable export format is very important. As outlined in the [Ultralytics YOLOv8 Modes documentation](https://docs.ultralytics.com/modes/export/#usage-examples), the model.export() function allows for converting your trained model into a variety of formats tailored to diverse environments and performance requirements.
|
||||
|
||||
The ideal format depends on your model's intended operational context, balancing speed, hardware constraints, and ease of integration. In the following section, we'll take a closer look at each export option, understanding when to choose each one.
|
||||
|
||||
### YOLOv8’s Deployment Options
|
||||
|
||||
Let’s walk through the different YOLOv8 deployment options. For a detailed walkthrough of the export process, visit the [Ultralytics documentation page on exporting](https://docs.ultralytics.com/modes/export/).
|
||||
|
||||
#### PyTorch
|
||||
|
||||
PyTorch is an open-source machine learning library widely used for applications in deep learning and artificial intelligence. It provides a high level of flexibility and speed, which has made it a favorite among researchers and developers.
|
||||
|
||||
- **Performance Benchmarks**: PyTorch is known for its ease of use and flexibility, which may result in a slight trade-off in raw performance when compared to other frameworks that are more specialized and optimized.
|
||||
|
||||
- **Compatibility and Integration**: Offers excellent compatibility with various data science and machine learning libraries in Python.
|
||||
|
||||
- **Community Support and Ecosystem**: One of the most vibrant communities, with extensive resources for learning and troubleshooting.
|
||||
|
||||
- **Case Studies**: Commonly used in research prototypes, many academic papers reference models deployed in PyTorch.
|
||||
|
||||
- **Maintenance and Updates**: Regular updates with active development and support for new features.
|
||||
|
||||
- **Security Considerations**: Regular patches for security issues, but security is largely dependent on the overall environment it’s deployed in.
|
||||
|
||||
- **Hardware Acceleration**: Supports CUDA for GPU acceleration, essential for speeding up model training and inference.
|
||||
|
||||
#### TorchScript
|
||||
|
||||
TorchScript extends PyTorch’s capabilities by allowing the exportation of models to be run in a C++ runtime environment. This makes it suitable for production environments where Python is unavailable.
|
||||
|
||||
- **Performance Benchmarks**: Can offer improved performance over native PyTorch, especially in production environments.
|
||||
|
||||
- **Compatibility and Integration**: Designed for seamless transition from PyTorch to C++ production environments, though some advanced features might not translate perfectly.
|
||||
|
||||
- **Community Support and Ecosystem**: Benefits from PyTorch’s large community but has a narrower scope of specialized developers.
|
||||
|
||||
- **Case Studies**: Widely used in industry settings where Python’s performance overhead is a bottleneck.
|
||||
|
||||
- **Maintenance and Updates**: Maintained alongside PyTorch with consistent updates.
|
||||
|
||||
- **Security Considerations**: Offers improved security by enabling the running of models in environments without full Python installations.
|
||||
|
||||
- **Hardware Acceleration**: Inherits PyTorch’s CUDA support, ensuring efficient GPU utilization.
|
||||
|
||||
#### ONNX
|
||||
|
||||
The Open Neural Network Exchange (ONNX) is a format that allows for model interoperability across different frameworks, which can be critical when deploying to various platforms.
|
||||
|
||||
- **Performance Benchmarks**: ONNX models may experience a variable performance depending on the specific runtime they are deployed on.
|
||||
|
||||
- **Compatibility and Integration**: High interoperability across multiple platforms and hardware due to its framework-agnostic nature.
|
||||
|
||||
- **Community Support and Ecosystem**: Supported by many organizations, leading to a broad ecosystem and a variety of tools for optimization.
|
||||
|
||||
- **Case Studies**: Frequently used to move models between different machine learning frameworks, demonstrating its flexibility.
|
||||
|
||||
- **Maintenance and Updates**: As an open standard, ONNX is regularly updated to support new operations and models.
|
||||
|
||||
- **Security Considerations**: As with any cross-platform tool, it's essential to ensure secure practices in the conversion and deployment pipeline.
|
||||
|
||||
- **Hardware Acceleration**: With ONNX Runtime, models can leverage various hardware optimizations.
|
||||
|
||||
#### OpenVINO
|
||||
|
||||
OpenVINO is an Intel toolkit designed to facilitate the deployment of deep learning models across Intel hardware, enhancing performance and speed.
|
||||
|
||||
- **Performance Benchmarks**: Specifically optimized for Intel CPUs, GPUs, and VPUs, offering significant performance boosts on compatible hardware.
|
||||
|
||||
- **Compatibility and Integration**: Works best within the Intel ecosystem but also supports a range of other platforms.
|
||||
|
||||
- **Community Support and Ecosystem**: Backed by Intel, with a solid user base especially in the computer vision domain.
|
||||
|
||||
- **Case Studies**: Often utilized in IoT and edge computing scenarios where Intel hardware is prevalent.
|
||||
|
||||
- **Maintenance and Updates**: Intel regularly updates OpenVINO to support the latest deep learning models and Intel hardware.
|
||||
|
||||
- **Security Considerations**: Provides robust security features suitable for deployment in sensitive applications.
|
||||
|
||||
- **Hardware Acceleration**: Tailored for acceleration on Intel hardware, leveraging dedicated instruction sets and hardware features.
|
||||
|
||||
For more details on deployment using OpenVINO, refer to the Ultralytics Integration documentation: [Intel OpenVINO Export](https://docs.ultralytics.com/integrations/openvino/).
|
||||
|
||||
#### TensorRT
|
||||
|
||||
TensorRT is a high-performance deep learning inference optimizer and runtime from NVIDIA, ideal for applications needing speed and efficiency.
|
||||
|
||||
- **Performance Benchmarks**: Delivers top-tier performance on NVIDIA GPUs with support for high-speed inference.
|
||||
|
||||
- **Compatibility and Integration**: Best suited for NVIDIA hardware, with limited support outside this environment.
|
||||
|
||||
- **Community Support and Ecosystem**: Strong support network through NVIDIA’s developer forums and documentation.
|
||||
|
||||
- **Case Studies**: Widely adopted in industries requiring real-time inference on video and image data.
|
||||
|
||||
- **Maintenance and Updates**: NVIDIA maintains TensorRT with frequent updates to enhance performance and support new GPU architectures.
|
||||
|
||||
- **Security Considerations**: Like many NVIDIA products, it has a strong emphasis on security, but specifics depend on the deployment environment.
|
||||
|
||||
- **Hardware Acceleration**: Exclusively designed for NVIDIA GPUs, providing deep optimization and acceleration.
|
||||
|
||||
#### CoreML
|
||||
|
||||
CoreML is Apple’s machine learning framework, optimized for on-device performance in the Apple ecosystem, including iOS, macOS, watchOS, and tvOS.
|
||||
|
||||
- **Performance Benchmarks**: Optimized for on-device performance on Apple hardware with minimal battery usage.
|
||||
|
||||
- **Compatibility and Integration**: Exclusively for Apple's ecosystem, providing a streamlined workflow for iOS and macOS applications.
|
||||
|
||||
- **Community Support and Ecosystem**: Strong support from Apple and a dedicated developer community, with extensive documentation and tools.
|
||||
|
||||
- **Case Studies**: Commonly used in applications that require on-device machine learning capabilities on Apple products.
|
||||
|
||||
- **Maintenance and Updates**: Regularly updated by Apple to support the latest machine learning advancements and Apple hardware.
|
||||
|
||||
- **Security Considerations**: Benefits from Apple's focus on user privacy and data security.
|
||||
|
||||
- **Hardware Acceleration**: Takes full advantage of Apple's neural engine and GPU for accelerated machine learning tasks.
|
||||
|
||||
#### TF SavedModel
|
||||
|
||||
TF SavedModel is TensorFlow’s format for saving and serving machine learning models, particularly suited for scalable server environments.
|
||||
|
||||
- **Performance Benchmarks**: Offers scalable performance in server environments, especially when used with TensorFlow Serving.
|
||||
|
||||
- **Compatibility and Integration**: Wide compatibility across TensorFlow's ecosystem, including cloud and enterprise server deployments.
|
||||
|
||||
- **Community Support and Ecosystem**: Large community support due to TensorFlow's popularity, with a vast array of tools for deployment and optimization.
|
||||
|
||||
- **Case Studies**: Extensively used in production environments for serving deep learning models at scale.
|
||||
|
||||
- **Maintenance and Updates**: Supported by Google and the TensorFlow community, ensuring regular updates and new features.
|
||||
|
||||
- **Security Considerations**: Deployment using TensorFlow Serving includes robust security features for enterprise-grade applications.
|
||||
|
||||
- **Hardware Acceleration**: Supports various hardware accelerations through TensorFlow's backends.
|
||||
|
||||
#### TF GraphDef
|
||||
|
||||
TF GraphDef is a TensorFlow format that represents the model as a graph, which is beneficial for environments where a static computation graph is required.
|
||||
|
||||
- **Performance Benchmarks**: Provides stable performance for static computation graphs, with a focus on consistency and reliability.
|
||||
|
||||
- **Compatibility and Integration**: Easily integrates within TensorFlow's infrastructure but less flexible compared to SavedModel.
|
||||
|
||||
- **Community Support and Ecosystem**: Good support from TensorFlow's ecosystem, with many resources available for optimizing static graphs.
|
||||
|
||||
- **Case Studies**: Useful in scenarios where a static graph is necessary, such as in certain embedded systems.
|
||||
|
||||
- **Maintenance and Updates**: Regular updates alongside TensorFlow's core updates.
|
||||
|
||||
- **Security Considerations**: Ensures safe deployment with TensorFlow's established security practices.
|
||||
|
||||
- **Hardware Acceleration**: Can utilize TensorFlow's hardware acceleration options, though not as flexible as SavedModel.
|
||||
|
||||
#### TF Lite
|
||||
|
||||
TF Lite is TensorFlow’s solution for mobile and embedded device machine learning, providing a lightweight library for on-device inference.
|
||||
|
||||
- **Performance Benchmarks**: Designed for speed and efficiency on mobile and embedded devices.
|
||||
|
||||
- **Compatibility and Integration**: Can be used on a wide range of devices due to its lightweight nature.
|
||||
|
||||
- **Community Support and Ecosystem**: Backed by Google, it has a robust community and a growing number of resources for developers.
|
||||
|
||||
- **Case Studies**: Popular in mobile applications that require on-device inference with minimal footprint.
|
||||
|
||||
- **Maintenance and Updates**: Regularly updated to include the latest features and optimizations for mobile devices.
|
||||
|
||||
- **Security Considerations**: Provides a secure environment for running models on end-user devices.
|
||||
|
||||
- **Hardware Acceleration**: Supports a variety of hardware acceleration options, including GPU and DSP.
|
||||
|
||||
#### TF Edge TPU
|
||||
|
||||
TF Edge TPU is designed for high-speed, efficient computing on Google's Edge TPU hardware, perfect for IoT devices requiring real-time processing.
|
||||
|
||||
- **Performance Benchmarks**: Specifically optimized for high-speed, efficient computing on Google's Edge TPU hardware.
|
||||
|
||||
- **Compatibility and Integration**: Works exclusively with TensorFlow Lite models on Edge TPU devices.
|
||||
|
||||
- **Community Support and Ecosystem**: Growing support with resources provided by Google and third-party developers.
|
||||
|
||||
- **Case Studies**: Used in IoT devices and applications that require real-time processing with low latency.
|
||||
|
||||
- **Maintenance and Updates**: Continually improved upon to leverage the capabilities of new Edge TPU hardware releases.
|
||||
|
||||
- **Security Considerations**: Integrates with Google's robust security for IoT and edge devices.
|
||||
|
||||
- **Hardware Acceleration**: Custom-designed to take full advantage of Google Coral devices.
|
||||
|
||||
#### TF.js
|
||||
|
||||
TensorFlow.js (TF.js) is a library that brings machine learning capabilities directly to the browser, offering a new realm of possibilities for web developers and users alike. It allows for the integration of machine learning models in web applications without the need for back-end infrastructure.
|
||||
|
||||
- **Performance Benchmarks**: Enables machine learning directly in the browser with reasonable performance, depending on the client device.
|
||||
|
||||
- **Compatibility and Integration**: High compatibility with web technologies, allowing for easy integration into web applications.
|
||||
|
||||
- **Community Support and Ecosystem**: Support from a community of web and Node.js developers, with a variety of tools for deploying ML models in browsers.
|
||||
|
||||
- **Case Studies**: Ideal for interactive web applications that benefit from client-side machine learning without the need for server-side processing.
|
||||
|
||||
- **Maintenance and Updates**: Maintained by the TensorFlow team with contributions from the open-source community.
|
||||
|
||||
- **Security Considerations**: Runs within the browser's secure context, utilizing the security model of the web platform.
|
||||
|
||||
- **Hardware Acceleration**: Performance can be enhanced with web-based APIs that access hardware acceleration like WebGL.
|
||||
|
||||
#### PaddlePaddle
|
||||
|
||||
PaddlePaddle is an open-source deep learning framework developed by Baidu. It is designed to be both efficient for researchers and easy to use for developers. It's particularly popular in China and offers specialized support for Chinese language processing.
|
||||
|
||||
- **Performance Benchmarks**: Offers competitive performance with a focus on ease of use and scalability.
|
||||
|
||||
- **Compatibility and Integration**: Well-integrated within Baidu's ecosystem and supports a wide range of applications.
|
||||
|
||||
- **Community Support and Ecosystem**: While the community is smaller globally, it's rapidly growing, especially in China.
|
||||
|
||||
- **Case Studies**: Commonly used in Chinese markets and by developers looking for alternatives to other major frameworks.
|
||||
|
||||
- **Maintenance and Updates**: Regularly updated with a focus on serving Chinese language AI applications and services.
|
||||
|
||||
- **Security Considerations**: Emphasizes data privacy and security, catering to Chinese data governance standards.
|
||||
|
||||
- **Hardware Acceleration**: Supports various hardware accelerations, including Baidu's own Kunlun chips.
|
||||
|
||||
#### ncnn
|
||||
|
||||
ncnn is a high-performance neural network inference framework optimized for the mobile platform. It stands out for its lightweight nature and efficiency, making it particularly well-suited for mobile and embedded devices where resources are limited.
|
||||
|
||||
- **Performance Benchmarks**: Highly optimized for mobile platforms, offering efficient inference on ARM-based devices.
|
||||
|
||||
- **Compatibility and Integration**: Suitable for applications on mobile phones and embedded systems with ARM architecture.
|
||||
|
||||
- **Community Support and Ecosystem**: Supported by a niche but active community focused on mobile and embedded ML applications.
|
||||
|
||||
- **Case Studies**: Favoured for mobile applications where efficiency and speed are critical on Android and other ARM-based systems.
|
||||
|
||||
- **Maintenance and Updates**: Continuously improved to maintain high performance on a range of ARM devices.
|
||||
|
||||
- **Security Considerations**: Focuses on running locally on the device, leveraging the inherent security of on-device processing.
|
||||
|
||||
- **Hardware Acceleration**: Tailored for ARM CPUs and GPUs, with specific optimizations for these architectures.
|
||||
|
||||
## Comparative Analysis of YOLOv8 Deployment Options
|
||||
|
||||
The following table provides a snapshot of the various deployment options available for YOLOv8 models, helping you to assess which may best fit your project needs based on several critical criteria. For an in-depth look at each deployment option's format, please see the [Ultralytics documentation page on export formats](https://docs.ultralytics.com/modes/export/#export-formats).
|
||||
|
||||
| Deployment Option | Performance Benchmarks | Compatibility and Integration | Community Support and Ecosystem | Case Studies | Maintenance and Updates | Security Considerations | Hardware Acceleration |
|
||||
|--------------------|------------------------|-------------------------------|--------------------------------|--------------|------------------------|-------------------------|-----------------------|
|
||||
| PyTorch | Good flexibility; may trade off raw performance | Excellent with Python libraries | Extensive resources and community | Research and prototypes | Regular, active development | Dependent on deployment environment | CUDA support for GPU acceleration |
|
||||
| TorchScript | Better for production than PyTorch | Smooth transition from PyTorch to C++ | Specialized but narrower than PyTorch | Industry where Python is a bottleneck | Consistent updates with PyTorch | Improved security without full Python | Inherits CUDA support from PyTorch |
|
||||
| ONNX | Variable depending on runtime | High across different frameworks | Broad ecosystem, supported by many orgs | Flexibility across ML frameworks | Regular updates for new operations | Ensure secure conversion and deployment practices | Various hardware optimizations |
|
||||
| OpenVINO | Optimized for Intel hardware | Best within Intel ecosystem | Solid in computer vision domain | IoT and edge with Intel hardware | Regular updates for Intel hardware | Robust features for sensitive applications | Tailored for Intel hardware |
|
||||
| TensorRT | Top-tier on NVIDIA GPUs | Best for NVIDIA hardware | Strong network through NVIDIA | Real-time video and image inference | Frequent updates for new GPUs | Emphasis on security | Designed for NVIDIA GPUs |
|
||||
| CoreML | Optimized for on-device Apple hardware | Exclusive to Apple ecosystem | Strong Apple and developer support | On-device ML on Apple products | Regular Apple updates | Focus on privacy and security | Apple neural engine and GPU |
|
||||
| TF SavedModel | Scalable in server environments | Wide compatibility in TensorFlow ecosystem | Large support due to TensorFlow popularity | Serving models at scale | Regular updates by Google and community | Robust features for enterprise | Various hardware accelerations |
|
||||
| TF GraphDef | Stable for static computation graphs | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs | Scenarios requiring static graphs | Updates alongside TensorFlow core | Established TensorFlow security practices | TensorFlow acceleration options |
|
||||
| TF Lite | Speed and efficiency on mobile/embedded | Wide range of device support | Robust community, Google backed | Mobile applications with minimal footprint | Latest features for mobile | Secure environment on end-user devices | GPU and DSP among others |
|
||||
| TF Edge TPU | Optimized for Google's Edge TPU hardware | Exclusive to Edge TPU devices | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware | Google's robust IoT security | Custom-designed for Google Coral |
|
||||
| TF.js | Reasonable in-browser performance | High with web technologies | Web and Node.js developers support | Interactive web applications | TensorFlow team and community contributions | Web platform security model | Enhanced with WebGL and other APIs |
|
||||
| PaddlePaddle | Competitive, easy to use and scalable | Baidu ecosystem, wide application support | Rapidly growing, especially in China | Chinese market and language processing | Focus on Chinese AI applications | Emphasizes data privacy and security | Including Baidu's Kunlun chips |
|
||||
| ncnn | Optimized for mobile ARM-based devices | Mobile and embedded ARM systems | Niche but active mobile/embedded ML community | Android and ARM systems efficiency | High performance maintenance on ARM | On-device security advantages | ARM CPUs and GPUs optimizations |
|
||||
|
||||
This comparative analysis gives you a high-level overview. For deployment, it's essential to consider the specific requirements and constraints of your project, and consult the detailed documentation and resources available for each option.
|
||||
|
||||
## Community and Support
|
||||
|
||||
When you're getting started with YOLOv8, having a helpful community and support can make a significant impact. Here's how to connect with others who share your interests and get the assistance you need.
|
||||
|
||||
### Engage with the Broader Community
|
||||
|
||||
- **GitHub Discussions:** The YOLOv8 repository on GitHub has a "Discussions" section where you can ask questions, report issues, and suggest improvements.
|
||||
|
||||
- **Ultralytics Discord Server:** Ultralytics has a [Discord server](https://ultralytics.com/discord/) where you can interact with other users and developers.
|
||||
|
||||
### Official Documentation and Resources
|
||||
|
||||
- **Ultralytics YOLOv8 Docs:** The [official documentation](https://docs.ultralytics.com/) provides a comprehensive overview of YOLOv8, along with guides on installation, usage, and troubleshooting.
|
||||
|
||||
These resources will help you tackle challenges and stay updated on the latest trends and best practices in the YOLOv8 community.
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this guide, we've explored the different deployment options for YOLOv8. We've also discussed the important factors to consider when making your choice. These options allow you to customize your model for various environments and performance requirements, making it suitable for real-world applications.
|
||||
|
||||
Don't forget that the YOLOv8 and Ultralytics community is a valuable source of help. Connect with other developers and experts to learn unique tips and solutions you might not find in regular documentation. Keep seeking knowledge, exploring new ideas, and sharing your experiences.
|
||||
|
||||
Happy deploying!
|
||||
196
docs/en/guides/raspberry-pi.md
Normal file
196
docs/en/guides/raspberry-pi.md
Normal file
|
|
@ -0,0 +1,196 @@
|
|||
---
|
||||
comments: true
|
||||
description: Quick start guide to setting up YOLO on a Raspberry Pi with a Pi Camera using the libcamera stack. Detailed comparison between Raspberry Pi 3, 4 and 5 models.
|
||||
keywords: Ultralytics, YOLO, Raspberry Pi, Pi Camera, libcamera, quick start guide, Raspberry Pi 4 vs Raspberry Pi 5, YOLO on Raspberry Pi, hardware setup, machine learning, AI
|
||||
---
|
||||
|
||||
# Quick Start Guide: Raspberry Pi and Pi Camera with YOLOv5 and YOLOv8
|
||||
|
||||
This comprehensive guide aims to expedite your journey with YOLO object detection models on a [Raspberry Pi](https://www.raspberrypi.com/) using a [Pi Camera](https://www.raspberrypi.com/products/camera-module-v2/). Whether you're a student, hobbyist, or a professional, this guide is designed to get you up and running in less than 30 minutes. The instructions here are rigorously tested to minimize setup issues, allowing you to focus on utilizing YOLO for your specific projects.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/yul4gq_LrOI"
|
||||
title="Introducing Raspberry Pi 5" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Raspberry Pi 5 updates and improvements.
|
||||
</p>
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Raspberry Pi 3, 4 or 5
|
||||
- Pi Camera
|
||||
- 64-bit Raspberry Pi Operating System
|
||||
|
||||
Connect the Pi Camera to your Raspberry Pi via a CSI cable and install the 64-bit Raspberry Pi Operating System. Verify your camera with the following command:
|
||||
|
||||
```bash
|
||||
libcamera-hello
|
||||
```
|
||||
|
||||
You should see a video feed from your camera.
|
||||
|
||||
## Choose Your YOLO Version: YOLOv5 or YOLOv8
|
||||
|
||||
This guide offers you the flexibility to start with either [YOLOv5](https://github.com/ultralytics/yolov5) or [YOLOv8](https://github.com/ultralytics/ultralytics). Both versions have their unique advantages and use-cases. The choice is yours, but remember, the guide's aim is not just quick setup but also a robust foundation for your future work in object detection.
|
||||
|
||||
## Hardware Specifics: At a Glance
|
||||
|
||||
To assist you in making an informed hardware decision, we've summarized the key hardware specifics of Raspberry Pi 3, 4, and 5 in the table below:
|
||||
|
||||
| Feature | Raspberry Pi 3 | Raspberry Pi 4 | Raspberry Pi 5 |
|
||||
|----------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|----------------------------------------------------------------------|
|
||||
| **CPU** | 1.2GHz Quad-Core ARM Cortex-A53 | 1.5GHz Quad-core 64-bit ARM Cortex-A72 | 2.4GHz Quad-core 64-bit Arm Cortex-A76 |
|
||||
| **RAM** | 1GB LPDDR2 | 2GB, 4GB or 8GB LPDDR4 | *Details not yet available* |
|
||||
| **USB Ports** | 4 x USB 2.0 | 2 x USB 2.0, 2 x USB 3.0 | 2 x USB 3.0, 2 x USB 2.0 |
|
||||
| **Network** | Ethernet & Wi-Fi 802.11n | Gigabit Ethernet & Wi-Fi 802.11ac | Gigabit Ethernet with PoE+ support, Dual-band 802.11ac Wi-Fi® |
|
||||
| **Performance** | Slower, may require lighter YOLO models | Faster, can run complex YOLO models | *Details not yet available* |
|
||||
| **Power Requirement** | 2.5A power supply | 3.0A USB-C power supply | *Details not yet available* |
|
||||
| **Official Documentation** | [Link](https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2837/README.md) | [Link](https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2711/README.md) | [Link](https://www.raspberrypi.com/news/introducing-raspberry-pi-5/) |
|
||||
|
||||
Please make sure to follow the instructions specific to your Raspberry Pi model to ensure a smooth setup process.
|
||||
|
||||
## Quick Start with YOLOv5
|
||||
|
||||
This section outlines how to set up YOLOv5 on a Raspberry Pi with a Pi Camera. These steps are designed to be compatible with the libcamera camera stack introduced in Raspberry Pi OS Bullseye.
|
||||
|
||||
### Install Necessary Packages
|
||||
|
||||
1. Update the Raspberry Pi:
|
||||
|
||||
```bash
|
||||
sudo apt-get update
|
||||
sudo apt-get upgrade -y
|
||||
sudo apt-get autoremove -y
|
||||
```
|
||||
|
||||
2. Clone the YOLOv5 repository:
|
||||
|
||||
```bash
|
||||
cd ~
|
||||
git clone https://github.com/Ultralytics/yolov5.git
|
||||
```
|
||||
|
||||
3. Install the required dependencies:
|
||||
|
||||
```bash
|
||||
cd ~/yolov5
|
||||
pip3 install -r requirements.txt
|
||||
```
|
||||
|
||||
4. For Raspberry Pi 3, install compatible versions of PyTorch and Torchvision (skip for Raspberry Pi 4):
|
||||
|
||||
```bash
|
||||
pip3 uninstall torch torchvision
|
||||
pip3 install torch==1.11.0 torchvision==0.12.0
|
||||
```
|
||||
|
||||
### Modify `detect.py`
|
||||
|
||||
To enable TCP streams via SSH or the CLI, minor modifications are needed in `detect.py`.
|
||||
|
||||
1. Open `detect.py`:
|
||||
|
||||
```bash
|
||||
sudo nano ~/yolov5/detect.py
|
||||
```
|
||||
|
||||
2. Find and modify the `is_url` line to accept TCP streams:
|
||||
|
||||
```python
|
||||
is_url = source.lower().startswith(('rtsp://', 'rtmp://', 'http://', 'https://', 'tcp://'))
|
||||
```
|
||||
|
||||
3. Comment out the `view_img` line:
|
||||
|
||||
```python
|
||||
# view_img = check_imshow(warn=True)
|
||||
```
|
||||
|
||||
4. Save and exit:
|
||||
|
||||
```bash
|
||||
CTRL + O -> ENTER -> CTRL + X
|
||||
```
|
||||
|
||||
### Initiate TCP Stream with Libcamera
|
||||
|
||||
1. Start the TCP stream:
|
||||
|
||||
```bash
|
||||
libcamera-vid -n -t 0 --width 1280 --height 960 --framerate 1 --inline --listen -o tcp://127.0.0.1:8888
|
||||
```
|
||||
|
||||
Keep this terminal session running for the next steps.
|
||||
|
||||
### Perform YOLOv5 Inference
|
||||
|
||||
1. Run the YOLOv5 detection:
|
||||
|
||||
```bash
|
||||
cd ~/yolov5
|
||||
python3 detect.py --source=tcp://127.0.0.1:8888
|
||||
```
|
||||
|
||||
## Quick Start with YOLOv8
|
||||
|
||||
Follow this section if you are interested in setting up YOLOv8 instead. The steps are quite similar but are tailored for YOLOv8's specific needs.
|
||||
|
||||
### Install Necessary Packages
|
||||
|
||||
1. Update the Raspberry Pi:
|
||||
|
||||
```bash
|
||||
sudo apt-get update
|
||||
sudo apt-get upgrade -y
|
||||
sudo apt-get autoremove -y
|
||||
```
|
||||
|
||||
2. Install the `ultralytics` Python package:
|
||||
|
||||
```bash
|
||||
pip3 install ultralytics
|
||||
```
|
||||
|
||||
3. Reboot:
|
||||
|
||||
```bash
|
||||
sudo reboot
|
||||
```
|
||||
|
||||
### Initiate TCP Stream with Libcamera
|
||||
|
||||
1. Start the TCP stream:
|
||||
|
||||
```bash
|
||||
libcamera-vid -n -t 0 --width 1280 --height 960 --framerate 1 --inline --listen -o tcp://127.0.0.1:8888
|
||||
```
|
||||
|
||||
### Perform YOLOv8 Inference
|
||||
|
||||
To perform inference with YOLOv8, you can use the following Python code snippet:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
model = YOLO('yolov8n.pt')
|
||||
results = model('tcp://127.0.0.1:8888', stream=True)
|
||||
|
||||
while True:
|
||||
for result in results:
|
||||
boxes = result.boxes
|
||||
probs = result.probs
|
||||
```
|
||||
|
||||
## Next Steps
|
||||
|
||||
Congratulations on successfully setting up YOLO on your Raspberry Pi! For further learning and support, visit [Ultralytics](https://ultralytics.com/) and [Kashmir World Foundation](https://www.kashmirworldfoundation.org/).
|
||||
|
||||
## Acknowledgements and Citations
|
||||
|
||||
This guide was initially created by Daan Eeltink for Kashmir World Foundation, an organization dedicated to the use of YOLO for the conservation of endangered species. We acknowledge their pioneering work and educational focus in the realm of object detection technologies.
|
||||
|
||||
For more information about Kashmir World Foundation's activities, you can visit their [website](https://www.kashmirworldfoundation.org/).
|
||||
185
docs/en/guides/sahi-tiled-inference.md
Normal file
185
docs/en/guides/sahi-tiled-inference.md
Normal file
|
|
@ -0,0 +1,185 @@
|
|||
---
|
||||
comments: true
|
||||
description: A comprehensive guide on how to use YOLOv8 with SAHI for standard and sliced inference in object detection tasks.
|
||||
keywords: YOLOv8, SAHI, Sliced Inference, Object Detection, Ultralytics, Large Scale Image Analysis, High-Resolution Imagery
|
||||
---
|
||||
|
||||
# Ultralytics Docs: Using YOLOv8 with SAHI for Sliced Inference
|
||||
|
||||
Welcome to the Ultralytics documentation on how to use YOLOv8 with [SAHI](https://github.com/obss/sahi) (Slicing Aided Hyper Inference). This comprehensive guide aims to furnish you with all the essential knowledge you'll need to implement SAHI alongside YOLOv8. We'll deep-dive into what SAHI is, why sliced inference is critical for large-scale applications, and how to integrate these functionalities with YOLOv8 for enhanced object detection performance.
|
||||
|
||||
<p align="center">
|
||||
<img width="1024" src="https://raw.githubusercontent.com/obss/sahi/main/resources/sliced_inference.gif" alt="SAHI Sliced Inference Overview">
|
||||
</p>
|
||||
|
||||
## Introduction to SAHI
|
||||
|
||||
SAHI (Slicing Aided Hyper Inference) is an innovative library designed to optimize object detection algorithms for large-scale and high-resolution imagery. Its core functionality lies in partitioning images into manageable slices, running object detection on each slice, and then stitching the results back together. SAHI is compatible with a range of object detection models, including the YOLO series, thereby offering flexibility while ensuring optimized use of computational resources.
|
||||
|
||||
### Key Features of SAHI
|
||||
|
||||
- **Seamless Integration**: SAHI integrates effortlessly with YOLO models, meaning you can start slicing and detecting without a lot of code modification.
|
||||
- **Resource Efficiency**: By breaking down large images into smaller parts, SAHI optimizes the memory usage, allowing you to run high-quality detection on hardware with limited resources.
|
||||
- **High Accuracy**: SAHI maintains the detection accuracy by employing smart algorithms to merge overlapping detection boxes during the stitching process.
|
||||
|
||||
## What is Sliced Inference?
|
||||
|
||||
Sliced Inference refers to the practice of subdividing a large or high-resolution image into smaller segments (slices), conducting object detection on these slices, and then recompiling the slices to reconstruct the object locations on the original image. This technique is invaluable in scenarios where computational resources are limited or when working with extremely high-resolution images that could otherwise lead to memory issues.
|
||||
|
||||
### Benefits of Sliced Inference
|
||||
|
||||
- **Reduced Computational Burden**: Smaller image slices are faster to process, and they consume less memory, enabling smoother operation on lower-end hardware.
|
||||
|
||||
- **Preserved Detection Quality**: Since each slice is treated independently, there is no reduction in the quality of object detection, provided the slices are large enough to capture the objects of interest.
|
||||
|
||||
- **Enhanced Scalability**: The technique allows for object detection to be more easily scaled across different sizes and resolutions of images, making it ideal for a wide range of applications from satellite imagery to medical diagnostics.
|
||||
|
||||
<table border="0">
|
||||
<tr>
|
||||
<th>YOLOv8 without SAHI</th>
|
||||
<th>YOLOv8 with SAHI</th>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><img src="https://user-images.githubusercontent.com/26833433/266123241-260a9740-5998-4e9a-ad04-b39b7767e731.png" alt="YOLOv8 without SAHI" width="640"></td>
|
||||
<td><img src="https://user-images.githubusercontent.com/26833433/266123245-55f696ad-ec74-4e71-9155-c211d693bb69.png" alt="YOLOv8 with SAHI" width="640"></td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
## Installation and Preparation
|
||||
|
||||
### Installation
|
||||
|
||||
To get started, install the latest versions of SAHI and Ultralytics:
|
||||
|
||||
```bash
|
||||
pip install -U ultralytics sahi
|
||||
```
|
||||
|
||||
### Import Modules and Download Resources
|
||||
|
||||
Here's how to import the necessary modules and download a YOLOv8 model and some test images:
|
||||
|
||||
```python
|
||||
from sahi.utils.yolov8 import download_yolov8s_model
|
||||
from sahi import AutoDetectionModel
|
||||
from sahi.utils.cv import read_image
|
||||
from sahi.utils.file import download_from_url
|
||||
from sahi.predict import get_prediction, get_sliced_prediction, predict
|
||||
from pathlib import Path
|
||||
from IPython.display import Image
|
||||
|
||||
# Download YOLOv8 model
|
||||
yolov8_model_path = "models/yolov8s.pt"
|
||||
download_yolov8s_model(yolov8_model_path)
|
||||
|
||||
# Download test images
|
||||
download_from_url('https://raw.githubusercontent.com/obss/sahi/main/demo/demo_data/small-vehicles1.jpeg', 'demo_data/small-vehicles1.jpeg')
|
||||
download_from_url('https://raw.githubusercontent.com/obss/sahi/main/demo/demo_data/terrain2.png', 'demo_data/terrain2.png')
|
||||
```
|
||||
|
||||
## Standard Inference with YOLOv8
|
||||
|
||||
### Instantiate the Model
|
||||
|
||||
You can instantiate a YOLOv8 model for object detection like this:
|
||||
|
||||
```python
|
||||
detection_model = AutoDetectionModel.from_pretrained(
|
||||
model_type='yolov8',
|
||||
model_path=yolov8_model_path,
|
||||
confidence_threshold=0.3,
|
||||
device="cpu", # or 'cuda:0'
|
||||
)
|
||||
```
|
||||
|
||||
### Perform Standard Prediction
|
||||
|
||||
Perform standard inference using an image path or a numpy image.
|
||||
|
||||
```python
|
||||
# With an image path
|
||||
result = get_prediction("demo_data/small-vehicles1.jpeg", detection_model)
|
||||
|
||||
# With a numpy image
|
||||
result = get_prediction(read_image("demo_data/small-vehicles1.jpeg"), detection_model)
|
||||
```
|
||||
|
||||
### Visualize Results
|
||||
|
||||
Export and visualize the predicted bounding boxes and masks:
|
||||
|
||||
```python
|
||||
result.export_visuals(export_dir="demo_data/")
|
||||
Image("demo_data/prediction_visual.png")
|
||||
```
|
||||
|
||||
## Sliced Inference with YOLOv8
|
||||
|
||||
Perform sliced inference by specifying the slice dimensions and overlap ratios:
|
||||
|
||||
```python
|
||||
result = get_sliced_prediction(
|
||||
"demo_data/small-vehicles1.jpeg",
|
||||
detection_model,
|
||||
slice_height=256,
|
||||
slice_width=256,
|
||||
overlap_height_ratio=0.2,
|
||||
overlap_width_ratio=0.2
|
||||
)
|
||||
```
|
||||
|
||||
## Handling Prediction Results
|
||||
|
||||
SAHI provides a `PredictionResult` object, which can be converted into various annotation formats:
|
||||
|
||||
```python
|
||||
# Access the object prediction list
|
||||
object_prediction_list = result.object_prediction_list
|
||||
|
||||
# Convert to COCO annotation, COCO prediction, imantics, and fiftyone formats
|
||||
result.to_coco_annotations()[:3]
|
||||
result.to_coco_predictions(image_id=1)[:3]
|
||||
result.to_imantics_annotations()[:3]
|
||||
result.to_fiftyone_detections()[:3]
|
||||
```
|
||||
|
||||
## Batch Prediction
|
||||
|
||||
For batch prediction on a directory of images:
|
||||
|
||||
```python
|
||||
predict(
|
||||
model_type="yolov8",
|
||||
model_path="path/to/yolov8n.pt",
|
||||
model_device="cpu", # or 'cuda:0'
|
||||
model_confidence_threshold=0.4,
|
||||
source="path/to/dir",
|
||||
slice_height=256,
|
||||
slice_width=256,
|
||||
overlap_height_ratio=0.2,
|
||||
overlap_width_ratio=0.2,
|
||||
)
|
||||
```
|
||||
|
||||
That's it! Now you're equipped to use YOLOv8 with SAHI for both standard and sliced inference.
|
||||
|
||||
## Citations and Acknowledgments
|
||||
|
||||
If you use SAHI in your research or development work, please cite the original SAHI paper and acknowledge the authors:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{akyon2022sahi,
|
||||
title={Slicing Aided Hyper Inference and Fine-tuning for Small Object Detection},
|
||||
author={Akyon, Fatih Cagatay and Altinuc, Sinan Onur and Temizel, Alptekin},
|
||||
journal={2022 IEEE International Conference on Image Processing (ICIP)},
|
||||
doi={10.1109/ICIP46576.2022.9897990},
|
||||
pages={966-970},
|
||||
year={2022}
|
||||
}
|
||||
```
|
||||
|
||||
We extend our thanks to the SAHI research group for creating and maintaining this invaluable resource for the computer vision community. For more information about SAHI and its creators, visit the [SAHI GitHub repository](https://github.com/obss/sahi).
|
||||
137
docs/en/guides/triton-inference-server.md
Normal file
137
docs/en/guides/triton-inference-server.md
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
---
|
||||
comments: true
|
||||
description: A step-by-step guide on integrating Ultralytics YOLOv8 with Triton Inference Server for scalable and high-performance deep learning inference deployments.
|
||||
keywords: YOLOv8, Triton Inference Server, ONNX, Deep Learning Deployment, Scalable Inference, Ultralytics, NVIDIA, Object Detection, Cloud Inferencing
|
||||
---
|
||||
|
||||
# Triton Inference Server with Ultralytics YOLOv8
|
||||
|
||||
The [Triton Inference Server](https://developer.nvidia.com/nvidia-triton-inference-server) (formerly known as TensorRT Inference Server) is an open-source software solution developed by NVIDIA. It provides a cloud inferencing solution optimized for NVIDIA GPUs. Triton simplifies the deployment of AI models at scale in production. Integrating Ultralytics YOLOv8 with Triton Inference Server allows you to deploy scalable, high-performance deep learning inference workloads. This guide provides steps to set up and test the integration.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/NQDtfSi5QF4"
|
||||
title="Getting Started with NVIDIA Triton Inference Server" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Getting Started with NVIDIA Triton Inference Server.
|
||||
</p>
|
||||
|
||||
## What is Triton Inference Server?
|
||||
|
||||
Triton Inference Server is designed to deploy a variety of AI models in production. It supports a wide range of deep learning and machine learning frameworks, including TensorFlow, PyTorch, ONNX Runtime, and many others. Its primary use cases are:
|
||||
|
||||
- Serving multiple models from a single server instance.
|
||||
- Dynamic model loading and unloading without server restart.
|
||||
- Ensemble inferencing, allowing multiple models to be used together to achieve results.
|
||||
- Model versioning for A/B testing and rolling updates.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Ensure you have the following prerequisites before proceeding:
|
||||
|
||||
- Docker installed on your machine.
|
||||
- Install `tritonclient`:
|
||||
```bash
|
||||
pip install tritonclient[all]
|
||||
```
|
||||
|
||||
## Exporting YOLOv8 to ONNX Format
|
||||
|
||||
Before deploying the model on Triton, it must be exported to the ONNX format. ONNX (Open Neural Network Exchange) is a format that allows models to be transferred between different deep learning frameworks. Use the `export` function from the `YOLO` class:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load an official model
|
||||
|
||||
# Export the model
|
||||
onnx_file = model.export(format='onnx', dynamic=True)
|
||||
```
|
||||
|
||||
## Setting Up Triton Model Repository
|
||||
|
||||
The Triton Model Repository is a storage location where Triton can access and load models.
|
||||
|
||||
1. Create the necessary directory structure:
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
|
||||
# Define paths
|
||||
triton_repo_path = Path('tmp') / 'triton_repo'
|
||||
triton_model_path = triton_repo_path / 'yolo'
|
||||
|
||||
# Create directories
|
||||
(triton_model_path / '1').mkdir(parents=True, exist_ok=True)
|
||||
```
|
||||
|
||||
2. Move the exported ONNX model to the Triton repository:
|
||||
|
||||
```python
|
||||
from pathlib import Path
|
||||
|
||||
# Move ONNX model to Triton Model path
|
||||
Path(onnx_file).rename(triton_model_path / '1' / 'model.onnx')
|
||||
|
||||
# Create config file
|
||||
(triton_model_path / 'config.pdtxt').touch()
|
||||
```
|
||||
|
||||
## Running Triton Inference Server
|
||||
|
||||
Run the Triton Inference Server using Docker:
|
||||
|
||||
```python
|
||||
import subprocess
|
||||
import time
|
||||
|
||||
from tritonclient.http import InferenceServerClient
|
||||
|
||||
# Define image https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver
|
||||
tag = 'nvcr.io/nvidia/tritonserver:23.09-py3' # 6.4 GB
|
||||
|
||||
# Pull the image
|
||||
subprocess.call(f'docker pull {tag}', shell=True)
|
||||
|
||||
# Run the Triton server and capture the container ID
|
||||
container_id = subprocess.check_output(
|
||||
f'docker run -d --rm -v {triton_repo_path}:/models -p 8000:8000 {tag} tritonserver --model-repository=/models',
|
||||
shell=True).decode('utf-8').strip()
|
||||
|
||||
# Wait for the Triton server to start
|
||||
triton_client = InferenceServerClient(url='localhost:8000', verbose=False, ssl=False)
|
||||
|
||||
# Wait until model is ready
|
||||
for _ in range(10):
|
||||
with contextlib.suppress(Exception):
|
||||
assert triton_client.is_model_ready(model_name)
|
||||
break
|
||||
time.sleep(1)
|
||||
```
|
||||
|
||||
Then run inference using the Triton Server model:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the Triton Server model
|
||||
model = YOLO(f'http://localhost:8000/yolo', task='detect')
|
||||
|
||||
# Run inference on the server
|
||||
results = model('path/to/image.jpg')
|
||||
```
|
||||
|
||||
Cleanup the container:
|
||||
|
||||
```python
|
||||
# Kill and remove the container at the end of the test
|
||||
subprocess.call(f'docker kill {container_id}', shell=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
By following the above steps, you can deploy and run Ultralytics YOLOv8 models efficiently on Triton Inference Server, providing a scalable and high-performance solution for deep learning inference tasks. If you face any issues or have further queries, refer to the [official Triton documentation](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/index.html) or reach out to the Ultralytics community for support.
|
||||
276
docs/en/guides/yolo-common-issues.md
Normal file
276
docs/en/guides/yolo-common-issues.md
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
---
|
||||
comments: true
|
||||
description: A comprehensive guide to troubleshooting common issues encountered while working with YOLOv8 in the Ultralytics ecosystem.
|
||||
keywords: Troubleshooting, Ultralytics, YOLOv8, Installation Errors, Training Data, Model Performance, Hyperparameter Tuning, Deployment
|
||||
---
|
||||
|
||||
# Troubleshooting Common YOLO Issues
|
||||
|
||||
<p align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/273067258-7c1b9aee-b4e8-43b5-befd-588d4f0bd361.png" alt="YOLO Common Issues Image">
|
||||
</p>
|
||||
|
||||
## Introduction
|
||||
|
||||
This guide serves as a comprehensive aid for troubleshooting common issues encountered while working with YOLOv8 on your Ultralytics projects. Navigating through these issues can be a breeze with the right guidance, ensuring your projects remain on track without unnecessary delays.
|
||||
|
||||
## Common Issues
|
||||
|
||||
### Installation Errors
|
||||
|
||||
Installation errors can arise due to various reasons, such as incompatible versions, missing dependencies, or incorrect environment setups. First, check to make sure you are doing the following:
|
||||
|
||||
- You're using Python 3.8 or later as recommended.
|
||||
|
||||
- Ensure that you have the correct version of PyTorch (1.8 or later) installed.
|
||||
|
||||
- Consider using virtual environments to avoid conflicts.
|
||||
|
||||
- Follow the [official installation guide](https://docs.ultralytics.com/quickstart/) step by step.
|
||||
|
||||
Additionally, here are some common installation issues users have encountered, along with their respective solutions:
|
||||
|
||||
- Import Errors or Dependency Issues - If you're getting errors during the import of YOLOv8, or you're having issues related to dependencies, consider the following troubleshooting steps:
|
||||
|
||||
- **Fresh Installation**: Sometimes, starting with a fresh installation can resolve unexpected issues. Especially with libraries like Ultralytics, where updates might introduce changes to the file tree structure or functionalities.
|
||||
|
||||
- **Update Regularly**: Ensure you're using the latest version of the library. Older versions might not be compatible with recent updates, leading to potential conflicts or issues.
|
||||
|
||||
- **Check Dependencies**: Verify that all required dependencies are correctly installed and are of the compatible versions.
|
||||
|
||||
- **Review Changes**: If you initially cloned or installed an older version, be aware that significant updates might affect the library's structure or functionalities. Always refer to the official documentation or changelogs to understand any major changes.
|
||||
|
||||
- Remember, keeping your libraries and dependencies up-to-date is crucial for a smooth and error-free experience.
|
||||
|
||||
- Running YOLOv8 on GPU - If you're having trouble running YOLOv8 on GPU, consider the following troubleshooting steps:
|
||||
|
||||
- **Verify CUDA Compatibility and Installation**: Ensure your GPU is CUDA compatible and that CUDA is correctly installed. Use the `nvidia-smi` command to check the status of your NVIDIA GPU and CUDA version.
|
||||
|
||||
- **Check PyTorch and CUDA Integration**: Ensure PyTorch can utilize CUDA by running `import torch; print(torch.cuda.is_available())` in a Python terminal. If it returns 'True', PyTorch is set up to use CUDA.
|
||||
|
||||
- **Environment Activation**: Ensure you're in the correct environment where all necessary packages are installed.
|
||||
|
||||
- **Update Your Packages**: Outdated packages might not be compatible with your GPU. Keep them updated.
|
||||
|
||||
- **Program Configuration**: Check if the program or code specifies GPU usage. In YOLOv8, this might be in the settings or configuration.
|
||||
|
||||
### Model Training Issues
|
||||
|
||||
This section will address common issues faced while training and their respective explanations and solutions.
|
||||
|
||||
#### Verification of Configuration Settings
|
||||
|
||||
**Issue**: You are unsure whether the configuration settings in the `.yaml` file are being applied correctly during model training.
|
||||
|
||||
**Solution**: The configuration settings in the `.yaml` file should be applied when using the `model.train()` function. To ensure that these settings are correctly applied, follow these steps:
|
||||
|
||||
- Confirm that the path to your `.yaml` configuration file is correct.
|
||||
- Make sure you pass the path to your `.yaml` file as the `data` argument when calling `model.train()`, as shown below:
|
||||
|
||||
```python
|
||||
model.train(data='/path/to/your/data.yaml', batch=4)
|
||||
```
|
||||
|
||||
#### Accelerating Training with Multiple GPUs
|
||||
|
||||
**Issue**: Training is slow on a single GPU, and you want to speed up the process using multiple GPUs.
|
||||
|
||||
**Solution**: Increasing the batch size can accelerate training, but it's essential to consider GPU memory capacity. To speed up training with multiple GPUs, follow these steps:
|
||||
|
||||
- Ensure that you have multiple GPUs available.
|
||||
|
||||
- Modify your .yaml configuration file to specify the number of GPUs to use, e.g., gpus: 4.
|
||||
|
||||
- Increase the batch size accordingly to fully utilize the multiple GPUs without exceeding memory limits.
|
||||
|
||||
- Modify your training command to utilize multiple GPUs:
|
||||
|
||||
```python
|
||||
# Adjust the batch size and other settings as needed to optimize training speed
|
||||
model.train(data='/path/to/your/data.yaml', batch=32, multi_scale=True)
|
||||
```
|
||||
|
||||
#### Continuous Monitoring Parameters
|
||||
|
||||
**Issue**: You want to know which parameters should be continuously monitored during training, apart from loss.
|
||||
|
||||
**Solution**: While loss is a crucial metric to monitor, it's also essential to track other metrics for model performance optimization. Some key metrics to monitor during training include:
|
||||
|
||||
- Precision
|
||||
- Recall
|
||||
- Mean Average Precision (mAP)
|
||||
|
||||
You can access these metrics from the training logs or by using tools like TensorBoard or wandb for visualization. Implementing early stopping based on these metrics can help you achieve better results.
|
||||
|
||||
#### Tools for Tracking Training Progress
|
||||
|
||||
**Issue**: You are looking for recommendations on tools to track training progress.
|
||||
|
||||
**Solution**: To track and visualize training progress, you can consider using the following tools:
|
||||
|
||||
- [TensorBoard](https://www.tensorflow.org/tensorboard): TensorBoard is a popular choice for visualizing training metrics, including loss, accuracy, and more. You can integrate it with your YOLOv8 training process.
|
||||
- [Comet](https://bit.ly/yolov8-readme-comet): Comet provides an extensive toolkit for experiment tracking and comparison. It allows you to track metrics, hyperparameters, and even model weights. Integration with YOLO models is also straightforward, providing you with a complete overview of your experiment cycle.
|
||||
- [Ultralytics HUB](https://hub.ultralytics.com): Ultralytics HUB offers a specialized environment for tracking YOLO models, giving you a one-stop platform to manage metrics, datasets, and even collaborate with your team. Given its tailored focus on YOLO, it offers more customized tracking options.
|
||||
|
||||
Each of these tools offers its own set of advantages, so you may want to consider the specific needs of your project when making a choice.
|
||||
|
||||
#### How to Check if Training is Happening on the GPU
|
||||
|
||||
**Issue**: The 'device' value in the training logs is 'null,' and you're unsure if training is happening on the GPU.
|
||||
|
||||
**Solution**: The 'device' value being 'null' typically means that the training process is set to automatically use an available GPU, which is the default behavior. To ensure training occurs on a specific GPU, you can manually set the 'device' value to the GPU index (e.g., '0' for the first GPU) in your .yaml configuration file:
|
||||
|
||||
```yaml
|
||||
device: 0
|
||||
```
|
||||
|
||||
This will explicitly assign the training process to the specified GPU. If you wish to train on the CPU, set 'device' to 'cpu'.
|
||||
|
||||
Keep an eye on the 'runs' folder for logs and metrics to monitor training progress effectively.
|
||||
|
||||
#### Key Considerations for Effective Model Training
|
||||
|
||||
Here are some things to keep in mind, if you are facing issues related to model training.
|
||||
|
||||
**Dataset Format and Labels**
|
||||
|
||||
- Importance: The foundation of any machine learning model lies in the quality and format of the data it is trained on.
|
||||
|
||||
- Recommendation: Ensure that your custom dataset and its associated labels adhere to the expected format. It's crucial to verify that annotations are accurate and of high quality. Incorrect or subpar annotations can derail the model's learning process, leading to unpredictable outcomes.
|
||||
|
||||
**Model Convergence**
|
||||
|
||||
- Importance: Achieving model convergence ensures that the model has sufficiently learned from the training data.
|
||||
|
||||
- Recommendation: When training a model 'from scratch', it's vital to ensure that the model reaches a satisfactory level of convergence. This might necessitate a longer training duration, with more epochs, compared to when you're fine-tuning an existing model.
|
||||
|
||||
**Learning Rate and Batch Size**
|
||||
|
||||
- Importance: These hyperparameters play a pivotal role in determining how the model updates its weights during training.
|
||||
|
||||
- Recommendation: Regularly evaluate if the chosen learning rate and batch size are optimal for your specific dataset. Parameters that are not in harmony with the dataset's characteristics can hinder the model's performance.
|
||||
|
||||
**Class Distribution**
|
||||
|
||||
- Importance: The distribution of classes in your dataset can influence the model's prediction tendencies.
|
||||
|
||||
- Recommendation: Regularly assess the distribution of classes within your dataset. If there's a class imbalance, there's a risk that the model will develop a bias towards the more prevalent class. This bias can be evident in the confusion matrix, where the model might predominantly predict the majority class.
|
||||
|
||||
**Cross-Check with Pretrained Weights**
|
||||
|
||||
- Importance: Leveraging pretrained weights can provide a solid starting point for model training, especially when data is limited.
|
||||
|
||||
- Recommendation: As a diagnostic step, consider training your model using the same data but initializing it with pretrained weights. If this approach yields a well-formed confusion matrix, it could suggest that the 'from scratch' model might require further training or adjustments.
|
||||
|
||||
### Issues Related to Model Predictions
|
||||
|
||||
This section will address common issues faced during model prediction.
|
||||
|
||||
#### Getting Bounding Box Predictions With Your YOLOv8 Custom Model
|
||||
|
||||
**Issue**: When running predictions with a custom YOLOv8 model, there are challenges with the format and visualization of the bounding box coordinates.
|
||||
|
||||
**Solution**:
|
||||
|
||||
- Coordinate Format: YOLOv8 provides bounding box coordinates in absolute pixel values. To convert these to relative coordinates (ranging from 0 to 1), you need to divide by the image dimensions. For example, let’s say your image size is 640x640. Then you would do the following:
|
||||
|
||||
```python
|
||||
# Convert absolute coordinates to relative coordinates
|
||||
x1 = x1 / 640 # Divide x-coordinates by image width
|
||||
x2 = x2 / 640
|
||||
y1 = y1 / 640 # Divide y-coordinates by image height
|
||||
y2 = y2 / 640
|
||||
```
|
||||
|
||||
- File Name: To obtain the file name of the image you're predicting on, access the image file path directly from the result object within your prediction loop.
|
||||
|
||||
#### Filtering Objects in YOLOv8 Predictions
|
||||
|
||||
**Issue**: Facing issues with how to filter and display only specific objects in the prediction results when running YOLOv8 using the Ultralytics library.
|
||||
|
||||
**Solution**: To detect specific classes use the classes argument to specify the classes you want to include in the output. For instance, to detect only cars (assuming 'cars' have class index 2):
|
||||
|
||||
```shell
|
||||
yolo task=detect mode=segment model=yolov8n-seg.pt source='path/to/car.mp4' show=True classes=2
|
||||
```
|
||||
|
||||
#### Understanding Precision Metrics in YOLOv8
|
||||
|
||||
**Issue**: Confusion regarding the difference between box precision, mask precision, and confusion matrix precision in YOLOv8.
|
||||
|
||||
**Solution**: Box precision measures the accuracy of predicted bounding boxes compared to the actual ground truth boxes using IoU (Intersection over Union) as the metric. Mask precision assesses the agreement between predicted segmentation masks and ground truth masks in pixel-wise object classification. Confusion matrix precision, on the other hand, focuses on overall classification accuracy across all classes and does not consider the geometric accuracy of predictions. It's important to note that a bounding box can be geometrically accurate (true positive) even if the class prediction is wrong, leading to differences between box precision and confusion matrix precision. These metrics evaluate distinct aspects of a model's performance, reflecting the need for different evaluation metrics in various tasks.
|
||||
|
||||
#### Extracting Object Dimensions in YOLOv8
|
||||
|
||||
**Issue**: Difficulty in retrieving the length and height of detected objects in YOLOv8, especially when multiple objects are detected in an image.
|
||||
|
||||
**Solution**: To retrieve the bounding box dimensions, first use the Ultralytics YOLOv8 model to predict objects in an image. Then, extract the width and height information of bounding boxes from the prediction results.
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pre-trained YOLOv8 model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Specify the source image
|
||||
source = 'https://ultralytics.com/images/bus.jpg'
|
||||
|
||||
# Make predictions
|
||||
results = model.predict(source, save=True, imgsz=320, conf=0.5)
|
||||
|
||||
# Extract bounding box dimensions
|
||||
boxes = results[0].boxes.xywh.cpu()
|
||||
for box in boxes:
|
||||
x, y, w, h = box
|
||||
print(f"Width of Box: {w}, Height of Box: {h}")
|
||||
```
|
||||
|
||||
### Deployment Challenges
|
||||
|
||||
#### GPU Deployment Issues
|
||||
|
||||
**Issue:** Deploying models in a multi-GPU environment can sometimes lead to unexpected behaviors like unexpected memory usage, inconsistent results across GPUs, etc.
|
||||
|
||||
**Solution:** Check for default GPU initialization. Some frameworks, like PyTorch, might initialize CUDA operations on a default GPU before transitioning to the designated GPUs. To bypass unexpected default initializations, specify the GPU directly during deployment and prediction. Then, use tools to monitor GPU utilization and memory usage to identify any anomalies in real-time. Also, ensure you're using the latest version of the framework or library.
|
||||
|
||||
#### Model Conversion/Exporting Issues
|
||||
|
||||
**Issue:** During the process of converting or exporting machine learning models to different formats or platforms, users might encounter errors or unexpected behaviors.
|
||||
|
||||
**Solution:**
|
||||
|
||||
- Compatibility Check: Ensure that you are using versions of libraries and frameworks that are compatible with each other. Mismatched versions can lead to unexpected errors during conversion.
|
||||
|
||||
- Environment Reset: If you're using an interactive environment like Jupyter or Colab, consider restarting your environment after making significant changes or installations. A fresh start can sometimes resolve underlying issues.
|
||||
|
||||
- Official Documentation: Always refer to the official documentation of the tool or library you are using for conversion. It often contains specific guidelines and best practices for model exporting.
|
||||
|
||||
- Community Support: Check the library or framework's official repository for similar issues reported by other users. The maintainers or community might have provided solutions or workarounds in discussion threads.
|
||||
|
||||
- Update Regularly: Ensure that you are using the latest version of the tool or library. Developers frequently release updates that fix known bugs or improve functionality.
|
||||
|
||||
- Test Incrementally: Before performing a full conversion, test the process with a smaller model or dataset to identify potential issues early on.
|
||||
|
||||
## Community and Support
|
||||
|
||||
Engaging with a community of like-minded individuals can significantly enhance your experience and success in working with YOLOv8. Below are some channels and resources you may find helpful.
|
||||
|
||||
### Forums and Channels for Getting Help
|
||||
|
||||
**GitHub Issues:** The YOLOv8 repository on GitHub has an [Issues tab](https://github.com/ultralytics/ultralytics/issues) where you can ask questions, report bugs, and suggest new features. The community and maintainers are active here, and it’s a great place to get help with specific problems.
|
||||
|
||||
**Ultralytics Discord Server:** Ultralytics has a [Discord server](https://ultralytics.com/discord/) where you can interact with other users and the developers.
|
||||
|
||||
### Official Documentation and Resources
|
||||
|
||||
**Ultralytics YOLOv8 Docs**: The [official documentation](https://docs.ultralytics.com/) provides a comprehensive overview of YOLOv8, along with guides on installation, usage, and troubleshooting.
|
||||
|
||||
These resources should provide a solid foundation for troubleshooting and improving your YOLOv8 projects, as well as connecting with others in the YOLOv8 community.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Troubleshooting is an integral part of any development process, and being equipped with the right knowledge can significantly reduce the time and effort spent in resolving issues. This guide aimed to address the most common challenges faced by users of the YOLOv8 model within the Ultralytics ecosystem. By understanding and addressing these common issues, you can ensure smoother project progress and achieve better results with your computer vision tasks.
|
||||
|
||||
Remember, the Ultralytics community is a valuable resource. Engaging with fellow developers and experts can provide additional insights and solutions that might not be covered in standard documentation. Always keep learning, experimenting, and sharing your experiences to contribute to the collective knowledge of the community.
|
||||
|
||||
Happy troubleshooting!
|
||||
165
docs/en/guides/yolo-performance-metrics.md
Normal file
165
docs/en/guides/yolo-performance-metrics.md
Normal file
|
|
@ -0,0 +1,165 @@
|
|||
---
|
||||
comments: true
|
||||
Description: A comprehensive guide on various performance metrics related to YOLOv8, their significance, and how to interpret them.
|
||||
keywords: YOLOv8, Performance metrics, Object detection, Intersection over Union (IoU), Average Precision (AP), Mean Average Precision (mAP), Precision, Recall, Validation mode, Ultralytics
|
||||
---
|
||||
|
||||
# Performance Metrics Deep Dive
|
||||
|
||||
## Introduction
|
||||
|
||||
Performance metrics are key tools to evaluate the accuracy and efficiency of object detection models. They shed light on how effectively a model can identify and localize objects within images. Additionally, they help in understanding the model's handling of false positives and false negatives. These insights are crucial for evaluating and enhancing the model's performance. In this guide, we will explore various performance metrics associated with YOLOv8, their significance, and how to interpret them.
|
||||
|
||||
## Object Detection Metrics
|
||||
|
||||
Let’s start by discussing some metrics that are not only important to YOLOv8 but are broadly applicable across different object detection models.
|
||||
|
||||
- **Intersection over Union (IoU):** IoU is a measure that quantifies the overlap between a predicted bounding box and a ground truth bounding box. It plays a fundamental role in evaluating the accuracy of object localization.
|
||||
|
||||
- **Average Precision (AP):** AP computes the area under the precision-recall curve, providing a single value that encapsulates the model's precision and recall performance.
|
||||
|
||||
- **Mean Average Precision (mAP):** mAP extends the concept of AP by calculating the average AP values across multiple object classes. This is useful in multi-class object detection scenarios to provide a comprehensive evaluation of the model's performance.
|
||||
|
||||
- **Precision and Recall:** Precision quantifies the proportion of true positives among all positive predictions, assessing the model's capability to avoid false positives. On the other hand, Recall calculates the proportion of true positives among all actual positives, measuring the model's ability to detect all instances of a class.
|
||||
|
||||
- **F1 Score:** The F1 Score is the harmonic mean of precision and recall, providing a balanced assessment of a model's performance while considering both false positives and false negatives.
|
||||
|
||||
## How to Calculate Metrics for YOLOv8 Model
|
||||
|
||||
Now, we can explore [YOLOv8's Validation mode](https://docs.ultralytics.com/modes/val/) that can be used to compute the above discussed evaluation metrics.
|
||||
|
||||
Using the validation mode is simple. Once you have a trained model, you can invoke the model.val() function. This function will then process the validation dataset and return a variety of performance metrics. But what do these metrics mean? And how should you interpret them?
|
||||
|
||||
### Interpreting the Output
|
||||
|
||||
Let's break down the output of the model.val() function and understand each segment of the output.
|
||||
|
||||
#### Class-wise Metrics
|
||||
|
||||
One of the sections of the output is the class-wise breakdown of performance metrics. This granular information is useful when you are trying to understand how well the model is doing for each specific class, especially in datasets with a diverse range of object categories. For each class in the dataset the following is provided:
|
||||
|
||||
- **Class**: This denotes the name of the object class, such as "person", "car", or "dog".
|
||||
|
||||
- **Images**: This metric tells you the number of images in the validation set that contain the object class.
|
||||
|
||||
- **Instances**: This provides the count of how many times the class appears across all images in the validation set.
|
||||
|
||||
- **Box(P, R, mAP50, mAP50-95)**: This metric provides insights into the model's performance in detecting objects:
|
||||
|
||||
- **P (Precision)**: The accuracy of the detected objects, indicating how many detections were correct.
|
||||
|
||||
- **R (Recall)**: The ability of the model to identify all instances of objects in the images.
|
||||
|
||||
- **mAP50**: Mean average precision calculated at an intersection over union (IoU) threshold of 0.50. It's a measure of the model's accuracy considering only the "easy" detections.
|
||||
|
||||
- **mAP50-95**: The average of the mean average precision calculated at varying IoU thresholds, ranging from 0.50 to 0.95. It gives a comprehensive view of the model's performance across different levels of detection difficulty.
|
||||
|
||||
#### Speed Metrics
|
||||
|
||||
The speed of inference can be as critical as accuracy, especially in real-time object detection scenarios. This section breaks down the time taken for various stages of the validation process, from preprocessing to post-processing.
|
||||
|
||||
#### COCO Metrics Evaluation
|
||||
|
||||
For users validating on the COCO dataset, additional metrics are calculated using the COCO evaluation script. These metrics give insights into precision and recall at different IoU thresholds and for objects of different sizes.
|
||||
|
||||
#### Visual Outputs
|
||||
|
||||
The model.val() function, apart from producing numeric metrics, also yields visual outputs that can provide a more intuitive understanding of the model's performance. Here's a breakdown of the visual outputs you can expect:
|
||||
|
||||
- **F1 Score Curve (`F1_curve.png`)**: This curve represents the F1 score across various thresholds. Interpreting this curve can offer insights into the model's balance between false positives and false negatives over different thresholds.
|
||||
|
||||
- **Precision-Recall Curve (`PR_curve.png`)**: An integral visualization for any classification problem, this curve showcases the trade-offs between precision and recall at varied thresholds. It becomes especially significant when dealing with imbalanced classes.
|
||||
|
||||
- **Precision Curve (`P_curve.png`)**: A graphical representation of precision values at different thresholds. This curve helps in understanding how precision varies as the threshold changes.
|
||||
|
||||
- **Recall Curve (`R_curve.png`)**: Correspondingly, this graph illustrates how the recall values change across different thresholds.
|
||||
|
||||
- **Confusion Matrix (`confusion_matrix.png`)**: The confusion matrix provides a detailed view of the outcomes, showcasing the counts of true positives, true negatives, false positives, and false negatives for each class.
|
||||
|
||||
- **Normalized Confusion Matrix (`confusion_matrix_normalized.png`)**: This visualization is a normalized version of the confusion matrix. It represents the data in proportions rather than raw counts. This format makes it simpler to compare the performance across classes.
|
||||
|
||||
- **Validation Batch Labels (`val_batchX_labels.jpg`)**: These images depict the ground truth labels for distinct batches from the validation dataset. They provide a clear picture of what the objects are and their respective locations as per the dataset.
|
||||
|
||||
- **Validation Batch Predictions (`val_batchX_pred.jpg`)**: Contrasting the label images, these visuals display the predictions made by the YOLOv8 model for the respective batches. By comparing these to the label images, you can easily assess how well the model detects and classifies objects visually.
|
||||
|
||||
#### Results Storage
|
||||
|
||||
For future reference, the results are saved to a directory, typically named runs/detect/val.
|
||||
|
||||
## Choosing the Right Metrics
|
||||
|
||||
Choosing the right metrics to evaluate often depends on the specific application.
|
||||
|
||||
- **mAP:** Suitable for a broad assessment of model performance.
|
||||
|
||||
- **IoU:** Essential when precise object location is crucial.
|
||||
|
||||
- **Precision:** Important when minimizing false detections is a priority.
|
||||
|
||||
- **Recall:** Vital when it's important to detect every instance of an object.
|
||||
|
||||
- **F1 Score:** Useful when a balance between precision and recall is needed.
|
||||
|
||||
For real-time applications, speed metrics like FPS (Frames Per Second) and latency are crucial to ensure timely results.
|
||||
|
||||
## Interpretation of Results
|
||||
|
||||
It’s important to understand the metrics. Here's what some of the commonly observed lower scores might suggest:
|
||||
|
||||
- **Low mAP:** Indicates the model may need general refinements.
|
||||
|
||||
- **Low IoU:** The model might be struggling to pinpoint objects accurately. Different bounding box methods could help.
|
||||
|
||||
- **Low Precision:** The model may be detecting too many non-existent objects. Adjusting confidence thresholds might reduce this.
|
||||
|
||||
- **Low Recall:** The model could be missing real objects. Improving feature extraction or using more data might help.
|
||||
|
||||
- **Imbalanced F1 Score:** There's a disparity between precision and recall.
|
||||
|
||||
- **Class-specific AP:** Low scores here can highlight classes the model struggles with.
|
||||
|
||||
## Case Studies
|
||||
|
||||
Real-world examples can help clarify how these metrics work in practice.
|
||||
|
||||
### Case 1
|
||||
|
||||
- **Situation:** mAP and F1 Score are suboptimal, but while Recall is good, Precision isn't.
|
||||
|
||||
- **Interpretation & Action:** There might be too many incorrect detections. Tightening confidence thresholds could reduce these, though it might also slightly decrease recall.
|
||||
|
||||
### Case 2
|
||||
|
||||
- **Situation:** mAP and Recall are acceptable, but IoU is lacking.
|
||||
|
||||
- **Interpretation & Action:** The model detects objects well but might not be localizing them precisely. Refining bounding box predictions might help.
|
||||
|
||||
### Case 3
|
||||
|
||||
- **Situation:** Some classes have a much lower AP than others, even with a decent overall mAP.
|
||||
|
||||
- **Interpretation & Action:** These classes might be more challenging for the model. Using more data for these classes or adjusting class weights during training could be beneficial.
|
||||
|
||||
## Connect and Collaborate
|
||||
|
||||
Tapping into a community of enthusiasts and experts can amplify your journey with YOLOv8. Here are some avenues that can facilitate learning, troubleshooting, and networking.
|
||||
|
||||
### Engage with the Broader Community
|
||||
|
||||
- **GitHub Issues:** The YOLOv8 repository on GitHub has an [Issues tab](https://github.com/ultralytics/ultralytics/issues) where you can ask questions, report bugs, and suggest new features. The community and maintainers are active here, and it’s a great place to get help with specific problems.
|
||||
|
||||
- **Ultralytics Discord Server:** Ultralytics has a [Discord server](https://ultralytics.com/discord/) where you can interact with other users and the developers.
|
||||
|
||||
### Official Documentation and Resources:
|
||||
|
||||
- **Ultralytics YOLOv8 Docs:** The [official documentation](https://docs.ultralytics.com/) provides a comprehensive overview of YOLOv8, along with guides on installation, usage, and troubleshooting.
|
||||
|
||||
Using these resources will not only guide you through any challenges but also keep you updated with the latest trends and best practices in the YOLOv8 community.
|
||||
|
||||
## Conclusion
|
||||
|
||||
In this guide, we've taken a close look at the essential performance metrics for YOLOv8. These metrics are key to understanding how well a model is performing and are vital for anyone aiming to fine-tune their models. They offer the necessary insights for improvements and to make sure the model works effectively in real-life situations.
|
||||
|
||||
Remember, the YOLOv8 and Ultralytics community is an invaluable asset. Engaging with fellow developers and experts can open doors to insights and solutions not found in standard documentation. As you journey through object detection, keep the spirit of learning alive, experiment with new strategies, and share your findings. By doing so, you contribute to the community's collective wisdom and ensure its growth.
|
||||
|
||||
Happy object detecting!
|
||||
108
docs/en/guides/yolo-thread-safe-inference.md
Normal file
108
docs/en/guides/yolo-thread-safe-inference.md
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
---
|
||||
comments: true
|
||||
description: This guide provides best practices for performing thread-safe inference with YOLO models, ensuring reliable and concurrent predictions in multi-threaded applications.
|
||||
keywords: thread-safe, YOLO inference, multi-threading, concurrent predictions, YOLO models, Ultralytics, Python threading, safe YOLO usage, AI concurrency
|
||||
---
|
||||
|
||||
# Thread-Safe Inference with YOLO Models
|
||||
|
||||
Running YOLO models in a multi-threaded environment requires careful consideration to ensure thread safety. Python's `threading` module allows you to run several threads concurrently, but when it comes to using YOLO models across these threads, there are important safety issues to be aware of. This page will guide you through creating thread-safe YOLO model inference.
|
||||
|
||||
## Understanding Python Threading
|
||||
|
||||
Python threads are a form of parallelism that allow your program to run multiple operations at once. However, Python's Global Interpreter Lock (GIL) means that only one thread can execute Python bytecode at a time.
|
||||
|
||||
<p align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/281418476-7f478570-fd77-4a40-bf3d-74b4db4d668c.png" alt="Single vs Multi-Thread Examples">
|
||||
</p>
|
||||
|
||||
While this sounds like a limitation, threads can still provide concurrency, especially for I/O-bound operations or when using operations that release the GIL, like those performed by YOLO's underlying C libraries.
|
||||
|
||||
## The Danger of Shared Model Instances
|
||||
|
||||
Instantiating a YOLO model outside your threads and sharing this instance across multiple threads can lead to race conditions, where the internal state of the model is inconsistently modified due to concurrent accesses. This is particularly problematic when the model or its components hold state that is not designed to be thread-safe.
|
||||
|
||||
### Non-Thread-Safe Example: Single Model Instance
|
||||
|
||||
When using threads in Python, it's important to recognize patterns that can lead to concurrency issues. Here is what you should avoid: sharing a single YOLO model instance across multiple threads.
|
||||
|
||||
```python
|
||||
# Unsafe: Sharing a single model instance across threads
|
||||
from ultralytics import YOLO
|
||||
from threading import Thread
|
||||
|
||||
# Instantiate the model outside the thread
|
||||
shared_model = YOLO("yolov8n.pt")
|
||||
|
||||
|
||||
def predict(image_path):
|
||||
results = shared_model.predict(image_path)
|
||||
# Process results
|
||||
|
||||
|
||||
# Starting threads that share the same model instance
|
||||
Thread(target=predict, args=("image1.jpg",)).start()
|
||||
Thread(target=predict, args=("image2.jpg",)).start()
|
||||
```
|
||||
|
||||
In the example above, the `shared_model` is used by multiple threads, which can lead to unpredictable results because `predict` could be executed simultaneously by multiple threads.
|
||||
|
||||
### Non-Thread-Safe Example: Multiple Model Instances
|
||||
|
||||
Similarly, here is an unsafe pattern with multiple YOLO model instances:
|
||||
|
||||
```python
|
||||
# Unsafe: Sharing multiple model instances across threads can still lead to issues
|
||||
from ultralytics import YOLO
|
||||
from threading import Thread
|
||||
|
||||
# Instantiate multiple models outside the thread
|
||||
shared_model_1 = YOLO("yolov8n_1.pt")
|
||||
shared_model_2 = YOLO("yolov8n_2.pt")
|
||||
|
||||
|
||||
def predict(model, image_path):
|
||||
results = model.predict(image_path)
|
||||
# Process results
|
||||
|
||||
|
||||
# Starting threads with individual model instances
|
||||
Thread(target=predict, args=(shared_model_1, "image1.jpg")).start()
|
||||
Thread(target=predict, args=(shared_model_2, "image2.jpg")).start()
|
||||
```
|
||||
|
||||
Even though there are two separate model instances, the risk of concurrency issues still exists. If the internal implementation of `YOLO` is not thread-safe, using separate instances might not prevent race conditions, especially if these instances share any underlying resources or states that are not thread-local.
|
||||
|
||||
## Thread-Safe Inference
|
||||
|
||||
To perform thread-safe inference, you should instantiate a separate YOLO model within each thread. This ensures that each thread has its own isolated model instance, eliminating the risk of race conditions.
|
||||
|
||||
### Thread-Safe Example
|
||||
|
||||
Here's how to instantiate a YOLO model inside each thread for safe parallel inference:
|
||||
|
||||
```python
|
||||
# Safe: Instantiating a single model inside each thread
|
||||
from ultralytics import YOLO
|
||||
from threading import Thread
|
||||
|
||||
|
||||
def thread_safe_predict(image_path):
|
||||
# Instantiate a new model inside the thread
|
||||
local_model = YOLO("yolov8n.pt")
|
||||
results = local_model.predict(image_path)
|
||||
# Process results
|
||||
|
||||
|
||||
# Starting threads that each have their own model instance
|
||||
Thread(target=thread_safe_predict, args=("image1.jpg",)).start()
|
||||
Thread(target=thread_safe_predict, args=("image2.jpg",)).start()
|
||||
```
|
||||
|
||||
In this example, each thread creates its own `YOLO` instance. This prevents any thread from interfering with the model state of another, thus ensuring that each thread performs inference safely and without unexpected interactions with the other threads.
|
||||
|
||||
## Conclusion
|
||||
|
||||
When using YOLO models with Python's `threading`, always instantiate your models within the thread that will use them to ensure thread safety. This practice avoids race conditions and makes sure that your inference tasks run reliably.
|
||||
|
||||
For more advanced scenarios and to further optimize your multi-threaded inference performance, consider using process-based parallelism with `multiprocessing` or leveraging a task queue with dedicated worker processes.
|
||||
61
docs/en/help/CI.md
Normal file
61
docs/en/help/CI.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how Ultralytics leverages Continuous Integration (CI) for maintaining high-quality code. Explore our CI tests and the status of these tests for our repositories.
|
||||
keywords: continuous integration, software development, CI tests, Ultralytics repositories, high-quality code, Docker Deployment, Broken Links, CodeQL, PyPi Publishing
|
||||
---
|
||||
|
||||
# Continuous Integration (CI)
|
||||
|
||||
Continuous Integration (CI) is an essential aspect of software development which involves integrating changes and testing them automatically. CI allows us to maintain high-quality code by catching issues early and often in the development process. At Ultralytics, we use various CI tests to ensure the quality and integrity of our codebase.
|
||||
|
||||
## CI Actions
|
||||
|
||||
Here's a brief description of our CI actions:
|
||||
|
||||
- **CI:** This is our primary CI test that involves running unit tests, linting checks, and sometimes more comprehensive tests depending on the repository.
|
||||
- **Docker Deployment:** This test checks the deployment of the project using Docker to ensure the Dockerfile and related scripts are working correctly.
|
||||
- **Broken Links:** This test scans the codebase for any broken or dead links in our markdown or HTML files.
|
||||
- **CodeQL:** CodeQL is a tool from GitHub that performs semantic analysis on our code, helping to find potential security vulnerabilities and maintain high-quality code.
|
||||
- **PyPi Publishing:** This test checks if the project can be packaged and published to PyPi without any errors.
|
||||
|
||||
### CI Results
|
||||
|
||||
Below is the table showing the status of these CI tests for our main repositories:
|
||||
|
||||
| Repository | CI | Docker Deployment | Broken Links | CodeQL | PyPi and Docs Publishing |
|
||||
|-----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| [yolov3](https://github.com/ultralytics/yolov3) | [](https://github.com/ultralytics/yolov3/actions/workflows/ci-testing.yml) | [](https://github.com/ultralytics/yolov3/actions/workflows/docker.yml) | [](https://github.com/ultralytics/yolov3/actions/workflows/links.yml) | [](https://github.com/ultralytics/yolov3/actions/workflows/codeql-analysis.yml) | |
|
||||
| [yolov5](https://github.com/ultralytics/yolov5) | [](https://github.com/ultralytics/yolov5/actions/workflows/ci-testing.yml) | [](https://github.com/ultralytics/yolov5/actions/workflows/docker.yml) | [](https://github.com/ultralytics/yolov5/actions/workflows/links.yml) | [](https://github.com/ultralytics/yolov5/actions/workflows/codeql-analysis.yml) | |
|
||||
| [ultralytics](https://github.com/ultralytics/ultralytics) | [](https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml) | [](https://github.com/ultralytics/ultralytics/actions/workflows/docker.yaml) | [](https://github.com/ultralytics/ultralytics/actions/workflows/links.yml) | [](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml) | [](https://github.com/ultralytics/ultralytics/actions/workflows/publish.yml) |
|
||||
| [hub](https://github.com/ultralytics/hub) | [](https://github.com/ultralytics/hub/actions/workflows/ci.yaml) | | [](https://github.com/ultralytics/hub/actions/workflows/links.yml) | | |
|
||||
| [docs](https://github.com/ultralytics/docs) | | | [](https://github.com/ultralytics/docs/actions/workflows/links.yml) | | [](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment) |
|
||||
|
||||
Each badge shows the status of the last run of the corresponding CI test on the `main` branch of the respective repository. If a test fails, the badge will display a "failing" status, and if it passes, it will display a "passing" status.
|
||||
|
||||
If you notice a test failing, it would be a great help if you could report it through a GitHub issue in the respective repository.
|
||||
|
||||
Remember, a successful CI test does not mean that everything is perfect. It is always recommended to manually review the code before deployment or merging changes.
|
||||
|
||||
## Code Coverage
|
||||
|
||||
Code coverage is a metric that represents the percentage of your codebase that is executed when your tests run. It provides insight into how well your tests exercise your code and can be crucial in identifying untested parts of your application. A high code coverage percentage is often associated with a lower likelihood of bugs. However, it's essential to understand that code coverage doesn't guarantee the absence of defects. It merely indicates which parts of the code have been executed by the tests.
|
||||
|
||||
### Integration with [codecov.io](https://codecov.io/)
|
||||
|
||||
At Ultralytics, we have integrated our repositories with [codecov.io](https://codecov.io/), a popular online platform for measuring and visualizing code coverage. Codecov provides detailed insights, coverage comparisons between commits, and visual overlays directly on your code, indicating which lines were covered.
|
||||
|
||||
By integrating with Codecov, we aim to maintain and improve the quality of our code by focusing on areas that might be prone to errors or need further testing.
|
||||
|
||||
### Coverage Results
|
||||
|
||||
To quickly get a glimpse of the code coverage status of the `ultralytics` python package, we have included a badge and and sunburst visual of the `ultralytics` coverage results. These images show the percentage of code covered by our tests, offering an at-a-glance metric of our testing efforts. For full details please see https://codecov.io/github/ultralytics/ultralytics.
|
||||
|
||||
| Repository | Code Coverage |
|
||||
|-----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| [ultralytics](https://github.com/ultralytics/ultralytics) | [](https://codecov.io/gh/ultralytics/ultralytics) |
|
||||
|
||||
In the sunburst graphic below, the inner-most circle is the entire project, moving away from the center are folders then, finally, a single file. The size and color of each slice is representing the number of statements and the coverage, respectively.
|
||||
|
||||
<a href="https://codecov.io/github/ultralytics/ultralytics">
|
||||
<img src="https://codecov.io/gh/ultralytics/ultralytics/branch/main/graphs/sunburst.svg?token=HHW7IIVFVY" alt="Ultralytics Codecov Image">
|
||||
</a>
|
||||
31
docs/en/help/CLA.md
Normal file
31
docs/en/help/CLA.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
---
|
||||
description: Understand terms governing contributions to Ultralytics projects including source code, bug fixes, documentation and more. Read our Contributor License Agreement.
|
||||
keywords: Ultralytics, Contributor License Agreement, Open Source Software, Contributions, Copyright License, Patent License, Moral Rights
|
||||
---
|
||||
|
||||
# Ultralytics Individual Contributor License Agreement
|
||||
|
||||
Thank you for your interest in contributing to open source software projects (“Projects”) made available by Ultralytics SE or its affiliates (“Ultralytics”). This Individual Contributor License Agreement (“Agreement”) sets out the terms governing any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that you submit or have submitted, in any form and in any manner, to Ultralytics in respect of any of the Projects (collectively “Contributions”). If you have any questions respecting this Agreement, please contact hello@ultralytics.com.
|
||||
|
||||
You agree that the following terms apply to all of your past, present and future Contributions. Except for the licenses granted in this Agreement, you retain all of your right, title and interest in and to your Contributions.
|
||||
|
||||
**Copyright License.** You hereby grant, and agree to grant, to Ultralytics a non-exclusive, perpetual, irrevocable, worldwide, fully-paid, royalty-free, transferable copyright license to reproduce, prepare derivative works of, publicly display, publicly perform, and distribute your Contributions and such derivative works, with the right to sublicense the foregoing rights through multiple tiers of sublicensees.
|
||||
|
||||
**Patent License.** You hereby grant, and agree to grant, to Ultralytics a non-exclusive, perpetual, irrevocable, worldwide, fully-paid, royalty-free, transferable patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer your Contributions, where such license applies only to those patent claims licensable by you that are necessarily infringed by your Contributions alone or by combination of your Contributions with the Project to which such Contributions were submitted, with the right to sublicense the foregoing rights through multiple tiers of sublicensees.
|
||||
|
||||
**Moral Rights.** To the fullest extent permitted under applicable law, you hereby waive, and agree not to assert, all of your “moral rights” in or relating to your Contributions for the benefit of Ultralytics, its assigns, and their respective direct and indirect sublicensees.
|
||||
|
||||
**Third Party Content/Rights.
|
||||
** If your Contribution includes or is based on any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that were not authored by you (“Third Party Content”) or if you are aware of any third party intellectual property or proprietary rights associated with your Contribution (“Third Party Rights”), then you agree to include with the submission of your Contribution full details respecting such Third Party Content and Third Party Rights, including, without limitation, identification of which aspects of your Contribution contain Third Party Content or are associated with Third Party Rights, the owner/author of the Third Party Content and Third Party Rights, where you obtained the Third Party Content, and any applicable third party license terms or restrictions respecting the Third Party Content and Third Party Rights. For greater certainty, the foregoing obligations respecting the identification of Third Party Content and Third Party Rights do not apply to any portion of a Project that is incorporated into your Contribution to that same Project.
|
||||
|
||||
**Representations.** You represent that, other than the Third Party Content and Third Party Rights identified by you in accordance with this Agreement, you are the sole author of your Contributions and are legally entitled to grant the foregoing licenses and waivers in respect of your Contributions. If your Contributions were created in the course of your employment with your past or present employer(s), you represent that such employer(s) has authorized you to make your Contributions on behalf of such employer(s) or such employer
|
||||
(s) has waived all of their right, title or interest in or to your Contributions.
|
||||
|
||||
**Disclaimer.** To the fullest extent permitted under applicable law, your Contributions are provided on an "asis"
|
||||
basis, without any warranties or conditions, express or implied, including, without limitation, any implied warranties or conditions of non-infringement, merchantability or fitness for a particular purpose. You are not required to provide support for your Contributions, except to the extent you desire to provide support.
|
||||
|
||||
**No Obligation.** You acknowledge that Ultralytics is under no obligation to use or incorporate your Contributions into any of the Projects. The decision to use or incorporate your Contributions into any of the Projects will be made at the sole discretion of Ultralytics or its authorized delegates ..
|
||||
|
||||
**Disputes.** This Agreement shall be governed by and construed in accordance with the laws of the State of New York, United States of America, without giving effect to its principles or rules regarding conflicts of laws, other than such principles directing application of New York law. The parties hereby submit to venue in, and jurisdiction of the courts located in New York, New York for purposes relating to this Agreement. In the event that any of the provisions of this Agreement shall be held by a court or other tribunal of competent jurisdiction to be unenforceable, the remaining portions hereof shall remain in full force and effect.
|
||||
|
||||
**Assignment.** You agree that Ultralytics may assign this Agreement, and all of its rights, obligations and licenses hereunder.
|
||||
39
docs/en/help/FAQ.md
Normal file
39
docs/en/help/FAQ.md
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
---
|
||||
comments: true
|
||||
description: Find solutions to your common Ultralytics YOLO related queries. Learn about hardware requirements, fine-tuning YOLO models, conversion to ONNX/TensorFlow, and more.
|
||||
keywords: Ultralytics, YOLO, FAQ, hardware requirements, ONNX, TensorFlow, real-time detection, YOLO accuracy
|
||||
---
|
||||
|
||||
# Ultralytics YOLO Frequently Asked Questions (FAQ)
|
||||
|
||||
This FAQ section addresses some common questions and issues users might encounter while working with Ultralytics YOLO repositories.
|
||||
|
||||
## 1. What are the hardware requirements for running Ultralytics YOLO?
|
||||
|
||||
Ultralytics YOLO can be run on a variety of hardware configurations, including CPUs, GPUs, and even some edge devices. However, for optimal performance and faster training and inference, we recommend using a GPU with a minimum of 8GB of memory. NVIDIA GPUs with CUDA support are ideal for this purpose.
|
||||
|
||||
## 2. How do I fine-tune a pre-trained YOLO model on my custom dataset?
|
||||
|
||||
To fine-tune a pre-trained YOLO model on your custom dataset, you'll need to create a dataset configuration file (YAML) that defines the dataset's properties, such as the path to the images, the number of classes, and class names. Next, you'll need to modify the model configuration file to match the number of classes in your dataset. Finally, use the `train.py` script to start the training process with your custom dataset and the pre-trained model. You can find a detailed guide on fine-tuning YOLO in the Ultralytics documentation.
|
||||
|
||||
## 3. How do I convert a YOLO model to ONNX or TensorFlow format?
|
||||
|
||||
Ultralytics provides built-in support for converting YOLO models to ONNX format. You can use the `export.py` script to convert a saved model to ONNX format. If you need to convert the model to TensorFlow format, you can use the ONNX model as an intermediary and then use the ONNX-TensorFlow converter to convert the ONNX model to TensorFlow format.
|
||||
|
||||
## 4. Can I use Ultralytics YOLO for real-time object detection?
|
||||
|
||||
Yes, Ultralytics YOLO is designed to be efficient and fast, making it suitable for real-time object detection tasks. The actual performance will depend on your hardware configuration and the complexity of the model. Using a GPU and optimizing the model for your specific use case can help achieve real-time performance.
|
||||
|
||||
## 5. How can I improve the accuracy of my YOLO model?
|
||||
|
||||
Improving the accuracy of a YOLO model may involve several strategies, such as:
|
||||
|
||||
- Fine-tuning the model on more annotated data
|
||||
- Data augmentation to increase the variety of training samples
|
||||
- Using a larger or more complex model architecture
|
||||
- Adjusting the learning rate, batch size, and other hyperparameters
|
||||
- Using techniques like transfer learning or knowledge distillation
|
||||
|
||||
Remember that there's often a trade-off between accuracy and inference speed, so finding the right balance is crucial for your specific application.
|
||||
|
||||
If you have any more questions or need assistance, don't hesitate to consult the Ultralytics documentation or reach out to the community through GitHub Issues or the official discussion forum.
|
||||
88
docs/en/help/code_of_conduct.md
Normal file
88
docs/en/help/code_of_conduct.md
Normal file
|
|
@ -0,0 +1,88 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore Ultralytics community’s Code of Conduct, ensuring a supportive, inclusive environment for contributors & members at all levels. Find our guidelines on acceptable behavior & enforcement.
|
||||
keywords: Ultralytics, code of conduct, community, contribution, behavior guidelines, enforcement, open source contributions
|
||||
---
|
||||
|
||||
# Ultralytics Contributor Covenant Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
|
||||
|
||||
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to a positive environment for our community include:
|
||||
|
||||
- Demonstrating empathy and kindness toward other people
|
||||
- Being respectful of differing opinions, viewpoints, and experiences
|
||||
- Giving and gracefully accepting constructive feedback
|
||||
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
|
||||
- Focusing on what is best not just for us as individuals, but for the overall community
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
- The use of sexualized language or imagery, and sexual attention or advances of any kind
|
||||
- Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
- Public or private harassment
|
||||
- Publishing others' private information, such as a physical or email address, without their explicit permission
|
||||
- Other conduct which could reasonably be considered inappropriate in a professional setting
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
|
||||
|
||||
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at hello@ultralytics.com. All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
|
||||
|
||||
## Enforcement Guidelines
|
||||
|
||||
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
|
||||
|
||||
### 1. Correction
|
||||
|
||||
**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
|
||||
|
||||
**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
|
||||
|
||||
### 2. Warning
|
||||
|
||||
**Community Impact**: A violation through a single incident or series of actions.
|
||||
|
||||
**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
|
||||
|
||||
### 3. Temporary Ban
|
||||
|
||||
**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior.
|
||||
|
||||
**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
|
||||
|
||||
### 4. Permanent Ban
|
||||
|
||||
**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
|
||||
|
||||
**Consequence**: A permanent ban from any sort of public interaction within the community.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.0, available at
|
||||
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
|
||||
|
||||
Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity).
|
||||
|
||||
For answers to common questions about this code of conduct, see the FAQ at
|
||||
https://www.contributor-covenant.org/faq. Translations are available at
|
||||
https://www.contributor-covenant.org/translations.
|
||||
|
||||
[homepage]: https://www.contributor-covenant.org
|
||||
76
docs/en/help/contributing.md
Normal file
76
docs/en/help/contributing.md
Normal file
|
|
@ -0,0 +1,76 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to contribute to Ultralytics YOLO projects – guidelines for pull requests, reporting bugs, code conduct and CLA signing.
|
||||
keywords: Ultralytics, YOLO, open-source, contribute, pull request, bug report, coding guidelines, CLA, code of conduct, GitHub
|
||||
---
|
||||
|
||||
# Contributing to Ultralytics Open-Source YOLO Repositories
|
||||
|
||||
First of all, thank you for your interest in contributing to Ultralytics open-source YOLO repositories! Your contributions will help improve the project and benefit the community. This document provides guidelines and best practices to get you started.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [Code of Conduct](#code-of-conduct)
|
||||
- [Pull Requests](#pull-requests)
|
||||
- [CLA Signing](#cla-signing)
|
||||
- [Google-Style Docstrings](#google-style-docstrings)
|
||||
- [GitHub Actions CI Tests](#github-actions-ci-tests)
|
||||
- [Bug Reports](#bug-reports)
|
||||
- [Minimum Reproducible Example](#minimum-reproducible-example)
|
||||
- [License and Copyright](#license-and-copyright)
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
All contributors are expected to adhere to the [Code of Conduct](code_of_conduct.md) to ensure a welcoming and inclusive environment for everyone.
|
||||
|
||||
## Pull Requests
|
||||
|
||||
We welcome contributions in the form of pull requests. To make the review process smoother, please follow these guidelines:
|
||||
|
||||
1. **Fork the repository**: Fork the Ultralytics YOLO repository to your own GitHub account.
|
||||
|
||||
2. **Create a branch**: Create a new branch in your forked repository with a descriptive name for your changes.
|
||||
|
||||
3. **Make your changes**: Make the changes you want to contribute. Ensure that your changes follow the coding style of the project and do not introduce new errors or warnings.
|
||||
|
||||
4. **Test your changes**: Test your changes locally to ensure that they work as expected and do not introduce new issues.
|
||||
|
||||
5. **Commit your changes**: Commit your changes with a descriptive commit message. Make sure to include any relevant issue numbers in your commit message.
|
||||
|
||||
6. **Create a pull request**: Create a pull request from your forked repository to the main Ultralytics YOLO repository. In the pull request description, provide a clear explanation of your changes and how they improve the project.
|
||||
|
||||
### CLA Signing
|
||||
|
||||
Before we can accept your pull request, you need to sign a [Contributor License Agreement (CLA)](CLA.md). This is a legal document stating that you agree to the terms of contributing to the Ultralytics YOLO repositories. The CLA ensures that your contributions are properly licensed and that the project can continue to be distributed under the AGPL-3.0 license.
|
||||
|
||||
To sign the CLA, follow the instructions provided by the CLA bot after you submit your PR.
|
||||
|
||||
### Google-Style Docstrings
|
||||
|
||||
When adding new functions or classes, please include a [Google-style docstring](https://google.github.io/styleguide/pyguide.html) to provide clear and concise documentation for other developers. This will help ensure that your contributions are easy to understand and maintain.
|
||||
|
||||
Example Google-style docstring:
|
||||
|
||||
```python
|
||||
def example_function(arg1: int, arg2: int) -> bool:
|
||||
"""
|
||||
Example function that demonstrates Google-style docstrings.
|
||||
|
||||
Args:
|
||||
arg1 (int): The first argument.
|
||||
arg2 (int): The second argument.
|
||||
|
||||
Returns:
|
||||
(bool): True if successful, False otherwise.
|
||||
|
||||
Examples:
|
||||
>>> result = example_function(1, 2) # returns False
|
||||
"""
|
||||
if arg1 == arg2:
|
||||
return True
|
||||
return False
|
||||
```
|
||||
|
||||
### GitHub Actions CI Tests
|
||||
|
||||
Before your pull request can be merged, all GitHub Actions Continuous Integration (CI) tests must pass. These tests include linting, unit tests, and other checks to ensure that your changes meet the quality standards of the project. Make sure to review the output of the GitHub Actions and fix any issues
|
||||
37
docs/en/help/environmental-health-safety.md
Normal file
37
docs/en/help/environmental-health-safety.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
---
|
||||
comments: false
|
||||
description: Discover Ultralytics’ EHS policy principles and implementation measures. Committed to safety, environment, and continuous improvement for a sustainable future.
|
||||
keywords: Ultralytics policy, EHS, environment, health and safety, compliance, prevention, continuous improvement, risk management, emergency preparedness, resource allocation, communication
|
||||
---
|
||||
|
||||
# Ultralytics Environmental, Health and Safety (EHS) Policy
|
||||
|
||||
At Ultralytics, we recognize that the long-term success of our company relies not only on the products and services we offer, but also the manner in which we conduct our business. We are committed to ensuring the safety and well-being of our employees, stakeholders, and the environment, and we will continuously strive to mitigate our impact on the environment while promoting health and safety.
|
||||
|
||||
## Policy Principles
|
||||
|
||||
1. **Compliance**: We will comply with all applicable laws, regulations, and standards related to EHS, and we will strive to exceed these standards where possible.
|
||||
|
||||
2. **Prevention**: We will work to prevent accidents, injuries, and environmental harm by implementing risk management measures and ensuring all our operations and procedures are safe.
|
||||
|
||||
3. **Continuous Improvement**: We will continuously improve our EHS performance by setting measurable objectives, monitoring our performance, auditing our operations, and revising our policies and procedures as needed.
|
||||
|
||||
4. **Communication**: We will communicate openly about our EHS performance and will engage with stakeholders to understand and address their concerns and expectations.
|
||||
|
||||
5. **Education and Training**: We will educate and train our employees and contractors in appropriate EHS procedures and practices.
|
||||
|
||||
## Implementation Measures
|
||||
|
||||
1. **Responsibility and Accountability**: Every employee and contractor working at or with Ultralytics is responsible for adhering to this policy. Managers and supervisors are accountable for ensuring this policy is implemented within their areas of control.
|
||||
|
||||
2. **Risk Management**: We will identify, assess, and manage EHS risks associated with our operations and activities to prevent accidents, injuries, and environmental harm.
|
||||
|
||||
3. **Resource Allocation**: We will allocate the necessary resources to ensure the effective implementation of our EHS policy, including the necessary equipment, personnel, and training.
|
||||
|
||||
4. **Emergency Preparedness and Response**: We will develop, maintain, and test emergency preparedness and response plans to ensure we can respond effectively to EHS incidents.
|
||||
|
||||
5. **Monitoring and Review**: We will monitor and review our EHS performance regularly to identify opportunities for improvement and ensure we are meeting our objectives.
|
||||
|
||||
This policy reflects our commitment to minimizing our environmental footprint, ensuring the safety and well-being of our employees, and continuously improving our performance.
|
||||
|
||||
Please remember that the implementation of an effective EHS policy requires the involvement and commitment of everyone working at or with Ultralytics. We encourage you to take personal responsibility for your safety and the safety of others, and to take care of the environment in which we live and work.
|
||||
19
docs/en/help/index.md
Normal file
19
docs/en/help/index.md
Normal file
|
|
@ -0,0 +1,19 @@
|
|||
---
|
||||
comments: true
|
||||
description: Find comprehensive guides and documents on Ultralytics YOLO tasks. Includes FAQs, contributing guides, CI guide, CLA, MRE guide, code of conduct & more.
|
||||
keywords: Ultralytics, YOLO, guides, documents, FAQ, contributing, CI guide, CLA, MRE guide, code of conduct, EHS policy, security policy, privacy policy
|
||||
---
|
||||
|
||||
Welcome to the Ultralytics Help page! We are dedicated to providing you with detailed resources to enhance your experience with the Ultralytics YOLO models and repositories. This page serves as your portal to guides and documentation designed to assist you with various tasks and answer questions you may encounter while engaging with our repositories.
|
||||
|
||||
- [Frequently Asked Questions (FAQ)](FAQ.md): Find answers to common questions and issues encountered by the community of Ultralytics YOLO users and contributors.
|
||||
- [Contributing Guide](contributing.md): Discover the protocols for making contributions, including how to submit pull requests, report bugs, and more.
|
||||
- [Continuous Integration (CI) Guide](CI.md): Gain insights into the CI processes we employ, complete with status reports for each Ultralytics repository.
|
||||
- [Contributor License Agreement (CLA)](CLA.md): Review the CLA to understand the rights and responsibilities associated with contributing to Ultralytics projects.
|
||||
- [Minimum Reproducible Example (MRE) Guide](minimum_reproducible_example.md): Learn the process for creating an MRE, which is crucial for the timely and effective resolution of bug reports.
|
||||
- [Code of Conduct](code_of_conduct.md): Our community guidelines support a respectful and open atmosphere for all collaborators.
|
||||
- [Environmental, Health and Safety (EHS) Policy](environmental-health-safety.md): Delve into our commitment to sustainability and the well-being of all our stakeholders.
|
||||
- [Security Policy](security.md): Familiarize yourself with our security protocols and the procedure for reporting vulnerabilities.
|
||||
- [Privacy Policy](privacy.md): Read our privacy policy to understand how we protect your data and respect your privacy in all our services and operations.
|
||||
|
||||
We encourage you to review these resources for a seamless and productive experience. Our aim is to foster a helpful and friendly environment for everyone in the Ultralytics community. Should you require additional support, please feel free to reach out via GitHub Issues or our official discussion forums. Happy coding!
|
||||
78
docs/en/help/minimum_reproducible_example.md
Normal file
78
docs/en/help/minimum_reproducible_example.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to create minimum reproducible examples (MRE) for efficient bug reporting in Ultralytics YOLO repositories with this step-by-step guide.
|
||||
keywords: Ultralytics, YOLO, minimum reproducible example, MRE, bug reports, guide, dependencies, code, troubleshooting
|
||||
---
|
||||
|
||||
# Creating a Minimum Reproducible Example for Bug Reports in Ultralytics YOLO Repositories
|
||||
|
||||
When submitting a bug report for Ultralytics YOLO repositories, it's essential to provide a [minimum reproducible example](https://docs.ultralytics.com/help/minimum_reproducible_example/) (MRE). An MRE is a small, self-contained piece of code that demonstrates the problem you're experiencing. Providing an MRE helps maintainers and contributors understand the issue and work on a fix more efficiently. This guide explains how to create an MRE when submitting bug reports to Ultralytics YOLO repositories.
|
||||
|
||||
## 1. Isolate the Problem
|
||||
|
||||
The first step in creating an MRE is to isolate the problem. This means removing any unnecessary code or dependencies that are not directly related to the issue. Focus on the specific part of the code that is causing the problem and remove any irrelevant code.
|
||||
|
||||
## 2. Use Public Models and Datasets
|
||||
|
||||
When creating an MRE, use publicly available models and datasets to reproduce the issue. For example, use the 'yolov8n.pt' model and the 'coco8.yaml' dataset. This ensures that the maintainers and contributors can easily run your example and investigate the problem without needing access to proprietary data or custom models.
|
||||
|
||||
## 3. Include All Necessary Dependencies
|
||||
|
||||
Make sure to include all the necessary dependencies in your MRE. If your code relies on external libraries, specify the required packages and their versions. Ideally, provide a `requirements.txt` file or list the dependencies in your bug report.
|
||||
|
||||
## 4. Write a Clear Description of the Issue
|
||||
|
||||
Provide a clear and concise description of the issue you're experiencing. Explain the expected behavior and the actual behavior you're encountering. If applicable, include any relevant error messages or logs.
|
||||
|
||||
## 5. Format Your Code Properly
|
||||
|
||||
When submitting an MRE, format your code properly using code blocks in the issue description. This makes it easier for others to read and understand your code. In GitHub, you can create a code block by wrapping your code with triple backticks (\```) and specifying the language:
|
||||
|
||||
<pre>
|
||||
```python
|
||||
# Your Python code goes here
|
||||
```
|
||||
</pre>
|
||||
|
||||
## 6. Test Your MRE
|
||||
|
||||
Before submitting your MRE, test it to ensure that it accurately reproduces the issue. Make sure that others can run your example without any issues or modifications.
|
||||
|
||||
## Example of an MRE
|
||||
|
||||
Here's an example of an MRE for a hypothetical bug report:
|
||||
|
||||
**Bug description:**
|
||||
|
||||
When running the `detect.py` script on the sample image from the 'coco8.yaml' dataset, I get an error related to the dimensions of the input tensor.
|
||||
|
||||
**MRE:**
|
||||
|
||||
```python
|
||||
import torch
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the model
|
||||
model = YOLO("yolov8n.pt")
|
||||
|
||||
# Load a 0-channel image
|
||||
image = torch.rand(1, 0, 640, 640)
|
||||
|
||||
# Run the model
|
||||
results = model(image)
|
||||
```
|
||||
|
||||
**Error message:**
|
||||
|
||||
```
|
||||
RuntimeError: Expected input[1, 0, 640, 640] to have 3 channels, but got 0 channels instead
|
||||
```
|
||||
|
||||
**Dependencies:**
|
||||
|
||||
- torch==2.0.0
|
||||
- ultralytics==8.0.90
|
||||
|
||||
In this example, the MRE demonstrates the issue with a minimal amount of code, uses a public model ('yolov8n.pt'), includes all necessary dependencies, and provides a clear description of the problem along with the error message.
|
||||
|
||||
By following these guidelines, you'll help the maintainers and contributors of Ultralytics YOLO repositories to understand and resolve your issue more efficiently.
|
||||
137
docs/en/help/privacy.md
Normal file
137
docs/en/help/privacy.md
Normal file
|
|
@ -0,0 +1,137 @@
|
|||
---
|
||||
description: Learn about how Ultralytics collects and uses data to improve user experience, ensure software stability, and address privacy concerns, with options to opt-out.
|
||||
keywords: Ultralytics, Data Collection, User Privacy, Google Analytics, Sentry, Crash Reporting, Anonymized Data, Privacy Settings, Opt-Out
|
||||
---
|
||||
|
||||
# Data Collection for Ultralytics Python Package
|
||||
|
||||
## Overview
|
||||
|
||||
[Ultralytics](https://ultralytics.com) is dedicated to the continuous enhancement of the user experience and the capabilities of our Python package, including the advanced YOLO models we develop. Our approach involves the gathering of anonymized usage statistics and crash reports, helping us identify opportunities for improvement and ensuring the reliability of our software. This transparency document outlines what data we collect, its purpose, and the choice you have regarding this data collection.
|
||||
|
||||
## Anonymized Google Analytics
|
||||
|
||||
[Google Analytics](https://developers.google.com/analytics) is a web analytics service offered by Google that tracks and reports website traffic. It allows us to collect data about how our Python package is used, which is crucial for making informed decisions about design and functionality.
|
||||
|
||||
### What We Collect
|
||||
|
||||
- **Usage Metrics**: These metrics help us understand how frequently and in what ways the package is utilized, what features are favored, and the typical command-line arguments that are used.
|
||||
- **System Information**: We collect general non-identifiable information about your computing environment to ensure our package performs well across various systems.
|
||||
- **Performance Data**: Understanding the performance of our models during training, validation, and inference helps us in identifying optimization opportunities.
|
||||
|
||||
For more information about Google Analytics and data privacy, visit [Google Analytics Privacy](https://support.google.com/analytics/answer/6004245).
|
||||
|
||||
### How We Use This Data
|
||||
|
||||
- **Feature Improvement**: Insights from usage metrics guide us in enhancing user satisfaction and interface design.
|
||||
- **Optimization**: Performance data assist us in fine-tuning our models for better efficiency and speed across diverse hardware and software configurations.
|
||||
- **Trend Analysis**: By studying usage trends, we can predict and respond to the evolving needs of our community.
|
||||
|
||||
### Privacy Considerations
|
||||
|
||||
We take several measures to ensure the privacy and security of the data you entrust to us:
|
||||
|
||||
- **Anonymization**: We configure Google Analytics to anonymize the data collected, which means no personally identifiable information (PII) is gathered. You can use our services with the assurance that your personal details remain private.
|
||||
- **Aggregation**: Data is analyzed only in aggregate form. This practice ensures that patterns can be observed without revealing any individual user's activity.
|
||||
- **No Image Data Collection**: Ultralytics does not collect, process, or view any training or inference images.
|
||||
|
||||
## Sentry Crash Reporting
|
||||
|
||||
[Sentry](https://sentry.io/) is a developer-centric error tracking software that aids in identifying, diagnosing, and resolving issues in real-time, ensuring the robustness and reliability of applications. Within our package, it plays a crucial role by providing insights through crash reporting, significantly contributing to the stability and ongoing refinement of our software.
|
||||
|
||||
!!! note
|
||||
|
||||
Crash reporting via Sentry is activated only if the `sentry-sdk` Python package is pre-installed on your system. This package isn't included in the `ultralytics` prerequisites and won't be installed automatically by Ultralytics.
|
||||
|
||||
### What We Collect
|
||||
|
||||
If the `sentry-sdk` Python package is pre-installed on your system a crash event may send the following information:
|
||||
|
||||
- **Crash Logs**: Detailed reports on the application's condition at the time of a crash, which are vital for our debugging efforts.
|
||||
- **Error Messages**: We record error messages generated during the operation of our package to understand and resolve potential issues quickly.
|
||||
|
||||
To learn more about how Sentry handles data, please visit [Sentry's Privacy Policy](https://sentry.io/privacy/).
|
||||
|
||||
### How We Use This Data
|
||||
|
||||
- **Debugging**: Analyzing crash logs and error messages enables us to swiftly identify and correct software bugs.
|
||||
- **Stability Metrics**: By constantly monitoring for crashes, we aim to improve the stability and reliability of our package.
|
||||
|
||||
### Privacy Considerations
|
||||
|
||||
- **Sensitive Information**: We ensure that crash logs are scrubbed of any personally identifiable or sensitive user data, safeguarding the confidentiality of your information.
|
||||
- **Controlled Collection**: Our crash reporting mechanism is meticulously calibrated to gather only what is essential for troubleshooting while respecting user privacy.
|
||||
|
||||
By detailing the tools used for data collection and offering additional background information with URLs to their respective privacy pages, users are provided with a comprehensive view of our practices, emphasizing transparency and respect for user privacy.
|
||||
|
||||
## Disabling Data Collection
|
||||
|
||||
We believe in providing our users with full control over their data. By default, our package is configured to collect analytics and crash reports to help improve the experience for all users. However, we respect that some users may prefer to opt out of this data collection.
|
||||
|
||||
To opt out of sending analytics and crash reports, you can simply set `sync=False` in your YOLO settings. This ensures that no data is transmitted from your machine to our analytics tools.
|
||||
|
||||
### Inspecting Settings
|
||||
|
||||
To gain insight into the current configuration of your settings, you can view them directly:
|
||||
|
||||
!!! example "View settings"
|
||||
|
||||
=== "Python"
|
||||
You can use Python to view your settings. Start by importing the `settings` object from the `ultralytics` module. Print and return settings using the following commands:
|
||||
```python
|
||||
from ultralytics import settings
|
||||
|
||||
# View all settings
|
||||
print(settings)
|
||||
|
||||
# Return analytics and crash reporting setting
|
||||
value = settings['sync']
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
Alternatively, the command-line interface allows you to check your settings with a simple command:
|
||||
```bash
|
||||
yolo settings
|
||||
```
|
||||
|
||||
### Modifying Settings
|
||||
|
||||
Ultralytics allows users to easily modify their settings. Changes can be performed in the following ways:
|
||||
|
||||
!!! example "Update settings"
|
||||
|
||||
=== "Python"
|
||||
Within the Python environment, call the `update` method on the `settings` object to change your settings:
|
||||
```python
|
||||
from ultralytics import settings
|
||||
|
||||
# Disable analytics and crash reporting
|
||||
settings.update({'sync': False})
|
||||
|
||||
# Reset settings to default values
|
||||
settings.reset()
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
If you prefer using the command-line interface, the following commands will allow you to modify your settings:
|
||||
```bash
|
||||
# Disable analytics and crash reporting
|
||||
yolo settings sync=False
|
||||
|
||||
# Reset settings to default values
|
||||
yolo settings reset
|
||||
```
|
||||
|
||||
The `sync=False` setting will prevent any data from being sent to Google Analytics or Sentry. Your settings will be respected across all sessions using the Ultralytics package and saved to disk for future sessions.
|
||||
|
||||
## Commitment to Privacy
|
||||
|
||||
Ultralytics takes user privacy seriously. We design our data collection practices with the following principles:
|
||||
|
||||
- **Transparency**: We are open about the data we collect and how it is used.
|
||||
- **Control**: We give users full control over their data.
|
||||
- **Security**: We employ industry-standard security measures to protect the data we collect.
|
||||
|
||||
## Questions or Concerns
|
||||
|
||||
If you have any questions or concerns about our data collection practices, please reach out to us via our [contact form](https://ultralytics.com/contact) or via [support@ultralytics.com](mailto:support@ultralytics.com). We are dedicated to ensuring our users feel informed and confident in their privacy when using our package.
|
||||
36
docs/en/help/security.md
Normal file
36
docs/en/help/security.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
---
|
||||
description: Explore Ultralytics' comprehensive security strategies safeguarding user data and systems. Learn about our diverse security tools, including Snyk, GitHub CodeQL, and Dependabot Alerts.
|
||||
keywords: Ultralytics, Comprehensive Security, user data protection, Snyk, GitHub CodeQL, Dependabot, vulnerability management, coding security practices
|
||||
---
|
||||
|
||||
# Ultralytics Security Policy
|
||||
|
||||
At [Ultralytics](https://ultralytics.com), the security of our users' data and systems is of utmost importance. To ensure the safety and security of our [open-source projects](https://github.com/ultralytics), we have implemented several measures to detect and prevent security vulnerabilities.
|
||||
|
||||
## Snyk Scanning
|
||||
|
||||
We utilize [Snyk](https://snyk.io/advisor/python/ultralytics) to conduct comprehensive security scans on Ultralytics repositories. Snyk's robust scanning capabilities extend beyond dependency checks; it also examines our code and Dockerfiles for various vulnerabilities. By identifying and addressing these issues proactively, we ensure a higher level of security and reliability for our users.
|
||||
|
||||
[](https://snyk.io/advisor/python/ultralytics)
|
||||
|
||||
## GitHub CodeQL Scanning
|
||||
|
||||
Our security strategy includes GitHub's [CodeQL](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/about-code-scanning-with-codeql) scanning. CodeQL delves deep into our codebase, identifying complex vulnerabilities like SQL injection and XSS by analyzing the code's semantic structure. This advanced level of analysis ensures early detection and resolution of potential security risks.
|
||||
|
||||
[](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml)
|
||||
|
||||
## GitHub Dependabot Alerts
|
||||
|
||||
[Dependabot](https://docs.github.com/en/code-security/dependabot) is integrated into our workflow to monitor dependencies for known vulnerabilities. When a vulnerability is identified in one of our dependencies, Dependabot alerts us, allowing for swift and informed remediation actions.
|
||||
|
||||
## GitHub Secret Scanning Alerts
|
||||
|
||||
We employ GitHub [secret scanning](https://docs.github.com/en/code-security/secret-scanning/managing-alerts-from-secret-scanning) alerts to detect sensitive data, such as credentials and private keys, accidentally pushed to our repositories. This early detection mechanism helps prevent potential security breaches and data exposures.
|
||||
|
||||
## Private Vulnerability Reporting
|
||||
|
||||
We enable private vulnerability reporting, allowing users to discreetly report potential security issues. This approach facilitates responsible disclosure, ensuring vulnerabilities are handled securely and efficiently.
|
||||
|
||||
If you suspect or discover a security vulnerability in any of our repositories, please let us know immediately. You can reach out to us directly via our [contact form](https://ultralytics.com/contact) or via [security@ultralytics.com](mailto:security@ultralytics.com). Our security team will investigate and respond as soon as possible.
|
||||
|
||||
We appreciate your help in keeping all Ultralytics open-source projects secure and safe for everyone 🙏.
|
||||
89
docs/en/hub/app/android.md
Normal file
89
docs/en/hub/app/android.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn about the Ultralytics Android App, enabling real-time object detection using YOLO models. Discover in-app features, quantization methods, and delegate options for optimal performance.
|
||||
keywords: Ultralytics, Android App, real-time object detection, YOLO models, TensorFlow Lite, FP16 quantization, INT8 quantization, CPU, GPU, Hexagon, NNAPI
|
||||
---
|
||||
|
||||
# Ultralytics Android App: Real-time Object Detection with YOLO Models
|
||||
|
||||
<a href="https://bit.ly/ultralytics_hub" target="_blank">
|
||||
<img width="100%" src="https://user-images.githubusercontent.com/26833433/281124469-6b3b0945-dbb1-44c8-80a9-ef6bc778b299.jpg" alt="Ultralytics HUB preview image"></a>
|
||||
<br>
|
||||
<div align="center">
|
||||
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://youtube.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.instagram.com/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="Ultralytics Instagram"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://ultralytics.com/discord"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app" style="text-decoration:none;">
|
||||
<img src="https://raw.githubusercontent.com/ultralytics/assets/master/app/google-play.svg" width="15%" alt="" /></a>
|
||||
</div>
|
||||
|
||||
The Ultralytics Android App is a powerful tool that allows you to run YOLO models directly on your Android device for real-time object detection. This app utilizes TensorFlow Lite for model optimization and various hardware delegates for acceleration, enabling fast and efficient object detection.
|
||||
|
||||
## Quantization and Acceleration
|
||||
|
||||
To achieve real-time performance on your Android device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
|
||||
|
||||
### FP16 Quantization
|
||||
|
||||
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
|
||||
|
||||
### INT8 Quantization
|
||||
|
||||
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in mean average precision (mAP) due to the lower numerical precision.
|
||||
|
||||
!!! tip "mAP Reduction in INT8 Models"
|
||||
|
||||
The reduced numerical precision in INT8 models can lead to some loss of information during the quantization process, which may result in a slight decrease in mAP. However, this trade-off is often acceptable considering the substantial performance gains offered by INT8 quantization.
|
||||
|
||||
## Delegates and Performance Variability
|
||||
|
||||
Different delegates are available on Android devices to accelerate model inference. These delegates include CPU, [GPU](https://www.tensorflow.org/lite/android/delegates/gpu), [Hexagon](https://www.tensorflow.org/lite/android/delegates/hexagon) and [NNAPI](https://www.tensorflow.org/lite/android/delegates/nnapi). The performance of these delegates varies depending on the device's hardware vendor, product line, and specific chipsets used in the device.
|
||||
|
||||
1. **CPU**: The default option, with reasonable performance on most devices.
|
||||
2. **GPU**: Utilizes the device's GPU for faster inference. It can provide a significant performance boost on devices with powerful GPUs.
|
||||
3. **Hexagon**: Leverages Qualcomm's Hexagon DSP for faster and more efficient processing. This option is available on devices with Qualcomm Snapdragon processors.
|
||||
4. **NNAPI**: The Android Neural Networks API (NNAPI) serves as an abstraction layer for running ML models on Android devices. NNAPI can utilize various hardware accelerators, such as CPU, GPU, and dedicated AI chips (e.g., Google's Edge TPU, or the Pixel Neural Core).
|
||||
|
||||
Here's a table showing the primary vendors, their product lines, popular devices, and supported delegates:
|
||||
|
||||
| Vendor | Product Lines | Popular Devices | Delegates Supported |
|
||||
|-----------------------------------------|--------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|
|
||||
| [Qualcomm](https://www.qualcomm.com/) | [Snapdragon (e.g., 800 series)](https://www.qualcomm.com/snapdragon) | [Samsung Galaxy S21](https://www.samsung.com/global/galaxy/galaxy-s21-5g/), [OnePlus 9](https://www.oneplus.com/9), [Google Pixel 6](https://store.google.com/product/pixel_6) | CPU, GPU, Hexagon, NNAPI |
|
||||
| [Samsung](https://www.samsung.com/) | [Exynos (e.g., Exynos 2100)](https://www.samsung.com/semiconductor/minisite/exynos/) | [Samsung Galaxy S21 (Global version)](https://www.samsung.com/global/galaxy/galaxy-s21-5g/) | CPU, GPU, NNAPI |
|
||||
| [MediaTek](https://i.mediatek.com/) | [Dimensity (e.g., Dimensity 1200)](https://i.mediatek.com/dimensity-1200) | [Realme GT](https://www.realme.com/global/realme-gt), [Xiaomi Redmi Note](https://www.mi.com/en/phone/redmi/note-list) | CPU, GPU, NNAPI |
|
||||
| [HiSilicon](https://www.hisilicon.com/) | [Kirin (e.g., Kirin 990)](https://www.hisilicon.com/en/products/Kirin) | [Huawei P40 Pro](https://consumer.huawei.com/en/phones/p40-pro/), [Huawei Mate 30 Pro](https://consumer.huawei.com/en/phones/mate30-pro/) | CPU, GPU, NNAPI |
|
||||
| [NVIDIA](https://www.nvidia.com/) | [Tegra (e.g., Tegra X1)](https://developer.nvidia.com/content/tegra-x1) | [NVIDIA Shield TV](https://www.nvidia.com/en-us/shield/shield-tv/), [Nintendo Switch](https://www.nintendo.com/switch/) | CPU, GPU, NNAPI |
|
||||
|
||||
Please note that the list of devices mentioned is not exhaustive and may vary depending on the specific chipsets and device models. Always test your models on your target devices to ensure compatibility and optimal performance.
|
||||
|
||||
Keep in mind that the choice of delegate can affect performance and model compatibility. For example, some models may not work with certain delegates, or a delegate may not be available on a specific device. As such, it's essential to test your model and the chosen delegate on your target devices for the best results.
|
||||
|
||||
## Getting Started with the Ultralytics Android App
|
||||
|
||||
To get started with the Ultralytics Android App, follow these steps:
|
||||
|
||||
1. Download the Ultralytics App from the [Google Play Store](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app).
|
||||
|
||||
2. Launch the app on your Android device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
|
||||
|
||||
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
|
||||
|
||||
4. Grant the app permission to access your device's camera.
|
||||
|
||||
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
|
||||
|
||||
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
|
||||
|
||||
With the Ultralytics Android App, you now have the power of real-time object detection using YOLO models right at your fingertips. Enjoy exploring the app's features and optimizing its settings to suit your specific use cases.
|
||||
48
docs/en/hub/app/index.md
Normal file
48
docs/en/hub/app/index.md
Normal file
|
|
@ -0,0 +1,48 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the Ultralytics HUB App, offering the ability to run YOLOv5 and YOLOv8 models on your iOS and Android devices with optimized performance.
|
||||
keywords: Ultralytics, HUB App, YOLOv5, YOLOv8, mobile AI, real-time object detection, image recognition, mobile device, hardware acceleration, Apple Neural Engine, Android GPU, NNAPI, custom model training
|
||||
---
|
||||
|
||||
# Ultralytics HUB App
|
||||
|
||||
<a href="https://bit.ly/ultralytics_hub" target="_blank">
|
||||
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/ultralytics-hub.png" alt="Ultralytics HUB preview image"></a>
|
||||
<br>
|
||||
<div align="center">
|
||||
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://youtube.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.instagram.com/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="Ultralytics Instagram"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://ultralytics.com/discord"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://apps.apple.com/xk/app/ultralytics/id1583935240" style="text-decoration:none;">
|
||||
<img src="https://raw.githubusercontent.com/ultralytics/assets/master/app/app-store.svg" width="15%" alt="" /></a>
|
||||
<a href="https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app" style="text-decoration:none;">
|
||||
<img src="https://raw.githubusercontent.com/ultralytics/assets/master/app/google-play.svg" width="15%" alt="" /></a>
|
||||
</div>
|
||||
|
||||
Welcome to the Ultralytics HUB App! We are excited to introduce this powerful mobile app that allows you to run YOLOv5 and YOLOv8 models directly on your [iOS](https://apps.apple.com/xk/app/ultralytics/id1583935240) and [Android](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app) devices. With the HUB App, you can utilize hardware acceleration features like Apple's Neural Engine (ANE) or Android GPU and Neural Network API (NNAPI) delegates to achieve impressive performance on your mobile device.
|
||||
|
||||
## Features
|
||||
|
||||
- **Run YOLOv5 and YOLOv8 models**: Experience the power of YOLO models on your mobile device for real-time object detection and image recognition tasks.
|
||||
- **Hardware Acceleration**: Benefit from Apple ANE on iOS devices or Android GPU and NNAPI delegates for optimized performance.
|
||||
- **Custom Model Training**: Train custom models with the Ultralytics HUB platform and preview them live using the HUB App.
|
||||
- **Mobile Compatibility**: The HUB App supports both iOS and Android devices, bringing the power of YOLO models to a wide range of users.
|
||||
|
||||
## App Documentation
|
||||
|
||||
- [**iOS**](./ios.md): Learn about YOLO CoreML models accelerated on Apple's Neural Engine for iPhones and iPads.
|
||||
- [**Android**](./android.md): Explore TFLite acceleration on Android mobile devices.
|
||||
|
||||
Get started today by downloading the Ultralytics HUB App on your mobile device and unlock the potential of YOLOv5 and YOLOv8 models on-the-go. Don't forget to check out our comprehensive [HUB Docs](../index.md) for more information on training, deploying, and using your custom models with the Ultralytics HUB platform.
|
||||
79
docs/en/hub/app/ios.md
Normal file
79
docs/en/hub/app/ios.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
---
|
||||
comments: true
|
||||
description: Execute object detection in real-time on your iOS devices utilizing YOLO models. Leverage the power of the Apple Neural Engine and Core ML for fast and efficient object detection.
|
||||
keywords: Ultralytics, iOS app, object detection, YOLO models, real time, Apple Neural Engine, Core ML, FP16, INT8, quantization
|
||||
---
|
||||
|
||||
# Ultralytics iOS App: Real-time Object Detection with YOLO Models
|
||||
|
||||
<a href="https://bit.ly/ultralytics_hub" target="_blank">
|
||||
<img width="100%" src="https://user-images.githubusercontent.com/26833433/281124469-6b3b0945-dbb1-44c8-80a9-ef6bc778b299.jpg" alt="Ultralytics HUB preview image"></a>
|
||||
<br>
|
||||
<div align="center">
|
||||
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://youtube.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.instagram.com/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="Ultralytics Instagram"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://ultralytics.com/discord"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://apps.apple.com/xk/app/ultralytics/id1583935240" style="text-decoration:none;">
|
||||
<img src="https://raw.githubusercontent.com/ultralytics/assets/master/app/app-store.svg" width="15%" alt="" /></a>
|
||||
</div>
|
||||
|
||||
The Ultralytics iOS App is a powerful tool that allows you to run YOLO models directly on your iPhone or iPad for real-time object detection. This app utilizes the Apple Neural Engine and Core ML for model optimization and acceleration, enabling fast and efficient object detection.
|
||||
|
||||
## Quantization and Acceleration
|
||||
|
||||
To achieve real-time performance on your iOS device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
|
||||
|
||||
### FP16 Quantization
|
||||
|
||||
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
|
||||
|
||||
### INT8 Quantization
|
||||
|
||||
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in accuracy.
|
||||
|
||||
## Apple Neural Engine
|
||||
|
||||
The Apple Neural Engine (ANE) is a dedicated hardware component integrated into Apple's A-series and M-series chips. It's designed to accelerate machine learning tasks, particularly for neural networks, allowing for faster and more efficient execution of your YOLO models.
|
||||
|
||||
By combining quantized YOLO models with the Apple Neural Engine, the Ultralytics iOS App achieves real-time object detection on your iOS device without compromising on accuracy or performance.
|
||||
|
||||
| Release Year | iPhone Name | Chipset Name | Node Size | ANE TOPs |
|
||||
|--------------|------------------------------------------------------|-------------------------------------------------------|-----------|----------|
|
||||
| 2017 | [iPhone X](https://en.wikipedia.org/wiki/IPhone_X) | [A11 Bionic](https://en.wikipedia.org/wiki/Apple_A11) | 10 nm | 0.6 |
|
||||
| 2018 | [iPhone XS](https://en.wikipedia.org/wiki/IPhone_XS) | [A12 Bionic](https://en.wikipedia.org/wiki/Apple_A12) | 7 nm | 5 |
|
||||
| 2019 | [iPhone 11](https://en.wikipedia.org/wiki/IPhone_11) | [A13 Bionic](https://en.wikipedia.org/wiki/Apple_A13) | 7 nm | 6 |
|
||||
| 2020 | [iPhone 12](https://en.wikipedia.org/wiki/IPhone_12) | [A14 Bionic](https://en.wikipedia.org/wiki/Apple_A14) | 5 nm | 11 |
|
||||
| 2021 | [iPhone 13](https://en.wikipedia.org/wiki/IPhone_13) | [A15 Bionic](https://en.wikipedia.org/wiki/Apple_A15) | 5 nm | 15.8 |
|
||||
| 2022 | [iPhone 14](https://en.wikipedia.org/wiki/IPhone_14) | [A16 Bionic](https://en.wikipedia.org/wiki/Apple_A16) | 4 nm | 17.0 |
|
||||
|
||||
Please note that this list only includes iPhone models from 2017 onwards, and the ANE TOPs values are approximate.
|
||||
|
||||
## Getting Started with the Ultralytics iOS App
|
||||
|
||||
To get started with the Ultralytics iOS App, follow these steps:
|
||||
|
||||
1. Download the Ultralytics App from the [App Store](https://apps.apple.com/xk/app/ultralytics/id1583935240).
|
||||
|
||||
2. Launch the app on your iOS device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
|
||||
|
||||
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
|
||||
|
||||
4. Grant the app permission to access your device's camera.
|
||||
|
||||
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
|
||||
|
||||
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
|
||||
|
||||
With the Ultralytics iOS App, you can now leverage the power of YOLO models for real-time object detection on your iPhone or iPad, powered by the Apple Neural Engine and optimized with FP16 or INT8 quantization.
|
||||
159
docs/en/hub/datasets.md
Normal file
159
docs/en/hub/datasets.md
Normal file
|
|
@ -0,0 +1,159 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how Ultralytics HUB datasets streamline your ML workflow. Upload, format, validate, access, share, edit or delete datasets for Ultralytics YOLO model training.
|
||||
keywords: Ultralytics, HUB datasets, YOLO model training, upload datasets, dataset validation, ML workflow, share datasets
|
||||
---
|
||||
|
||||
# HUB Datasets
|
||||
|
||||
[Ultralytics HUB](https://hub.ultralytics.com/) datasets are a practical solution for managing and leveraging your custom datasets.
|
||||
|
||||
Once uploaded, datasets can be immediately utilized for model training. This integrated approach facilitates a seamless transition from dataset management to model training, significantly simplifying the entire process.
|
||||
|
||||
## Upload Dataset
|
||||
|
||||
Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets. They use the same structure and the same label formats to keep everything simple.
|
||||
|
||||
Before you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML file inside the dataset root directory** and that **your dataset YAML, directory and ZIP have the same name**, as shown in the example below, and then zip the dataset directory.
|
||||
|
||||
For example, if your dataset is called "coco8", as our [COCO8](https://docs.ultralytics.com/datasets/detect/coco8) example dataset, then you should have a `coco8.yaml` inside your `coco8/` directory, which will create a `coco8.zip` when zipped:
|
||||
|
||||
```bash
|
||||
zip -r coco8.zip coco8
|
||||
```
|
||||
|
||||
You can download our [COCO8](https://github.com/ultralytics/hub/blob/main/example_datasets/coco8.zip) example dataset and unzip it to see exactly how to structure your dataset.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_1.jpg" alt="COCO8 Dataset Structure" width="80%" />
|
||||
</p>
|
||||
|
||||
The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format.
|
||||
|
||||
!!! example "coco8.yaml"
|
||||
|
||||
```yaml
|
||||
--8<-- "ultralytics/cfg/datasets/coco8.yaml"
|
||||
```
|
||||
|
||||
After zipping your dataset, you should validate it before uploading it to Ultralytics HUB. Ultralytics HUB conducts the dataset validation check post-upload, so by ensuring your dataset is correctly formatted and error-free ahead of time, you can forestall any setbacks due to dataset rejection.
|
||||
|
||||
```py
|
||||
from ultralytics.hub import check_dataset
|
||||
|
||||
check_dataset('path/to/coco8.zip')
|
||||
```
|
||||
|
||||
Once your dataset ZIP is ready, navigate to the [Datasets](https://hub.ultralytics.com/datasets) page by clicking on the **Datasets** button in the sidebar.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also upload a dataset directly from the [Home](https://hub.ultralytics.com/home) page.
|
||||
|
||||

|
||||
|
||||
Click on the **Upload Dataset** button on the top right of the page. This action will trigger the **Upload Dataset** dialog.
|
||||
|
||||

|
||||
|
||||
Upload your dataset in the _Dataset .zip file_ field.
|
||||
|
||||
You have the additional option to set a custom name and description for your Ultralytics HUB dataset.
|
||||
|
||||
When you're happy with your dataset configuration, click **Upload**.
|
||||
|
||||

|
||||
|
||||
After your dataset is uploaded and processed, you will be able to access it from the Datasets page.
|
||||
|
||||

|
||||
|
||||
You can view the images in your dataset grouped by splits (Train, Validation, Test).
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
Each image can be enlarged for better visualization.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
Also, you can analyze your dataset by click on the **Overview** tab.
|
||||
|
||||

|
||||
|
||||
Next, [train a model](https://docs.ultralytics.com/hub/models/#train-model) on your dataset.
|
||||
|
||||

|
||||
|
||||
## Share Dataset
|
||||
|
||||
!!! info "Info"
|
||||
|
||||
Ultralytics HUB's sharing functionality provides a convenient way to share datasets with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
|
||||
|
||||
??? note "Note"
|
||||
|
||||
You have control over the general access of your datasets.
|
||||
|
||||
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the dataset, regardless of whether they have an Ultralytics HUB account or not.
|
||||
|
||||
Navigate to the Dataset page of the dataset you want to share, open the dataset actions dropdown and click on the **Share** option. This action will trigger the **Share Dataset** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also share a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||
|
||||

|
||||
|
||||
Set the general access to "Unlisted" and click **Save**.
|
||||
|
||||

|
||||
|
||||
Now, anyone who has the direct link to your dataset can view it.
|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can easily click on the dataset's link shown in the **Share Dataset** dialog to copy it.
|
||||
|
||||

|
||||
|
||||
## Edit Dataset
|
||||
|
||||
Navigate to the Dataset page of the dataset you want to edit, open the dataset actions dropdown and click on the **Edit** option. This action will trigger the **Update Dataset** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also edit a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||
|
||||

|
||||
|
||||
Apply the desired modifications to your dataset and then confirm the changes by clicking **Save**.
|
||||
|
||||

|
||||
|
||||
## Delete Dataset
|
||||
|
||||
Navigate to the Dataset page of the dataset you want to delete, open the dataset actions dropdown and click on the **Delete** option. This action will delete the dataset.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also delete a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
If you change your mind, you can restore the dataset from the [Trash](https://hub.ultralytics.com/trash) page.
|
||||
|
||||

|
||||
61
docs/en/hub/index.md
Normal file
61
docs/en/hub/index.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
---
|
||||
comments: true
|
||||
description: Gain insights into training and deploying your YOLOv5 and YOLOv8 models with Ultralytics HUB. Explore pre-trained models, templates and various integrations.
|
||||
keywords: Ultralytics HUB, YOLOv5, YOLOv8, model training, model deployment, pretrained models, model integrations
|
||||
---
|
||||
|
||||
# Ultralytics HUB
|
||||
|
||||
<a href="https://bit.ly/ultralytics_hub" target="_blank">
|
||||
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/ultralytics-hub.png" alt="Ultralytics HUB preview image"></a>
|
||||
<br>
|
||||
<div align="center">
|
||||
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://youtube.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.instagram.com/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="Ultralytics Instagram"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://ultralytics.com/discord"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://github.com/ultralytics/hub/actions/workflows/ci.yaml">
|
||||
<img src="https://github.com/ultralytics/hub/actions/workflows/ci.yaml/badge.svg" alt="CI CPU"></a>
|
||||
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb">
|
||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
|
||||
</div>
|
||||
|
||||
👋 Hello from the [Ultralytics](https://ultralytics.com/) Team! We've been working hard these last few months to launch [Ultralytics HUB](https://bit.ly/ultralytics_hub), a new web tool for training and deploying all your YOLOv5 and YOLOv8 🚀 models from one spot!
|
||||
|
||||
## Introduction
|
||||
|
||||
HUB is designed to be user-friendly and intuitive, with a drag-and-drop interface that allows users to easily upload their data and train new models quickly. It offers a range of pre-trained models and templates to choose from, making it easy for users to get started with training their own models. Once a model is trained, it can be easily deployed and used for real-time object detection, instance segmentation and classification tasks.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/lveF9iCMIzc?si=_Q4WB5kMB5qNe7q6"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Train Your Custom YOLO Models In A Few Clicks with Ultralytics HUB.
|
||||
</p>
|
||||
|
||||
We hope that the resources here will help you get the most out of HUB. Please browse the HUB <a href="https://docs.ultralytics.com/hub">Docs</a> for details, raise an issue on <a href="https://github.com/ultralytics/hub/issues/new/choose">GitHub</a> for support, and join our <a href="https://ultralytics.com/discord">Discord</a> community for questions and discussions!
|
||||
|
||||
- [**Quickstart**](./quickstart.md). Start training and deploying YOLO models with HUB in seconds.
|
||||
- [**Datasets: Preparing and Uploading**](./datasets.md). Learn how to prepare and upload your datasets to HUB in YOLO format.
|
||||
- [**Projects: Creating and Managing**](./projects.md). Group your models into projects for improved organization.
|
||||
- [**Models: Training and Exporting**](./models.md). Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
|
||||
- [**Integrations: Options**](./integrations.md). Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
|
||||
- [**Ultralytics HUB App**](./app/index.md). Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
|
||||
* [**iOS**](./app/ios.md). Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
|
||||
* [**Android**](./app/android.md). Explore TFLite acceleration on mobile devices.
|
||||
- [**Inference API**](./inference_api.md). Understand how to use the Inference API for running your trained models in the cloud to generate predictions.
|
||||
458
docs/en/hub/inference_api.md
Normal file
458
docs/en/hub/inference_api.md
Normal file
|
|
@ -0,0 +1,458 @@
|
|||
---
|
||||
comments: true
|
||||
description: Access object detection capabilities of YOLOv8 via our RESTful API. Learn how to use the YOLO Inference API with Python or CLI for swift object detection.
|
||||
keywords: Ultralytics, YOLOv8, Inference API, object detection, RESTful API, Python, CLI, Quickstart
|
||||
---
|
||||
|
||||
# YOLO Inference API
|
||||
|
||||
The YOLO Inference API allows you to access the YOLOv8 object detection capabilities via a RESTful API. This enables you to run object detection on images without the need to install and set up the YOLOv8 environment locally.
|
||||
|
||||

|
||||
Screenshot of the Inference API section in the trained model Preview tab.
|
||||
|
||||
## API URL
|
||||
|
||||
The API URL is the address used to access the YOLO Inference API. In this case, the base URL is:
|
||||
|
||||
```
|
||||
https://api.ultralytics.com/v1/predict
|
||||
```
|
||||
|
||||
## Example Usage in Python
|
||||
|
||||
To access the YOLO Inference API with the specified model and API key using Python, you can use the following code:
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
# API URL, use actual MODEL_ID
|
||||
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
|
||||
|
||||
# Headers, use actual API_KEY
|
||||
headers = {"x-api-key": "API_KEY"}
|
||||
|
||||
# Inference arguments (optional)
|
||||
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
|
||||
|
||||
# Load image and send request
|
||||
with open("path/to/image.jpg", "rb") as image_file:
|
||||
files = {"image": image_file}
|
||||
response = requests.post(url, headers=headers, files=files, data=data)
|
||||
|
||||
print(response.json())
|
||||
```
|
||||
|
||||
In this example, replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `path/to/image.jpg` with the path to the image you want to analyze.
|
||||
|
||||
## Example Usage with CLI
|
||||
|
||||
You can use the YOLO Inference API with the command-line interface (CLI) by utilizing the `curl` command. Replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze:
|
||||
|
||||
```bash
|
||||
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
|
||||
-H "x-api-key: API_KEY" \
|
||||
-F "image=@/path/to/image.jpg" \
|
||||
-F "size=640" \
|
||||
-F "confidence=0.25" \
|
||||
-F "iou=0.45"
|
||||
```
|
||||
|
||||
## Passing Arguments
|
||||
|
||||
This command sends a POST request to the YOLO Inference API with the specified `MODEL_ID` in the URL and the `API_KEY` in the request `headers`, along with the image file specified by `@path/to/image.jpg`.
|
||||
|
||||
Here's an example of passing the `size`, `confidence`, and `iou` arguments via the API URL using the `requests` library in Python:
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
# API URL, use actual MODEL_ID
|
||||
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
|
||||
|
||||
# Headers, use actual API_KEY
|
||||
headers = {"x-api-key": "API_KEY"}
|
||||
|
||||
# Inference arguments (optional)
|
||||
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
|
||||
|
||||
# Load image and send request
|
||||
with open("path/to/image.jpg", "rb") as image_file:
|
||||
files = {"image": image_file}
|
||||
response = requests.post(url, headers=headers, files=files, data=data)
|
||||
|
||||
print(response.json())
|
||||
```
|
||||
|
||||
In this example, the `data` dictionary contains the query arguments `size`, `confidence`, and `iou`, which tells the API to run inference at image size 640 with confidence and IoU thresholds of 0.25 and 0.45.
|
||||
|
||||
This will send the query parameters along with the file in the POST request. See the table below for a full list of available inference arguments.
|
||||
|
||||
| Inference Argument | Default | Type | Notes |
|
||||
|--------------------|---------|---------|------------------------------------------------|
|
||||
| `size` | `640` | `int` | valid range is `32` - `1280` pixels |
|
||||
| `confidence` | `0.25` | `float` | valid range is `0.01` - `1.0` |
|
||||
| `iou` | `0.45` | `float` | valid range is `0.0` - `0.95` |
|
||||
| `url` | `''` | `str` | optional image URL if not image file is passed |
|
||||
| `normalize` | `False` | `bool` | |
|
||||
|
||||
## Return JSON format
|
||||
|
||||
The YOLO Inference API returns a JSON list with the detection results. The format of the JSON list will be the same as the one produced locally by the `results[0].tojson()` command.
|
||||
|
||||
The JSON list contains information about the detected objects, their coordinates, classes, and confidence scores.
|
||||
|
||||
### Detect Model Format
|
||||
|
||||
YOLO detection models, such as `yolov8n.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
|
||||
|
||||
!!! example "Detect Model JSON Response"
|
||||
|
||||
=== "Local"
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Run inference
|
||||
results = model('image.jpg')
|
||||
|
||||
# Print image.jpg results in JSON format
|
||||
print(results[0].tojson())
|
||||
```
|
||||
|
||||
=== "CLI API"
|
||||
```bash
|
||||
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
|
||||
-H "x-api-key: API_KEY" \
|
||||
-F "image=@/path/to/image.jpg" \
|
||||
-F "size=640" \
|
||||
-F "confidence=0.25" \
|
||||
-F "iou=0.45"
|
||||
```
|
||||
|
||||
=== "Python API"
|
||||
```python
|
||||
import requests
|
||||
|
||||
# API URL, use actual MODEL_ID
|
||||
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
|
||||
|
||||
# Headers, use actual API_KEY
|
||||
headers = {"x-api-key": "API_KEY"}
|
||||
|
||||
# Inference arguments (optional)
|
||||
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
|
||||
|
||||
# Load image and send request
|
||||
with open("path/to/image.jpg", "rb") as image_file:
|
||||
files = {"image": image_file}
|
||||
response = requests.post(url, headers=headers, files=files, data=data)
|
||||
|
||||
print(response.json())
|
||||
```
|
||||
|
||||
=== "JSON Response"
|
||||
```json
|
||||
{
|
||||
"success": True,
|
||||
"message": "Inference complete.",
|
||||
"data": [
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.8359682559967041,
|
||||
"box": {
|
||||
"x1": 0.08974208831787109,
|
||||
"y1": 0.27418340047200523,
|
||||
"x2": 0.8706787109375,
|
||||
"y2": 0.9887352837456598
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.8189555406570435,
|
||||
"box": {
|
||||
"x1": 0.5847355842590332,
|
||||
"y1": 0.05813225640190972,
|
||||
"x2": 0.8930277824401855,
|
||||
"y2": 0.9903111775716146
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "tie",
|
||||
"class": 27,
|
||||
"confidence": 0.2909725308418274,
|
||||
"box": {
|
||||
"x1": 0.3433395862579346,
|
||||
"y1": 0.6070465511745877,
|
||||
"x2": 0.40964522361755373,
|
||||
"y2": 0.9849439832899306
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Segment Model Format
|
||||
|
||||
YOLO segmentation models, such as `yolov8n-seg.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
|
||||
|
||||
!!! example "Segment Model JSON Response"
|
||||
|
||||
=== "Local"
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load model
|
||||
model = YOLO('yolov8n-seg.pt')
|
||||
|
||||
# Run inference
|
||||
results = model('image.jpg')
|
||||
|
||||
# Print image.jpg results in JSON format
|
||||
print(results[0].tojson())
|
||||
```
|
||||
|
||||
=== "CLI API"
|
||||
```bash
|
||||
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
|
||||
-H "x-api-key: API_KEY" \
|
||||
-F "image=@/path/to/image.jpg" \
|
||||
-F "size=640" \
|
||||
-F "confidence=0.25" \
|
||||
-F "iou=0.45"
|
||||
```
|
||||
|
||||
=== "Python API"
|
||||
```python
|
||||
import requests
|
||||
|
||||
# API URL, use actual MODEL_ID
|
||||
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
|
||||
|
||||
# Headers, use actual API_KEY
|
||||
headers = {"x-api-key": "API_KEY"}
|
||||
|
||||
# Inference arguments (optional)
|
||||
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
|
||||
|
||||
# Load image and send request
|
||||
with open("path/to/image.jpg", "rb") as image_file:
|
||||
files = {"image": image_file}
|
||||
response = requests.post(url, headers=headers, files=files, data=data)
|
||||
|
||||
print(response.json())
|
||||
```
|
||||
|
||||
=== "JSON Response"
|
||||
Note `segments` `x` and `y` lengths may vary from one object to another. Larger or more complex objects may have more segment points.
|
||||
```json
|
||||
{
|
||||
"success": True,
|
||||
"message": "Inference complete.",
|
||||
"data": [
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.856913149356842,
|
||||
"box": {
|
||||
"x1": 0.1064866065979004,
|
||||
"y1": 0.2798851860894097,
|
||||
"x2": 0.8738358497619629,
|
||||
"y2": 0.9894873725043403
|
||||
},
|
||||
"segments": {
|
||||
"x": [
|
||||
0.421875,
|
||||
0.4203124940395355,
|
||||
0.41718751192092896
|
||||
...
|
||||
],
|
||||
"y": [
|
||||
0.2888889014720917,
|
||||
0.2916666567325592,
|
||||
0.2916666567325592
|
||||
...
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.8512625694274902,
|
||||
"box": {
|
||||
"x1": 0.5757311820983887,
|
||||
"y1": 0.053943040635850696,
|
||||
"x2": 0.8960096359252929,
|
||||
"y2": 0.985154045952691
|
||||
},
|
||||
"segments": {
|
||||
"x": [
|
||||
0.7515624761581421,
|
||||
0.75,
|
||||
0.7437499761581421
|
||||
...
|
||||
],
|
||||
"y": [
|
||||
0.0555555559694767,
|
||||
0.05833333358168602,
|
||||
0.05833333358168602
|
||||
...
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "tie",
|
||||
"class": 27,
|
||||
"confidence": 0.6485961675643921,
|
||||
"box": {
|
||||
"x1": 0.33911995887756347,
|
||||
"y1": 0.6057066175672743,
|
||||
"x2": 0.4081430912017822,
|
||||
"y2": 0.9916408962673611
|
||||
},
|
||||
"segments": {
|
||||
"x": [
|
||||
0.37187498807907104,
|
||||
0.37031251192092896,
|
||||
0.3687500059604645
|
||||
...
|
||||
],
|
||||
"y": [
|
||||
0.6111111044883728,
|
||||
0.6138888597488403,
|
||||
0.6138888597488403
|
||||
...
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Pose Model Format
|
||||
|
||||
YOLO pose models, such as `yolov8n-pose.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
|
||||
|
||||
!!! example "Pose Model JSON Response"
|
||||
|
||||
=== "Local"
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load model
|
||||
model = YOLO('yolov8n-seg.pt')
|
||||
|
||||
# Run inference
|
||||
results = model('image.jpg')
|
||||
|
||||
# Print image.jpg results in JSON format
|
||||
print(results[0].tojson())
|
||||
```
|
||||
|
||||
=== "CLI API"
|
||||
```bash
|
||||
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
|
||||
-H "x-api-key: API_KEY" \
|
||||
-F "image=@/path/to/image.jpg" \
|
||||
-F "size=640" \
|
||||
-F "confidence=0.25" \
|
||||
-F "iou=0.45"
|
||||
```
|
||||
|
||||
=== "Python API"
|
||||
```python
|
||||
import requests
|
||||
|
||||
# API URL, use actual MODEL_ID
|
||||
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
|
||||
|
||||
# Headers, use actual API_KEY
|
||||
headers = {"x-api-key": "API_KEY"}
|
||||
|
||||
# Inference arguments (optional)
|
||||
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
|
||||
|
||||
# Load image and send request
|
||||
with open("path/to/image.jpg", "rb") as image_file:
|
||||
files = {"image": image_file}
|
||||
response = requests.post(url, headers=headers, files=files, data=data)
|
||||
|
||||
print(response.json())
|
||||
```
|
||||
|
||||
=== "JSON Response"
|
||||
Note COCO-keypoints pretrained models will have 17 human keypoints. The `visible` part of the keypoints indicates whether a keypoint is visible or obscured. Obscured keypoints may be outside the image or may not be visible, i.e. a person's eyes facing away from the camera.
|
||||
```json
|
||||
{
|
||||
"success": True,
|
||||
"message": "Inference complete.",
|
||||
"data": [
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.8439509868621826,
|
||||
"box": {
|
||||
"x1": 0.1125,
|
||||
"y1": 0.28194444444444444,
|
||||
"x2": 0.7953125,
|
||||
"y2": 0.9902777777777778
|
||||
},
|
||||
"keypoints": {
|
||||
"x": [
|
||||
0.5058594942092896,
|
||||
0.5103894472122192,
|
||||
0.4920862317085266
|
||||
...
|
||||
],
|
||||
"y": [
|
||||
0.48964157700538635,
|
||||
0.4643048942089081,
|
||||
0.4465252459049225
|
||||
...
|
||||
],
|
||||
"visible": [
|
||||
0.8726999163627625,
|
||||
0.653947651386261,
|
||||
0.9130823612213135
|
||||
...
|
||||
]
|
||||
}
|
||||
},
|
||||
{
|
||||
"name": "person",
|
||||
"class": 0,
|
||||
"confidence": 0.7474289536476135,
|
||||
"box": {
|
||||
"x1": 0.58125,
|
||||
"y1": 0.0625,
|
||||
"x2": 0.8859375,
|
||||
"y2": 0.9888888888888889
|
||||
},
|
||||
"keypoints": {
|
||||
"x": [
|
||||
0.778544008731842,
|
||||
0.7976160049438477,
|
||||
0.7530890107154846
|
||||
...
|
||||
],
|
||||
"y": [
|
||||
0.27595141530036926,
|
||||
0.2378823608160019,
|
||||
0.23644638061523438
|
||||
...
|
||||
],
|
||||
"visible": [
|
||||
0.8900790810585022,
|
||||
0.789978563785553,
|
||||
0.8974530100822449
|
||||
...
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
62
docs/en/hub/integrations.md
Normal file
62
docs/en/hub/integrations.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore integration options for Ultralytics HUB. Currently featuring Roboflow for dataset integration and multiple export formats for your trained models.
|
||||
keywords: Ultralytics HUB, Integrations, Roboflow, Dataset, Export, YOLOv5, YOLOv8, ONNX, CoreML, TensorRT, TensorFlow
|
||||
---
|
||||
|
||||
# HUB Integrations
|
||||
|
||||
🚧 **Under Construction** 🚧
|
||||
|
||||
Welcome to the Integrations guide for [Ultralytics HUB](https://hub.ultralytics.com/)! We are in the process of expanding this section to provide you with comprehensive guidance on integrating your YOLOv5 and YOLOv8 models with various platforms and formats. Currently, Roboflow is our available dataset integration, with a wide range of export integrations for your trained models.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/lveF9iCMIzc?si=_Q4WB5kMB5qNe7q6"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Train Your Custom YOLO Models In A Few Clicks with Ultralytics HUB.
|
||||
</p>
|
||||
|
||||
## Available Integrations
|
||||
|
||||
### Dataset Integrations
|
||||
|
||||
- **Roboflow**: Seamlessly import your datasets for training.
|
||||
|
||||
### Export Integrations
|
||||
|
||||
| Format | `format` Argument | Model | Metadata | Arguments |
|
||||
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
|
||||
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
|
||||
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
|
||||
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
|
||||
| [OpenVINO](../integrations/openvino.md) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
|
||||
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
|
||||
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
|
||||
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
|
||||
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
|
||||
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
|
||||
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
|
||||
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
|
||||
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
|
||||
| [NCNN](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
|
||||
|
||||
## Coming Soon
|
||||
|
||||
- Additional Dataset Integrations
|
||||
- Detailed Export Integration Guides
|
||||
- Step-by-Step Tutorials for Each Integration
|
||||
|
||||
## Need Immediate Assistance?
|
||||
|
||||
While we're in the process of creating detailed guides:
|
||||
|
||||
- Browse through other [HUB Docs](https://docs.ultralytics.com/hub/) for detailed guides and tutorials.
|
||||
- Raise an issue on our [GitHub](https://github.com/ultralytics/hub/) for technical support.
|
||||
- Join our [Discord Community](https://ultralytics.com/discord/) for live discussions and community support.
|
||||
|
||||
We appreciate your patience as we work to make this section comprehensive and user-friendly. Stay tuned for updates!
|
||||
213
docs/en/hub/models.md
Normal file
213
docs/en/hub/models.md
Normal file
|
|
@ -0,0 +1,213 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to use Ultralytics HUB models for efficient and user-friendly AI model training. For easy model creation, training, evaluation and deployment, follow our detailed guide.
|
||||
keywords: Ultralytics, HUB Models, AI model training, model creation, model training, model evaluation, model deployment
|
||||
---
|
||||
|
||||
# Ultralytics HUB Models
|
||||
|
||||
[Ultralytics HUB](https://hub.ultralytics.com/) models provide a streamlined solution for training vision AI models on your custom datasets.
|
||||
|
||||
The process is user-friendly and efficient, involving a simple three-step creation and accelerated training powered by Utralytics YOLOv8. During training, real-time updates on model metrics are available so that you can monitor each step of the progress. Once training is completed, you can preview your model and easily deploy it to real-world applications. Therefore, Ultralytics HUB offers a comprehensive yet straightforward system for model creation, training, evaluation, and deployment.
|
||||
|
||||
## Train Model
|
||||
|
||||
Navigate to the [Models](https://hub.ultralytics.com/models) page by clicking on the **Models** button in the sidebar.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also train a model directly from the [Home](https://hub.ultralytics.com/home) page.
|
||||
|
||||

|
||||
|
||||
Click on the **Train Model** button on the top right of the page. This action will trigger the **Train Model** dialog.
|
||||
|
||||

|
||||
|
||||
The **Train Model** dialog has three simple steps, explained below.
|
||||
|
||||
### 1. Dataset
|
||||
|
||||
In this step, you have to select the dataset you want to train your model on. After you selected a dataset, click **Continue**.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can skip this step if you train a model directly from the Dataset page.
|
||||
|
||||

|
||||
|
||||
### 2. Model
|
||||
|
||||
In this step, you have to choose the project in which you want to create your model, the name of your model and your model's architecture.
|
||||
|
||||
??? note "Note"
|
||||
|
||||
Ultralytics HUB will try to pre-select the project.
|
||||
|
||||
If you opened the **Train Model** dialog as described above, Ultralytics HUB will pre-select the last project you used.
|
||||
|
||||
If you opened the **Train Model** dialog from the Project page, Ultralytics HUB will pre-select the project you were inside of.
|
||||
|
||||

|
||||
|
||||
In case you don't have a project created yet, you can set the name of your project in this step and it will be created together with your model.
|
||||
|
||||

|
||||
|
||||
!!! info "Info"
|
||||
|
||||
You can read more about the available [YOLOv8](https://docs.ultralytics.com/models/yolov8) (and [YOLOv5](https://docs.ultralytics.com/models/yolov5)) architectures in our documentation.
|
||||
|
||||
When you're happy with your model configuration, click **Continue**.
|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
By default, your model will use a pre-trained model (trained on the [COCO](https://docs.ultralytics.com/datasets/detect/coco) dataset) to reduce training time.
|
||||
|
||||
You can change this behaviour by opening the **Advanced Options** accordion.
|
||||
|
||||
### 3. Train
|
||||
|
||||
In this step, you will start training you model.
|
||||
|
||||
Ultralytics HUB offers three training options:
|
||||
|
||||
- Ultralytics Cloud **(COMING SOON)**
|
||||
- Google Colab
|
||||
- Bring your own agent
|
||||
|
||||
In order to start training your model, follow the instructions presented in this step.
|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
When you are on this step, before the training starts, you can change the default training configuration by opening the **Advanced Options** accordion.
|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
When you are on this step, you have the option to close the **Train Model** dialog and start training your model from the Model page later.
|
||||
|
||||

|
||||
|
||||
To start training your model using Google Colab, simply follow the instructions shown above or on the Google Colab notebook.
|
||||
|
||||
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb" target="_blank">
|
||||
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
|
||||
</a>
|
||||
|
||||
When the training starts, you can click **Done** and monitor the training progress on the Model page.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
In case the training stops and a checkpoint was saved, you can resume training your model from the Model page.
|
||||
|
||||

|
||||
|
||||
## Preview Model
|
||||
|
||||
Ultralytics HUB offers a variety of ways to preview your trained model.
|
||||
|
||||
You can preview your model if you click on the **Preview** tab and upload an image in the **Test** card.
|
||||
|
||||

|
||||
|
||||
You can also use our Ultralytics Cloud API to effortlessly [run inference](https://docs.ultralytics.com/hub/inference_api) with your custom model.
|
||||
|
||||

|
||||
|
||||
Furthermore, you can preview your model in real-time directly on your [iOS](https://apps.apple.com/xk/app/ultralytics/id1583935240) or [Android](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app) mobile device by [downloading](https://ultralytics.com/app_install) our [Ultralytics HUB Mobile Application](./app/index.md).
|
||||
|
||||

|
||||
|
||||
## Deploy Model
|
||||
|
||||
You can export your model to 13 different formats, including ONNX, OpenVINO, CoreML, TensorFlow, Paddle and many others.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can customize the export options of each format if you open the export actions dropdown and click on the **Advanced** option.
|
||||
|
||||

|
||||
|
||||
## Share Model
|
||||
|
||||
!!! info "Info"
|
||||
|
||||
Ultralytics HUB's sharing functionality provides a convenient way to share models with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
|
||||
|
||||
??? note "Note"
|
||||
|
||||
You have control over the general access of your models.
|
||||
|
||||
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the model, regardless of whether they have an Ultralytics HUB account or not.
|
||||
|
||||
Navigate to the Model page of the model you want to share, open the model actions dropdown and click on the **Share** option. This action will trigger the **Share Model** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also share a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
|
||||
|
||||

|
||||
|
||||
Set the general access to "Unlisted" and click **Save**.
|
||||
|
||||

|
||||
|
||||
Now, anyone who has the direct link to your model can view it.
|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can easily click on the model's link shown in the **Share Model** dialog to copy it.
|
||||
|
||||

|
||||
|
||||
## Edit Model
|
||||
|
||||
Navigate to the Model page of the model you want to edit, open the model actions dropdown and click on the **Edit** option. This action will trigger the **Update Model** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also edit a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
|
||||
|
||||

|
||||
|
||||
Apply the desired modifications to your model and then confirm the changes by clicking **Save**.
|
||||
|
||||

|
||||
|
||||
## Delete Model
|
||||
|
||||
Navigate to the Model page of the model you want to delete, open the model actions dropdown and click on the **Delete** option. This action will delete the model.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also delete a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
|
||||
|
||||

|
||||
|
||||
??? note "Note"
|
||||
|
||||
If you change your mind, you can restore the model from the [Trash](https://hub.ultralytics.com/trash) page.
|
||||
|
||||

|
||||
169
docs/en/hub/projects.md
Normal file
169
docs/en/hub/projects.md
Normal file
|
|
@ -0,0 +1,169 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to manage Ultralytics HUB projects. Understand effective strategies to create, share, edit, delete, and compare models in an organized workspace.
|
||||
keywords: Ultralytics, HUB projects, Create project, Edit project, Share project, Delete project, Compare Models, Model Management
|
||||
---
|
||||
|
||||
# Ultralytics HUB Projects
|
||||
|
||||
[Ultralytics HUB](https://hub.ultralytics.com/) projects provide an effective solution for consolidating and managing your models. If you are working with several models that perform similar tasks or have related purposes, Ultralytics HUB projects allow you to group these models together.
|
||||
|
||||
This creates a unified and organized workspace that facilitates easier model management, comparison and development. Having similar models or various iterations together can facilitate rapid benchmarking, as you can compare their effectiveness. This can lead to faster, more insightful iterative development and refinement of your models.
|
||||
|
||||
## Create Project
|
||||
|
||||
Navigate to the [Projects](https://hub.ultralytics.com/projects) page by clicking on the **Projects** button in the sidebar.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also create a project directly from the [Home](https://hub.ultralytics.com/home) page.
|
||||
|
||||

|
||||
|
||||
Click on the **Create Project** button on the top right of the page. This action will trigger the **Create Project** dialog, opening up a suite of options for tailoring your project to your needs.
|
||||
|
||||

|
||||
|
||||
Type the name of your project in the _Project name_ field or keep the default name and finalize the project creation with a single click.
|
||||
|
||||
You have the additional option to enrich your project with a description and a unique image, enhancing its recognizability on the Projects page.
|
||||
|
||||
When you're happy with your project configuration, click **Create**.
|
||||
|
||||

|
||||
|
||||
After your project is created, you will be able to access it from the Projects page.
|
||||
|
||||

|
||||
|
||||
Next, [train a model](https://docs.ultralytics.com/hub/models/#train-model) inside your project.
|
||||
|
||||

|
||||
|
||||
## Share Project
|
||||
|
||||
!!! info "Info"
|
||||
|
||||
Ultralytics HUB's sharing functionality provides a convenient way to share projects with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
|
||||
|
||||
??? note "Note"
|
||||
|
||||
You have control over the general access of your projects.
|
||||
|
||||
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the project, regardless of whether they have an Ultralytics HUB account or not.
|
||||
|
||||
Navigate to the Project page of the project you want to share, open the project actions dropdown and click on the **Share** option. This action will trigger the **Share Project** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also share a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
|
||||
|
||||

|
||||
|
||||
Set the general access to "Unlisted" and click **Save**.
|
||||
|
||||

|
||||
|
||||
!!! warning "Warning"
|
||||
|
||||
When changing the general access of a project, the general access of the models inside the project will be changed as well.
|
||||
|
||||
Now, anyone who has the direct link to your project can view it.
|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can easily click on the project's link shown in the **Share Project** dialog to copy it.
|
||||
|
||||

|
||||
|
||||
## Edit Project
|
||||
|
||||
Navigate to the Project page of the project you want to edit, open the project actions dropdown and click on the **Edit** option. This action will trigger the **Update Project** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also edit a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
|
||||
|
||||

|
||||
|
||||
Apply the desired modifications to your project and then confirm the changes by clicking **Save**.
|
||||
|
||||

|
||||
|
||||
## Delete Project
|
||||
|
||||
Navigate to the Project page of the project you want to delete, open the project actions dropdown and click on the **Delete** option. This action will delete the project.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also delete a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
|
||||
|
||||

|
||||
|
||||
!!! warning "Warning"
|
||||
|
||||
When deleting a project, the models inside the project will be deleted as well.
|
||||
|
||||
??? note "Note"
|
||||
|
||||
If you change your mind, you can restore the project from the [Trash](https://hub.ultralytics.com/trash) page.
|
||||
|
||||

|
||||
|
||||
## Compare Models
|
||||
|
||||
Navigate to the Project page of the project where the models you want to compare are located. To use the model comparison feature, click on the **Charts** tab.
|
||||
|
||||

|
||||
|
||||
This will display all the relevant charts. Each chart corresponds to a different metric and contains the performance of each model for that metric. The models are represented by different colors and you can hover over each data point to get more information.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
Each chart can be enlarged for better visualization.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You have the flexibility to customize your view by selectively hiding certain models. This feature allows you to concentrate on the models of interest.
|
||||
|
||||

|
||||
|
||||
## Reorder Models
|
||||
|
||||
??? note "Note"
|
||||
|
||||
Ultralytics HUB's reordering functionality works only inside projects you own.
|
||||
|
||||
Navigate to the Project page of the project where the models you want to reorder are located. Click on the designated reorder icon of the model you want to move and drag it to the desired location.
|
||||
|
||||

|
||||
|
||||
## Transfer Models
|
||||
|
||||
Navigate to the Project page of the project where the model you want to mode is located, open the project actions dropdown and click on the **Transfer** option. This action will trigger the **Transfer Model** dialog.
|
||||
|
||||

|
||||
|
||||
??? tip "Tip"
|
||||
|
||||
You can also transfer a model directly from the [Models](https://hub.ultralytics.com/models) page.
|
||||
|
||||

|
||||
|
||||
Select the project you want to transfer the model to and click **Save**.
|
||||
|
||||

|
||||
52
docs/en/hub/quickstart.md
Normal file
52
docs/en/hub/quickstart.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
---
|
||||
comments: true
|
||||
description: Kickstart your journey with Ultralytics HUB. Learn how to train and deploy YOLOv5 and YOLOv8 models in seconds with our Quickstart guide.
|
||||
keywords: Ultralytics HUB, Quickstart, YOLOv5, YOLOv8, model training, quick deployment, drag-and-drop interface, real-time object detection
|
||||
---
|
||||
|
||||
# Quickstart Guide for Ultralytics HUB
|
||||
|
||||
🚧 **Under Construction** 🚧
|
||||
|
||||
Thank you for visiting the Quickstart guide for [Ultralytics HUB](https://hub.ultralytics.com/)! We're currently hard at work building out this page to provide you with step-by-step instructions on how to get up and running with HUB in no time.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/lveF9iCMIzc?si=_Q4WB5kMB5qNe7q6"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Train Your Custom YOLO Models In A Few Clicks with Ultralytics HUB.
|
||||
</p>
|
||||
|
||||
In the meantime, here's a brief overview of what you can expect from Ultralytics HUB:
|
||||
|
||||
## What is Ultralytics HUB?
|
||||
|
||||
Ultralytics HUB is your one-stop solution for training and deploying YOLOv5 and YOLOv8 models. It's designed with user experience in mind, featuring a drag-and-drop interface to make uploading data and training new models a breeze. Whether you're a beginner or an experienced machine learning practitioner, HUB has a range of pre-trained models and templates to accelerate your projects.
|
||||
|
||||
## Key Features
|
||||
|
||||
- **User-Friendly Interface**: Simply drag and drop your data to start training.
|
||||
- **Pre-Trained Models**: Choose from a selection of pre-trained models to kick-start your projects.
|
||||
- **Real-Time Object Detection**: Deploy trained models easily for real-time object detection, instance segmentation, and classification tasks.
|
||||
|
||||
## Coming Soon
|
||||
|
||||
- Detailed Steps to Start Your First Project
|
||||
- Guide on Preparing and Uploading Datasets
|
||||
- Tutorial on Model Training and Exporting
|
||||
- Integration Options and How-To's
|
||||
- And much more!
|
||||
|
||||
## Need Help Now?
|
||||
|
||||
While we're polishing this page, feel free to:
|
||||
|
||||
- Browse through other [HUB Docs](https://docs.ultralytics.com/hub/) for detailed guides and tutorials.
|
||||
- Raise an issue on our [GitHub](https://github.com/ultralytics/hub/) for technical support.
|
||||
- Join our [Discord Community](https://ultralytics.com/discord/) for live discussions and community support.
|
||||
|
||||
Stay tuned! We'll be back soon with more detailed information to help you get the most out of Ultralytics HUB. Thank you for your patience and interest!
|
||||
78
docs/en/index.md
Normal file
78
docs/en/index.md
Normal file
|
|
@ -0,0 +1,78 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore a complete guide to Ultralytics YOLOv8, a high-speed, high-accuracy object detection & image segmentation model. Installation, prediction, training tutorials and more.
|
||||
keywords: Ultralytics, YOLOv8, object detection, image segmentation, machine learning, deep learning, computer vision, YOLOv8 installation, YOLOv8 prediction, YOLOv8 training, YOLO history, YOLO licenses
|
||||
---
|
||||
|
||||
<div align="center">
|
||||
<p>
|
||||
<a href="https://yolovision.ultralytics.com" target="_blank">
|
||||
<img width="1024" src="https://raw.githubusercontent.com/ultralytics/assets/main/yolov8/banner-yolov8.png" alt="Ultralytics YOLO banner"></a>
|
||||
</p>
|
||||
<a href="https://github.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="Ultralytics GitHub"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.linkedin.com/company/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="Ultralytics LinkedIn"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://twitter.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="3%" alt="Ultralytics Twitter"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://youtube.com/ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="3%" alt="Ultralytics YouTube"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.tiktok.com/@ultralytics"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="3%" alt="Ultralytics TikTok"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://www.instagram.com/ultralytics/"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="Ultralytics Instagram"></a>
|
||||
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%">
|
||||
<a href="https://ultralytics.com/discord"><img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-discord.png" width="3%" alt="Ultralytics Discord"></a>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg" alt="Ultralytics CI"></a>
|
||||
<a href="https://codecov.io/github/ultralytics/ultralytics"><img src="https://codecov.io/github/ultralytics/ultralytics/branch/main/graph/badge.svg?token=HHW7IIVFVY" alt="Ultralytics Code Coverage"></a>
|
||||
<a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv8 Citation"></a>
|
||||
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
|
||||
<br>
|
||||
<a href="https://console.paperspace.com/github/ultralytics/ultralytics"><img src="https://assets.paperspace.io/img/gradient-badge.svg" alt="Run on Gradient"/></a>
|
||||
<a href="https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
|
||||
<a href="https://www.kaggle.com/ultralytics/yolov8"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open In Kaggle"></a>
|
||||
</div>
|
||||
|
||||
Introducing [Ultralytics](https://ultralytics.com) [YOLOv8](https://github.com/ultralytics/ultralytics), the latest version of the acclaimed real-time object detection and image segmentation model. YOLOv8 is built on cutting-edge advancements in deep learning and computer vision, offering unparalleled performance in terms of speed and accuracy. Its streamlined design makes it suitable for various applications and easily adaptable to different hardware platforms, from edge devices to cloud APIs.
|
||||
|
||||
Explore the YOLOv8 Docs, a comprehensive resource designed to help you understand and utilize its features and capabilities. Whether you are a seasoned machine learning practitioner or new to the field, this hub aims to maximize YOLOv8's potential in your projects
|
||||
|
||||
## Where to Start
|
||||
|
||||
- **Install** `ultralytics` with pip and get up and running in minutes [:material-clock-fast: Get Started](quickstart.md){ .md-button }
|
||||
- **Predict** new images and videos with YOLOv8 [:octicons-image-16: Predict on Images](modes/predict.md){ .md-button }
|
||||
- **Train** a new YOLOv8 model on your own custom dataset [:fontawesome-solid-brain: Train a Model](modes/train.md){ .md-button }
|
||||
- **Explore** YOLOv8 tasks like segment, classify, pose and track [:material-magnify-expand: Explore Tasks](tasks/index.md){ .md-button }
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/LNwODJXcvt4?si=7n1UvGRLSd9p5wKs"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How to Train a YOLOv8 model on Your Custom Dataset in <a href="https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb" target="_blank">Google Colab</a>.
|
||||
</p>
|
||||
|
||||
## YOLO: A Brief History
|
||||
|
||||
[YOLO](https://arxiv.org/abs/1506.02640) (You Only Look Once), a popular object detection and image segmentation model, was developed by Joseph Redmon and Ali Farhadi at the University of Washington. Launched in 2015, YOLO quickly gained popularity for its high speed and accuracy.
|
||||
|
||||
- [YOLOv2](https://arxiv.org/abs/1612.08242), released in 2016, improved the original model by incorporating batch normalization, anchor boxes, and dimension clusters.
|
||||
- [YOLOv3](https://pjreddie.com/media/files/papers/YOLOv3.pdf), launched in 2018, further enhanced the model's performance using a more efficient backbone network, multiple anchors and spatial pyramid pooling.
|
||||
- [YOLOv4](https://arxiv.org/abs/2004.10934) was released in 2020, introducing innovations like Mosaic data augmentation, a new anchor-free detection head, and a new loss function.
|
||||
- [YOLOv5](https://github.com/ultralytics/yolov5) further improved the model's performance and added new features such as hyperparameter optimization, integrated experiment tracking and automatic export to popular export formats.
|
||||
- [YOLOv6](https://github.com/meituan/YOLOv6) was open-sourced by [Meituan](https://about.meituan.com/) in 2022 and is in use in many of the company's autonomous delivery robots.
|
||||
- [YOLOv7](https://github.com/WongKinYiu/yolov7) added additional tasks such as pose estimation on the COCO keypoints dataset.
|
||||
- [YOLOv8](https://github.com/ultralytics/ultralytics) is the latest version of YOLO by Ultralytics. As a cutting-edge, state-of-the-art (SOTA) model, YOLOv8 builds on the success of previous versions, introducing new features and improvements for enhanced performance, flexibility, and efficiency. YOLOv8 supports a full range of vision AI tasks, including [detection](tasks/detect.md), [segmentation](tasks/segment.md), [pose estimation](tasks/pose.md), [tracking](modes/track.md), and [classification](tasks/classify.md). This versatility allows users to leverage YOLOv8's capabilities across diverse applications and domains.
|
||||
|
||||
## YOLO Licenses: How is Ultralytics YOLO licensed?
|
||||
|
||||
Ultralytics offers two licensing options to accommodate diverse use cases:
|
||||
|
||||
- **AGPL-3.0 License**: This [OSI-approved](https://opensource.org/licenses/) open-source license is ideal for students and enthusiasts, promoting open collaboration and knowledge sharing. See the [LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) file for more details.
|
||||
- **Enterprise License**: Designed for commercial use, this license permits seamless integration of Ultralytics software and AI models into commercial goods and services, bypassing the open-source requirements of AGPL-3.0. If your scenario involves embedding our solutions into a commercial offering, reach out through [Ultralytics Licensing](https://ultralytics.com/license).
|
||||
|
||||
Our licensing strategy is designed to ensure that any improvements to our open-source projects are returned to the community. We hold the principles of open source close to our hearts ❤️, and our mission is to guarantee that our contributions can be utilized and expanded upon in ways that are beneficial to all.
|
||||
71
docs/en/integrations/index.md
Normal file
71
docs/en/integrations/index.md
Normal file
|
|
@ -0,0 +1,71 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore Ultralytics integrations with tools for dataset management, model optimization, ML workflows automation, experiment tracking, version control, and more. Learn about our support for various model export formats for deployment.
|
||||
keywords: Ultralytics integrations, Roboflow, Neural Magic, ClearML, Comet ML, DVC, Ultralytics HUB, MLFlow, Neptune, Ray Tune, TensorBoard, W&B, model export formats, PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, CoreML, TF SavedModel, TF GraphDef, TF Lite, TF Edge TPU, TF.js, PaddlePaddle, NCNN
|
||||
---
|
||||
|
||||
# Ultralytics Integrations
|
||||
|
||||
Welcome to the Ultralytics Integrations page! This page provides an overview of our partnerships with various tools and platforms, designed to streamline your machine learning workflows, enhance dataset management, simplify model training, and facilitate efficient deployment.
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Datasets Integrations
|
||||
|
||||
- [Roboflow](roboflow.md): Facilitate seamless dataset management for Ultralytics models, offering robust annotation, preprocessing, and augmentation capabilities.
|
||||
|
||||
## Training Integrations
|
||||
|
||||
- [Comet ML](https://www.comet.ml/): Enhance your model development with Ultralytics by tracking, comparing, and optimizing your machine learning experiments.
|
||||
|
||||
- [ClearML](https://clear.ml/): Automate your Ultralytics ML workflows, monitor experiments, and foster team collaboration.
|
||||
|
||||
- [DVC](https://dvc.org/): Implement version control for your Ultralytics machine learning projects, synchronizing data, code, and models effectively.
|
||||
|
||||
- [Ultralytics HUB](https://hub.ultralytics.com): Access and contribute to a community of pre-trained Ultralytics models.
|
||||
|
||||
- [MLFlow](mlflow.md): Streamline the entire ML lifecycle of Ultralytics models, from experimentation and reproducibility to deployment.
|
||||
|
||||
- [Neptune](https://neptune.ai/): Maintain a comprehensive log of your ML experiments with Ultralytics in this metadata store designed for MLOps.
|
||||
|
||||
- [Ray Tune](ray-tune.md): Optimize the hyperparameters of your Ultralytics models at any scale.
|
||||
|
||||
- [TensorBoard](https://tensorboard.dev/): Visualize your Ultralytics ML workflows, monitor model metrics, and foster team collaboration.
|
||||
|
||||
- [Weights & Biases (W&B)](https://wandb.ai/site): Monitor experiments, visualize metrics, and foster reproducibility and collaboration on Ultralytics projects.
|
||||
|
||||
## Deployment Integrations
|
||||
|
||||
- [Neural Magic](https://neuralmagic.com/): Leverage Quantization Aware Training (QAT) and pruning techniques to optimize Ultralytics models for superior performance and leaner size.
|
||||
|
||||
### Export Formats
|
||||
|
||||
We also support a variety of model export formats for deployment in different environments. Here are the available formats:
|
||||
|
||||
| Format | `format` Argument | Model | Metadata | Arguments |
|
||||
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
|
||||
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
|
||||
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
|
||||
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
|
||||
| [OpenVINO](openvino.md) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
|
||||
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
|
||||
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
|
||||
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
|
||||
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
|
||||
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
|
||||
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
|
||||
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
|
||||
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
|
||||
| [NCNN](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
|
||||
|
||||
Explore the links to learn more about each integration and how to get the most out of them with Ultralytics.
|
||||
|
||||
## Contribute to Our Integrations
|
||||
|
||||
We're always excited to see how the community integrates Ultralytics YOLO with other technologies, tools, and platforms! If you have successfully integrated YOLO with a new system or have valuable insights to share, consider contributing to our Integrations Docs.
|
||||
|
||||
By writing a guide or tutorial, you can help expand our documentation and provide real-world examples that benefit the community. It's an excellent way to contribute to the growing ecosystem around Ultralytics YOLO.
|
||||
|
||||
To contribute, please check out our [Contributing Guide](https://docs.ultralytics.com/help/contributing) for instructions on how to submit a Pull Request (PR) 🛠️. We eagerly await your contributions!
|
||||
|
||||
Let's collaborate to make the Ultralytics YOLO ecosystem more expansive and feature-rich 🙏!
|
||||
112
docs/en/integrations/mlflow.md
Normal file
112
docs/en/integrations/mlflow.md
Normal file
|
|
@ -0,0 +1,112 @@
|
|||
---
|
||||
comments: true
|
||||
description: Uncover the utility of MLflow for effective experiment logging in your Ultralytics YOLO projects.
|
||||
keywords: ultralytics docs, YOLO, MLflow, experiment logging, metrics tracking, parameter logging, artifact logging
|
||||
---
|
||||
|
||||
# MLflow Integration for Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://user-images.githubusercontent.com/26833433/274929143-05e37e72-c355-44be-a842-b358592340b7.png" alt="MLflow ecosystem">
|
||||
|
||||
## Introduction
|
||||
|
||||
Experiment logging is a crucial aspect of machine learning workflows that enables tracking of various metrics, parameters, and artifacts. It helps to enhance model reproducibility, debug issues, and improve model performance. [Ultralytics](https://ultralytics.com) YOLO, known for its real-time object detection capabilities, now offers integration with [MLflow](https://mlflow.org/), an open-source platform for complete machine learning lifecycle management.
|
||||
|
||||
This documentation page is a comprehensive guide to setting up and utilizing the MLflow logging capabilities for your Ultralytics YOLO project.
|
||||
|
||||
## What is MLflow?
|
||||
|
||||
[MLflow](https://mlflow.org/) is an open-source platform developed by [Databricks](https://www.databricks.com/) for managing the end-to-end machine learning lifecycle. It includes tools for tracking experiments, packaging code into reproducible runs, and sharing and deploying models. MLflow is designed to work with any machine learning library and programming language.
|
||||
|
||||
## Features
|
||||
|
||||
- **Metrics Logging**: Logs metrics at the end of each epoch and at the end of the training.
|
||||
- **Parameter Logging**: Logs all the parameters used in the training.
|
||||
- **Artifacts Logging**: Logs model artifacts, including weights and configuration files, at the end of the training.
|
||||
|
||||
## Setup and Prerequisites
|
||||
|
||||
Ensure MLflow is installed. If not, install it using pip:
|
||||
|
||||
```bash
|
||||
pip install mlflow
|
||||
```
|
||||
|
||||
Make sure that MLflow logging is enabled in Ultralytics settings. Usually, this is controlled by the settings `mflow` key. See the [settings](https://docs.ultralytics.com/quickstart/#ultralytics-settings) page for more info.
|
||||
|
||||
!!! example "Update Ultralytics MLflow Settings"
|
||||
|
||||
=== "Python"
|
||||
Within the Python environment, call the `update` method on the `settings` object to change your settings:
|
||||
```python
|
||||
from ultralytics import settings
|
||||
|
||||
# Update a setting
|
||||
settings.update({'mlflow': True})
|
||||
|
||||
# Reset settings to default values
|
||||
settings.reset()
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
If you prefer using the command-line interface, the following commands will allow you to modify your settings:
|
||||
```bash
|
||||
# Update a setting
|
||||
yolo settings runs_dir='/path/to/runs'
|
||||
|
||||
# Reset settings to default values
|
||||
yolo settings reset
|
||||
```
|
||||
|
||||
## How to Use
|
||||
|
||||
### Commands
|
||||
|
||||
1. **Set a Project Name**: You can set the project name via an environment variable:
|
||||
```bash
|
||||
export MLFLOW_EXPERIMENT_NAME=<your_experiment_name>
|
||||
```
|
||||
Or use the `project=<project>` argument when training a YOLO model, i.e. `yolo train project=my_project`.
|
||||
|
||||
2. **Set a Run Name**: Similar to setting a project name, you can set the run name via an environment variable:
|
||||
```bash
|
||||
export MLFLOW_RUN=<your_run_name>
|
||||
```
|
||||
Or use the `name=<name>` argument when training a YOLO model, i.e. `yolo train project=my_project name=my_name`.
|
||||
|
||||
3. **Start Local MLflow Server**: To start tracking, use:
|
||||
```bash
|
||||
mlflow server --backend-store-uri runs/mlflow'
|
||||
```
|
||||
This will start a local server at http://127.0.0.1:5000 by default and save all mlflow logs to the 'runs/mlflow' directory. To specify a different URI, set the `MLFLOW_TRACKING_URI` environment variable.
|
||||
|
||||
4. **Kill MLflow Server Instances**: To stop all running MLflow instances, run:
|
||||
```bash
|
||||
ps aux | grep 'mlflow' | grep -v 'grep' | awk '{print $2}' | xargs kill -9
|
||||
```
|
||||
|
||||
### Logging
|
||||
|
||||
The logging is taken care of by the `on_pretrain_routine_end`, `on_fit_epoch_end`, and `on_train_end` callback functions. These functions are automatically called during the respective stages of the training process, and they handle the logging of parameters, metrics, and artifacts.
|
||||
|
||||
## Examples
|
||||
|
||||
1. **Logging Custom Metrics**: You can add custom metrics to be logged by modifying the `trainer.metrics` dictionary before `on_fit_epoch_end` is called.
|
||||
|
||||
2. **View Experiment**: To view your logs, navigate to your MLflow server (usually http://127.0.0.1:5000) and select your experiment and run.
|
||||
<img width="1024" src="https://user-images.githubusercontent.com/26833433/274933329-3127aa8c-4491-48ea-81df-ed09a5837f2a.png" alt="YOLO MLflow Experiment">
|
||||
|
||||
3. **View Run**: Runs are individual models inside an experiment. Click on a Run and see the Run details, including uploaded artifacts and model weights.
|
||||
<img width="1024" src="https://user-images.githubusercontent.com/26833433/274933337-ac61371c-2867-4099-a733-147a2583b3de.png" alt="YOLO MLflow Run">
|
||||
|
||||
## Disabling MLflow
|
||||
|
||||
To turn off MLflow logging:
|
||||
|
||||
```bash
|
||||
yolo settings mlflow=False
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
MLflow logging integration with Ultralytics YOLO offers a streamlined way to keep track of your machine learning experiments. It empowers you to monitor performance metrics and manage artifacts effectively, thus aiding in robust model development and deployment. For further details please visit the MLflow [official documentation](https://mlflow.org/docs/latest/index.html).
|
||||
284
docs/en/integrations/openvino.md
Normal file
284
docs/en/integrations/openvino.md
Normal file
|
|
@ -0,0 +1,284 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the power of deploying your Ultralytics YOLOv8 model using OpenVINO format for up to 10x speedup vs PyTorch.
|
||||
keywords: ultralytics docs, YOLOv8, export YOLOv8, YOLOv8 model deployment, exporting YOLOv8, OpenVINO, OpenVINO format
|
||||
---
|
||||
|
||||
# Intel OpenVINO Export
|
||||
|
||||
<img width="1024" src="https://user-images.githubusercontent.com/26833433/252345644-0cf84257-4b34-404c-b7ce-eb73dfbcaff1.png" alt="OpenVINO Ecosystem">
|
||||
|
||||
In this guide, we cover exporting YOLOv8 models to the [OpenVINO](https://docs.openvino.ai/) format, which can provide up to 3x [CPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_CPU.html) speedup as well as accelerating on other Intel hardware ([iGPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_GPU.html), [dGPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_GPU.html), [VPU](https://docs.openvino.ai/2022.3/openvino_docs_OV_UG_supported_plugins_VPU.html), etc.).
|
||||
|
||||
OpenVINO, short for Open Visual Inference & Neural Network Optimization toolkit, is a comprehensive toolkit for optimizing and deploying AI inference models. Even though the name contains Visual, OpenVINO also supports various additional tasks including language, audio, time series, etc.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/kONm9nE5_Fk?si=kzquuBrxjSbntHoU"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How To Export and Optimize an Ultralytics YOLOv8 Model for Inference with OpenVINO.
|
||||
</p>
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Export a YOLOv8n model to OpenVINO format and run inference with the exported model.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a YOLOv8n PyTorch model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Export the model
|
||||
model.export(format='openvino') # creates 'yolov8n_openvino_model/'
|
||||
|
||||
# Load the exported OpenVINO model
|
||||
ov_model = YOLO('yolov8n_openvino_model/')
|
||||
|
||||
# Run inference
|
||||
results = ov_model('https://ultralytics.com/images/bus.jpg')
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Export a YOLOv8n PyTorch model to OpenVINO format
|
||||
yolo export model=yolov8n.pt format=openvino # creates 'yolov8n_openvino_model/'
|
||||
|
||||
# Run inference with the exported model
|
||||
yolo predict model=yolov8n_openvino_model source='https://ultralytics.com/images/bus.jpg'
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
| Key | Value | Description |
|
||||
|----------|--------------|------------------------------------------------------|
|
||||
| `format` | `'openvino'` | format to export to |
|
||||
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
|
||||
| `half` | `False` | FP16 quantization |
|
||||
|
||||
## Benefits of OpenVINO
|
||||
|
||||
1. **Performance**: OpenVINO delivers high-performance inference by utilizing the power of Intel CPUs, integrated and discrete GPUs, and FPGAs.
|
||||
2. **Support for Heterogeneous Execution**: OpenVINO provides an API to write once and deploy on any supported Intel hardware (CPU, GPU, FPGA, VPU, etc.).
|
||||
3. **Model Optimizer**: OpenVINO provides a Model Optimizer that imports, converts, and optimizes models from popular deep learning frameworks such as PyTorch, TensorFlow, TensorFlow Lite, Keras, ONNX, PaddlePaddle, and Caffe.
|
||||
4. **Ease of Use**: The toolkit comes with more than [80 tutorial notebooks](https://github.com/openvinotoolkit/openvino_notebooks) (including [YOLOv8 optimization](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/230-yolov8-optimization)) teaching different aspects of the toolkit.
|
||||
|
||||
## OpenVINO Export Structure
|
||||
|
||||
When you export a model to OpenVINO format, it results in a directory containing the following:
|
||||
|
||||
1. **XML file**: Describes the network topology.
|
||||
2. **BIN file**: Contains the weights and biases binary data.
|
||||
3. **Mapping file**: Holds mapping of original model output tensors to OpenVINO tensor names.
|
||||
|
||||
You can use these files to run inference with the OpenVINO Inference Engine.
|
||||
|
||||
## Using OpenVINO Export in Deployment
|
||||
|
||||
Once you have the OpenVINO files, you can use the OpenVINO Runtime to run the model. The Runtime provides a unified API to inference across all supported Intel hardware. It also provides advanced capabilities like load balancing across Intel hardware and asynchronous execution. For more information on running the inference, refer to the [Inference with OpenVINO Runtime Guide](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_OV_Runtime_User_Guide.html).
|
||||
|
||||
Remember, you'll need the XML and BIN files as well as any application-specific settings like input size, scale factor for normalization, etc., to correctly set up and use the model with the Runtime.
|
||||
|
||||
In your deployment application, you would typically do the following steps:
|
||||
|
||||
1. Initialize OpenVINO by creating `core = Core()`.
|
||||
2. Load the model using the `core.read_model()` method.
|
||||
3. Compile the model using the `core.compile_model()` function.
|
||||
4. Prepare the input (image, text, audio, etc.).
|
||||
5. Run inference using `compiled_model(input_data)`.
|
||||
|
||||
For more detailed steps and code snippets, refer to the [OpenVINO documentation](https://docs.openvino.ai/) or [API tutorial](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/002-openvino-api/002-openvino-api.ipynb).
|
||||
|
||||
## OpenVINO YOLOv8 Benchmarks
|
||||
|
||||
YOLOv8 benchmarks below were run by the Ultralytics team on 4 different model formats measuring speed and accuracy: PyTorch, TorchScript, ONNX and OpenVINO. Benchmarks were run on Intel Flex and Arc GPUs, and on Intel Xeon CPUs at FP32 precision (with the `half=False` argument).
|
||||
|
||||
!!! note
|
||||
|
||||
The benchmarking results below are for reference and might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run.
|
||||
|
||||
All benchmarks run with `openvino` Python package version [2023.0.1](https://pypi.org/project/openvino/2023.0.1/).
|
||||
|
||||
### Intel Flex GPU
|
||||
|
||||
The Intel® Data Center GPU Flex Series is a versatile and robust solution designed for the intelligent visual cloud. This GPU supports a wide array of workloads including media streaming, cloud gaming, AI visual inference, and virtual desktop Infrastructure workloads. It stands out for its open architecture and built-in support for the AV1 encode, providing a standards-based software stack for high-performance, cross-architecture applications. The Flex Series GPU is optimized for density and quality, offering high reliability, availability, and scalability.
|
||||
|
||||
Benchmarks below run on Intel® Data Center GPU Flex 170 at FP32 precision.
|
||||
|
||||
<div align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741543-62659bf8-1765-4d0b-b71c-8a4f9885506a.jpg">
|
||||
</div>
|
||||
|
||||
| Model | Format | Status | Size (MB) | mAP50-95(B) | Inference time (ms/im) |
|
||||
|---------|-------------|--------|-----------|-------------|------------------------|
|
||||
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 21.79 |
|
||||
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 23.24 |
|
||||
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 37.22 |
|
||||
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3703 | 3.29 |
|
||||
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 31.89 |
|
||||
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 32.71 |
|
||||
| YOLOv8s | ONNX | ✅ | 42.8 | 0.4472 | 43.42 |
|
||||
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4470 | 3.92 |
|
||||
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 50.75 |
|
||||
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 47.90 |
|
||||
| YOLOv8m | ONNX | ✅ | 99.0 | 0.4999 | 63.16 |
|
||||
| YOLOv8m | OpenVINO | ✅ | 49.8 | 0.4997 | 7.11 |
|
||||
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 77.45 |
|
||||
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.5268 | 85.71 |
|
||||
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 88.94 |
|
||||
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5264 | 9.37 |
|
||||
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 100.09 |
|
||||
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 114.64 |
|
||||
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 110.32 |
|
||||
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5367 | 15.02 |
|
||||
|
||||
This table represents the benchmark results for five different models (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) across four different formats (PyTorch, TorchScript, ONNX, OpenVINO), giving us the status, size, mAP50-95(B) metric, and inference time for each combination.
|
||||
|
||||
### Intel Arc GPU
|
||||
|
||||
Intel® Arc™ represents Intel's foray into the dedicated GPU market. The Arc™ series, designed to compete with leading GPU manufacturers like AMD and Nvidia, caters to both the laptop and desktop markets. The series includes mobile versions for compact devices like laptops, and larger, more powerful versions for desktop computers.
|
||||
|
||||
The Arc™ series is divided into three categories: Arc™ 3, Arc™ 5, and Arc™ 7, with each number indicating the performance level. Each category includes several models, and the 'M' in the GPU model name signifies a mobile, integrated variant.
|
||||
|
||||
Early reviews have praised the Arc™ series, particularly the integrated A770M GPU, for its impressive graphics performance. The availability of the Arc™ series varies by region, and additional models are expected to be released soon. Intel® Arc™ GPUs offer high-performance solutions for a range of computing needs, from gaming to content creation.
|
||||
|
||||
Benchmarks below run on Intel® Arc 770 GPU at FP32 precision.
|
||||
|
||||
<div align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741545-8530388f-8fd1-44f7-a4ae-f875d59dc282.jpg">
|
||||
</div>
|
||||
|
||||
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|
||||
|---------|-------------|--------|-----------|---------------------|------------------------|
|
||||
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 88.79 |
|
||||
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 102.66 |
|
||||
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 57.98 |
|
||||
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3703 | 8.52 |
|
||||
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 189.83 |
|
||||
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 227.58 |
|
||||
| YOLOv8s | ONNX | ✅ | 42.7 | 0.4472 | 142.03 |
|
||||
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4469 | 9.19 |
|
||||
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 411.64 |
|
||||
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 517.12 |
|
||||
| YOLOv8m | ONNX | ✅ | 98.9 | 0.4999 | 298.68 |
|
||||
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.4996 | 12.55 |
|
||||
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 725.73 |
|
||||
| YOLOv8l | TorchScript | ✅ | 167.1 | 0.5268 | 892.83 |
|
||||
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 576.11 |
|
||||
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5262 | 17.62 |
|
||||
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 988.92 |
|
||||
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 1186.42 |
|
||||
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 768.90 |
|
||||
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5367 | 19 |
|
||||
|
||||
### Intel Xeon CPU
|
||||
|
||||
The Intel® Xeon® CPU is a high-performance, server-grade processor designed for complex and demanding workloads. From high-end cloud computing and virtualization to artificial intelligence and machine learning applications, Xeon® CPUs provide the power, reliability, and flexibility required for today's data centers.
|
||||
|
||||
Notably, Xeon® CPUs deliver high compute density and scalability, making them ideal for both small businesses and large enterprises. By choosing Intel® Xeon® CPUs, organizations can confidently handle their most demanding computing tasks and foster innovation while maintaining cost-effectiveness and operational efficiency.
|
||||
|
||||
Benchmarks below run on 4th Gen Intel® Xeon® Scalable CPU at FP32 precision.
|
||||
|
||||
<div align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741546-dcd8e52a-fc38-424f-b87e-c8365b6f28dc.jpg">
|
||||
</div>
|
||||
|
||||
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|
||||
|---------|-------------|--------|-----------|---------------------|------------------------|
|
||||
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 24.36 |
|
||||
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 23.93 |
|
||||
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 39.86 |
|
||||
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3704 | 11.34 |
|
||||
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 33.77 |
|
||||
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 34.84 |
|
||||
| YOLOv8s | ONNX | ✅ | 42.8 | 0.4472 | 43.23 |
|
||||
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4471 | 13.86 |
|
||||
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 53.91 |
|
||||
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 53.51 |
|
||||
| YOLOv8m | ONNX | ✅ | 99.0 | 0.4999 | 64.16 |
|
||||
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.4996 | 28.79 |
|
||||
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 75.78 |
|
||||
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.5268 | 79.13 |
|
||||
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 88.45 |
|
||||
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5263 | 56.23 |
|
||||
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 96.60 |
|
||||
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 114.28 |
|
||||
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 111.02 |
|
||||
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5371 | 83.28 |
|
||||
|
||||
### Intel Core CPU
|
||||
|
||||
The Intel® Core® series is a range of high-performance processors by Intel. The lineup includes Core i3 (entry-level), Core i5 (mid-range), Core i7 (high-end), and Core i9 (extreme performance). Each series caters to different computing needs and budgets, from everyday tasks to demanding professional workloads. With each new generation, improvements are made to performance, energy efficiency, and features.
|
||||
|
||||
Benchmarks below run on 13th Gen Intel® Core® i7-13700H CPU at FP32 precision.
|
||||
|
||||
<div align="center">
|
||||
<img width="800" src="https://user-images.githubusercontent.com/26833433/254559985-727bfa43-93fa-4fec-a417-800f869f3f9e.jpg">
|
||||
</div>
|
||||
|
||||
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|
||||
|---------|-------------|--------|-----------|---------------------|------------------------|
|
||||
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.4478 | 104.61 |
|
||||
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.4525 | 112.39 |
|
||||
| YOLOv8n | ONNX | ✅ | 12.2 | 0.4525 | 28.02 |
|
||||
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.4504 | 23.53 |
|
||||
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.5885 | 194.83 |
|
||||
| YOLOv8s | TorchScript | ✅ | 43.0 | 0.5962 | 202.01 |
|
||||
| YOLOv8s | ONNX | ✅ | 42.8 | 0.5962 | 65.74 |
|
||||
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.5966 | 38.66 |
|
||||
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.6101 | 355.23 |
|
||||
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.6120 | 424.78 |
|
||||
| YOLOv8m | ONNX | ✅ | 99.0 | 0.6120 | 173.39 |
|
||||
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.6091 | 69.80 |
|
||||
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.6591 | 593.00 |
|
||||
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.6580 | 697.54 |
|
||||
| YOLOv8l | ONNX | ✅ | 166.8 | 0.6580 | 342.15 |
|
||||
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.0708 | 117.69 |
|
||||
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.6651 | 804.65 |
|
||||
| YOLOv8x | TorchScript | ✅ | 260.8 | 0.6650 | 921.46 |
|
||||
| YOLOv8x | ONNX | ✅ | 260.4 | 0.6650 | 526.66 |
|
||||
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.6619 | 158.73 |
|
||||
|
||||
## Reproduce Our Results
|
||||
|
||||
To reproduce the Ultralytics benchmarks above on all export [formats](../modes/export.md) run this code:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a YOLOv8n PyTorch model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Benchmark YOLOv8n speed and accuracy on the COCO128 dataset for all all export formats
|
||||
results= model.benchmarks(data='coco128.yaml')
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Benchmark YOLOv8n speed and accuracy on the COCO128 dataset for all all export formats
|
||||
yolo benchmark model=yolov8n.pt data=coco128.yaml
|
||||
```
|
||||
|
||||
Note that benchmarking results might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run. For the most reliable results use a dataset with a large number of images, i.e. `data='coco128.yaml' (128 val images), or `data='coco.yaml'` (5000 val images).
|
||||
|
||||
## Conclusion
|
||||
|
||||
The benchmarking results clearly demonstrate the benefits of exporting the YOLOv8 model to the OpenVINO format. Across different models and hardware platforms, the OpenVINO format consistently outperforms other formats in terms of inference speed while maintaining comparable accuracy.
|
||||
|
||||
For the Intel® Data Center GPU Flex Series, the OpenVINO format was able to deliver inference speeds almost 10 times faster than the original PyTorch format. On the Xeon CPU, the OpenVINO format was twice as fast as the PyTorch format. The accuracy of the models remained nearly identical across the different formats.
|
||||
|
||||
The benchmarks underline the effectiveness of OpenVINO as a tool for deploying deep learning models. By converting models to the OpenVINO format, developers can achieve significant performance improvements, making it easier to deploy these models in real-world applications.
|
||||
|
||||
For more detailed information and instructions on using OpenVINO, refer to the [official OpenVINO documentation](https://docs.openvinotoolkit.org/latest/index.html).
|
||||
179
docs/en/integrations/ray-tune.md
Normal file
179
docs/en/integrations/ray-tune.md
Normal file
|
|
@ -0,0 +1,179 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover how to streamline hyperparameter tuning for YOLOv8 models with Ray Tune. Learn to accelerate tuning, integrate with Weights & Biases, and analyze results.
|
||||
keywords: Ultralytics, YOLOv8, Ray Tune, hyperparameter tuning, machine learning optimization, Weights & Biases integration, result analysis
|
||||
---
|
||||
|
||||
# Efficient Hyperparameter Tuning with Ray Tune and YOLOv8
|
||||
|
||||
Hyperparameter tuning is vital in achieving peak model performance by discovering the optimal set of hyperparameters. This involves running trials with different hyperparameters and evaluating each trial’s performance.
|
||||
|
||||
## Accelerate Tuning with Ultralytics YOLOv8 and Ray Tune
|
||||
|
||||
[Ultralytics YOLOv8](https://ultralytics.com) incorporates Ray Tune for hyperparameter tuning, streamlining the optimization of YOLOv8 model hyperparameters. With Ray Tune, you can utilize advanced search strategies, parallelism, and early stopping to expedite the tuning process.
|
||||
|
||||
### Ray Tune
|
||||
|
||||
<p align="center">
|
||||
<img width="640" src="https://docs.ray.io/en/latest/_images/tune_overview.png" alt="Ray Tune Overview">
|
||||
</p>
|
||||
|
||||
[Ray Tune](https://docs.ray.io/en/latest/tune/index.html) is a hyperparameter tuning library designed for efficiency and flexibility. It supports various search strategies, parallelism, and early stopping strategies, and seamlessly integrates with popular machine learning frameworks, including Ultralytics YOLOv8.
|
||||
|
||||
### Integration with Weights & Biases
|
||||
|
||||
YOLOv8 also allows optional integration with [Weights & Biases](https://wandb.ai/site) for monitoring the tuning process.
|
||||
|
||||
## Installation
|
||||
|
||||
To install the required packages, run:
|
||||
|
||||
!!! tip "Installation"
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Install and update Ultralytics and Ray Tune packages
|
||||
pip install -U ultralytics "ray[tune]"
|
||||
|
||||
# Optionally install W&B for logging
|
||||
pip install wandb
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
!!! example "Usage"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Start tuning hyperparameters for YOLOv8n training on the COCO8 dataset
|
||||
result_grid = model.tune(data='coco8.yaml', use_ray=True)
|
||||
```
|
||||
|
||||
## `tune()` Method Parameters
|
||||
|
||||
The `tune()` method in YOLOv8 provides an easy-to-use interface for hyperparameter tuning with Ray Tune. It accepts several arguments that allow you to customize the tuning process. Below is a detailed explanation of each parameter:
|
||||
|
||||
| Parameter | Type | Description | Default Value |
|
||||
|-----------------|------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
|
||||
| `data` | `str` | The dataset configuration file (in YAML format) to run the tuner on. This file should specify the training and validation data paths, as well as other dataset-specific settings. | |
|
||||
| `space` | `dict, optional` | A dictionary defining the hyperparameter search space for Ray Tune. Each key corresponds to a hyperparameter name, and the value specifies the range of values to explore during tuning. If not provided, YOLOv8 uses a default search space with various hyperparameters. | |
|
||||
| `grace_period` | `int, optional` | The grace period in epochs for the [ASHA scheduler](https://docs.ray.io/en/latest/tune/api/schedulers.html) in Ray Tune. The scheduler will not terminate any trial before this number of epochs, allowing the model to have some minimum training before making a decision on early stopping. | 10 |
|
||||
| `gpu_per_trial` | `int, optional` | The number of GPUs to allocate per trial during tuning. This helps manage GPU usage, particularly in multi-GPU environments. If not provided, the tuner will use all available GPUs. | None |
|
||||
| `iterations` | `int, optional` | The maximum number of trials to run during tuning. This parameter helps control the total number of hyperparameter combinations tested, ensuring the tuning process does not run indefinitely. | 10 |
|
||||
| `**train_args` | `dict, optional` | Additional arguments to pass to the `train()` method during tuning. These arguments can include settings like the number of training epochs, batch size, and other training-specific configurations. | {} |
|
||||
|
||||
By customizing these parameters, you can fine-tune the hyperparameter optimization process to suit your specific needs and available computational resources.
|
||||
|
||||
## Default Search Space Description
|
||||
|
||||
The following table lists the default search space parameters for hyperparameter tuning in YOLOv8 with Ray Tune. Each parameter has a specific value range defined by `tune.uniform()`.
|
||||
|
||||
| Parameter | Value Range | Description |
|
||||
|-------------------|----------------------------|------------------------------------------|
|
||||
| `lr0` | `tune.uniform(1e-5, 1e-1)` | Initial learning rate |
|
||||
| `lrf` | `tune.uniform(0.01, 1.0)` | Final learning rate factor |
|
||||
| `momentum` | `tune.uniform(0.6, 0.98)` | Momentum |
|
||||
| `weight_decay` | `tune.uniform(0.0, 0.001)` | Weight decay |
|
||||
| `warmup_epochs` | `tune.uniform(0.0, 5.0)` | Warmup epochs |
|
||||
| `warmup_momentum` | `tune.uniform(0.0, 0.95)` | Warmup momentum |
|
||||
| `box` | `tune.uniform(0.02, 0.2)` | Box loss weight |
|
||||
| `cls` | `tune.uniform(0.2, 4.0)` | Class loss weight |
|
||||
| `hsv_h` | `tune.uniform(0.0, 0.1)` | Hue augmentation range |
|
||||
| `hsv_s` | `tune.uniform(0.0, 0.9)` | Saturation augmentation range |
|
||||
| `hsv_v` | `tune.uniform(0.0, 0.9)` | Value (brightness) augmentation range |
|
||||
| `degrees` | `tune.uniform(0.0, 45.0)` | Rotation augmentation range (degrees) |
|
||||
| `translate` | `tune.uniform(0.0, 0.9)` | Translation augmentation range |
|
||||
| `scale` | `tune.uniform(0.0, 0.9)` | Scaling augmentation range |
|
||||
| `shear` | `tune.uniform(0.0, 10.0)` | Shear augmentation range (degrees) |
|
||||
| `perspective` | `tune.uniform(0.0, 0.001)` | Perspective augmentation range |
|
||||
| `flipud` | `tune.uniform(0.0, 1.0)` | Vertical flip augmentation probability |
|
||||
| `fliplr` | `tune.uniform(0.0, 1.0)` | Horizontal flip augmentation probability |
|
||||
| `mosaic` | `tune.uniform(0.0, 1.0)` | Mosaic augmentation probability |
|
||||
| `mixup` | `tune.uniform(0.0, 1.0)` | Mixup augmentation probability |
|
||||
| `copy_paste` | `tune.uniform(0.0, 1.0)` | Copy-paste augmentation probability |
|
||||
|
||||
## Custom Search Space Example
|
||||
|
||||
In this example, we demonstrate how to use a custom search space for hyperparameter tuning with Ray Tune and YOLOv8. By providing a custom search space, you can focus the tuning process on specific hyperparameters of interest.
|
||||
|
||||
!!! example "Usage"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Define a YOLO model
|
||||
model = YOLO("yolov8n.pt")
|
||||
|
||||
# Run Ray Tune on the model
|
||||
result_grid = model.tune(data="coco128.yaml",
|
||||
space={"lr0": tune.uniform(1e-5, 1e-1)},
|
||||
epochs=50,
|
||||
use_ray=True)
|
||||
```
|
||||
|
||||
In the code snippet above, we create a YOLO model with the "yolov8n.pt" pretrained weights. Then, we call the `tune()` method, specifying the dataset configuration with "coco128.yaml". We provide a custom search space for the initial learning rate `lr0` using a dictionary with the key "lr0" and the value `tune.uniform(1e-5, 1e-1)`. Finally, we pass additional training arguments, such as the number of epochs directly to the tune method as `epochs=50`.
|
||||
|
||||
## Processing Ray Tune Results
|
||||
|
||||
After running a hyperparameter tuning experiment with Ray Tune, you might want to perform various analyses on the obtained results. This guide will take you through common workflows for processing and analyzing these results.
|
||||
|
||||
### Loading Tune Experiment Results from a Directory
|
||||
|
||||
After running the tuning experiment with `tuner.fit()`, you can load the results from a directory. This is useful, especially if you're performing the analysis after the initial training script has exited.
|
||||
|
||||
```python
|
||||
experiment_path = f"{storage_path}/{exp_name}"
|
||||
print(f"Loading results from {experiment_path}...")
|
||||
|
||||
restored_tuner = tune.Tuner.restore(experiment_path, trainable=train_mnist)
|
||||
result_grid = restored_tuner.get_results()
|
||||
```
|
||||
|
||||
### Basic Experiment-Level Analysis
|
||||
|
||||
Get an overview of how trials performed. You can quickly check if there were any errors during the trials.
|
||||
|
||||
```python
|
||||
if result_grid.errors:
|
||||
print("One or more trials failed!")
|
||||
else:
|
||||
print("No errors!")
|
||||
```
|
||||
|
||||
### Basic Trial-Level Analysis
|
||||
|
||||
Access individual trial hyperparameter configurations and the last reported metrics.
|
||||
|
||||
```python
|
||||
for i, result in enumerate(result_grid):
|
||||
print(f"Trial #{i}: Configuration: {result.config}, Last Reported Metrics: {result.metrics}")
|
||||
```
|
||||
|
||||
### Plotting the Entire History of Reported Metrics for a Trial
|
||||
|
||||
You can plot the history of reported metrics for each trial to see how the metrics evolved over time.
|
||||
|
||||
```python
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
for result in result_grid:
|
||||
plt.plot(result.metrics_dataframe["training_iteration"], result.metrics_dataframe["mean_accuracy"], label=f"Trial {i}")
|
||||
|
||||
plt.xlabel('Training Iterations')
|
||||
plt.ylabel('Mean Accuracy')
|
||||
plt.legend()
|
||||
plt.show()
|
||||
```
|
||||
|
||||
## Summary
|
||||
|
||||
In this documentation, we covered common workflows to analyze the results of experiments run with Ray Tune using Ultralytics. The key steps include loading the experiment results from a directory, performing basic experiment-level and trial-level analysis and plotting metrics.
|
||||
|
||||
Explore further by looking into Ray Tune’s [Analyze Results](https://docs.ray.io/en/latest/tune/examples/tune_analyze_results.html) docs page to get the most out of your hyperparameter tuning experiments.
|
||||
239
docs/en/integrations/roboflow.md
Normal file
239
docs/en/integrations/roboflow.md
Normal file
|
|
@ -0,0 +1,239 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to use Roboflow with Ultralytics for labeling and managing images for use in training, and for evaluating model performance.
|
||||
keywords: Ultralytics, YOLOv8, Roboflow, vector analysis, confusion matrix, data management, image labeling
|
||||
---
|
||||
|
||||
# Roboflow
|
||||
|
||||
[Roboflow](https://roboflow.com/?ref=ultralytics) has everything you need to build and deploy computer vision models. Connect Roboflow at any step in your pipeline with APIs and SDKs, or use the end-to-end interface to automate the entire process from image to inference. Whether you’re in need of [data labeling](https://roboflow.com/annotate?ref=ultralytics), [model training](https://roboflow.com/train?ref=ultralytics), or [model deployment](https://roboflow.com/deploy?ref=ultralytics), Roboflow gives you building blocks to bring custom computer vision solutions to your project.
|
||||
|
||||
!!! warning
|
||||
|
||||
Roboflow users can use Ultralytics under the [AGPL license](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) or procure an [Enterprise license](https://ultralytics.com/license) directly from Ultralytics. Be aware that Roboflow does **not** provide Ultralytics licenses, and it is the responsibility of the user to ensure appropriate licensing.
|
||||
|
||||
In this guide, we are going to showcase how to find, label, and organize data for use in training a custom Ultralytics YOLOv8 model. Use the table of contents below to jump directly to a specific section:
|
||||
|
||||
- Gather data for training a custom YOLOv8 model
|
||||
- Upload, convert and label data for YOLOv8 format
|
||||
- Pre-process and augment data for model robustness
|
||||
- Dataset management for [YOLOv8](https://docs.ultralytics.com/models/yolov8/)
|
||||
- Export data in 40+ formats for model training
|
||||
- Upload custom YOLOv8 model weights for testing and deployment
|
||||
- Gather Data for Training a Custom YOLOv8 Model
|
||||
|
||||
Roboflow provides two services that can help you collect data for YOLOv8 models: [Universe](https://universe.roboflow.com/?ref=ultralytics) and [Collect](https://roboflow.com/collect?ref=ultralytics).
|
||||
|
||||
Universe is an online repository with over 250,000 vision datasets totalling over 100 million images.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_universe.png" alt="Roboflow Universe" width="800"/>
|
||||
</p>
|
||||
|
||||
With a [free Roboflow account](https://app.roboflow.com/?ref=ultralytics), you can export any dataset available on Universe. To export a dataset, click the "Download this Dataset" button on any dataset.
|
||||
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_dataset.png" alt="Roboflow Universe dataset export" width="800"/>
|
||||
</p>
|
||||
|
||||
For YOLOv8, select "YOLOv8" as the export format:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_data_format.png" alt="Roboflow Universe dataset export" width="800"/>
|
||||
</p>
|
||||
|
||||
Universe also has a page that aggregates all [public fine-tuned YOLOv8 models uploaded to Roboflow](https://universe.roboflow.com/search?q=model:yolov8). You can use this page to explore pre-trained models you can use for testing or [for automated data labeling](https://docs.roboflow.com/annotate/use-roboflow-annotate/model-assisted-labeling) or to prototype with [Roboflow inference](https://roboflow.com/inference?ref=ultralytics).
|
||||
|
||||
If you want to gather images yourself, try [Collect](https://github.com/roboflow/roboflow-collect), an open source project that allows you to automatically gather images using a webcam on the edge. You can use text or image prompts with Collect to instruct what data should be collected, allowing you to capture only the useful data you need to build your vision model.
|
||||
|
||||
## Upload, Convert and Label Data for YOLOv8 Format
|
||||
|
||||
[Roboflow Annotate](https://docs.roboflow.com/annotate/use-roboflow-annotate) is an online annotation tool for use in labeling images for object detection, classification, and segmentation.
|
||||
|
||||
To label data for a YOLOv8 object detection, instance segmentation, or classification model, first create a project in Roboflow.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_create_project.png" alt="Create a Roboflow project" width="400"/>
|
||||
</p>
|
||||
|
||||
Next, upload your images, and any pre-existing annotations you have from other tools ([using one of the 40+ supported import formats](https://roboflow.com/formats?ref=ultralytics)), into Roboflow.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_upload_data.png" alt="Upload images to Roboflow" width="800"/>
|
||||
</p>
|
||||
|
||||
Select the batch of images you have uploaded on the Annotate page to which you are taken after uploading images. Then, click "Start Annotating" to label images.
|
||||
|
||||
To label with bounding boxes, press the `B` key on your keyboard or click the box icon in the sidebar. Click on a point where you want to start your bounding box, then drag to create the box:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_annotate.png" alt="Annotating an image in Roboflow" width="800"/>
|
||||
</p>
|
||||
|
||||
A pop-up will appear asking you to select a class for your annotation once you have created an annotation.
|
||||
|
||||
To label with polygons, press the `P` key on your keyboard, or the polygon icon in the sidebar. With the polygon annotation tool enabled, click on individual points in the image to draw a polygon.
|
||||
|
||||
Roboflow offers a SAM-based label assistant with which you can label images faster than ever. SAM (Segment Anything Model) is a state-of-the-art computer vision model that can precisely label images. With SAM, you can significantly speed up the image labeling process. Annotating images with polygons becomes as simple as a few clicks, rather than the tedious process of precisely clicking points around an object.
|
||||
|
||||
To use the label assistant, click the cursor icon in the sidebar, SAM will be loaded for use in your project.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_annotate_interactive.png" alt="Annotating an image in Roboflow with SAM-powered label assist" width="800"/>
|
||||
</p>
|
||||
|
||||
Hover over any object in the image and SAM will recommend an annotation. You can hover to find the right place to annotate, then click to create your annotation. To amend your annotation to be more or less specific, you can click inside or outside of the annotation SAM has created on the document.
|
||||
|
||||
You can also add tags to images from the Tags panel in the sidebar. You can apply tags to data from a particular area, taken from a specific camera, and more. You can then use these tags to search through data for images matching a tag and generate versions of a dataset with images that contain a particular tag or set of tags.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_tags.png" alt="Adding tags to an image in Roboflow" width="300"/>
|
||||
</p>
|
||||
|
||||
Models hosted on Roboflow can be used with Label Assist, an automated annotation tool that uses your YOLOv8 model to recommend annotations. To use Label Assist, first upload a YOLOv8 model to Roboflow (see instructions later in the guide). Then, click the magic wand icon in the left sidebar and select your model for use in Label Assist.
|
||||
|
||||
Choose a model, then click "Continue" to enable Label Assist:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_label_assist.png" alt="Enabling Label Assist" width="800"/>
|
||||
</p>
|
||||
|
||||
When you open new images for annotation, Label Assist will trigger and recommend annotations.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_label_assist.png" alt="ALabel Assist recommending an annotation" width="800"/>
|
||||
</p>
|
||||
|
||||
## Dataset Management for YOLOv8
|
||||
|
||||
Roboflow provides a suite of tools for understanding computer vision datasets.
|
||||
|
||||
First, you can use dataset search to find images that meet a semantic text description (i.e. find all images that contain people), or that meet a specified label (i.e. the image is associated with a specific tag). To use dataset search, click "Dataset" in the sidebar. Then, input a search query using the search bar and associated filters at the top of the page.
|
||||
|
||||
For example, the following text query finds images that contain people in a dataset:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_dataset_management.png" alt="Searching for an image" width="800"/>
|
||||
</p>
|
||||
|
||||
You can narrow your search to images with a particular tag using the "Tags" selector:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_filter_by_tag.png" alt="Filter images by tag" width="350"/>
|
||||
</p>
|
||||
|
||||
Before you start training a model with your dataset, we recommend using Roboflow [Health Check](https://docs.roboflow.com/datasets/dataset-health-check), a web tool that provides an insight into your dataset and how you can improve the dataset prior to training a vision model.
|
||||
|
||||
To use Health Check, click the "Health Check" sidebar link. A list of statistics will appear that show the average size of images in your dataset, class balance, a heatmap of where annotations are in your images, and more.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_dataset_health_check.png" alt="Roboflow Health Check analysis" width="800"/>
|
||||
</p>
|
||||
|
||||
Health Check may recommend changes to help enhance dataset performance. For example, the class balance feature may show that there is an imbalance in labels that, if solved, may boost performance or your model.
|
||||
|
||||
## Export Data in 40+ Formats for Model Training
|
||||
|
||||
To export your data, you will need a dataset version. A version is a state of your dataset frozen-in-time. To create a version, first click "Versions" in the sidebar. Then, click the "Create New Version" button. On this page, you will be able to choose augmentations and preprocessing steps to apply to your dataset:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_generate_dataset.png" alt="Creating a dataset version on Roboflow" width="800"/>
|
||||
</p>
|
||||
|
||||
For each augmentation you select, a pop-up will appear allowing you to tune the augmentation to your needs. Here is an example of tuning a brightness augmentation within specified parameters:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_augmentations.png" alt="Applying augmentations to a dataset" width="800"/>
|
||||
</p>
|
||||
|
||||
When your dataset version has been generated, you can export your data into a range of formats. Click the "Export Dataset" button on your dataset version page to export your data:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_export_data.png" alt="Exporting a dataset" width="800"/>
|
||||
</p>
|
||||
|
||||
You are now ready to train YOLOv8 on a custom dataset. Follow this [written guide](https://blog.roboflow.com/how-to-train-yolov8-on-a-custom-dataset/) and [YouTube video](https://www.youtube.com/watch?v=wuZtUMEiKWY) for step-by-step instructions or refer to the [Ultralytics documentation](https://docs.ultralytics.com/modes/train/).
|
||||
|
||||
## Upload Custom YOLOv8 Model Weights for Testing and Deployment
|
||||
|
||||
Roboflow offers an infinitely scalable API for deployed models and SDKs for use with NVIDIA Jetsons, Luxonis OAKs, Raspberry Pis, GPU-based devices, and more.
|
||||
|
||||
You can deploy YOLOv8 models by uploading YOLOv8 weights to Roboflow. You can do this in a few lines of Python code. Create a new Python file and add the following code:
|
||||
|
||||
```python
|
||||
import roboflow # install with 'pip install roboflow'
|
||||
|
||||
roboflow.login()
|
||||
|
||||
rf = roboflow.Roboflow()
|
||||
|
||||
project = rf.workspace(WORKSPACE_ID).project("football-players-detection-3zvbc")
|
||||
dataset = project.version(VERSION).download("yolov8")
|
||||
|
||||
project.version(dataset.version).deploy(model_type="yolov8", model_path=f"{HOME}/runs/detect/train/")
|
||||
```
|
||||
|
||||
In this code, replace the project ID and version ID with the values for your account and project. [Learn how to retrieve your Roboflow API key](https://docs.roboflow.com/api-reference/authentication#retrieve-an-api-key).
|
||||
|
||||
When you run the code above, you will be asked to authenticate. Then, your model will be uploaded and an API will be created for your project. This process can take up to 30 minutes to complete.
|
||||
|
||||
To test your model and find deployment instructions for supported SDKs, go to the "Deploy" tab in the Roboflow sidebar. At the top of this page, a widget will appear with which you can test your model. You can use your webcam for live testing or upload images or videos.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_test_project.png" alt="Running inference on an example image" width="800"/>
|
||||
</p>
|
||||
|
||||
You can also use your uploaded model as a [labeling assistant](https://docs.roboflow.com/annotate/use-roboflow-annotate/model-assisted-labeling). This feature uses your trained model to recommend annotations on images uploaded to Roboflow.
|
||||
|
||||
## How to Evaluate YOLOv8 Models
|
||||
|
||||
Roboflow provides a range of features for use in evaluating models.
|
||||
|
||||
Once you have uploaded a model to Roboflow, you can access our model evaluation tool, which provides a confusion matrix showing the performance of your model as well as an interactive vector analysis plot. These features can help you find opportunities to improve your model.
|
||||
|
||||
To access a confusion matrix, go to your model page on the Roboflow dashboard, then click "View Detailed Evaluation":
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_model_eval.png" alt="Start a Roboflow model evaluation" width="800"/>
|
||||
</p>
|
||||
|
||||
A pop-up will appear showing a confusion matrix:
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_confusion_matrix.png" alt="A confusion matrix" width="800"/>
|
||||
</p>
|
||||
|
||||
Hover over a box on the confusion matrix to see the value associated with the box. Click on a box to see images in the respective category. Click on an image to view the model predictions and ground truth data associated with that image.
|
||||
|
||||
For more insights, click Vector Analysis. This will show a scatter plot of the images in your dataset, calculated using CLIP. The closer images are in the plot, the more similar they are, semantically. Each image is represented as a dot with a color between white and red. The more red the dot, the worse the model performed.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_vector_analysis.png" alt="A vector analysis plot" width="800"/>
|
||||
</p>
|
||||
|
||||
You can use Vector Analysis to:
|
||||
|
||||
- Find clusters of images;
|
||||
- Identify clusters where the model performs poorly, and;
|
||||
- Visualize commonalities between images on which the model performs poorly.
|
||||
|
||||
## Learning Resources
|
||||
|
||||
Want to learn more about using Roboflow for creating YOLOv8 models? The following resources may be helpful in your work.
|
||||
|
||||
- [Train YOLOv8 on a Custom Dataset](https://github.com/roboflow/notebooks/blob/main/notebooks/train-yolov8-object-detection-on-custom-dataset.ipynb): Follow our interactive notebook that shows you how to train a YOLOv8 model on a custom dataset.
|
||||
- [Autodistill](https://autodistill.github.io/autodistill/): Use large foundation vision models to label data for specific models. You can label images for use in training YOLOv8 classification, detection, and segmentation models with Autodistill.
|
||||
- [Supervision](https://roboflow.github.io/supervision/): A Python package with helpful utilities for use in working with computer vision models. You can use supervision to filter detections, compute confusion matrices, and more, all in a few lines of Python code.
|
||||
- [Roboflow Blog](https://blog.roboflow.com/): The Roboflow Blog features over 500 articles on computer vision, covering topics from how to train a YOLOv8 model to annotation best practices.
|
||||
- [Roboflow YouTube channel](https://www.youtube.com/@Roboflow): Browse dozens of in-depth computer vision guides on our YouTube channel, covering topics from training YOLOv8 models to automated image labeling.
|
||||
|
||||
## Project Showcase
|
||||
|
||||
Below are a few of the many pieces of feedback we have received for using YOLOv8 and Roboflow together to create computer vision models.
|
||||
|
||||
<p align="center">
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_showcase_1.png" alt="Showcase image" width="500"/>
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_showcase_2.png" alt="Showcase image" width="500"/>
|
||||
<img src="https://media.roboflow.com/ultralytics/rf_showcase_3.png" alt="Showcase image" width="500"/>
|
||||
</p>
|
||||
186
docs/en/models/fast-sam.md
Normal file
186
docs/en/models/fast-sam.md
Normal file
|
|
@ -0,0 +1,186 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore FastSAM, a CNN-based solution for real-time object segmentation in images. Enhanced user interaction, computational efficiency and adaptable across vision tasks.
|
||||
keywords: FastSAM, machine learning, CNN-based solution, object segmentation, real-time solution, Ultralytics, vision tasks, image processing, industrial applications, user interaction
|
||||
---
|
||||
|
||||
# Fast Segment Anything Model (FastSAM)
|
||||
|
||||
The Fast Segment Anything Model (FastSAM) is a novel, real-time CNN-based solution for the Segment Anything task. This task is designed to segment any object within an image based on various possible user interaction prompts. FastSAM significantly reduces computational demands while maintaining competitive performance, making it a practical choice for a variety of vision tasks.
|
||||
|
||||

|
||||
|
||||
## Overview
|
||||
|
||||
FastSAM is designed to address the limitations of the [Segment Anything Model (SAM)](sam.md), a heavy Transformer model with substantial computational resource requirements. The FastSAM decouples the segment anything task into two sequential stages: all-instance segmentation and prompt-guided selection. The first stage uses [YOLOv8-seg](../tasks/segment.md) to produce the segmentation masks of all instances in the image. In the second stage, it outputs the region-of-interest corresponding to the prompt.
|
||||
|
||||
## Key Features
|
||||
|
||||
1. **Real-time Solution:** By leveraging the computational efficiency of CNNs, FastSAM provides a real-time solution for the segment anything task, making it valuable for industrial applications that require quick results.
|
||||
|
||||
2. **Efficiency and Performance:** FastSAM offers a significant reduction in computational and resource demands without compromising on performance quality. It achieves comparable performance to SAM but with drastically reduced computational resources, enabling real-time application.
|
||||
|
||||
3. **Prompt-guided Segmentation:** FastSAM can segment any object within an image guided by various possible user interaction prompts, providing flexibility and adaptability in different scenarios.
|
||||
|
||||
4. **Based on YOLOv8-seg:** FastSAM is based on [YOLOv8-seg](../tasks/segment.md), an object detector equipped with an instance segmentation branch. This allows it to effectively produce the segmentation masks of all instances in an image.
|
||||
|
||||
5. **Competitive Results on Benchmarks:** On the object proposal task on MS COCO, FastSAM achieves high scores at a significantly faster speed than [SAM](sam.md) on a single NVIDIA RTX 3090, demonstrating its efficiency and capability.
|
||||
|
||||
6. **Practical Applications:** The proposed approach provides a new, practical solution for a large number of vision tasks at a really high speed, tens or hundreds of times faster than current methods.
|
||||
|
||||
7. **Model Compression Feasibility:** FastSAM demonstrates the feasibility of a path that can significantly reduce the computational effort by introducing an artificial prior to the structure, thus opening new possibilities for large model architecture for general vision tasks.
|
||||
|
||||
## Usage
|
||||
|
||||
### Python API
|
||||
|
||||
The FastSAM models are easy to integrate into your Python applications. Ultralytics provides a user-friendly Python API to streamline the process.
|
||||
|
||||
#### Predict Usage
|
||||
|
||||
To perform object detection on an image, use the `predict` method as shown below:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics import FastSAM
|
||||
from ultralytics.models.fastsam import FastSAMPrompt
|
||||
|
||||
# Define an inference source
|
||||
source = 'path/to/bus.jpg'
|
||||
|
||||
# Create a FastSAM model
|
||||
model = FastSAM('FastSAM-s.pt') # or FastSAM-x.pt
|
||||
|
||||
# Run inference on an image
|
||||
everything_results = model(source, device='cpu', retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)
|
||||
|
||||
# Prepare a Prompt Process object
|
||||
prompt_process = FastSAMPrompt(source, everything_results, device='cpu')
|
||||
|
||||
# Everything prompt
|
||||
ann = prompt_process.everything_prompt()
|
||||
|
||||
# Bbox default shape [0,0,0,0] -> [x1,y1,x2,y2]
|
||||
ann = prompt_process.box_prompt(bbox=[200, 200, 300, 300])
|
||||
|
||||
# Text prompt
|
||||
ann = prompt_process.text_prompt(text='a photo of a dog')
|
||||
|
||||
# Point prompt
|
||||
# points default [[0,0]] [[x1,y1],[x2,y2]]
|
||||
# point_label default [0] [1,0] 0:background, 1:foreground
|
||||
ann = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1])
|
||||
prompt_process.plot(annotations=ann, output='./')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
```bash
|
||||
# Load a FastSAM model and segment everything with it
|
||||
yolo segment predict model=FastSAM-s.pt source=path/to/bus.jpg imgsz=640
|
||||
```
|
||||
|
||||
This snippet demonstrates the simplicity of loading a pre-trained model and running a prediction on an image.
|
||||
|
||||
#### Val Usage
|
||||
|
||||
Validation of the model on a dataset can be done as follows:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics import FastSAM
|
||||
|
||||
# Create a FastSAM model
|
||||
model = FastSAM('FastSAM-s.pt') # or FastSAM-x.pt
|
||||
|
||||
# Validate the model
|
||||
results = model.val(data='coco8-seg.yaml')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
```bash
|
||||
# Load a FastSAM model and validate it on the COCO8 example dataset at image size 640
|
||||
yolo segment val model=FastSAM-s.pt data=coco8.yaml imgsz=640
|
||||
```
|
||||
|
||||
Please note that FastSAM only supports detection and segmentation of a single class of object. This means it will recognize and segment all objects as the same class. Therefore, when preparing the dataset, you need to convert all object category IDs to 0.
|
||||
|
||||
### FastSAM official Usage
|
||||
|
||||
FastSAM is also available directly from the [https://github.com/CASIA-IVA-Lab/FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) repository. Here is a brief overview of the typical steps you might take to use FastSAM:
|
||||
|
||||
#### Installation
|
||||
|
||||
1. Clone the FastSAM repository:
|
||||
```shell
|
||||
git clone https://github.com/CASIA-IVA-Lab/FastSAM.git
|
||||
```
|
||||
|
||||
2. Create and activate a Conda environment with Python 3.9:
|
||||
```shell
|
||||
conda create -n FastSAM python=3.9
|
||||
conda activate FastSAM
|
||||
```
|
||||
|
||||
3. Navigate to the cloned repository and install the required packages:
|
||||
```shell
|
||||
cd FastSAM
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
4. Install the CLIP model:
|
||||
```shell
|
||||
pip install git+https://github.com/openai/CLIP.git
|
||||
```
|
||||
|
||||
#### Example Usage
|
||||
|
||||
1. Download a [model checkpoint](https://drive.google.com/file/d/1m1sjY4ihXBU1fZXdQ-Xdj-mDltW-2Rqv/view?usp=sharing).
|
||||
|
||||
2. Use FastSAM for inference. Example commands:
|
||||
|
||||
- Segment everything in an image:
|
||||
```shell
|
||||
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg
|
||||
```
|
||||
|
||||
- Segment specific objects using text prompt:
|
||||
```shell
|
||||
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --text_prompt "the yellow dog"
|
||||
```
|
||||
|
||||
- Segment objects within a bounding box (provide box coordinates in xywh format):
|
||||
```shell
|
||||
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --box_prompt "[570,200,230,400]"
|
||||
```
|
||||
|
||||
- Segment objects near specific points:
|
||||
```shell
|
||||
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --point_prompt "[[520,360],[620,300]]" --point_label "[1,0]"
|
||||
```
|
||||
|
||||
Additionally, you can try FastSAM through a [Colab demo](https://colab.research.google.com/drive/1oX14f6IneGGw612WgVlAiy91UHwFAvr9?usp=sharing) or on the [HuggingFace web demo](https://huggingface.co/spaces/An-619/FastSAM) for a visual experience.
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
We would like to acknowledge the FastSAM authors for their significant contributions in the field of real-time instance segmentation:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{zhao2023fast,
|
||||
title={Fast Segment Anything},
|
||||
author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
|
||||
year={2023},
|
||||
eprint={2306.12156},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
The original FastSAM paper can be found on [arXiv](https://arxiv.org/abs/2306.12156). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/CASIA-IVA-Lab/FastSAM). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
|
||||
90
docs/en/models/index.md
Normal file
90
docs/en/models/index.md
Normal file
|
|
@ -0,0 +1,90 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the diverse range of YOLO family, SAM, MobileSAM, FastSAM, YOLO-NAS, and RT-DETR models supported by Ultralytics. Get started with examples for both CLI and Python usage.
|
||||
keywords: Ultralytics, documentation, YOLO, SAM, MobileSAM, FastSAM, YOLO-NAS, RT-DETR, models, architectures, Python, CLI
|
||||
---
|
||||
|
||||
# Models Supported by Ultralytics
|
||||
|
||||
Welcome to Ultralytics' model documentation! We offer support for a wide range of models, each tailored to specific tasks like [object detection](../tasks/detect.md), [instance segmentation](../tasks/segment.md), [image classification](../tasks/classify.md), [pose estimation](../tasks/pose.md), and [multi-object tracking](../modes/track.md). If you're interested in contributing your model architecture to Ultralytics, check out our [Contributing Guide](../help/contributing.md).
|
||||
|
||||
## Featured Models
|
||||
|
||||
Here are some of the key models supported:
|
||||
|
||||
1. **[YOLOv3](./yolov3.md)**: The third iteration of the YOLO model family, originally by Joseph Redmon, known for its efficient real-time object detection capabilities.
|
||||
2. **[YOLOv4](./yolov4.md)**: A darknet-native update to YOLOv3, released by Alexey Bochkovskiy in 2020.
|
||||
3. **[YOLOv5](./yolov5.md)**: An improved version of the YOLO architecture by Ultralytics, offering better performance and speed trade-offs compared to previous versions.
|
||||
4. **[YOLOv6](./yolov6.md)**: Released by [Meituan](https://about.meituan.com/) in 2022, and in use in many of the company's autonomous delivery robots.
|
||||
5. **[YOLOv7](./yolov7.md)**: Updated YOLO models released in 2022 by the authors of YOLOv4.
|
||||
6. **[YOLOv8](./yolov8.md)**: The latest version of the YOLO family, featuring enhanced capabilities such as instance segmentation, pose/keypoints estimation, and classification.
|
||||
7. **[Segment Anything Model (SAM)](./sam.md)**: Meta's Segment Anything Model (SAM).
|
||||
8. **[Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md)**: MobileSAM for mobile applications, by Kyung Hee University.
|
||||
9. **[Fast Segment Anything Model (FastSAM)](./fast-sam.md)**: FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences.
|
||||
10. **[YOLO-NAS](./yolo-nas.md)**: YOLO Neural Architecture Search (NAS) Models.
|
||||
11. **[Realtime Detection Transformers (RT-DETR)](./rtdetr.md)**: Baidu's PaddlePaddle Realtime Detection Transformer (RT-DETR) models.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/MWq1UxqTClU?si=nHAW-lYDzrz68jR0"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Run Ultralytics YOLO models in just a few lines of code.
|
||||
</p>
|
||||
|
||||
## Getting Started: Usage Examples
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in Python:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a COCO-pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the YOLOv8n model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolov8n.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
## Contributing New Models
|
||||
|
||||
Interested in contributing your model to Ultralytics? Great! We're always open to expanding our model portfolio.
|
||||
|
||||
1. **Fork the Repository**: Start by forking the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
2. **Clone Your Fork**: Clone your fork to your local machine and create a new branch to work on.
|
||||
|
||||
3. **Implement Your Model**: Add your model following the coding standards and guidelines provided in our [Contributing Guide](../help/contributing.md).
|
||||
|
||||
4. **Test Thoroughly**: Make sure to test your model rigorously, both in isolation and as part of the pipeline.
|
||||
|
||||
5. **Create a Pull Request**: Once you're satisfied with your model, create a pull request to the main repository for review.
|
||||
|
||||
6. **Code Review & Merging**: After review, if your model meets our criteria, it will be merged into the main repository.
|
||||
|
||||
For detailed steps, consult our [Contributing Guide](../help/contributing.md).
|
||||
109
docs/en/models/mobile-sam.md
Normal file
109
docs/en/models/mobile-sam.md
Normal file
|
|
@ -0,0 +1,109 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn more about MobileSAM, its implementation, comparison with the original SAM, and how to download and test it in the Ultralytics framework. Improve your mobile applications today.
|
||||
keywords: MobileSAM, Ultralytics, SAM, mobile applications, Arxiv, GPU, API, image encoder, mask decoder, model download, testing method
|
||||
---
|
||||
|
||||

|
||||
|
||||
# Mobile Segment Anything (MobileSAM)
|
||||
|
||||
The MobileSAM paper is now available on [arXiv](https://arxiv.org/pdf/2306.14289.pdf).
|
||||
|
||||
A demonstration of MobileSAM running on a CPU can be accessed at this [demo link](https://huggingface.co/spaces/dhkim2810/MobileSAM). The performance on a Mac i5 CPU takes approximately 3 seconds. On the Hugging Face demo, the interface and lower-performance CPUs contribute to a slower response, but it continues to function effectively.
|
||||
|
||||
MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [Segment Anything in 3D](https://github.com/Jumpat/SegmentAnythingin3D).
|
||||
|
||||
MobileSAM is trained on a single GPU with a 100k dataset (1% of the original images) in less than a day. The code for this training will be made available in the future.
|
||||
|
||||
## Adapting from SAM to MobileSAM
|
||||
|
||||
Since MobileSAM retains the same pipeline as the original SAM, we have incorporated the original's pre-processing, post-processing, and all other interfaces. Consequently, those currently using the original SAM can transition to MobileSAM with minimal effort.
|
||||
|
||||
MobileSAM performs comparably to the original SAM and retains the same pipeline except for a change in the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a smaller Tiny-ViT (5M). On a single GPU, MobileSAM operates at about 12ms per image: 8ms on the image encoder and 4ms on the mask decoder.
|
||||
|
||||
The following table provides a comparison of ViT-based image encoders:
|
||||
|
||||
| Image Encoder | Original SAM | MobileSAM |
|
||||
|---------------|--------------|-----------|
|
||||
| Parameters | 611M | 5M |
|
||||
| Speed | 452ms | 8ms |
|
||||
|
||||
Both the original SAM and MobileSAM utilize the same prompt-guided mask decoder:
|
||||
|
||||
| Mask Decoder | Original SAM | MobileSAM |
|
||||
|--------------|--------------|-----------|
|
||||
| Parameters | 3.876M | 3.876M |
|
||||
| Speed | 4ms | 4ms |
|
||||
|
||||
Here is the comparison of the whole pipeline:
|
||||
|
||||
| Whole Pipeline (Enc+Dec) | Original SAM | MobileSAM |
|
||||
|--------------------------|--------------|-----------|
|
||||
| Parameters | 615M | 9.66M |
|
||||
| Speed | 456ms | 12ms |
|
||||
|
||||
The performance of MobileSAM and the original SAM are demonstrated using both a point and a box as prompts.
|
||||
|
||||

|
||||
|
||||

|
||||
|
||||
With its superior performance, MobileSAM is approximately 5 times smaller and 7 times faster than the current FastSAM. More details are available at the [MobileSAM project page](https://github.com/ChaoningZhang/MobileSAM).
|
||||
|
||||
## Testing MobileSAM in Ultralytics
|
||||
|
||||
Just like the original SAM, we offer a straightforward testing method in Ultralytics, including modes for both Point and Box prompts.
|
||||
|
||||
### Model Download
|
||||
|
||||
You can download the model [here](https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/mobile_sam.pt).
|
||||
|
||||
### Point Prompt
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics import SAM
|
||||
|
||||
# Load the model
|
||||
model = SAM('mobile_sam.pt')
|
||||
|
||||
# Predict a segment based on a point prompt
|
||||
model.predict('ultralytics/assets/zidane.jpg', points=[900, 370], labels=[1])
|
||||
```
|
||||
|
||||
### Box Prompt
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics import SAM
|
||||
|
||||
# Load the model
|
||||
model = SAM('mobile_sam.pt')
|
||||
|
||||
# Predict a segment based on a box prompt
|
||||
model.predict('ultralytics/assets/zidane.jpg', bboxes=[439, 437, 524, 709])
|
||||
```
|
||||
|
||||
We have implemented `MobileSAM` and `SAM` using the same API. For more usage information, please see the [SAM page](./sam.md).
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you find MobileSAM useful in your research or development work, please consider citing our paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{mobile_sam,
|
||||
title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications},
|
||||
author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon},
|
||||
journal={arXiv preprint arXiv:2306.14289},
|
||||
year={2023}
|
||||
}
|
||||
```
|
||||
101
docs/en/models/rtdetr.md
Normal file
101
docs/en/models/rtdetr.md
Normal file
|
|
@ -0,0 +1,101 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover the features and benefits of RT-DETR, Baidu’s efficient and adaptable real-time object detector powered by Vision Transformers, including pre-trained models.
|
||||
keywords: RT-DETR, Baidu, Vision Transformers, object detection, real-time performance, CUDA, TensorRT, IoU-aware query selection, Ultralytics, Python API, PaddlePaddle
|
||||
---
|
||||
|
||||
# Baidu's RT-DETR: A Vision Transformer-Based Real-Time Object Detector
|
||||
|
||||
## Overview
|
||||
|
||||
Real-Time Detection Transformer (RT-DETR), developed by Baidu, is a cutting-edge end-to-end object detector that provides real-time performance while maintaining high accuracy. It leverages the power of Vision Transformers (ViT) to efficiently process multiscale features by decoupling intra-scale interaction and cross-scale fusion. RT-DETR is highly adaptable, supporting flexible adjustment of inference speed using different decoder layers without retraining. The model excels on accelerated backends like CUDA with TensorRT, outperforming many other real-time object detectors.
|
||||
|
||||

|
||||
**Overview of Baidu's RT-DETR.** The RT-DETR model architecture diagram shows the last three stages of the backbone {S3, S4, S5} as the input to the encoder. The efficient hybrid encoder transforms multiscale features into a sequence of image features through intrascale feature interaction (AIFI) and cross-scale feature-fusion module (CCFM). The IoU-aware query selection is employed to select a fixed number of image features to serve as initial object queries for the decoder. Finally, the decoder with auxiliary prediction heads iteratively optimizes object queries to generate boxes and confidence scores ([source](https://arxiv.org/pdf/2304.08069.pdf)).
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multiscale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection.
|
||||
- **IoU-aware Query Selection:** Baidu's RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene, enhancing the detection accuracy.
|
||||
- **Adaptable Inference Speed:** Baidu's RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.
|
||||
|
||||
## Pre-trained Models
|
||||
|
||||
The Ultralytics Python API provides pre-trained PaddlePaddle RT-DETR models with different scales:
|
||||
|
||||
- RT-DETR-L: 53.0% AP on COCO val2017, 114 FPS on T4 GPU
|
||||
- RT-DETR-X: 54.8% AP on COCO val2017, 74 FPS on T4 GPU
|
||||
|
||||
## Usage
|
||||
|
||||
You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference:
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple inference code for RT-DETR. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using RT-DETR with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import RTDETR
|
||||
|
||||
# Load a COCO-pretrained RT-DETR-l model
|
||||
model = RTDETR('rtdetr-l.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the RT-DETR-l model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained RT-DETR-l model and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=rtdetr-l.pt data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Load a COCO-pretrained RT-DETR-l model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=rtdetr-l.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
### Supported Tasks
|
||||
|
||||
| Model Type | Pre-trained Weights | Tasks Supported |
|
||||
|---------------------|---------------------|------------------|
|
||||
| RT-DETR Large | `rtdetr-l.pt` | Object Detection |
|
||||
| RT-DETR Extra-Large | `rtdetr-x.pt` | Object Detection |
|
||||
|
||||
### Supported Modes
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :heavy_check_mark: |
|
||||
| Training | :heavy_check_mark: |
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you use Baidu's RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069):
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{lv2023detrs,
|
||||
title={DETRs Beat YOLOs on Real-time Object Detection},
|
||||
author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
|
||||
year={2023},
|
||||
eprint={2304.08069},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to acknowledge Baidu and the [PaddlePaddle](https://github.com/PaddlePaddle/PaddleDetection) team for creating and maintaining this valuable resource for the computer vision community. Their contribution to the field with the development of the Vision Transformers-based real-time object detector, RT-DETR, is greatly appreciated.
|
||||
|
||||
*Keywords: RT-DETR, Transformer, ViT, Vision Transformers, Baidu RT-DETR, PaddlePaddle, Paddle Paddle RT-DETR, real-time object detection, Vision Transformers-based object detection, pre-trained PaddlePaddle RT-DETR models, Baidu's RT-DETR usage, Ultralytics Python API*
|
||||
232
docs/en/models/sam.md
Normal file
232
docs/en/models/sam.md
Normal file
|
|
@ -0,0 +1,232 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the cutting-edge Segment Anything Model (SAM) from Ultralytics that allows real-time image segmentation. Learn about its promptable segmentation, zero-shot performance, and how to use it.
|
||||
keywords: Ultralytics, image segmentation, Segment Anything Model, SAM, SA-1B dataset, real-time performance, zero-shot transfer, object detection, image analysis, machine learning
|
||||
---
|
||||
|
||||
# Segment Anything Model (SAM)
|
||||
|
||||
Welcome to the frontier of image segmentation with the Segment Anything Model, or SAM. This revolutionary model has changed the game by introducing promptable image segmentation with real-time performance, setting new standards in the field.
|
||||
|
||||
## Introduction to SAM: The Segment Anything Model
|
||||
|
||||
The Segment Anything Model, or SAM, is a cutting-edge image segmentation model that allows for promptable segmentation, providing unparalleled versatility in image analysis tasks. SAM forms the heart of the Segment Anything initiative, a groundbreaking project that introduces a novel model, task, and dataset for image segmentation.
|
||||
|
||||
SAM's advanced design allows it to adapt to new image distributions and tasks without prior knowledge, a feature known as zero-shot transfer. Trained on the expansive [SA-1B dataset](https://ai.facebook.com/datasets/segment-anything/), which contains more than 1 billion masks spread over 11 million carefully curated images, SAM has displayed impressive zero-shot performance, surpassing previous fully supervised results in many cases.
|
||||
|
||||

|
||||
Example images with overlaid masks from our newly introduced dataset, SA-1B. SA-1B contains 11M diverse, high-resolution, licensed, and privacy protecting images and 1.1B high-quality segmentation masks. These masks were annotated fully automatically by SAM, and as verified by human ratings and numerous experiments, are of high quality and diversity. Images are grouped by number of masks per image for visualization (there are ∼100 masks per image on average).
|
||||
|
||||
## Key Features of the Segment Anything Model (SAM)
|
||||
|
||||
- **Promptable Segmentation Task:** SAM was designed with a promptable segmentation task in mind, allowing it to generate valid segmentation masks from any given prompt, such as spatial or text clues identifying an object.
|
||||
- **Advanced Architecture:** The Segment Anything Model employs a powerful image encoder, a prompt encoder, and a lightweight mask decoder. This unique architecture enables flexible prompting, real-time mask computation, and ambiguity awareness in segmentation tasks.
|
||||
- **The SA-1B Dataset:** Introduced by the Segment Anything project, the SA-1B dataset features over 1 billion masks on 11 million images. As the largest segmentation dataset to date, it provides SAM with a diverse and large-scale training data source.
|
||||
- **Zero-Shot Performance:** SAM displays outstanding zero-shot performance across various segmentation tasks, making it a ready-to-use tool for diverse applications with minimal need for prompt engineering.
|
||||
|
||||
For an in-depth look at the Segment Anything Model and the SA-1B dataset, please visit the [Segment Anything website](https://segment-anything.com) and check out the research paper [Segment Anything](https://arxiv.org/abs/2304.02643).
|
||||
|
||||
## How to Use SAM: Versatility and Power in Image Segmentation
|
||||
|
||||
The Segment Anything Model can be employed for a multitude of downstream tasks that go beyond its training data. This includes edge detection, object proposal generation, instance segmentation, and preliminary text-to-mask prediction. With prompt engineering, SAM can swiftly adapt to new tasks and data distributions in a zero-shot manner, establishing it as a versatile and potent tool for all your image segmentation needs.
|
||||
|
||||
### SAM prediction example
|
||||
|
||||
!!! example "Segment with prompts"
|
||||
|
||||
Segment image with given prompts.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import SAM
|
||||
|
||||
# Load a model
|
||||
model = SAM('sam_b.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Run inference with bboxes prompt
|
||||
model('ultralytics/assets/zidane.jpg', bboxes=[439, 437, 524, 709])
|
||||
|
||||
# Run inference with points prompt
|
||||
model('ultralytics/assets/zidane.jpg', points=[900, 370], labels=[1])
|
||||
```
|
||||
|
||||
!!! example "Segment everything"
|
||||
|
||||
Segment the whole image.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import SAM
|
||||
|
||||
# Load a model
|
||||
model = SAM('sam_b.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Run inference
|
||||
model('path/to/image.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Run inference with a SAM model
|
||||
yolo predict model=sam_b.pt source=path/to/image.jpg
|
||||
```
|
||||
|
||||
- The logic here is to segment the whole image if you don't pass any prompts(bboxes/points/masks).
|
||||
|
||||
!!! example "SAMPredictor example"
|
||||
|
||||
This way you can set image once and run prompts inference multiple times without running image encoder multiple times.
|
||||
|
||||
=== "Prompt inference"
|
||||
|
||||
```python
|
||||
from ultralytics.models.sam import Predictor as SAMPredictor
|
||||
|
||||
# Create SAMPredictor
|
||||
overrides = dict(conf=0.25, task='segment', mode='predict', imgsz=1024, model="mobile_sam.pt")
|
||||
predictor = SAMPredictor(overrides=overrides)
|
||||
|
||||
# Set image
|
||||
predictor.set_image("ultralytics/assets/zidane.jpg") # set with image file
|
||||
predictor.set_image(cv2.imread("ultralytics/assets/zidane.jpg")) # set with np.ndarray
|
||||
results = predictor(bboxes=[439, 437, 524, 709])
|
||||
results = predictor(points=[900, 370], labels=[1])
|
||||
|
||||
# Reset image
|
||||
predictor.reset_image()
|
||||
```
|
||||
|
||||
Segment everything with additional args.
|
||||
|
||||
=== "Segment everything"
|
||||
|
||||
```python
|
||||
from ultralytics.models.sam import Predictor as SAMPredictor
|
||||
|
||||
# Create SAMPredictor
|
||||
overrides = dict(conf=0.25, task='segment', mode='predict', imgsz=1024, model="mobile_sam.pt")
|
||||
predictor = SAMPredictor(overrides=overrides)
|
||||
|
||||
# Segment with additional args
|
||||
results = predictor(source="ultralytics/assets/zidane.jpg", crop_n_layers=1, points_stride=64)
|
||||
```
|
||||
|
||||
- More additional args for `Segment everything` see [`Predictor/generate` Reference](../reference/models/sam/predict.md).
|
||||
|
||||
## Available Models and Supported Tasks
|
||||
|
||||
| Model Type | Pre-trained Weights | Tasks Supported |
|
||||
|------------|---------------------|-----------------------|
|
||||
| SAM base | `sam_b.pt` | Instance Segmentation |
|
||||
| SAM large | `sam_l.pt` | Instance Segmentation |
|
||||
|
||||
## Operating Modes
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :x: |
|
||||
| Training | :x: |
|
||||
|
||||
## SAM comparison vs YOLOv8
|
||||
|
||||
Here we compare Meta's smallest SAM model, SAM-b, with Ultralytics smallest segmentation model, [YOLOv8n-seg](../tasks/segment.md):
|
||||
|
||||
| Model | Size | Parameters | Speed (CPU) |
|
||||
|------------------------------------------------|----------------------------|------------------------|----------------------------|
|
||||
| Meta's SAM-b | 358 MB | 94.7 M | 51096 ms/im |
|
||||
| [MobileSAM](mobile-sam.md) | 40.7 MB | 10.1 M | 46122 ms/im |
|
||||
| [FastSAM-s](fast-sam.md) with YOLOv8 backbone | 23.7 MB | 11.8 M | 115 ms/im |
|
||||
| Ultralytics [YOLOv8n-seg](../tasks/segment.md) | **6.7 MB** (53.4x smaller) | **3.4 M** (27.9x less) | **59 ms/im** (866x faster) |
|
||||
|
||||
This comparison shows the order-of-magnitude differences in the model sizes and speeds between models. Whereas SAM presents unique capabilities for automatic segmenting, it is not a direct competitor to YOLOv8 segment models, which are smaller, faster and more efficient.
|
||||
|
||||
Tests run on a 2023 Apple M2 Macbook with 16GB of RAM. To reproduce this test:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics import FastSAM, SAM, YOLO
|
||||
|
||||
# Profile SAM-b
|
||||
model = SAM('sam_b.pt')
|
||||
model.info()
|
||||
model('ultralytics/assets')
|
||||
|
||||
# Profile MobileSAM
|
||||
model = SAM('mobile_sam.pt')
|
||||
model.info()
|
||||
model('ultralytics/assets')
|
||||
|
||||
# Profile FastSAM-s
|
||||
model = FastSAM('FastSAM-s.pt')
|
||||
model.info()
|
||||
model('ultralytics/assets')
|
||||
|
||||
# Profile YOLOv8n-seg
|
||||
model = YOLO('yolov8n-seg.pt')
|
||||
model.info()
|
||||
model('ultralytics/assets')
|
||||
```
|
||||
|
||||
## Auto-Annotation: A Quick Path to Segmentation Datasets
|
||||
|
||||
Auto-annotation is a key feature of SAM, allowing users to generate a [segmentation dataset](https://docs.ultralytics.com/datasets/segment) using a pre-trained detection model. This feature enables rapid and accurate annotation of a large number of images, bypassing the need for time-consuming manual labeling.
|
||||
|
||||
### Generate Your Segmentation Dataset Using a Detection Model
|
||||
|
||||
To auto-annotate your dataset with the Ultralytics framework, use the `auto_annotate` function as shown below:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
from ultralytics.data.annotator import auto_annotate
|
||||
|
||||
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model='sam_b.pt')
|
||||
```
|
||||
|
||||
| Argument | Type | Description | Default |
|
||||
|------------|---------------------|---------------------------------------------------------------------------------------------------------|--------------|
|
||||
| data | str | Path to a folder containing images to be annotated. | |
|
||||
| det_model | str, optional | Pre-trained YOLO detection model. Defaults to 'yolov8x.pt'. | 'yolov8x.pt' |
|
||||
| sam_model | str, optional | Pre-trained SAM segmentation model. Defaults to 'sam_b.pt'. | 'sam_b.pt' |
|
||||
| device | str, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | |
|
||||
| output_dir | str, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | None |
|
||||
|
||||
The `auto_annotate` function takes the path to your images, with optional arguments for specifying the pre-trained detection and SAM segmentation models, the device to run the models on, and the output directory for saving the annotated results.
|
||||
|
||||
Auto-annotation with pre-trained models can dramatically cut down the time and effort required for creating high-quality segmentation datasets. This feature is especially beneficial for researchers and developers dealing with large image collections, as it allows them to focus on model development and evaluation rather than manual annotation.
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you find SAM useful in your research or development work, please consider citing our paper:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{kirillov2023segment,
|
||||
title={Segment Anything},
|
||||
author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
|
||||
year={2023},
|
||||
eprint={2304.02643},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
We would like to express our gratitude to Meta AI for creating and maintaining this valuable resource for the computer vision community.
|
||||
|
||||
*keywords: Segment Anything, Segment Anything Model, SAM, Meta SAM, image segmentation, promptable segmentation, zero-shot performance, SA-1B dataset, advanced architecture, auto-annotation, Ultralytics, pre-trained models, SAM base, SAM large, instance segmentation, computer vision, AI, artificial intelligence, machine learning, data annotation, segmentation masks, detection model, YOLO detection model, bibtex, Meta AI.*
|
||||
127
docs/en/models/yolo-nas.md
Normal file
127
docs/en/models/yolo-nas.md
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore detailed documentation of YOLO-NAS, a superior object detection model. Learn about its features, pre-trained models, usage with Ultralytics Python API, and more.
|
||||
keywords: YOLO-NAS, Deci AI, object detection, deep learning, neural architecture search, Ultralytics Python API, YOLO model, pre-trained models, quantization, optimization, COCO, Objects365, Roboflow 100
|
||||
---
|
||||
|
||||
# YOLO-NAS
|
||||
|
||||
## Overview
|
||||
|
||||
Developed by Deci AI, YOLO-NAS is a groundbreaking object detection foundational model. It is the product of advanced Neural Architecture Search technology, meticulously designed to address the limitations of previous YOLO models. With significant improvements in quantization support and accuracy-latency trade-offs, YOLO-NAS represents a major leap in object detection.
|
||||
|
||||

|
||||
**Overview of YOLO-NAS.** YOLO-NAS employs quantization-aware blocks and selective quantization for optimal performance. The model, when converted to its INT8 quantized version, experiences a minimal precision drop, a significant improvement over other models. These advancements culminate in a superior architecture with unprecedented object detection capabilities and outstanding performance.
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Quantization-Friendly Basic Block:** YOLO-NAS introduces a new basic block that is friendly to quantization, addressing one of the significant limitations of previous YOLO models.
|
||||
- **Sophisticated Training and Quantization:** YOLO-NAS leverages advanced training schemes and post-training quantization to enhance performance.
|
||||
- **AutoNAC Optimization and Pre-training:** YOLO-NAS utilizes AutoNAC optimization and is pre-trained on prominent datasets such as COCO, Objects365, and Roboflow 100. This pre-training makes it extremely suitable for downstream object detection tasks in production environments.
|
||||
|
||||
## Pre-trained Models
|
||||
|
||||
Experience the power of next-generation object detection with the pre-trained YOLO-NAS models provided by Ultralytics. These models are designed to deliver top-notch performance in terms of both speed and accuracy. Choose from a variety of options tailored to your specific needs:
|
||||
|
||||
| Model | mAP | Latency (ms) |
|
||||
|------------------|-------|--------------|
|
||||
| YOLO-NAS S | 47.5 | 3.21 |
|
||||
| YOLO-NAS M | 51.55 | 5.85 |
|
||||
| YOLO-NAS L | 52.22 | 7.87 |
|
||||
| YOLO-NAS S INT-8 | 47.03 | 2.36 |
|
||||
| YOLO-NAS M INT-8 | 51.0 | 3.78 |
|
||||
| YOLO-NAS L INT-8 | 52.1 | 4.78 |
|
||||
|
||||
Each model variant is designed to offer a balance between Mean Average Precision (mAP) and latency, helping you optimize your object detection tasks for both performance and speed.
|
||||
|
||||
## Usage
|
||||
|
||||
Ultralytics has made YOLO-NAS models easy to integrate into your Python applications via our `ultralytics` python package. The package provides a user-friendly Python API to streamline the process.
|
||||
|
||||
The following examples show how to use YOLO-NAS models with the `ultralytics` package for inference and validation:
|
||||
|
||||
### Inference and Validation Examples
|
||||
|
||||
In this example we validate YOLO-NAS-s on the COCO8 dataset.
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple inference and validation code for YOLO-NAS. For handling inference results see [Predict](../modes/predict.md) mode. For using YOLO-NAS with additional modes see [Val](../modes/val.md) and [Export](../modes/export.md). YOLO-NAS on the `ultralytics` package does not support training.
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models files can be passed to the `NAS()` class to create a model instance in python:
|
||||
|
||||
```python
|
||||
from ultralytics import NAS
|
||||
|
||||
# Load a COCO-pretrained YOLO-NAS-s model
|
||||
model = NAS('yolo_nas_s.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Validate the model on the COCO8 example dataset
|
||||
results = model.val(data='coco8.yaml')
|
||||
|
||||
# Run inference with the YOLO-NAS-s model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained YOLO-NAS-s model and validate it's performance on the COCO8 example dataset
|
||||
yolo val model=yolo_nas_s.pt data=coco8.yaml
|
||||
|
||||
# Load a COCO-pretrained YOLO-NAS-s model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolo_nas_s.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
### Supported Tasks
|
||||
|
||||
The YOLO-NAS models are primarily designed for object detection tasks. You can download the pre-trained weights for each variant of the model as follows:
|
||||
|
||||
| Model Type | Pre-trained Weights | Tasks Supported |
|
||||
|------------|-----------------------------------------------------------------------------------------------|------------------|
|
||||
| YOLO-NAS-s | [yolo_nas_s.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_s.pt) | Object Detection |
|
||||
| YOLO-NAS-m | [yolo_nas_m.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_m.pt) | Object Detection |
|
||||
| YOLO-NAS-l | [yolo_nas_l.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_l.pt) | Object Detection |
|
||||
|
||||
### Supported Modes
|
||||
|
||||
The YOLO-NAS models support both inference and validation modes, allowing you to predict and validate results with ease. Training mode, however, is currently not supported.
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :heavy_check_mark: |
|
||||
| Training | :x: |
|
||||
|
||||
Harness the power of the YOLO-NAS models to drive your object detection tasks to new heights of performance and speed.
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you employ YOLO-NAS in your research or development work, please cite SuperGradients:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{supergradients,
|
||||
doi = {10.5281/ZENODO.7789328},
|
||||
url = {https://zenodo.org/record/7789328},
|
||||
author = {Aharon, Shay and {Louis-Dupont} and {Ofri Masad} and Yurkova, Kate and {Lotem Fridman} and {Lkdci} and Khvedchenya, Eugene and Rubin, Ran and Bagrov, Natan and Tymchenko, Borys and Keren, Tomer and Zhilko, Alexander and {Eran-Deci}},
|
||||
title = {Super-Gradients},
|
||||
publisher = {GitHub},
|
||||
journal = {GitHub repository},
|
||||
year = {2021},
|
||||
}
|
||||
```
|
||||
|
||||
We express our gratitude to Deci AI's [SuperGradients](https://github.com/Deci-AI/super-gradients/) team for their efforts in creating and maintaining this valuable resource for the computer vision community. We believe YOLO-NAS, with its innovative architecture and superior object detection capabilities, will become a critical tool for developers and researchers alike.
|
||||
|
||||
*Keywords: YOLO-NAS, Deci AI, object detection, deep learning, neural architecture search, Ultralytics Python API, YOLO model, SuperGradients, pre-trained models, quantization-friendly basic block, advanced training schemes, post-training quantization, AutoNAC optimization, COCO, Objects365, Roboflow 100*
|
||||
107
docs/en/models/yolov3.md
Normal file
107
docs/en/models/yolov3.md
Normal file
|
|
@ -0,0 +1,107 @@
|
|||
---
|
||||
comments: true
|
||||
description: Get an overview of YOLOv3, YOLOv3-Ultralytics and YOLOv3u. Learn about their key features, usage, and supported tasks for object detection.
|
||||
keywords: YOLOv3, YOLOv3-Ultralytics, YOLOv3u, Object Detection, Inference, Training, Ultralytics
|
||||
---
|
||||
|
||||
# YOLOv3, YOLOv3-Ultralytics, and YOLOv3u
|
||||
|
||||
## Overview
|
||||
|
||||
This document presents an overview of three closely related object detection models, namely [YOLOv3](https://pjreddie.com/darknet/yolo/), [YOLOv3-Ultralytics](https://github.com/ultralytics/yolov3), and [YOLOv3u](https://github.com/ultralytics/ultralytics).
|
||||
|
||||
1. **YOLOv3:** This is the third version of the You Only Look Once (YOLO) object detection algorithm. Originally developed by Joseph Redmon, YOLOv3 improved on its predecessors by introducing features such as multiscale predictions and three different sizes of detection kernels.
|
||||
|
||||
2. **YOLOv3-Ultralytics:** This is Ultralytics' implementation of the YOLOv3 model. It reproduces the original YOLOv3 architecture and offers additional functionalities, such as support for more pre-trained models and easier customization options.
|
||||
|
||||
3. **YOLOv3u:** This is an updated version of YOLOv3-Ultralytics that incorporates the anchor-free, objectness-free split head used in YOLOv8 models. YOLOv3u maintains the same backbone and neck architecture as YOLOv3 but with the updated detection head from YOLOv8.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- **YOLOv3:** Introduced the use of three different scales for detection, leveraging three different sizes of detection kernels: 13x13, 26x26, and 52x52. This significantly improved detection accuracy for objects of different sizes. Additionally, YOLOv3 added features such as multi-label predictions for each bounding box and a better feature extractor network.
|
||||
|
||||
- **YOLOv3-Ultralytics:** Ultralytics' implementation of YOLOv3 provides the same performance as the original model but comes with added support for more pre-trained models, additional training methods, and easier customization options. This makes it more versatile and user-friendly for practical applications.
|
||||
|
||||
- **YOLOv3u:** This updated model incorporates the anchor-free, objectness-free split head from YOLOv8. By eliminating the need for pre-defined anchor boxes and objectness scores, this detection head design can improve the model's ability to detect objects of varying sizes and shapes. This makes YOLOv3u more robust and accurate for object detection tasks.
|
||||
|
||||
## Supported Tasks
|
||||
|
||||
YOLOv3, YOLOv3-Ultralytics, and YOLOv3u all support the following tasks:
|
||||
|
||||
- Object Detection
|
||||
|
||||
## Supported Modes
|
||||
|
||||
All three models support the following modes:
|
||||
|
||||
- Inference
|
||||
- Validation
|
||||
- Training
|
||||
- Export
|
||||
|
||||
## Performance
|
||||
|
||||
Below is a comparison of the performance of the three models. The performance is measured in terms of the Mean Average Precision (mAP) on the COCO dataset:
|
||||
|
||||
TODO
|
||||
|
||||
## Usage
|
||||
|
||||
You can use YOLOv3 for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv3 model for inference:
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple inference code for YOLOv3. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv3 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a COCO-pretrained YOLOv3n model
|
||||
model = YOLO('yolov3n.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the YOLOv3n model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolov3n.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{redmon2018yolov3,
|
||||
title={YOLOv3: An Incremental Improvement},
|
||||
author={Redmon, Joseph and Farhadi, Ali},
|
||||
journal={arXiv preprint arXiv:1804.02767},
|
||||
year={2018}
|
||||
}
|
||||
```
|
||||
|
||||
Thank you to Joseph Redmon and Ali Farhadi for developing the original YOLOv3.
|
||||
71
docs/en/models/yolov4.md
Normal file
71
docs/en/models/yolov4.md
Normal file
|
|
@ -0,0 +1,71 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore our detailed guide on YOLOv4, a state-of-the-art real-time object detector. Understand its architectural highlights, innovative features, and application examples.
|
||||
keywords: ultralytics, YOLOv4, object detection, neural network, real-time detection, object detector, machine learning
|
||||
---
|
||||
|
||||
# YOLOv4: High-Speed and Precise Object Detection
|
||||
|
||||
Welcome to the Ultralytics documentation page for YOLOv4, a state-of-the-art, real-time object detector launched in 2020 by Alexey Bochkovskiy at [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet). YOLOv4 is designed to provide the optimal balance between speed and accuracy, making it an excellent choice for many applications.
|
||||
|
||||

|
||||
**YOLOv4 architecture diagram**. Showcasing the intricate network design of YOLOv4, including the backbone, neck, and head components, and their interconnected layers for optimal real-time object detection.
|
||||
|
||||
## Introduction
|
||||
|
||||
YOLOv4 stands for You Only Look Once version 4. It is a real-time object detection model developed to address the limitations of previous YOLO versions like [YOLOv3](./yolov3.md) and other object detection models. Unlike other convolutional neural network (CNN) based object detectors, YOLOv4 is not only applicable for recommendation systems but also for standalone process management and human input reduction. Its operation on conventional graphics processing units (GPUs) allows for mass usage at an affordable price, and it is designed to work in real-time on a conventional GPU while requiring only one such GPU for training.
|
||||
|
||||
## Architecture
|
||||
|
||||
YOLOv4 makes use of several innovative features that work together to optimize its performance. These include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT), Mish-activation, Mosaic data augmentation, DropBlock regularization, and CIoU loss. These features are combined to achieve state-of-the-art results.
|
||||
|
||||
A typical object detector is composed of several parts including the input, the backbone, the neck, and the head. The backbone of YOLOv4 is pre-trained on ImageNet and is used to predict classes and bounding boxes of objects. The backbone could be from several models including VGG, ResNet, ResNeXt, or DenseNet. The neck part of the detector is used to collect feature maps from different stages and usually includes several bottom-up paths and several top-down paths. The head part is what is used to make the final object detections and classifications.
|
||||
|
||||
## Bag of Freebies
|
||||
|
||||
YOLOv4 also makes use of methods known as "bag of freebies," which are techniques that improve the accuracy of the model during training without increasing the cost of inference. Data augmentation is a common bag of freebies technique used in object detection, which increases the variability of the input images to improve the robustness of the model. Some examples of data augmentation include photometric distortions (adjusting the brightness, contrast, hue, saturation, and noise of an image) and geometric distortions (adding random scaling, cropping, flipping, and rotating). These techniques help the model to generalize better to different types of images.
|
||||
|
||||
## Features and Performance
|
||||
|
||||
YOLOv4 is designed for optimal speed and accuracy in object detection. The architecture of YOLOv4 includes CSPDarknet53 as the backbone, PANet as the neck, and YOLOv3 as the detection head. This design allows YOLOv4 to perform object detection at an impressive speed, making it suitable for real-time applications. YOLOv4 also excels in accuracy, achieving state-of-the-art results in object detection benchmarks.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
As of the time of writing, Ultralytics does not currently support YOLOv4 models. Therefore, any users interested in using YOLOv4 will need to refer directly to the YOLOv4 GitHub repository for installation and usage instructions.
|
||||
|
||||
Here is a brief overview of the typical steps you might take to use YOLOv4:
|
||||
|
||||
1. Visit the YOLOv4 GitHub repository: [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet).
|
||||
|
||||
2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.
|
||||
|
||||
3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.
|
||||
|
||||
Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv4 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv4 GitHub repository.
|
||||
|
||||
We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv4 is implemented.
|
||||
|
||||
## Conclusion
|
||||
|
||||
YOLOv4 is a powerful and efficient object detection model that strikes a balance between speed and accuracy. Its use of unique features and bag of freebies techniques during training allows it to perform excellently in real-time object detection tasks. YOLOv4 can be trained and used by anyone with a conventional GPU, making it accessible and practical for a wide range of applications.
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
We would like to acknowledge the YOLOv4 authors for their significant contributions in the field of real-time object detection:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{bochkovskiy2020yolov4,
|
||||
title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
|
||||
author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
|
||||
year={2020},
|
||||
eprint={2004.10934},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
The original YOLOv4 paper can be found on [arXiv](https://arxiv.org/pdf/2004.10934.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/AlexeyAB/darknet). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
|
||||
115
docs/en/models/yolov5.md
Normal file
115
docs/en/models/yolov5.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover YOLOv5u, a boosted version of the YOLOv5 model featuring an improved accuracy-speed tradeoff and numerous pre-trained models for various object detection tasks.
|
||||
keywords: YOLOv5u, object detection, pre-trained models, Ultralytics, Inference, Validation, YOLOv5, YOLOv8, anchor-free, objectness-free, real-time applications, machine learning
|
||||
---
|
||||
|
||||
# YOLOv5
|
||||
|
||||
## Overview
|
||||
|
||||
YOLOv5u represents an advancement in object detection methodologies. Originating from the foundational architecture of the [YOLOv5](https://github.com/ultralytics/yolov5) model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the [YOLOv8](./yolov8.md) models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- **Anchor-free Split Ultralytics Head:** Traditional object detection models rely on predefined anchor boxes to predict object locations. However, YOLOv5u modernizes this approach. By adopting an anchor-free split Ultralytics head, it ensures a more flexible and adaptive detection mechanism, consequently enhancing the performance in diverse scenarios.
|
||||
|
||||
- **Optimized Accuracy-Speed Tradeoff:** Speed and accuracy often pull in opposite directions. But YOLOv5u challenges this tradeoff. It offers a calibrated balance, ensuring real-time detections without compromising on accuracy. This feature is particularly invaluable for applications that demand swift responses, such as autonomous vehicles, robotics, and real-time video analytics.
|
||||
|
||||
- **Variety of Pre-trained Models:** Understanding that different tasks require different toolsets, YOLOv5u provides a plethora of pre-trained models. Whether you're focusing on Inference, Validation, or Training, there's a tailor-made model awaiting you. This variety ensures you're not just using a one-size-fits-all solution, but a model specifically fine-tuned for your unique challenge.
|
||||
|
||||
## Supported Tasks
|
||||
|
||||
| Model Type | Pre-trained Weights | Task |
|
||||
|------------|-----------------------------------------------------------------------------------------------------------------------------|-----------|
|
||||
| YOLOv5u | `yolov5nu`, `yolov5su`, `yolov5mu`, `yolov5lu`, `yolov5xu`, `yolov5n6u`, `yolov5s6u`, `yolov5m6u`, `yolov5l6u`, `yolov5x6u` | Detection |
|
||||
|
||||
## Supported Modes
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :heavy_check_mark: |
|
||||
| Training | :heavy_check_mark: |
|
||||
|
||||
!!! Performance
|
||||
|
||||
=== "Detection"
|
||||
|
||||
| Model | YAML | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|
||||
|---------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-----------------------|----------------------|--------------------------------|-------------------------------------|--------------------|-------------------|
|
||||
| [yolov5nu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5nu.pt) | [yolov5n.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 34.3 | 73.6 | 1.06 | 2.6 | 7.7 |
|
||||
| [yolov5su.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5su.pt) | [yolov5s.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 43.0 | 120.7 | 1.27 | 9.1 | 24.0 |
|
||||
| [yolov5mu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5mu.pt) | [yolov5m.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 49.0 | 233.9 | 1.86 | 25.1 | 64.2 |
|
||||
| [yolov5lu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5lu.pt) | [yolov5l.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 52.2 | 408.4 | 2.50 | 53.2 | 135.0 |
|
||||
| [yolov5xu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5xu.pt) | [yolov5x.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 53.2 | 763.2 | 3.81 | 97.2 | 246.4 |
|
||||
| | | | | | | | |
|
||||
| [yolov5n6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5n6u.pt) | [yolov5n6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 42.1 | 211.0 | 1.83 | 4.3 | 7.8 |
|
||||
| [yolov5s6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5s6u.pt) | [yolov5s6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 48.6 | 422.6 | 2.34 | 15.3 | 24.6 |
|
||||
| [yolov5m6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5m6u.pt) | [yolov5m6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 53.6 | 810.9 | 4.36 | 41.2 | 65.7 |
|
||||
| [yolov5l6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5l6u.pt) | [yolov5l6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 55.7 | 1470.9 | 5.47 | 86.1 | 137.4 |
|
||||
| [yolov5x6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5x6u.pt) | [yolov5x6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 56.8 | 2436.5 | 8.98 | 155.4 | 250.7 |
|
||||
|
||||
## Usage
|
||||
|
||||
You can use YOLOv5u for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv5u model for inference:
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple inference code for YOLOv5. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv5 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a COCO-pretrained YOLOv5n model
|
||||
model = YOLO('yolov5n.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the YOLOv5n model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Load a COCO-pretrained YOLOv5n model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolov5n.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
```bibtex
|
||||
@software{yolov5,
|
||||
title = {Ultralytics YOLOv5},
|
||||
author = {Glenn Jocher},
|
||||
year = {2020},
|
||||
version = {7.0},
|
||||
license = {AGPL-3.0},
|
||||
url = {https://github.com/ultralytics/yolov5},
|
||||
doi = {10.5281/zenodo.3908559},
|
||||
orcid = {0000-0001-5950-6979}
|
||||
}
|
||||
```
|
||||
|
||||
Special thanks to Glenn Jocher and the Ultralytics team for their work on developing and maintaining the YOLOv5 and YOLOv5u models.
|
||||
113
docs/en/models/yolov6.md
Normal file
113
docs/en/models/yolov6.md
Normal file
|
|
@ -0,0 +1,113 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore Meituan YOLOv6, a state-of-the-art object detection model striking a balance between speed and accuracy. Dive into features, pre-trained models, and Python usage.
|
||||
keywords: Meituan YOLOv6, object detection, Ultralytics, YOLOv6 docs, Bi-directional Concatenation, Anchor-Aided Training, pretrained models, real-time applications
|
||||
---
|
||||
|
||||
# Meituan YOLOv6
|
||||
|
||||
## Overview
|
||||
|
||||
[Meituan](https://about.meituan.com/) YOLOv6 is a cutting-edge object detector that offers remarkable balance between speed and accuracy, making it a popular choice for real-time applications. This model introduces several notable enhancements on its architecture and training scheme, including the implementation of a Bi-directional Concatenation (BiC) module, an anchor-aided training (AAT) strategy, and an improved backbone and neck design for state-of-the-art accuracy on the COCO dataset.
|
||||
|
||||

|
||||

|
||||
**Overview of YOLOv6.** Model architecture diagram showing the redesigned network components and training strategies that have led to significant performance improvements. (a) The neck of YOLOv6 (N and S are shown). Note for M/L, RepBlocks is replaced with CSPStackRep. (b) The structure of a BiC module. (c) A SimCSPSPPF block. ([source](https://arxiv.org/pdf/2301.05586.pdf)).
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Bidirectional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation.
|
||||
- **Anchor-Aided Training (AAT) Strategy:** This model proposes AAT to enjoy the benefits of both anchor-based and anchor-free paradigms without compromising inference efficiency.
|
||||
- **Enhanced Backbone and Neck Design:** By deepening YOLOv6 to include another stage in the backbone and neck, this model achieves state-of-the-art performance on the COCO dataset at high-resolution input.
|
||||
- **Self-Distillation Strategy:** A new self-distillation strategy is implemented to boost the performance of smaller models of YOLOv6, enhancing the auxiliary regression branch during training and removing it at inference to avoid a marked speed decline.
|
||||
|
||||
## Pre-trained Models
|
||||
|
||||
YOLOv6 provides various pre-trained models with different scales:
|
||||
|
||||
- YOLOv6-N: 37.5% AP on COCO val2017 at 1187 FPS with NVIDIA Tesla T4 GPU.
|
||||
- YOLOv6-S: 45.0% AP at 484 FPS.
|
||||
- YOLOv6-M: 50.0% AP at 226 FPS.
|
||||
- YOLOv6-L: 52.8% AP at 116 FPS.
|
||||
- YOLOv6-L6: State-of-the-art accuracy in real-time.
|
||||
|
||||
YOLOv6 also provides quantized models for different precisions and models optimized for mobile platforms.
|
||||
|
||||
## Usage
|
||||
|
||||
You can use YOLOv6 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv6 models for training:
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple training code for YOLOv6. For more options including training settings see [Train](../modes/train.md) mode. For using YOLOv6 with additional modes see [Predict](../modes/predict.md), [Val](../modes/val.md) and [Export](../modes/export.md).
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Build a YOLOv6n model from scratch
|
||||
model = YOLO('yolov6n.yaml')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the YOLOv6n model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Build a YOLOv6n model from scratch and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Build a YOLOv6n model from scratch and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolov6n.yaml source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
### Supported Tasks
|
||||
|
||||
| Model Type | Pre-trained Weights | Tasks Supported |
|
||||
|------------|---------------------|------------------|
|
||||
| YOLOv6-N | `yolov6-n.pt` | Object Detection |
|
||||
| YOLOv6-S | `yolov6-s.pt` | Object Detection |
|
||||
| YOLOv6-M | `yolov6-m.pt` | Object Detection |
|
||||
| YOLOv6-L | `yolov6-l.pt` | Object Detection |
|
||||
| YOLOv6-L6 | `yolov6-l6.pt` | Object Detection |
|
||||
|
||||
## Supported Modes
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :heavy_check_mark: |
|
||||
| Training | :heavy_check_mark: |
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
We would like to acknowledge the authors for their significant contributions in the field of real-time object detection:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@misc{li2023yolov6,
|
||||
title={YOLOv6 v3.0: A Full-Scale Reloading},
|
||||
author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu},
|
||||
year={2023},
|
||||
eprint={2301.05586},
|
||||
archivePrefix={arXiv},
|
||||
primaryClass={cs.CV}
|
||||
}
|
||||
```
|
||||
|
||||
The original YOLOv6 paper can be found on [arXiv](https://arxiv.org/abs/2301.05586). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/meituan/YOLOv6). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
|
||||
65
docs/en/models/yolov7.md
Normal file
65
docs/en/models/yolov7.md
Normal file
|
|
@ -0,0 +1,65 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the YOLOv7, a real-time object detector. Understand its superior speed, impressive accuracy, and unique trainable bag-of-freebies optimization focus.
|
||||
keywords: YOLOv7, real-time object detector, state-of-the-art, Ultralytics, MS COCO dataset, model re-parameterization, dynamic label assignment, extended scaling, compound scaling
|
||||
---
|
||||
|
||||
# YOLOv7: Trainable Bag-of-Freebies
|
||||
|
||||
YOLOv7 is a state-of-the-art real-time object detector that surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS. It has the highest accuracy (56.8% AP) among all known real-time object detectors with 30 FPS or higher on GPU V100. Moreover, YOLOv7 outperforms other object detectors such as YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, and many others in speed and accuracy. The model is trained on the MS COCO dataset from scratch without using any other datasets or pre-trained weights. Source code for YOLOv7 is available on GitHub.
|
||||
|
||||

|
||||
**Comparison of state-of-the-art object detectors.** From the results in Table 2 we know that the proposed method has the best speed-accuracy trade-off comprehensively. If we compare YOLOv7-tiny-SiLU with YOLOv5-N (r6.1), our method is 127 fps faster and 10.7% more accurate on AP. In addition, YOLOv7 has 51.4% AP at frame rate of 161 fps, while PPYOLOE-L with the same AP has only 78 fps frame rate. In terms of parameter usage, YOLOv7 is 41% less than PPYOLOE-L. If we compare YOLOv7-X with 114 fps inference speed to YOLOv5-L (r6.1) with 99 fps inference speed, YOLOv7-X can improve AP by 3.9%. If YOLOv7-X is compared with YOLOv5-X (r6.1) of similar scale, the inference speed of YOLOv7-X is 31 fps faster. In addition, in terms the amount of parameters and computation, YOLOv7-X reduces 22% of parameters and 8% of computation compared to YOLOv5-X (r6.1), but improves AP by 2.2% ([Source](https://arxiv.org/pdf/2207.02696.pdf)).
|
||||
|
||||
## Overview
|
||||
|
||||
Real-time object detection is an important component in many computer vision systems, including multi-object tracking, autonomous driving, robotics, and medical image analysis. In recent years, real-time object detection development has focused on designing efficient architectures and improving the inference speed of various CPUs, GPUs, and neural processing units (NPUs). YOLOv7 supports both mobile GPU and GPU devices, from the edge to the cloud.
|
||||
|
||||
Unlike traditional real-time object detectors that focus on architecture optimization, YOLOv7 introduces a focus on the optimization of the training process. This includes modules and optimization methods designed to improve the accuracy of object detection without increasing the inference cost, a concept known as the "trainable bag-of-freebies".
|
||||
|
||||
## Key Features
|
||||
|
||||
YOLOv7 introduces several key features:
|
||||
|
||||
1. **Model Re-parameterization**: YOLOv7 proposes a planned re-parameterized model, which is a strategy applicable to layers in different networks with the concept of gradient propagation path.
|
||||
|
||||
2. **Dynamic Label Assignment**: The training of the model with multiple output layers presents a new issue: "How to assign dynamic targets for the outputs of different branches?" To solve this problem, YOLOv7 introduces a new label assignment method called coarse-to-fine lead guided label assignment.
|
||||
|
||||
3. **Extended and Compound Scaling**: YOLOv7 proposes "extend" and "compound scaling" methods for the real-time object detector that can effectively utilize parameters and computation.
|
||||
|
||||
4. **Efficiency**: The method proposed by YOLOv7 can effectively reduce about 40% parameters and 50% computation of state-of-the-art real-time object detector, and has faster inference speed and higher detection accuracy.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
As of the time of writing, Ultralytics does not currently support YOLOv7 models. Therefore, any users interested in using YOLOv7 will need to refer directly to the YOLOv7 GitHub repository for installation and usage instructions.
|
||||
|
||||
Here is a brief overview of the typical steps you might take to use YOLOv7:
|
||||
|
||||
1. Visit the YOLOv7 GitHub repository: [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7).
|
||||
|
||||
2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.
|
||||
|
||||
3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.
|
||||
|
||||
Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv7 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv7 GitHub repository.
|
||||
|
||||
We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv7 is implemented.
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
We would like to acknowledge the YOLOv7 authors for their significant contributions in the field of real-time object detection:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@article{wang2022yolov7,
|
||||
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
|
||||
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
|
||||
journal={arXiv preprint arXiv:2207.02696},
|
||||
year={2022}
|
||||
}
|
||||
```
|
||||
|
||||
The original YOLOv7 paper can be found on [arXiv](https://arxiv.org/pdf/2207.02696.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/WongKinYiu/yolov7). We appreciate their efforts in advancing the field and making their work accessible to the broader community.
|
||||
154
docs/en/models/yolov8.md
Normal file
154
docs/en/models/yolov8.md
Normal file
|
|
@ -0,0 +1,154 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore the thrilling features of YOLOv8, the latest version of our real-time object detector! Learn how advanced architectures, pre-trained models and optimal balance between accuracy & speed make YOLOv8 the perfect choice for your object detection tasks.
|
||||
keywords: YOLOv8, Ultralytics, real-time object detector, pre-trained models, documentation, object detection, YOLO series, advanced architectures, accuracy, speed
|
||||
---
|
||||
|
||||
# YOLOv8
|
||||
|
||||
## Overview
|
||||
|
||||
YOLOv8 is the latest iteration in the YOLO series of real-time object detectors, offering cutting-edge performance in terms of accuracy and speed. Building upon the advancements of previous YOLO versions, YOLOv8 introduces new features and optimizations that make it an ideal choice for various object detection tasks in a wide range of applications.
|
||||
|
||||

|
||||
|
||||
## Key Features
|
||||
|
||||
- **Advanced Backbone and Neck Architectures:** YOLOv8 employs state-of-the-art backbone and neck architectures, resulting in improved feature extraction and object detection performance.
|
||||
- **Anchor-free Split Ultralytics Head:** YOLOv8 adopts an anchor-free split Ultralytics head, which contributes to better accuracy and a more efficient detection process compared to anchor-based approaches.
|
||||
- **Optimized Accuracy-Speed Tradeoff:** With a focus on maintaining an optimal balance between accuracy and speed, YOLOv8 is suitable for real-time object detection tasks in diverse application areas.
|
||||
- **Variety of Pre-trained Models:** YOLOv8 offers a range of pre-trained models to cater to various tasks and performance requirements, making it easier to find the right model for your specific use case.
|
||||
|
||||
## Supported Tasks
|
||||
|
||||
| Model Type | Pre-trained Weights | Task |
|
||||
|-------------|---------------------------------------------------------------------------------------------------------------------|-----------------------|
|
||||
| YOLOv8 | `yolov8n.pt`, `yolov8s.pt`, `yolov8m.pt`, `yolov8l.pt`, `yolov8x.pt` | Detection |
|
||||
| YOLOv8-seg | `yolov8n-seg.pt`, `yolov8s-seg.pt`, `yolov8m-seg.pt`, `yolov8l-seg.pt`, `yolov8x-seg.pt` | Instance Segmentation |
|
||||
| YOLOv8-pose | `yolov8n-pose.pt`, `yolov8s-pose.pt`, `yolov8m-pose.pt`, `yolov8l-pose.pt`, `yolov8x-pose.pt`, `yolov8x-pose-p6.pt` | Pose/Keypoints |
|
||||
| YOLOv8-cls | `yolov8n-cls.pt`, `yolov8s-cls.pt`, `yolov8m-cls.pt`, `yolov8l-cls.pt`, `yolov8x-cls.pt` | Classification |
|
||||
|
||||
## Supported Modes
|
||||
|
||||
| Mode | Supported |
|
||||
|------------|--------------------|
|
||||
| Inference | :heavy_check_mark: |
|
||||
| Validation | :heavy_check_mark: |
|
||||
| Training | :heavy_check_mark: |
|
||||
|
||||
!!! Performance
|
||||
|
||||
=== "Detection (COCO)"
|
||||
|
||||
| Model | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|
||||
| ------------------------------------------------------------------------------------ | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
|
||||
| [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt) | 640 | 37.3 | 80.4 | 0.99 | 3.2 | 8.7 |
|
||||
| [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s.pt) | 640 | 44.9 | 128.4 | 1.20 | 11.2 | 28.6 |
|
||||
| [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m.pt) | 640 | 50.2 | 234.7 | 1.83 | 25.9 | 78.9 |
|
||||
| [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l.pt) | 640 | 52.9 | 375.2 | 2.39 | 43.7 | 165.2 |
|
||||
| [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x.pt) | 640 | 53.9 | 479.1 | 3.53 | 68.2 | 257.8 |
|
||||
|
||||
=== "Detection (Open Images V7)"
|
||||
|
||||
See [Detection Docs](https://docs.ultralytics.com/tasks/detect/) for usage examples with these models trained on [Open Image V7](https://docs.ultralytics.com/datasets/detect/open-images-v7/), which include 600 pre-trained classes.
|
||||
|
||||
| Model | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|
||||
| ----------------------------------------------------------------------------------------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
|
||||
| [YOLOv8n](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-oiv7.pt) | 640 | 18.4 | 142.4 | 1.21 | 3.5 | 10.5 |
|
||||
| [YOLOv8s](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-oiv7.pt) | 640 | 27.7 | 183.1 | 1.40 | 11.4 | 29.7 |
|
||||
| [YOLOv8m](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-oiv7.pt) | 640 | 33.6 | 408.5 | 2.26 | 26.2 | 80.6 |
|
||||
| [YOLOv8l](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l-oiv7.pt) | 640 | 34.9 | 596.9 | 2.43 | 44.1 | 167.4 |
|
||||
| [YOLOv8x](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-oiv7.pt) | 640 | 36.3 | 860.6 | 3.56 | 68.7 | 260.6 |
|
||||
|
||||
=== "Segmentation (COCO)"
|
||||
|
||||
| Model | size<br><sup>(pixels) | mAP<sup>box<br>50-95 | mAP<sup>mask<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|
||||
| -------------------------------------------------------------------------------------------- | --------------------- | -------------------- | --------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
|
||||
| [YOLOv8n-seg](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-seg.pt) | 640 | 36.7 | 30.5 | 96.1 | 1.21 | 3.4 | 12.6 |
|
||||
| [YOLOv8s-seg](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-seg.pt) | 640 | 44.6 | 36.8 | 155.7 | 1.47 | 11.8 | 42.6 |
|
||||
| [YOLOv8m-seg](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-seg.pt) | 640 | 49.9 | 40.8 | 317.0 | 2.18 | 27.3 | 110.2 |
|
||||
| [YOLOv8l-seg](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l-seg.pt) | 640 | 52.3 | 42.6 | 572.4 | 2.79 | 46.0 | 220.5 |
|
||||
| [YOLOv8x-seg](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-seg.pt) | 640 | 53.4 | 43.4 | 712.1 | 4.02 | 71.8 | 344.1 |
|
||||
|
||||
=== "Classification (ImageNet)"
|
||||
|
||||
| Model | size<br><sup>(pixels) | acc<br><sup>top1 | acc<br><sup>top5 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) at 640 |
|
||||
| -------------------------------------------------------------------------------------------- | --------------------- | ---------------- | ---------------- | ------------------------------ | ----------------------------------- | ------------------ | ------------------------ |
|
||||
| [YOLOv8n-cls](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-cls.pt) | 224 | 66.6 | 87.0 | 12.9 | 0.31 | 2.7 | 4.3 |
|
||||
| [YOLOv8s-cls](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-cls.pt) | 224 | 72.3 | 91.1 | 23.4 | 0.35 | 6.4 | 13.5 |
|
||||
| [YOLOv8m-cls](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-cls.pt) | 224 | 76.4 | 93.2 | 85.4 | 0.62 | 17.0 | 42.7 |
|
||||
| [YOLOv8l-cls](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l-cls.pt) | 224 | 78.0 | 94.1 | 163.0 | 0.87 | 37.5 | 99.7 |
|
||||
| [YOLOv8x-cls](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-cls.pt) | 224 | 78.4 | 94.3 | 232.0 | 1.01 | 57.4 | 154.8 |
|
||||
|
||||
=== "Pose (COCO)"
|
||||
|
||||
| Model | size<br><sup>(pixels) | mAP<sup>pose<br>50-95 | mAP<sup>pose<br>50 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|
||||
| ---------------------------------------------------------------------------------------------------- | --------------------- | --------------------- | ------------------ | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
|
||||
| [YOLOv8n-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n-pose.pt) | 640 | 50.4 | 80.1 | 131.8 | 1.18 | 3.3 | 9.2 |
|
||||
| [YOLOv8s-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8s-pose.pt) | 640 | 60.0 | 86.2 | 233.2 | 1.42 | 11.6 | 30.2 |
|
||||
| [YOLOv8m-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8m-pose.pt) | 640 | 65.0 | 88.8 | 456.3 | 2.00 | 26.4 | 81.0 |
|
||||
| [YOLOv8l-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l-pose.pt) | 640 | 67.6 | 90.0 | 784.5 | 2.59 | 44.4 | 168.6 |
|
||||
| [YOLOv8x-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-pose.pt) | 640 | 69.2 | 90.2 | 1607.1 | 3.73 | 69.4 | 263.2 |
|
||||
| [YOLOv8x-pose-p6](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-pose-p6.pt) | 1280 | 71.6 | 91.2 | 4088.7 | 10.04 | 99.1 | 1066.4 |
|
||||
|
||||
## Usage
|
||||
|
||||
You can use YOLOv8 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv8 models for inference:
|
||||
|
||||
!!! example ""
|
||||
|
||||
This example provides simple inference code for YOLOv8. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv8 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
|
||||
|
||||
=== "Python"
|
||||
|
||||
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a COCO-pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Display model information (optional)
|
||||
model.info()
|
||||
|
||||
# Train the model on the COCO8 example dataset for 100 epochs
|
||||
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
|
||||
|
||||
# Run inference with the YOLOv8n model on the 'bus.jpg' image
|
||||
results = model('path/to/bus.jpg')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
CLI commands are available to directly run the models:
|
||||
|
||||
```bash
|
||||
# Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs
|
||||
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
|
||||
|
||||
# Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
|
||||
yolo predict model=yolov8n.pt source=path/to/bus.jpg
|
||||
```
|
||||
|
||||
## Citations and Acknowledgements
|
||||
|
||||
If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format:
|
||||
|
||||
!!! note ""
|
||||
|
||||
=== "BibTeX"
|
||||
|
||||
```bibtex
|
||||
@software{yolov8_ultralytics,
|
||||
author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu},
|
||||
title = {Ultralytics YOLOv8},
|
||||
version = {8.0.0},
|
||||
year = {2023},
|
||||
url = {https://github.com/ultralytics/ultralytics},
|
||||
orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069},
|
||||
license = {AGPL-3.0}
|
||||
}
|
||||
```
|
||||
|
||||
Please note that the DOI is pending and will be added to the citation once it is available. The usage of the software is in accordance with the AGPL-3.0 license.
|
||||
94
docs/en/modes/benchmark.md
Normal file
94
docs/en/modes/benchmark.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to profile speed and accuracy of YOLOv8 across various export formats; get insights on mAP50-95, accuracy_top5 metrics, and more.
|
||||
keywords: Ultralytics, YOLOv8, benchmarking, speed profiling, accuracy profiling, mAP50-95, accuracy_top5, ONNX, OpenVINO, TensorRT, YOLO export formats
|
||||
---
|
||||
|
||||
# Model Benchmarking with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
Once your model is trained and validated, the next logical step is to evaluate its performance in various real-world scenarios. Benchmark mode in Ultralytics YOLOv8 serves this purpose by providing a robust framework for assessing the speed and accuracy of your model across a range of export formats.
|
||||
|
||||
## Why Is Benchmarking Crucial?
|
||||
|
||||
- **Informed Decisions:** Gain insights into the trade-offs between speed and accuracy.
|
||||
- **Resource Allocation:** Understand how different export formats perform on different hardware.
|
||||
- **Optimization:** Learn which export format offers the best performance for your specific use case.
|
||||
- **Cost Efficiency:** Make more efficient use of hardware resources based on benchmark results.
|
||||
|
||||
### Key Metrics in Benchmark Mode
|
||||
|
||||
- **mAP50-95:** For object detection, segmentation, and pose estimation.
|
||||
- **accuracy_top5:** For image classification.
|
||||
- **Inference Time:** Time taken for each image in milliseconds.
|
||||
|
||||
### Supported Export Formats
|
||||
|
||||
- **ONNX:** For optimal CPU performance
|
||||
- **TensorRT:** For maximal GPU efficiency
|
||||
- **OpenVINO:** For Intel hardware optimization
|
||||
- **CoreML, TensorFlow SavedModel, and More:** For diverse deployment needs.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
* Export to ONNX or OpenVINO for up to 3x CPU speedup.
|
||||
* Export to TensorRT for up to 5x GPU speedup.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Run YOLOv8n benchmarks on all supported export formats including ONNX, TensorRT etc. See Arguments section below for a full list of export arguments.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics.utils.benchmarks import benchmark
|
||||
|
||||
# Benchmark on GPU
|
||||
benchmark(model='yolov8n.pt', data='coco8.yaml', imgsz=640, half=False, device=0)
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
Arguments such as `model`, `data`, `imgsz`, `half`, `device`, and `verbose` provide users with the flexibility to fine-tune the benchmarks to their specific needs and compare the performance of different export formats with ease.
|
||||
|
||||
| Key | Value | Description |
|
||||
|-----------|---------|-----------------------------------------------------------------------|
|
||||
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
|
||||
| `data` | `None` | path to YAML referencing the benchmarking dataset (under `val` label) |
|
||||
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
|
||||
| `half` | `False` | FP16 quantization |
|
||||
| `int8` | `False` | INT8 quantization |
|
||||
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
|
||||
| `verbose` | `False` | do not continue on error (bool), or val floor threshold (float) |
|
||||
|
||||
## Export Formats
|
||||
|
||||
Benchmarks will attempt to run automatically on all possible export formats below.
|
||||
|
||||
| Format | `format` Argument | Model | Metadata | Arguments |
|
||||
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
|
||||
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
|
||||
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
|
||||
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
|
||||
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
|
||||
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
|
||||
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
|
||||
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
|
||||
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
|
||||
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
|
||||
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
|
||||
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
|
||||
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
|
||||
| [ncnn](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
|
||||
|
||||
See full `export` details in the [Export](https://docs.ultralytics.com/modes/export/) page.
|
||||
108
docs/en/modes/export.md
Normal file
108
docs/en/modes/export.md
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
---
|
||||
comments: true
|
||||
description: Step-by-step guide on exporting your YOLOv8 models to various format like ONNX, TensorRT, CoreML and more for deployment. Explore now!.
|
||||
keywords: YOLO, YOLOv8, Ultralytics, Model export, ONNX, TensorRT, CoreML, TensorFlow SavedModel, OpenVINO, PyTorch, export model
|
||||
---
|
||||
|
||||
# Model Export with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
The ultimate goal of training a model is to deploy it for real-world applications. Export mode in Ultralytics YOLOv8 offers a versatile range of options for exporting your trained model to different formats, making it deployable across various platforms and devices. This comprehensive guide aims to walk you through the nuances of model exporting, showcasing how to achieve maximum compatibility and performance.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/WbomGeoOT_k?si=aGmuyooWftA0ue9X"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How To Export Custom Trained Ultralytics YOLOv8 Model and Run Live Inference on Webcam.
|
||||
</p>
|
||||
|
||||
## Why Choose YOLOv8's Export Mode?
|
||||
|
||||
- **Versatility:** Export to multiple formats including ONNX, TensorRT, CoreML, and more.
|
||||
- **Performance:** Gain up to 5x GPU speedup with TensorRT and 3x CPU speedup with ONNX or OpenVINO.
|
||||
- **Compatibility:** Make your model universally deployable across numerous hardware and software environments.
|
||||
- **Ease of Use:** Simple CLI and Python API for quick and straightforward model exporting.
|
||||
|
||||
### Key Features of Export Mode
|
||||
|
||||
Here are some of the standout functionalities:
|
||||
|
||||
- **One-Click Export:** Simple commands for exporting to different formats.
|
||||
- **Batch Export:** Export batched-inference capable models.
|
||||
- **Optimized Inference:** Exported models are optimized for quicker inference times.
|
||||
- **Tutorial Videos:** In-depth guides and tutorials for a smooth exporting experience.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
* Export to ONNX or OpenVINO for up to 3x CPU speedup.
|
||||
* Export to TensorRT for up to 5x GPU speedup.
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Export a YOLOv8n model to a different format like ONNX or TensorRT. See Arguments section below for a full list of export arguments.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load an official model
|
||||
model = YOLO('path/to/best.pt') # load a custom trained model
|
||||
|
||||
# Export the model
|
||||
model.export(format='onnx')
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
yolo export model=yolov8n.pt format=onnx # export official model
|
||||
yolo export model=path/to/best.pt format=onnx # export custom trained model
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
Export settings for YOLO models refer to the various configurations and options used to save or export the model for use in other environments or platforms. These settings can affect the model's performance, size, and compatibility with different systems. Some common YOLO export settings include the format of the exported model file (e.g. ONNX, TensorFlow SavedModel), the device on which the model will be run (e.g. CPU, GPU), and the presence of additional features such as masks or multiple labels per box. Other factors that may affect the export process include the specific task the model is being used for and the requirements or constraints of the target environment or platform. It is important to carefully consider and configure these settings to ensure that the exported model is optimized for the intended use case and can be used effectively in the target environment.
|
||||
|
||||
| Key | Value | Description |
|
||||
|-------------|-----------------|------------------------------------------------------|
|
||||
| `format` | `'torchscript'` | format to export to |
|
||||
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
|
||||
| `keras` | `False` | use Keras for TF SavedModel export |
|
||||
| `optimize` | `False` | TorchScript: optimize for mobile |
|
||||
| `half` | `False` | FP16 quantization |
|
||||
| `int8` | `False` | INT8 quantization |
|
||||
| `dynamic` | `False` | ONNX/TensorRT: dynamic axes |
|
||||
| `simplify` | `False` | ONNX/TensorRT: simplify model |
|
||||
| `opset` | `None` | ONNX: opset version (optional, defaults to latest) |
|
||||
| `workspace` | `4` | TensorRT: workspace size (GB) |
|
||||
| `nms` | `False` | CoreML: add NMS |
|
||||
|
||||
## Export Formats
|
||||
|
||||
Available YOLOv8 export formats are in the table below. You can export to any format using the `format` argument, i.e. `format='onnx'` or `format='engine'`.
|
||||
|
||||
| Format | `format` Argument | Model | Metadata | Arguments |
|
||||
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
|
||||
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
|
||||
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
|
||||
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
|
||||
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
|
||||
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
|
||||
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
|
||||
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
|
||||
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
|
||||
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
|
||||
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
|
||||
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
|
||||
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
|
||||
| [ncnn](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
|
||||
74
docs/en/modes/index.md
Normal file
74
docs/en/modes/index.md
Normal file
|
|
@ -0,0 +1,74 @@
|
|||
---
|
||||
comments: true
|
||||
description: From training to tracking, make the most of YOLOv8 with Ultralytics. Get insights and examples for each supported mode including validation, export, and benchmarking.
|
||||
keywords: Ultralytics, YOLOv8, Machine Learning, Object Detection, Training, Validation, Prediction, Export, Tracking, Benchmarking
|
||||
---
|
||||
|
||||
# Ultralytics YOLOv8 Modes
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
Ultralytics YOLOv8 is not just another object detection model; it's a versatile framework designed to cover the entire lifecycle of machine learning models—from data ingestion and model training to validation, deployment, and real-world tracking. Each mode serves a specific purpose and is engineered to offer you the flexibility and efficiency required for different tasks and use-cases.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/j8uQc0qB91s?si=dhnGKgqvs7nPgeaM"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Ultralytics Modes Tutorial: Train, Validate, Predict, Export & Benchmark.
|
||||
</p>
|
||||
|
||||
### Modes at a Glance
|
||||
|
||||
Understanding the different **modes** that Ultralytics YOLOv8 supports is critical to getting the most out of your models:
|
||||
|
||||
- **Train** mode: Fine-tune your model on custom or preloaded datasets.
|
||||
- **Val** mode: A post-training checkpoint to validate model performance.
|
||||
- **Predict** mode: Unleash the predictive power of your model on real-world data.
|
||||
- **Export** mode: Make your model deployment-ready in various formats.
|
||||
- **Track** mode: Extend your object detection model into real-time tracking applications.
|
||||
- **Benchmark** mode: Analyze the speed and accuracy of your model in diverse deployment environments.
|
||||
|
||||
This comprehensive guide aims to give you an overview and practical insights into each mode, helping you harness the full potential of YOLOv8.
|
||||
|
||||
## [Train](train.md)
|
||||
|
||||
Train mode is used for training a YOLOv8 model on a custom dataset. In this mode, the model is trained using the specified dataset and hyperparameters. The training process involves optimizing the model's parameters so that it can accurately predict the classes and locations of objects in an image.
|
||||
|
||||
[Train Examples](train.md){ .md-button .md-button--primary}
|
||||
|
||||
## [Val](val.md)
|
||||
|
||||
Val mode is used for validating a YOLOv8 model after it has been trained. In this mode, the model is evaluated on a validation set to measure its accuracy and generalization performance. This mode can be used to tune the hyperparameters of the model to improve its performance.
|
||||
|
||||
[Val Examples](val.md){ .md-button .md-button--primary}
|
||||
|
||||
## [Predict](predict.md)
|
||||
|
||||
Predict mode is used for making predictions using a trained YOLOv8 model on new images or videos. In this mode, the model is loaded from a checkpoint file, and the user can provide images or videos to perform inference. The model predicts the classes and locations of objects in the input images or videos.
|
||||
|
||||
[Predict Examples](predict.md){ .md-button .md-button--primary}
|
||||
|
||||
## [Export](export.md)
|
||||
|
||||
Export mode is used for exporting a YOLOv8 model to a format that can be used for deployment. In this mode, the model is converted to a format that can be used by other software applications or hardware devices. This mode is useful when deploying the model to production environments.
|
||||
|
||||
[Export Examples](export.md){ .md-button .md-button--primary}
|
||||
|
||||
## [Track](track.md)
|
||||
|
||||
Track mode is used for tracking objects in real-time using a YOLOv8 model. In this mode, the model is loaded from a checkpoint file, and the user can provide a live video stream to perform real-time object tracking. This mode is useful for applications such as surveillance systems or self-driving cars.
|
||||
|
||||
[Track Examples](track.md){ .md-button .md-button--primary}
|
||||
|
||||
## [Benchmark](benchmark.md)
|
||||
|
||||
Benchmark mode is used to profile the speed and accuracy of various export formats for YOLOv8. The benchmarks provide information on the size of the exported format, its `mAP50-95` metrics (for object detection, segmentation and pose)
|
||||
or `accuracy_top5` metrics (for classification), and the inference time in milliseconds per image across various export formats like ONNX, OpenVINO, TensorRT and others. This information can help users choose the optimal export format for their specific use case based on their requirements for speed and accuracy.
|
||||
|
||||
[Benchmark Examples](benchmark.md){ .md-button .md-button--primary}
|
||||
715
docs/en/modes/predict.md
Normal file
715
docs/en/modes/predict.md
Normal file
|
|
@ -0,0 +1,715 @@
|
|||
---
|
||||
comments: true
|
||||
description: Discover how to use YOLOv8 predict mode for various tasks. Learn about different inference sources like images, videos, and data formats.
|
||||
keywords: Ultralytics, YOLOv8, predict mode, inference sources, prediction tasks, streaming mode, image processing, video processing, machine learning, AI
|
||||
---
|
||||
|
||||
# Model Prediction with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
In the world of machine learning and computer vision, the process of making sense out of visual data is called 'inference' or 'prediction'. Ultralytics YOLOv8 offers a powerful feature known as **predict mode** that is tailored for high-performance, real-time inference on a wide range of data sources.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/QtsI0TnwDZs?si=ljesw75cMO2Eas14"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How to Extract the Outputs from Ultralytics YOLOv8 Model for Custom Projects.
|
||||
</p>
|
||||
|
||||
## Real-world Applications
|
||||
|
||||
| Manufacturing | Sports | Safety |
|
||||
|:-----------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------:|
|
||||
|  |  |  |
|
||||
| Vehicle Spare Parts Detection | Football Player Detection | People Fall Detection |
|
||||
|
||||
## Why Use Ultralytics YOLO for Inference?
|
||||
|
||||
Here's why you should consider YOLOv8's predict mode for your various inference needs:
|
||||
|
||||
- **Versatility:** Capable of making inferences on images, videos, and even live streams.
|
||||
- **Performance:** Engineered for real-time, high-speed processing without sacrificing accuracy.
|
||||
- **Ease of Use:** Intuitive Python and CLI interfaces for rapid deployment and testing.
|
||||
- **Highly Customizable:** Various settings and parameters to tune the model's inference behavior according to your specific requirements.
|
||||
|
||||
### Key Features of Predict Mode
|
||||
|
||||
YOLOv8's predict mode is designed to be robust and versatile, featuring:
|
||||
|
||||
- **Multiple Data Source Compatibility:** Whether your data is in the form of individual images, a collection of images, video files, or real-time video streams, predict mode has you covered.
|
||||
- **Streaming Mode:** Use the streaming feature to generate a memory-efficient generator of `Results` objects. Enable this by setting `stream=True` in the predictor's call method.
|
||||
- **Batch Processing:** The ability to process multiple images or video frames in a single batch, further speeding up inference time.
|
||||
- **Integration Friendly:** Easily integrate with existing data pipelines and other software components, thanks to its flexible API.
|
||||
|
||||
Ultralytics YOLO models return either a Python list of `Results` objects, or a memory-efficient Python generator of `Results` objects when `stream=True` is passed to the model during inference:
|
||||
|
||||
!!! example "Predict"
|
||||
|
||||
=== "Return a list with `stream=False`"
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # pretrained YOLOv8n model
|
||||
|
||||
# Run batched inference on a list of images
|
||||
results = model(['im1.jpg', 'im2.jpg']) # return a list of Results objects
|
||||
|
||||
# Process results list
|
||||
for result in results:
|
||||
boxes = result.boxes # Boxes object for bbox outputs
|
||||
masks = result.masks # Masks object for segmentation masks outputs
|
||||
keypoints = result.keypoints # Keypoints object for pose outputs
|
||||
probs = result.probs # Probs object for classification outputs
|
||||
```
|
||||
|
||||
=== "Return a generator with `stream=True`"
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # pretrained YOLOv8n model
|
||||
|
||||
# Run batched inference on a list of images
|
||||
results = model(['im1.jpg', 'im2.jpg'], stream=True) # return a generator of Results objects
|
||||
|
||||
# Process results generator
|
||||
for result in results:
|
||||
boxes = result.boxes # Boxes object for bbox outputs
|
||||
masks = result.masks # Masks object for segmentation masks outputs
|
||||
keypoints = result.keypoints # Keypoints object for pose outputs
|
||||
probs = result.probs # Probs object for classification outputs
|
||||
```
|
||||
|
||||
## Inference Sources
|
||||
|
||||
YOLOv8 can process different types of input sources for inference, as shown in the table below. The sources include static images, video streams, and various data formats. The table also indicates whether each source can be used in streaming mode with the argument `stream=True` ✅. Streaming mode is beneficial for processing videos or live streams as it creates a generator of results instead of loading all frames into memory.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
Use `stream=True` for processing long videos or large datasets to efficiently manage memory. When `stream=False`, the results for all frames or data points are stored in memory, which can quickly add up and cause out-of-memory errors for large inputs. In contrast, `stream=True` utilizes a generator, which only keeps the results of the current frame or data point in memory, significantly reducing memory consumption and preventing out-of-memory issues.
|
||||
|
||||
| Source | Argument | Type | Notes |
|
||||
|----------------|--------------------------------------------|-----------------|---------------------------------------------------------------------------------------------|
|
||||
| image | `'image.jpg'` | `str` or `Path` | Single image file. |
|
||||
| URL | `'https://ultralytics.com/images/bus.jpg'` | `str` | URL to an image. |
|
||||
| screenshot | `'screen'` | `str` | Capture a screenshot. |
|
||||
| PIL | `Image.open('im.jpg')` | `PIL.Image` | HWC format with RGB channels. |
|
||||
| OpenCV | `cv2.imread('im.jpg')` | `np.ndarray` | HWC format with BGR channels `uint8 (0-255)`. |
|
||||
| numpy | `np.zeros((640,1280,3))` | `np.ndarray` | HWC format with BGR channels `uint8 (0-255)`. |
|
||||
| torch | `torch.zeros(16,3,320,640)` | `torch.Tensor` | BCHW format with RGB channels `float32 (0.0-1.0)`. |
|
||||
| CSV | `'sources.csv'` | `str` or `Path` | CSV file containing paths to images, videos, or directories. |
|
||||
| video ✅ | `'video.mp4'` | `str` or `Path` | Video file in formats like MP4, AVI, etc. |
|
||||
| directory ✅ | `'path/'` | `str` or `Path` | Path to a directory containing images or videos. |
|
||||
| glob ✅ | `'path/*.jpg'` | `str` | Glob pattern to match multiple files. Use the `*` character as a wildcard. |
|
||||
| YouTube ✅ | `'https://youtu.be/LNwODJXcvt4'` | `str` | URL to a YouTube video. |
|
||||
| stream ✅ | `'rtsp://example.com/media.mp4'` | `str` | URL for streaming protocols such as RTSP, RTMP, TCP, or an IP address. |
|
||||
| multi-stream ✅ | `'list.streams'` | `str` or `Path` | `*.streams` text file with one stream URL per row, i.e. 8 streams will run at batch-size 8. |
|
||||
|
||||
Below are code examples for using each source type:
|
||||
|
||||
!!! example "Prediction sources"
|
||||
|
||||
=== "image"
|
||||
Run inference on an image file.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define path to the image file
|
||||
source = 'path/to/image.jpg'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "screenshot"
|
||||
Run inference on the current screen content as a screenshot.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define current screenshot as source
|
||||
source = 'screen'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "URL"
|
||||
Run inference on an image or video hosted remotely via URL.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define remote image or video URL
|
||||
source = 'https://ultralytics.com/images/bus.jpg'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "PIL"
|
||||
Run inference on an image opened with Python Imaging Library (PIL).
|
||||
```python
|
||||
from PIL import Image
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Open an image using PIL
|
||||
source = Image.open('path/to/image.jpg')
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "OpenCV"
|
||||
Run inference on an image read with OpenCV.
|
||||
```python
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Read an image using OpenCV
|
||||
source = cv2.imread('path/to/image.jpg')
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "numpy"
|
||||
Run inference on an image represented as a numpy array.
|
||||
```python
|
||||
import numpy as np
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
|
||||
source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype='uint8')
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "torch"
|
||||
Run inference on an image represented as a PyTorch tensor.
|
||||
```python
|
||||
import torch
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
|
||||
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "CSV"
|
||||
Run inference on a collection of images, URLs, videos and directories listed in a CSV file.
|
||||
```python
|
||||
import torch
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define a path to a CSV file with images, URLs, videos and directories
|
||||
source = 'path/to/file.csv'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source) # list of Results objects
|
||||
```
|
||||
|
||||
=== "video"
|
||||
Run inference on a video file. By using `stream=True`, you can create a generator of Results objects to reduce memory usage.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define path to video file
|
||||
source = 'path/to/video.mp4'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source, stream=True) # generator of Results objects
|
||||
```
|
||||
|
||||
=== "directory"
|
||||
Run inference on all images and videos in a directory. To also capture images and videos in subdirectories use a glob pattern, i.e. `path/to/dir/**/*`.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define path to directory containing images and videos for inference
|
||||
source = 'path/to/dir'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source, stream=True) # generator of Results objects
|
||||
```
|
||||
|
||||
=== "glob"
|
||||
Run inference on all images and videos that match a glob expression with `*` characters.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define a glob search for all JPG files in a directory
|
||||
source = 'path/to/dir/*.jpg'
|
||||
|
||||
# OR define a recursive glob search for all JPG files including subdirectories
|
||||
source = 'path/to/dir/**/*.jpg'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source, stream=True) # generator of Results objects
|
||||
```
|
||||
|
||||
=== "YouTube"
|
||||
Run inference on a YouTube video. By using `stream=True`, you can create a generator of Results objects to reduce memory usage for long videos.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Define source as YouTube video URL
|
||||
source = 'https://youtu.be/LNwODJXcvt4'
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source, stream=True) # generator of Results objects
|
||||
```
|
||||
|
||||
=== "Streams"
|
||||
Run inference on remote streaming sources using RTSP, RTMP, TCP and IP address protocols. If multiple streams are provided in a `*.streams` text file then batched inference will run, i.e. 8 streams will run at batch-size 8, otherwise single streams will run at batch-size 1.
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Single stream with batch-size 1 inference
|
||||
source = 'rtsp://example.com/media.mp4' # RTSP, RTMP, TCP or IP streaming address
|
||||
|
||||
# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
|
||||
source = 'path/to/list.streams' # *.streams text file with one streaming address per row
|
||||
|
||||
# Run inference on the source
|
||||
results = model(source, stream=True) # generator of Results objects
|
||||
```
|
||||
|
||||
## Inference Arguments
|
||||
|
||||
`model.predict()` accepts multiple arguments that can be passed at inference time to override defaults:
|
||||
|
||||
!!! example
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Run inference on 'bus.jpg' with arguments
|
||||
model.predict('bus.jpg', save=True, imgsz=320, conf=0.5)
|
||||
```
|
||||
|
||||
All supported arguments:
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|-----------------|----------------|------------------------|--------------------------------------------------------------------------------|
|
||||
| `source` | `str` | `'ultralytics/assets'` | source directory for images or videos |
|
||||
| `conf` | `float` | `0.25` | object confidence threshold for detection |
|
||||
| `iou` | `float` | `0.7` | intersection over union (IoU) threshold for NMS |
|
||||
| `imgsz` | `int or tuple` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
|
||||
| `half` | `bool` | `False` | use half precision (FP16) |
|
||||
| `device` | `None or str` | `None` | device to run on, i.e. cuda device=0/1/2/3 or device=cpu |
|
||||
| `show` | `bool` | `False` | show results if possible |
|
||||
| `save` | `bool` | `False` | save images with results |
|
||||
| `save_txt` | `bool` | `False` | save results as .txt file |
|
||||
| `save_conf` | `bool` | `False` | save results with confidence scores |
|
||||
| `save_crop` | `bool` | `False` | save cropped images with results |
|
||||
| `hide_labels` | `bool` | `False` | hide labels |
|
||||
| `hide_conf` | `bool` | `False` | hide confidence scores |
|
||||
| `max_det` | `int` | `300` | maximum number of detections per image |
|
||||
| `vid_stride` | `bool` | `False` | video frame-rate stride |
|
||||
| `stream_buffer` | `bool` | `False` | buffer all streaming frames (True) or return the most recent frame (False) |
|
||||
| `line_width` | `None or int` | `None` | The line width of the bounding boxes. If None, it is scaled to the image size. |
|
||||
| `visualize` | `bool` | `False` | visualize model features |
|
||||
| `augment` | `bool` | `False` | apply image augmentation to prediction sources |
|
||||
| `agnostic_nms` | `bool` | `False` | class-agnostic NMS |
|
||||
| `retina_masks` | `bool` | `False` | use high-resolution segmentation masks |
|
||||
| `classes` | `None or list` | `None` | filter results by class, i.e. classes=0, or classes=[0,2,3] |
|
||||
| `boxes` | `bool` | `True` | Show boxes in segmentation predictions |
|
||||
|
||||
## Image and Video Formats
|
||||
|
||||
YOLOv8 supports various image and video formats, as specified in [data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/utils.py). See the tables below for the valid suffixes and example predict commands.
|
||||
|
||||
### Images
|
||||
|
||||
The below table contains valid Ultralytics image formats.
|
||||
|
||||
| Image Suffixes | Example Predict Command | Reference |
|
||||
|----------------|----------------------------------|-------------------------------------------------------------------------------|
|
||||
| .bmp | `yolo predict source=image.bmp` | [Microsoft BMP File Format](https://en.wikipedia.org/wiki/BMP_file_format) |
|
||||
| .dng | `yolo predict source=image.dng` | [Adobe DNG](https://www.adobe.com/products/photoshop/extend.displayTab2.html) |
|
||||
| .jpeg | `yolo predict source=image.jpeg` | [JPEG](https://en.wikipedia.org/wiki/JPEG) |
|
||||
| .jpg | `yolo predict source=image.jpg` | [JPEG](https://en.wikipedia.org/wiki/JPEG) |
|
||||
| .mpo | `yolo predict source=image.mpo` | [Multi Picture Object](https://fileinfo.com/extension/mpo) |
|
||||
| .png | `yolo predict source=image.png` | [Portable Network Graphics](https://en.wikipedia.org/wiki/PNG) |
|
||||
| .tif | `yolo predict source=image.tif` | [Tag Image File Format](https://en.wikipedia.org/wiki/TIFF) |
|
||||
| .tiff | `yolo predict source=image.tiff` | [Tag Image File Format](https://en.wikipedia.org/wiki/TIFF) |
|
||||
| .webp | `yolo predict source=image.webp` | [WebP](https://en.wikipedia.org/wiki/WebP) |
|
||||
| .pfm | `yolo predict source=image.pfm` | [Portable FloatMap](https://en.wikipedia.org/wiki/Netpbm#File_formats) |
|
||||
|
||||
### Videos
|
||||
|
||||
The below table contains valid Ultralytics video formats.
|
||||
|
||||
| Video Suffixes | Example Predict Command | Reference |
|
||||
|----------------|----------------------------------|----------------------------------------------------------------------------------|
|
||||
| .asf | `yolo predict source=video.asf` | [Advanced Systems Format](https://en.wikipedia.org/wiki/Advanced_Systems_Format) |
|
||||
| .avi | `yolo predict source=video.avi` | [Audio Video Interleave](https://en.wikipedia.org/wiki/Audio_Video_Interleave) |
|
||||
| .gif | `yolo predict source=video.gif` | [Graphics Interchange Format](https://en.wikipedia.org/wiki/GIF) |
|
||||
| .m4v | `yolo predict source=video.m4v` | [MPEG-4 Part 14](https://en.wikipedia.org/wiki/M4V) |
|
||||
| .mkv | `yolo predict source=video.mkv` | [Matroska](https://en.wikipedia.org/wiki/Matroska) |
|
||||
| .mov | `yolo predict source=video.mov` | [QuickTime File Format](https://en.wikipedia.org/wiki/QuickTime_File_Format) |
|
||||
| .mp4 | `yolo predict source=video.mp4` | [MPEG-4 Part 14 - Wikipedia](https://en.wikipedia.org/wiki/MPEG-4_Part_14) |
|
||||
| .mpeg | `yolo predict source=video.mpeg` | [MPEG-1 Part 2](https://en.wikipedia.org/wiki/MPEG-1) |
|
||||
| .mpg | `yolo predict source=video.mpg` | [MPEG-1 Part 2](https://en.wikipedia.org/wiki/MPEG-1) |
|
||||
| .ts | `yolo predict source=video.ts` | [MPEG Transport Stream](https://en.wikipedia.org/wiki/MPEG_transport_stream) |
|
||||
| .wmv | `yolo predict source=video.wmv` | [Windows Media Video](https://en.wikipedia.org/wiki/Windows_Media_Video) |
|
||||
| .webm | `yolo predict source=video.webm` | [WebM Project](https://en.wikipedia.org/wiki/WebM) |
|
||||
|
||||
## Working with Results
|
||||
|
||||
All Ultralytics `predict()` calls will return a list of `Results` objects:
|
||||
|
||||
!!! example "Results"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Run inference on an image
|
||||
results = model('bus.jpg') # list of 1 Results object
|
||||
results = model(['bus.jpg', 'zidane.jpg']) # list of 2 Results objects
|
||||
```
|
||||
|
||||
`Results` objects have the following attributes:
|
||||
|
||||
| Attribute | Type | Description |
|
||||
|--------------|-----------------------|------------------------------------------------------------------------------------------|
|
||||
| `orig_img` | `numpy.ndarray` | The original image as a numpy array. |
|
||||
| `orig_shape` | `tuple` | The original image shape in (height, width) format. |
|
||||
| `boxes` | `Boxes, optional` | A Boxes object containing the detection bounding boxes. |
|
||||
| `masks` | `Masks, optional` | A Masks object containing the detection masks. |
|
||||
| `probs` | `Probs, optional` | A Probs object containing probabilities of each class for classification task. |
|
||||
| `keypoints` | `Keypoints, optional` | A Keypoints object containing detected keypoints for each object. |
|
||||
| `speed` | `dict` | A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image. |
|
||||
| `names` | `dict` | A dictionary of class names. |
|
||||
| `path` | `str` | The path to the image file. |
|
||||
|
||||
`Results` objects have the following methods:
|
||||
|
||||
| Method | Return Type | Description |
|
||||
|-----------------|-----------------|-------------------------------------------------------------------------------------|
|
||||
| `__getitem__()` | `Results` | Return a Results object for the specified index. |
|
||||
| `__len__()` | `int` | Return the number of detections in the Results object. |
|
||||
| `update()` | `None` | Update the boxes, masks, and probs attributes of the Results object. |
|
||||
| `cpu()` | `Results` | Return a copy of the Results object with all tensors on CPU memory. |
|
||||
| `numpy()` | `Results` | Return a copy of the Results object with all tensors as numpy arrays. |
|
||||
| `cuda()` | `Results` | Return a copy of the Results object with all tensors on GPU memory. |
|
||||
| `to()` | `Results` | Return a copy of the Results object with tensors on the specified device and dtype. |
|
||||
| `new()` | `Results` | Return a new Results object with the same image, path, and names. |
|
||||
| `keys()` | `List[str]` | Return a list of non-empty attribute names. |
|
||||
| `plot()` | `numpy.ndarray` | Plots the detection results. Returns a numpy array of the annotated image. |
|
||||
| `verbose()` | `str` | Return log string for each task. |
|
||||
| `save_txt()` | `None` | Save predictions into a txt file. |
|
||||
| `save_crop()` | `None` | Save cropped predictions to `save_dir/cls/file_name.jpg`. |
|
||||
| `tojson()` | `None` | Convert the object to JSON format. |
|
||||
|
||||
For more details see the `Results` class [documentation](../reference/engine/results.md).
|
||||
|
||||
### Boxes
|
||||
|
||||
`Boxes` object can be used to index, manipulate, and convert bounding boxes to different formats.
|
||||
|
||||
!!! example "Boxes"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Run inference on an image
|
||||
results = model('bus.jpg') # results list
|
||||
|
||||
# View results
|
||||
for r in results:
|
||||
print(r.boxes) # print the Boxes object containing the detection bounding boxes
|
||||
```
|
||||
|
||||
Here is a table for the `Boxes` class methods and properties, including their name, type, and description:
|
||||
|
||||
| Name | Type | Description |
|
||||
|-----------|---------------------------|--------------------------------------------------------------------|
|
||||
| `cpu()` | Method | Move the object to CPU memory. |
|
||||
| `numpy()` | Method | Convert the object to a numpy array. |
|
||||
| `cuda()` | Method | Move the object to CUDA memory. |
|
||||
| `to()` | Method | Move the object to the specified device. |
|
||||
| `xyxy` | Property (`torch.Tensor`) | Return the boxes in xyxy format. |
|
||||
| `conf` | Property (`torch.Tensor`) | Return the confidence values of the boxes. |
|
||||
| `cls` | Property (`torch.Tensor`) | Return the class values of the boxes. |
|
||||
| `id` | Property (`torch.Tensor`) | Return the track IDs of the boxes (if available). |
|
||||
| `xywh` | Property (`torch.Tensor`) | Return the boxes in xywh format. |
|
||||
| `xyxyn` | Property (`torch.Tensor`) | Return the boxes in xyxy format normalized by original image size. |
|
||||
| `xywhn` | Property (`torch.Tensor`) | Return the boxes in xywh format normalized by original image size. |
|
||||
|
||||
For more details see the `Boxes` class [documentation](../reference/engine/results.md).
|
||||
|
||||
### Masks
|
||||
|
||||
`Masks` object can be used index, manipulate and convert masks to segments.
|
||||
|
||||
!!! example "Masks"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n-seg Segment model
|
||||
model = YOLO('yolov8n-seg.pt')
|
||||
|
||||
# Run inference on an image
|
||||
results = model('bus.jpg') # results list
|
||||
|
||||
# View results
|
||||
for r in results:
|
||||
print(r.masks) # print the Masks object containing the detected instance masks
|
||||
```
|
||||
|
||||
Here is a table for the `Masks` class methods and properties, including their name, type, and description:
|
||||
|
||||
| Name | Type | Description |
|
||||
|-----------|---------------------------|-----------------------------------------------------------------|
|
||||
| `cpu()` | Method | Returns the masks tensor on CPU memory. |
|
||||
| `numpy()` | Method | Returns the masks tensor as a numpy array. |
|
||||
| `cuda()` | Method | Returns the masks tensor on GPU memory. |
|
||||
| `to()` | Method | Returns the masks tensor with the specified device and dtype. |
|
||||
| `xyn` | Property (`torch.Tensor`) | A list of normalized segments represented as tensors. |
|
||||
| `xy` | Property (`torch.Tensor`) | A list of segments in pixel coordinates represented as tensors. |
|
||||
|
||||
For more details see the `Masks` class [documentation](../reference/engine/results.md).
|
||||
|
||||
### Keypoints
|
||||
|
||||
`Keypoints` object can be used index, manipulate and normalize coordinates.
|
||||
|
||||
!!! example "Keypoints"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n-pose Pose model
|
||||
model = YOLO('yolov8n-pose.pt')
|
||||
|
||||
# Run inference on an image
|
||||
results = model('bus.jpg') # results list
|
||||
|
||||
# View results
|
||||
for r in results:
|
||||
print(r.keypoints) # print the Keypoints object containing the detected keypoints
|
||||
```
|
||||
|
||||
Here is a table for the `Keypoints` class methods and properties, including their name, type, and description:
|
||||
|
||||
| Name | Type | Description |
|
||||
|-----------|---------------------------|-------------------------------------------------------------------|
|
||||
| `cpu()` | Method | Returns the keypoints tensor on CPU memory. |
|
||||
| `numpy()` | Method | Returns the keypoints tensor as a numpy array. |
|
||||
| `cuda()` | Method | Returns the keypoints tensor on GPU memory. |
|
||||
| `to()` | Method | Returns the keypoints tensor with the specified device and dtype. |
|
||||
| `xyn` | Property (`torch.Tensor`) | A list of normalized keypoints represented as tensors. |
|
||||
| `xy` | Property (`torch.Tensor`) | A list of keypoints in pixel coordinates represented as tensors. |
|
||||
| `conf` | Property (`torch.Tensor`) | Returns confidence values of keypoints if available, else None. |
|
||||
|
||||
For more details see the `Keypoints` class [documentation](../reference/engine/results.md).
|
||||
|
||||
### Probs
|
||||
|
||||
`Probs` object can be used index, get `top1` and `top5` indices and scores of classification.
|
||||
|
||||
!!! example "Probs"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n-cls Classify model
|
||||
model = YOLO('yolov8n-cls.pt')
|
||||
|
||||
# Run inference on an image
|
||||
results = model('bus.jpg') # results list
|
||||
|
||||
# View results
|
||||
for r in results:
|
||||
print(r.probs) # print the Probs object containing the detected class probabilities
|
||||
```
|
||||
|
||||
Here's a table summarizing the methods and properties for the `Probs` class:
|
||||
|
||||
| Name | Type | Description |
|
||||
|------------|---------------------------|-------------------------------------------------------------------------|
|
||||
| `cpu()` | Method | Returns a copy of the probs tensor on CPU memory. |
|
||||
| `numpy()` | Method | Returns a copy of the probs tensor as a numpy array. |
|
||||
| `cuda()` | Method | Returns a copy of the probs tensor on GPU memory. |
|
||||
| `to()` | Method | Returns a copy of the probs tensor with the specified device and dtype. |
|
||||
| `top1` | Property (`int`) | Index of the top 1 class. |
|
||||
| `top5` | Property (`list[int]`) | Indices of the top 5 classes. |
|
||||
| `top1conf` | Property (`torch.Tensor`) | Confidence of the top 1 class. |
|
||||
| `top5conf` | Property (`torch.Tensor`) | Confidences of the top 5 classes. |
|
||||
|
||||
For more details see the `Probs` class [documentation](../reference/engine/results.md).
|
||||
|
||||
## Plotting Results
|
||||
|
||||
You can use the `plot()` method of a `Result` objects to visualize predictions. It plots all prediction types (boxes, masks, keypoints, probabilities, etc.) contained in the `Results` object onto a numpy array that can then be shown or saved.
|
||||
|
||||
!!! example "Plotting"
|
||||
|
||||
```python
|
||||
from PIL import Image
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a pretrained YOLOv8n model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Run inference on 'bus.jpg'
|
||||
results = model('bus.jpg') # results list
|
||||
|
||||
# Show the results
|
||||
for r in results:
|
||||
im_array = r.plot() # plot a BGR numpy array of predictions
|
||||
im = Image.fromarray(im_array[..., ::-1]) # RGB PIL image
|
||||
im.show() # show image
|
||||
im.save('results.jpg') # save image
|
||||
```
|
||||
|
||||
The `plot()` method supports the following arguments:
|
||||
|
||||
| Argument | Type | Description | Default |
|
||||
|--------------|-----------------|--------------------------------------------------------------------------------|---------------|
|
||||
| `conf` | `bool` | Whether to plot the detection confidence score. | `True` |
|
||||
| `line_width` | `float` | The line width of the bounding boxes. If None, it is scaled to the image size. | `None` |
|
||||
| `font_size` | `float` | The font size of the text. If None, it is scaled to the image size. | `None` |
|
||||
| `font` | `str` | The font to use for the text. | `'Arial.ttf'` |
|
||||
| `pil` | `bool` | Whether to return the image as a PIL Image. | `False` |
|
||||
| `img` | `numpy.ndarray` | Plot to another image. if not, plot to original image. | `None` |
|
||||
| `im_gpu` | `torch.Tensor` | Normalized image in gpu with shape (1, 3, 640, 640), for faster mask plotting. | `None` |
|
||||
| `kpt_radius` | `int` | Radius of the drawn keypoints. Default is 5. | `5` |
|
||||
| `kpt_line` | `bool` | Whether to draw lines connecting keypoints. | `True` |
|
||||
| `labels` | `bool` | Whether to plot the label of bounding boxes. | `True` |
|
||||
| `boxes` | `bool` | Whether to plot the bounding boxes. | `True` |
|
||||
| `masks` | `bool` | Whether to plot the masks. | `True` |
|
||||
| `probs` | `bool` | Whether to plot classification probability | `True` |
|
||||
|
||||
## Thread-Safe Inference
|
||||
|
||||
Ensuring thread safety during inference is crucial when you are running multiple YOLO models in parallel across different threads. Thread-safe inference guarantees that each thread's predictions are isolated and do not interfere with one another, avoiding race conditions and ensuring consistent and reliable outputs.
|
||||
|
||||
When using YOLO models in a multi-threaded application, it's important to instantiate separate model objects for each thread or employ thread-local storage to prevent conflicts:
|
||||
|
||||
!!! example "Thread-Safe Inference"
|
||||
|
||||
Instantiate a single model inside each thread for thread-safe inference:
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
from threading import Thread
|
||||
|
||||
def thread_safe_predict(image_path):
|
||||
# Instantiate a new model inside the thread
|
||||
local_model = YOLO("yolov8n.pt")
|
||||
results = local_model.predict(image_path)
|
||||
# Process results
|
||||
|
||||
|
||||
# Starting threads that each have their own model instance
|
||||
Thread(target=thread_safe_predict, args=("image1.jpg",)).start()
|
||||
Thread(target=thread_safe_predict, args=("image2.jpg",)).start()
|
||||
```
|
||||
|
||||
For an in-depth look at thread-safe inference with YOLO models and step-by-step instructions, please refer to our [YOLO Thread-Safe Inference Guide](../guides/yolo-thread-safe-inference.md). This guide will provide you with all the necessary information to avoid common pitfalls and ensure that your multi-threaded inference runs smoothly.
|
||||
|
||||
## Streaming Source `for`-loop
|
||||
|
||||
Here's a Python script using OpenCV (`cv2`) and YOLOv8 to run inference on video frames. This script assumes you have already installed the necessary packages (`opencv-python` and `ultralytics`).
|
||||
|
||||
!!! example "Streaming for-loop"
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the YOLOv8 model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Open the video file
|
||||
video_path = "path/to/your/video/file.mp4"
|
||||
cap = cv2.VideoCapture(video_path)
|
||||
|
||||
# Loop through the video frames
|
||||
while cap.isOpened():
|
||||
# Read a frame from the video
|
||||
success, frame = cap.read()
|
||||
|
||||
if success:
|
||||
# Run YOLOv8 inference on the frame
|
||||
results = model(frame)
|
||||
|
||||
# Visualize the results on the frame
|
||||
annotated_frame = results[0].plot()
|
||||
|
||||
# Display the annotated frame
|
||||
cv2.imshow("YOLOv8 Inference", annotated_frame)
|
||||
|
||||
# Break the loop if 'q' is pressed
|
||||
if cv2.waitKey(1) & 0xFF == ord("q"):
|
||||
break
|
||||
else:
|
||||
# Break the loop if the end of the video is reached
|
||||
break
|
||||
|
||||
# Release the video capture object and close the display window
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
This script will run predictions on each frame of the video, visualize the results, and display them in a window. The loop can be exited by pressing 'q'.
|
||||
354
docs/en/modes/track.md
Normal file
354
docs/en/modes/track.md
Normal file
|
|
@ -0,0 +1,354 @@
|
|||
---
|
||||
comments: true
|
||||
description: Learn how to use Ultralytics YOLO for object tracking in video streams. Guides to use different trackers and customise tracker configurations.
|
||||
keywords: Ultralytics, YOLO, object tracking, video streams, BoT-SORT, ByteTrack, Python guide, CLI guide
|
||||
---
|
||||
|
||||
# Multi-Object Tracking with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://user-images.githubusercontent.com/26833433/243418637-1d6250fd-1515-4c10-a844-a32818ae6d46.png" alt="Multi-object tracking examples">
|
||||
|
||||
Object tracking in the realm of video analytics is a critical task that not only identifies the location and class of objects within the frame but also maintains a unique ID for each detected object as the video progresses. The applications are limitless—ranging from surveillance and security to real-time sports analytics.
|
||||
|
||||
## Why Choose Ultralytics YOLO for Object Tracking?
|
||||
|
||||
The output from Ultralytics trackers is consistent with standard object detection but has the added value of object IDs. This makes it easy to track objects in video streams and perform subsequent analytics. Here's why you should consider using Ultralytics YOLO for your object tracking needs:
|
||||
|
||||
- **Efficiency:** Process video streams in real-time without compromising accuracy.
|
||||
- **Flexibility:** Supports multiple tracking algorithms and configurations.
|
||||
- **Ease of Use:** Simple Python API and CLI options for quick integration and deployment.
|
||||
- **Customizability:** Easy to use with custom trained YOLO models, allowing integration into domain-specific applications.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/hHyHmOtmEgs?si=VNZtXmm45Nb9s-N-"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> Object Detection and Tracking with Ultralytics YOLOv8.
|
||||
</p>
|
||||
|
||||
## Real-world Applications
|
||||
|
||||
| Transportation | Retail | Aquaculture |
|
||||
|:----------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
|
||||
|  |  |  |
|
||||
| Vehicle Tracking | People Tracking | Fish Tracking |
|
||||
|
||||
## Features at a Glance
|
||||
|
||||
Ultralytics YOLO extends its object detection features to provide robust and versatile object tracking:
|
||||
|
||||
- **Real-Time Tracking:** Seamlessly track objects in high-frame-rate videos.
|
||||
- **Multiple Tracker Support:** Choose from a variety of established tracking algorithms.
|
||||
- **Customizable Tracker Configurations:** Tailor the tracking algorithm to meet specific requirements by adjusting various parameters.
|
||||
|
||||
## Available Trackers
|
||||
|
||||
Ultralytics YOLO supports the following tracking algorithms. They can be enabled by passing the relevant YAML configuration file such as `tracker=tracker_type.yaml`:
|
||||
|
||||
* [BoT-SORT](https://github.com/NirAharon/BoT-SORT) - Use `botsort.yaml` to enable this tracker.
|
||||
* [ByteTrack](https://github.com/ifzhang/ByteTrack) - Use `bytetrack.yaml` to enable this tracker.
|
||||
|
||||
The default tracker is BoT-SORT.
|
||||
|
||||
## Tracking
|
||||
|
||||
To run the tracker on video streams, use a trained Detect, Segment or Pose model such as YOLOv8n, YOLOv8n-seg and YOLOv8n-pose.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load an official or custom model
|
||||
model = YOLO('yolov8n.pt') # Load an official Detect model
|
||||
model = YOLO('yolov8n-seg.pt') # Load an official Segment model
|
||||
model = YOLO('yolov8n-pose.pt') # Load an official Pose model
|
||||
model = YOLO('path/to/best.pt') # Load a custom trained model
|
||||
|
||||
# Perform tracking with the model
|
||||
results = model.track(source="https://youtu.be/LNwODJXcvt4", show=True) # Tracking with default tracker
|
||||
results = model.track(source="https://youtu.be/LNwODJXcvt4", show=True, tracker="bytetrack.yaml") # Tracking with ByteTrack tracker
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Perform tracking with various models using the command line interface
|
||||
yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" # Official Detect model
|
||||
yolo track model=yolov8n-seg.pt source="https://youtu.be/LNwODJXcvt4" # Official Segment model
|
||||
yolo track model=yolov8n-pose.pt source="https://youtu.be/LNwODJXcvt4" # Official Pose model
|
||||
yolo track model=path/to/best.pt source="https://youtu.be/LNwODJXcvt4" # Custom trained model
|
||||
|
||||
# Track using ByteTrack tracker
|
||||
yolo track model=path/to/best.pt tracker="bytetrack.yaml"
|
||||
```
|
||||
|
||||
As can be seen in the above usage, tracking is available for all Detect, Segment and Pose models run on videos or streaming sources.
|
||||
|
||||
## Configuration
|
||||
|
||||
### Tracking Arguments
|
||||
|
||||
Tracking configuration shares properties with Predict mode, such as `conf`, `iou`, and `show`. For further configurations, refer to the [Predict](https://docs.ultralytics.com/modes/predict/) model page.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Configure the tracking parameters and run the tracker
|
||||
model = YOLO('yolov8n.pt')
|
||||
results = model.track(source="https://youtu.be/LNwODJXcvt4", conf=0.3, iou=0.5, show=True)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Configure tracking parameters and run the tracker using the command line interface
|
||||
yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" conf=0.3, iou=0.5 show
|
||||
```
|
||||
|
||||
### Tracker Selection
|
||||
|
||||
Ultralytics also allows you to use a modified tracker configuration file. To do this, simply make a copy of a tracker config file (for example, `custom_tracker.yaml`) from [ultralytics/cfg/trackers](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers) and modify any configurations (except the `tracker_type`) as per your needs.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the model and run the tracker with a custom configuration file
|
||||
model = YOLO('yolov8n.pt')
|
||||
results = model.track(source="https://youtu.be/LNwODJXcvt4", tracker='custom_tracker.yaml')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Load the model and run the tracker with a custom configuration file using the command line interface
|
||||
yolo track model=yolov8n.pt source="https://youtu.be/LNwODJXcvt4" tracker='custom_tracker.yaml'
|
||||
```
|
||||
|
||||
For a comprehensive list of tracking arguments, refer to the [ultralytics/cfg/trackers](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers) page.
|
||||
|
||||
## Python Examples
|
||||
|
||||
### Persisting Tracks Loop
|
||||
|
||||
Here is a Python script using OpenCV (`cv2`) and YOLOv8 to run object tracking on video frames. This script still assumes you have already installed the necessary packages (`opencv-python` and `ultralytics`). The `persist=True` argument tells the tracker that the current image or frame is the next in a sequence and to expect tracks from the previous image in the current image.
|
||||
|
||||
!!! example "Streaming for-loop with tracking"
|
||||
|
||||
```python
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the YOLOv8 model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Open the video file
|
||||
video_path = "path/to/video.mp4"
|
||||
cap = cv2.VideoCapture(video_path)
|
||||
|
||||
# Loop through the video frames
|
||||
while cap.isOpened():
|
||||
# Read a frame from the video
|
||||
success, frame = cap.read()
|
||||
|
||||
if success:
|
||||
# Run YOLOv8 tracking on the frame, persisting tracks between frames
|
||||
results = model.track(frame, persist=True)
|
||||
|
||||
# Visualize the results on the frame
|
||||
annotated_frame = results[0].plot()
|
||||
|
||||
# Display the annotated frame
|
||||
cv2.imshow("YOLOv8 Tracking", annotated_frame)
|
||||
|
||||
# Break the loop if 'q' is pressed
|
||||
if cv2.waitKey(1) & 0xFF == ord("q"):
|
||||
break
|
||||
else:
|
||||
# Break the loop if the end of the video is reached
|
||||
break
|
||||
|
||||
# Release the video capture object and close the display window
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
Please note the change from `model(frame)` to `model.track(frame)`, which enables object tracking instead of simple detection. This modified script will run the tracker on each frame of the video, visualize the results, and display them in a window. The loop can be exited by pressing 'q'.
|
||||
|
||||
### Plotting Tracks Over Time
|
||||
|
||||
Visualizing object tracks over consecutive frames can provide valuable insights into the movement patterns and behavior of detected objects within a video. With Ultralytics YOLOv8, plotting these tracks is a seamless and efficient process.
|
||||
|
||||
In the following example, we demonstrate how to utilize YOLOv8's tracking capabilities to plot the movement of detected objects across multiple video frames. This script involves opening a video file, reading it frame by frame, and utilizing the YOLO model to identify and track various objects. By retaining the center points of the detected bounding boxes and connecting them, we can draw lines that represent the paths followed by the tracked objects.
|
||||
|
||||
!!! example "Plotting tracks over multiple video frames"
|
||||
|
||||
```python
|
||||
from collections import defaultdict
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load the YOLOv8 model
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Open the video file
|
||||
video_path = "path/to/video.mp4"
|
||||
cap = cv2.VideoCapture(video_path)
|
||||
|
||||
# Store the track history
|
||||
track_history = defaultdict(lambda: [])
|
||||
|
||||
# Loop through the video frames
|
||||
while cap.isOpened():
|
||||
# Read a frame from the video
|
||||
success, frame = cap.read()
|
||||
|
||||
if success:
|
||||
# Run YOLOv8 tracking on the frame, persisting tracks between frames
|
||||
results = model.track(frame, persist=True)
|
||||
|
||||
# Get the boxes and track IDs
|
||||
boxes = results[0].boxes.xywh.cpu()
|
||||
track_ids = results[0].boxes.id.int().cpu().tolist()
|
||||
|
||||
# Visualize the results on the frame
|
||||
annotated_frame = results[0].plot()
|
||||
|
||||
# Plot the tracks
|
||||
for box, track_id in zip(boxes, track_ids):
|
||||
x, y, w, h = box
|
||||
track = track_history[track_id]
|
||||
track.append((float(x), float(y))) # x, y center point
|
||||
if len(track) > 30: # retain 90 tracks for 90 frames
|
||||
track.pop(0)
|
||||
|
||||
# Draw the tracking lines
|
||||
points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
|
||||
cv2.polylines(annotated_frame, [points], isClosed=False, color=(230, 230, 230), thickness=10)
|
||||
|
||||
# Display the annotated frame
|
||||
cv2.imshow("YOLOv8 Tracking", annotated_frame)
|
||||
|
||||
# Break the loop if 'q' is pressed
|
||||
if cv2.waitKey(1) & 0xFF == ord("q"):
|
||||
break
|
||||
else:
|
||||
# Break the loop if the end of the video is reached
|
||||
break
|
||||
|
||||
# Release the video capture object and close the display window
|
||||
cap.release()
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
### Multithreaded Tracking
|
||||
|
||||
Multithreaded tracking provides the capability to run object tracking on multiple video streams simultaneously. This is particularly useful when handling multiple video inputs, such as from multiple surveillance cameras, where concurrent processing can greatly enhance efficiency and performance.
|
||||
|
||||
In the provided Python script, we make use of Python's `threading` module to run multiple instances of the tracker concurrently. Each thread is responsible for running the tracker on one video file, and all the threads run simultaneously in the background.
|
||||
|
||||
To ensure that each thread receives the correct parameters (the video file, the model to use and the file index), we define a function `run_tracker_in_thread` that accepts these parameters and contains the main tracking loop. This function reads the video frame by frame, runs the tracker, and displays the results.
|
||||
|
||||
Two different models are used in this example: `yolov8n.pt` and `yolov8n-seg.pt`, each tracking objects in a different video file. The video files are specified in `video_file1` and `video_file2`.
|
||||
|
||||
The `daemon=True` parameter in `threading.Thread` means that these threads will be closed as soon as the main program finishes. We then start the threads with `start()` and use `join()` to make the main thread wait until both tracker threads have finished.
|
||||
|
||||
Finally, after all threads have completed their task, the windows displaying the results are closed using `cv2.destroyAllWindows()`.
|
||||
|
||||
!!! example "Streaming for-loop with tracking"
|
||||
|
||||
```python
|
||||
import threading
|
||||
import cv2
|
||||
from ultralytics import YOLO
|
||||
|
||||
|
||||
def run_tracker_in_thread(filename, model, file_index):
|
||||
"""
|
||||
Runs a video file or webcam stream concurrently with the YOLOv8 model using threading.
|
||||
|
||||
This function captures video frames from a given file or camera source and utilizes the YOLOv8 model for object
|
||||
tracking. The function runs in its own thread for concurrent processing.
|
||||
|
||||
Args:
|
||||
filename (str): The path to the video file or the identifier for the webcam/external camera source.
|
||||
model (obj): The YOLOv8 model object.
|
||||
file_index (int): An index to uniquely identify the file being processed, used for display purposes.
|
||||
|
||||
Note:
|
||||
Press 'q' to quit the video display window.
|
||||
"""
|
||||
video = cv2.VideoCapture(filename) # Read the video file
|
||||
|
||||
while True:
|
||||
ret, frame = video.read() # Read the video frames
|
||||
|
||||
# Exit the loop if no more frames in either video
|
||||
if not ret:
|
||||
break
|
||||
|
||||
# Track objects in frames if available
|
||||
results = model.track(frame, persist=True)
|
||||
res_plotted = results[0].plot()
|
||||
cv2.imshow(f"Tracking_Stream_{file_index}", res_plotted)
|
||||
|
||||
key = cv2.waitKey(1)
|
||||
if key == ord('q'):
|
||||
break
|
||||
|
||||
# Release video sources
|
||||
video.release()
|
||||
|
||||
|
||||
# Load the models
|
||||
model1 = YOLO('yolov8n.pt')
|
||||
model2 = YOLO('yolov8n-seg.pt')
|
||||
|
||||
# Define the video files for the trackers
|
||||
video_file1 = "path/to/video1.mp4" # Path to video file, 0 for webcam
|
||||
video_file2 = 0 # Path to video file, 0 for webcam, 1 for external camera
|
||||
|
||||
# Create the tracker threads
|
||||
tracker_thread1 = threading.Thread(target=run_tracker_in_thread, args=(video_file1, model1, 1), daemon=True)
|
||||
tracker_thread2 = threading.Thread(target=run_tracker_in_thread, args=(video_file2, model2, 2), daemon=True)
|
||||
|
||||
# Start the tracker threads
|
||||
tracker_thread1.start()
|
||||
tracker_thread2.start()
|
||||
|
||||
# Wait for the tracker threads to finish
|
||||
tracker_thread1.join()
|
||||
tracker_thread2.join()
|
||||
|
||||
# Clean up and close windows
|
||||
cv2.destroyAllWindows()
|
||||
```
|
||||
|
||||
This example can easily be extended to handle more video files and models by creating more threads and applying the same methodology.
|
||||
|
||||
## Contribute New Trackers
|
||||
|
||||
Are you proficient in multi-object tracking and have successfully implemented or adapted a tracking algorithm with Ultralytics YOLO? We invite you to contribute to our Trackers section in [ultralytics/cfg/trackers](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers)! Your real-world applications and solutions could be invaluable for users working on tracking tasks.
|
||||
|
||||
By contributing to this section, you help expand the scope of tracking solutions available within the Ultralytics YOLO framework, adding another layer of functionality and utility for the community.
|
||||
|
||||
To initiate your contribution, please refer to our [Contributing Guide](https://docs.ultralytics.com/help/contributing) for comprehensive instructions on submitting a Pull Request (PR) 🛠️. We are excited to see what you bring to the table!
|
||||
|
||||
Together, let's enhance the tracking capabilities of the Ultralytics YOLO ecosystem 🙏!
|
||||
294
docs/en/modes/train.md
Normal file
294
docs/en/modes/train.md
Normal file
|
|
@ -0,0 +1,294 @@
|
|||
---
|
||||
comments: true
|
||||
description: Step-by-step guide to train YOLOv8 models with Ultralytics YOLO including examples of single-GPU and multi-GPU training
|
||||
keywords: Ultralytics, YOLOv8, YOLO, object detection, train mode, custom dataset, GPU training, multi-GPU, hyperparameters, CLI examples, Python examples
|
||||
---
|
||||
|
||||
# Model Training with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
Training a deep learning model involves feeding it data and adjusting its parameters so that it can make accurate predictions. Train mode in Ultralytics YOLOv8 is engineered for effective and efficient training of object detection models, fully utilizing modern hardware capabilities. This guide aims to cover all the details you need to get started with training your own models using YOLOv8's robust set of features.
|
||||
|
||||
<p align="center">
|
||||
<br>
|
||||
<iframe width="720" height="405" src="https://www.youtube.com/embed/LNwODJXcvt4?si=7n1UvGRLSd9p5wKs"
|
||||
title="YouTube video player" frameborder="0"
|
||||
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share"
|
||||
allowfullscreen>
|
||||
</iframe>
|
||||
<br>
|
||||
<strong>Watch:</strong> How to Train a YOLOv8 model on Your Custom Dataset in Google Colab.
|
||||
</p>
|
||||
|
||||
## Why Choose Ultralytics YOLO for Training?
|
||||
|
||||
Here are some compelling reasons to opt for YOLOv8's Train mode:
|
||||
|
||||
- **Efficiency:** Make the most out of your hardware, whether you're on a single-GPU setup or scaling across multiple GPUs.
|
||||
- **Versatility:** Train on custom datasets in addition to readily available ones like COCO, VOC, and ImageNet.
|
||||
- **User-Friendly:** Simple yet powerful CLI and Python interfaces for a straightforward training experience.
|
||||
- **Hyperparameter Flexibility:** A broad range of customizable hyperparameters to fine-tune model performance.
|
||||
|
||||
### Key Features of Train Mode
|
||||
|
||||
The following are some notable features of YOLOv8's Train mode:
|
||||
|
||||
- **Automatic Dataset Download:** Standard datasets like COCO, VOC, and ImageNet are downloaded automatically on first use.
|
||||
- **Multi-GPU Support:** Scale your training efforts seamlessly across multiple GPUs to expedite the process.
|
||||
- **Hyperparameter Configuration:** The option to modify hyperparameters through YAML configuration files or CLI arguments.
|
||||
- **Visualization and Monitoring:** Real-time tracking of training metrics and visualization of the learning process for better insights.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
* YOLOv8 datasets like COCO, VOC, ImageNet and many others automatically download on first use, i.e. `yolo train data=coco.yaml`
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Train YOLOv8n on the COCO128 dataset for 100 epochs at image size 640. The training device can be specified using the `device` argument. If no argument is passed GPU `device=0` will be used if available, otherwise `device=cpu` will be used. See Arguments section below for a full list of training arguments.
|
||||
|
||||
!!! example "Single-GPU and CPU Training Example"
|
||||
|
||||
Device is determined automatically. If a GPU is available then it will be used, otherwise training will start on CPU.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.yaml') # build a new model from YAML
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
model = YOLO('yolov8n.yaml').load('yolov8n.pt') # build from YAML and transfer weights
|
||||
|
||||
# Train the model
|
||||
results = model.train(data='coco128.yaml', epochs=100, imgsz=640)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Build a new model from YAML and start training from scratch
|
||||
yolo detect train data=coco128.yaml model=yolov8n.yaml epochs=100 imgsz=640
|
||||
|
||||
# Start training from a pretrained *.pt model
|
||||
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640
|
||||
|
||||
# Build a new model from YAML, transfer pretrained weights to it and start training
|
||||
yolo detect train data=coco128.yaml model=yolov8n.yaml pretrained=yolov8n.pt epochs=100 imgsz=640
|
||||
```
|
||||
|
||||
### Multi-GPU Training
|
||||
|
||||
Multi-GPU training allows for more efficient utilization of available hardware resources by distributing the training load across multiple GPUs. This feature is available through both the Python API and the command-line interface. To enable multi-GPU training, specify the GPU device IDs you wish to use.
|
||||
|
||||
!!! example "Multi-GPU Training Example"
|
||||
|
||||
To train with 2 GPUs, CUDA devices 0 and 1 use the following commands. Expand to additional GPUs as required.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model with 2 GPUs
|
||||
results = model.train(data='coco128.yaml', epochs=100, imgsz=640, device=[0, 1])
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model using GPUs 0 and 1
|
||||
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640 device=0,1
|
||||
```
|
||||
|
||||
### Apple M1 and M2 MPS Training
|
||||
|
||||
With the support for Apple M1 and M2 chips integrated in the Ultralytics YOLO models, it's now possible to train your models on devices utilizing the powerful Metal Performance Shaders (MPS) framework. The MPS offers a high-performance way of executing computation and image processing tasks on Apple's custom silicon.
|
||||
|
||||
To enable training on Apple M1 and M2 chips, you should specify 'mps' as your device when initiating the training process. Below is an example of how you could do this in Python and via the command line:
|
||||
|
||||
!!! example "MPS Training Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
|
||||
|
||||
# Train the model with 2 GPUs
|
||||
results = model.train(data='coco128.yaml', epochs=100, imgsz=640, device='mps')
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Start training from a pretrained *.pt model using GPUs 0 and 1
|
||||
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640 device=mps
|
||||
```
|
||||
|
||||
While leveraging the computational power of the M1/M2 chips, this enables more efficient processing of the training tasks. For more detailed guidance and advanced configuration options, please refer to the [PyTorch MPS documentation](https://pytorch.org/docs/stable/notes/mps.html).
|
||||
|
||||
### Resuming Interrupted Trainings
|
||||
|
||||
Resuming training from a previously saved state is a crucial feature when working with deep learning models. This can come in handy in various scenarios, like when the training process has been unexpectedly interrupted, or when you wish to continue training a model with new data or for more epochs.
|
||||
|
||||
When training is resumed, Ultralytics YOLO loads the weights from the last saved model and also restores the optimizer state, learning rate scheduler, and the epoch number. This allows you to continue the training process seamlessly from where it was left off.
|
||||
|
||||
You can easily resume training in Ultralytics YOLO by setting the `resume` argument to `True` when calling the `train` method, and specifying the path to the `.pt` file containing the partially trained model weights.
|
||||
|
||||
Below is an example of how to resume an interrupted training using Python and via the command line:
|
||||
|
||||
!!! example "Resume Training Example"
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('path/to/last.pt') # load a partially trained model
|
||||
|
||||
# Resume training
|
||||
results = model.train(resume=True)
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
# Resume an interrupted training
|
||||
yolo train resume model=path/to/last.pt
|
||||
```
|
||||
|
||||
By setting `resume=True`, the `train` function will continue training from where it left off, using the state stored in the 'path/to/last.pt' file. If the `resume` argument is omitted or set to `False`, the `train` function will start a new training session.
|
||||
|
||||
Remember that checkpoints are saved at the end of every epoch by default, or at fixed interval using the `save_period` argument, so you must complete at least 1 epoch to resume a training run.
|
||||
|
||||
## Arguments
|
||||
|
||||
Training settings for YOLO models refer to the various hyperparameters and configurations used to train the model on a dataset. These settings can affect the model's performance, speed, and accuracy. Some common YOLO training settings include the batch size, learning rate, momentum, and weight decay. Other factors that may affect the training process include the choice of optimizer, the choice of loss function, and the size and composition of the training dataset. It is important to carefully tune and experiment with these settings to achieve the best possible performance for a given task.
|
||||
|
||||
| Key | Value | Description |
|
||||
|-------------------|----------|------------------------------------------------------------------------------------------------|
|
||||
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
|
||||
| `data` | `None` | path to data file, i.e. coco128.yaml |
|
||||
| `epochs` | `100` | number of epochs to train for |
|
||||
| `patience` | `50` | epochs to wait for no observable improvement for early stopping of training |
|
||||
| `batch` | `16` | number of images per batch (-1 for AutoBatch) |
|
||||
| `imgsz` | `640` | size of input images as integer |
|
||||
| `save` | `True` | save train checkpoints and predict results |
|
||||
| `save_period` | `-1` | Save checkpoint every x epochs (disabled if < 1) |
|
||||
| `cache` | `False` | True/ram, disk or False. Use cache for data loading |
|
||||
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
|
||||
| `workers` | `8` | number of worker threads for data loading (per RANK if DDP) |
|
||||
| `project` | `None` | project name |
|
||||
| `name` | `None` | experiment name |
|
||||
| `exist_ok` | `False` | whether to overwrite existing experiment |
|
||||
| `pretrained` | `True` | (bool or str) whether to use a pretrained model (bool) or a model to load weights from (str) |
|
||||
| `optimizer` | `'auto'` | optimizer to use, choices=[SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, auto] |
|
||||
| `verbose` | `False` | whether to print verbose output |
|
||||
| `seed` | `0` | random seed for reproducibility |
|
||||
| `deterministic` | `True` | whether to enable deterministic mode |
|
||||
| `single_cls` | `False` | train multi-class data as single-class |
|
||||
| `rect` | `False` | rectangular training with each batch collated for minimum padding |
|
||||
| `cos_lr` | `False` | use cosine learning rate scheduler |
|
||||
| `close_mosaic` | `10` | (int) disable mosaic augmentation for final epochs (0 to disable) |
|
||||
| `resume` | `False` | resume training from last checkpoint |
|
||||
| `amp` | `True` | Automatic Mixed Precision (AMP) training, choices=[True, False] |
|
||||
| `fraction` | `1.0` | dataset fraction to train on (default is 1.0, all images in train set) |
|
||||
| `profile` | `False` | profile ONNX and TensorRT speeds during training for loggers |
|
||||
| `freeze` | `None` | (int or list, optional) freeze first n layers, or freeze list of layer indices during training |
|
||||
| `lr0` | `0.01` | initial learning rate (i.e. SGD=1E-2, Adam=1E-3) |
|
||||
| `lrf` | `0.01` | final learning rate (lr0 * lrf) |
|
||||
| `momentum` | `0.937` | SGD momentum/Adam beta1 |
|
||||
| `weight_decay` | `0.0005` | optimizer weight decay 5e-4 |
|
||||
| `warmup_epochs` | `3.0` | warmup epochs (fractions ok) |
|
||||
| `warmup_momentum` | `0.8` | warmup initial momentum |
|
||||
| `warmup_bias_lr` | `0.1` | warmup initial bias lr |
|
||||
| `box` | `7.5` | box loss gain |
|
||||
| `cls` | `0.5` | cls loss gain (scale with pixels) |
|
||||
| `dfl` | `1.5` | dfl loss gain |
|
||||
| `pose` | `12.0` | pose loss gain (pose-only) |
|
||||
| `kobj` | `2.0` | keypoint obj loss gain (pose-only) |
|
||||
| `label_smoothing` | `0.0` | label smoothing (fraction) |
|
||||
| `nbs` | `64` | nominal batch size |
|
||||
| `overlap_mask` | `True` | masks should overlap during training (segment train only) |
|
||||
| `mask_ratio` | `4` | mask downsample ratio (segment train only) |
|
||||
| `dropout` | `0.0` | use dropout regularization (classify train only) |
|
||||
| `val` | `True` | validate/test during training |
|
||||
|
||||
## Logging
|
||||
|
||||
In training a YOLOv8 model, you might find it valuable to keep track of the model's performance over time. This is where logging comes into play. Ultralytics' YOLO provides support for three types of loggers - Comet, ClearML, and TensorBoard.
|
||||
|
||||
To use a logger, select it from the dropdown menu in the code snippet above and run it. The chosen logger will be installed and initialized.
|
||||
|
||||
### Comet
|
||||
|
||||
[Comet](https://www.comet.ml/site/) is a platform that allows data scientists and developers to track, compare, explain and optimize experiments and models. It provides functionalities such as real-time metrics, code diffs, and hyperparameters tracking.
|
||||
|
||||
To use Comet:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
# pip install comet_ml
|
||||
import comet_ml
|
||||
|
||||
comet_ml.init()
|
||||
```
|
||||
|
||||
Remember to sign in to your Comet account on their website and get your API key. You will need to add this to your environment variables or your script to log your experiments.
|
||||
|
||||
### ClearML
|
||||
|
||||
[ClearML](https://www.clear.ml/) is an open-source platform that automates tracking of experiments and helps with efficient sharing of resources. It is designed to help teams manage, execute, and reproduce their ML work more efficiently.
|
||||
|
||||
To use ClearML:
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
```python
|
||||
# pip install clearml
|
||||
import clearml
|
||||
|
||||
clearml.browser_login()
|
||||
```
|
||||
|
||||
After running this script, you will need to sign in to your ClearML account on the browser and authenticate your session.
|
||||
|
||||
### TensorBoard
|
||||
|
||||
[TensorBoard](https://www.tensorflow.org/tensorboard) is a visualization toolkit for TensorFlow. It allows you to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it.
|
||||
|
||||
To use TensorBoard in [Google Colab](https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb):
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "CLI"
|
||||
```bash
|
||||
load_ext tensorboard
|
||||
tensorboard --logdir ultralytics/runs # replace with 'runs' directory
|
||||
```
|
||||
|
||||
To use TensorBoard locally run the below command and view results at http://localhost:6006/.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "CLI"
|
||||
```bash
|
||||
tensorboard --logdir ultralytics/runs # replace with 'runs' directory
|
||||
```
|
||||
|
||||
This will load TensorBoard and direct it to the directory where your training logs are saved.
|
||||
|
||||
After setting up your logger, you can then proceed with your model training. All training metrics will be automatically logged in your chosen platform, and you can access these logs to monitor your model's performance over time, compare different models, and identify areas for improvement.
|
||||
86
docs/en/modes/val.md
Normal file
86
docs/en/modes/val.md
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
comments: true
|
||||
description: Guide for Validating YOLOv8 Models. Learn how to evaluate the performance of your YOLO models using validation settings and metrics with Python and CLI examples.
|
||||
keywords: Ultralytics, YOLO Docs, YOLOv8, validation, model evaluation, hyperparameters, accuracy, metrics, Python, CLI
|
||||
---
|
||||
|
||||
# Model Validation with Ultralytics YOLO
|
||||
|
||||
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png" alt="Ultralytics YOLO ecosystem and integrations">
|
||||
|
||||
## Introduction
|
||||
|
||||
Validation is a critical step in the machine learning pipeline, allowing you to assess the quality of your trained models. Val mode in Ultralytics YOLOv8 provides a robust suite of tools and metrics for evaluating the performance of your object detection models. This guide serves as a complete resource for understanding how to effectively use the Val mode to ensure that your models are both accurate and reliable.
|
||||
|
||||
## Why Validate with Ultralytics YOLO?
|
||||
|
||||
Here's why using YOLOv8's Val mode is advantageous:
|
||||
|
||||
- **Precision:** Get accurate metrics like mAP50, mAP75, and mAP50-95 to comprehensively evaluate your model.
|
||||
- **Convenience:** Utilize built-in features that remember training settings, simplifying the validation process.
|
||||
- **Flexibility:** Validate your model with the same or different datasets and image sizes.
|
||||
- **Hyperparameter Tuning:** Use validation metrics to fine-tune your model for better performance.
|
||||
|
||||
### Key Features of Val Mode
|
||||
|
||||
These are the notable functionalities offered by YOLOv8's Val mode:
|
||||
|
||||
- **Automated Settings:** Models remember their training configurations for straightforward validation.
|
||||
- **Multi-Metric Support:** Evaluate your model based on a range of accuracy metrics.
|
||||
- **CLI and Python API:** Choose from command-line interface or Python API based on your preference for validation.
|
||||
- **Data Compatibility:** Works seamlessly with datasets used during the training phase as well as custom datasets.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
* YOLOv8 models automatically remember their training settings, so you can validate a model at the same image size and on the original dataset easily with just `yolo val model=yolov8n.pt` or `model('yolov8n.pt').val()`
|
||||
|
||||
## Usage Examples
|
||||
|
||||
Validate trained YOLOv8n model accuracy on the COCO128 dataset. No argument need to passed as the `model` retains it's training `data` and arguments as model attributes. See Arguments section below for a full list of export arguments.
|
||||
|
||||
!!! example ""
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Load a model
|
||||
model = YOLO('yolov8n.pt') # load an official model
|
||||
model = YOLO('path/to/best.pt') # load a custom model
|
||||
|
||||
# Validate the model
|
||||
metrics = model.val() # no arguments needed, dataset and settings remembered
|
||||
metrics.box.map # map50-95
|
||||
metrics.box.map50 # map50
|
||||
metrics.box.map75 # map75
|
||||
metrics.box.maps # a list contains map50-95 of each category
|
||||
```
|
||||
=== "CLI"
|
||||
|
||||
```bash
|
||||
yolo detect val model=yolov8n.pt # val official model
|
||||
yolo detect val model=path/to/best.pt # val custom model
|
||||
```
|
||||
|
||||
## Arguments
|
||||
|
||||
Validation settings for YOLO models refer to the various hyperparameters and configurations used to evaluate the model's performance on a validation dataset. These settings can affect the model's performance, speed, and accuracy. Some common YOLO validation settings include the batch size, the frequency with which validation is performed during training, and the metrics used to evaluate the model's performance. Other factors that may affect the validation process include the size and composition of the validation dataset and the specific task the model is being used for. It is important to carefully tune and experiment with these settings to ensure that the model is performing well on the validation dataset and to detect and prevent overfitting.
|
||||
|
||||
| Key | Value | Description |
|
||||
|---------------|---------|--------------------------------------------------------------------|
|
||||
| `data` | `None` | path to data file, i.e. coco128.yaml |
|
||||
| `imgsz` | `640` | size of input images as integer |
|
||||
| `batch` | `16` | number of images per batch (-1 for AutoBatch) |
|
||||
| `save_json` | `False` | save results to JSON file |
|
||||
| `save_hybrid` | `False` | save hybrid version of labels (labels + additional predictions) |
|
||||
| `conf` | `0.001` | object confidence threshold for detection |
|
||||
| `iou` | `0.6` | intersection over union (IoU) threshold for NMS |
|
||||
| `max_det` | `300` | maximum number of detections per image |
|
||||
| `half` | `True` | use half precision (FP16) |
|
||||
| `device` | `None` | device to run on, i.e. cuda device=0/1/2/3 or device=cpu |
|
||||
| `dnn` | `False` | use OpenCV DNN for ONNX inference |
|
||||
| `plots` | `False` | show plots during training |
|
||||
| `rect` | `False` | rectangular val with each batch collated for minimum padding |
|
||||
| `split` | `val` | dataset split to use for validation, i.e. 'val', 'test' or 'train' |
|
||||
|
|
||||
314
docs/en/quickstart.md
Normal file
314
docs/en/quickstart.md
Normal file
|
|
@ -0,0 +1,314 @@
|
|||
---
|
||||
comments: true
|
||||
description: Explore various methods to install Ultralytics using pip, conda, git and Docker. Learn how to use Ultralytics with command line interface or within your Python projects.
|
||||
keywords: Ultralytics installation, pip install Ultralytics, Docker install Ultralytics, Ultralytics command line interface, Ultralytics Python interface
|
||||
---
|
||||
|
||||
## Install Ultralytics
|
||||
|
||||
Ultralytics provides various installation methods including pip, conda, and Docker. Install YOLOv8 via the `ultralytics` pip package for the latest stable release or by cloning the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics) for the most up-to-date version. Docker can be used to execute the package in an isolated container, avoiding local installation.
|
||||
|
||||
!!! example "Install"
|
||||
|
||||
=== "Pip install (recommended)"
|
||||
Install the `ultralytics` package using pip, or update an existing installation by running `pip install -U ultralytics`. Visit the Python Package Index (PyPI) for more details on the `ultralytics` package: [https://pypi.org/project/ultralytics/](https://pypi.org/project/ultralytics/).
|
||||
|
||||
[](https://badge.fury.io/py/ultralytics) [](https://pepy.tech/project/ultralytics)
|
||||
|
||||
```bash
|
||||
# Install the ultralytics package from PyPI
|
||||
pip install ultralytics
|
||||
```
|
||||
|
||||
You can also install the `ultralytics` package directly from the GitHub [repository](https://github.com/ultralytics/ultralytics). This might be useful if you want the latest development version. Make sure to have the Git command-line tool installed on your system. The `@main` command installs the `main` branch and may be modified to another branch, i.e. `@my-branch`, or removed entirely to default to `main` branch.
|
||||
|
||||
```bash
|
||||
# Install the ultralytics package from GitHub
|
||||
pip install git+https://github.com/ultralytics/ultralytics.git@main
|
||||
```
|
||||
|
||||
|
||||
=== "Conda install"
|
||||
Conda is an alternative package manager to pip which may also be used for installation. Visit Anaconda for more details at [https://anaconda.org/conda-forge/ultralytics](https://anaconda.org/conda-forge/ultralytics). Ultralytics feedstock repository for updating the conda package is at [https://github.com/conda-forge/ultralytics-feedstock/](https://github.com/conda-forge/ultralytics-feedstock/).
|
||||
|
||||
|
||||
[](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics) [](https://anaconda.org/conda-forge/ultralytics)
|
||||
|
||||
```bash
|
||||
# Install the ultralytics package using conda
|
||||
conda install -c conda-forge ultralytics
|
||||
```
|
||||
|
||||
!!! note
|
||||
|
||||
If you are installing in a CUDA environment best practice is to install `ultralytics`, `pytorch` and `pytorch-cuda` in the same command to allow the conda package manager to resolve any conflicts, or else to install `pytorch-cuda` last to allow it override the CPU-specific `pytorch` package if necessary.
|
||||
```bash
|
||||
# Install all packages together using conda
|
||||
conda install -c pytorch -c nvidia -c conda-forge pytorch torchvision pytorch-cuda=11.8 ultralytics
|
||||
```
|
||||
|
||||
### Conda Docker Image
|
||||
|
||||
Ultralytics Conda Docker images are also available from [DockerHub](https://hub.docker.com/r/ultralytics/ultralytics). These images are based on [Miniconda3](https://docs.conda.io/projects/miniconda/en/latest/) and are an simple way to start using `ultralytics` in a Conda environment.
|
||||
|
||||
```bash
|
||||
# Set image name as a variable
|
||||
t=ultralytics/ultralytics:latest-conda
|
||||
|
||||
# Pull the latest ultralytics image from Docker Hub
|
||||
sudo docker pull $t
|
||||
|
||||
# Run the ultralytics image in a container with GPU support
|
||||
sudo docker run -it --ipc=host --gpus all $t # all GPUs
|
||||
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t # specify GPUs
|
||||
```
|
||||
|
||||
=== "Git clone"
|
||||
Clone the `ultralytics` repository if you are interested in contributing to the development or wish to experiment with the latest source code. After cloning, navigate into the directory and install the package in editable mode `-e` using pip.
|
||||
```bash
|
||||
# Clone the ultralytics repository
|
||||
git clone https://github.com/ultralytics/ultralytics
|
||||
|
||||
# Navigate to the cloned directory
|
||||
cd ultralytics
|
||||
|
||||
# Install the package in editable mode for development
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
=== "Docker"
|
||||
|
||||
Utilize Docker to effortlessly execute the `ultralytics` package in an isolated container, ensuring consistent and smooth performance across various environments. By choosing one of the official `ultralytics` images from [Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics), you not only avoid the complexity of local installation but also benefit from access to a verified working environment. Ultralytics offers 5 main supported Docker images, each designed to provide high compatibility and efficiency for different platforms and use cases:
|
||||
|
||||
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
|
||||
|
||||
- **Dockerfile:** GPU image recommended for training.
|
||||
- **Dockerfile-arm64:** Optimized for ARM64 architecture, allowing deployment on devices like Raspberry Pi and other ARM64-based platforms.
|
||||
- **Dockerfile-cpu:** Ubuntu-based CPU-only version suitable for inference and environments without GPUs.
|
||||
- **Dockerfile-jetson:** Tailored for NVIDIA Jetson devices, integrating GPU support optimized for these platforms.
|
||||
- **Dockerfile-python:** Minimal image with just Python and necessary dependencies, ideal for lightweight applications and development.
|
||||
- **Dockerfile-conda:** Based on Miniconda3 with conda installation of ultralytics package.
|
||||
|
||||
Below are the commands to get the latest image and execute it:
|
||||
|
||||
```bash
|
||||
# Set image name as a variable
|
||||
t=ultralytics/ultralytics:latest
|
||||
|
||||
# Pull the latest ultralytics image from Docker Hub
|
||||
sudo docker pull $t
|
||||
|
||||
# Run the ultralytics image in a container with GPU support
|
||||
sudo docker run -it --ipc=host --gpus all $t # all GPUs
|
||||
sudo docker run -it --ipc=host --gpus '"device=2,3"' $t # specify GPUs
|
||||
```
|
||||
|
||||
The above command initializes a Docker container with the latest `ultralytics` image. The `-it` flag assigns a pseudo-TTY and maintains stdin open, enabling you to interact with the container. The `--ipc=host` flag sets the IPC (Inter-Process Communication) namespace to the host, which is essential for sharing memory between processes. The `--gpus all` flag enables access to all available GPUs inside the container, which is crucial for tasks that require GPU computation.
|
||||
|
||||
Note: To work with files on your local machine within the container, use Docker volumes for mounting a local directory into the container:
|
||||
|
||||
```bash
|
||||
# Mount local directory to a directory inside the container
|
||||
sudo docker run -it --ipc=host --gpus all -v /path/on/host:/path/in/container $t
|
||||
```
|
||||
|
||||
Alter `/path/on/host` with the directory path on your local machine, and `/path/in/container` with the desired path inside the Docker container for accessibility.
|
||||
|
||||
For advanced Docker usage, feel free to explore the [Ultralytics Docker Guide](https://docs.ultralytics.com/guides/docker-quickstart/).
|
||||
|
||||
See the `ultralytics` [requirements.txt](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) file for a list of dependencies. Note that all examples above install all required dependencies.
|
||||
|
||||
!!! tip "Tip"
|
||||
|
||||
PyTorch requirements vary by operating system and CUDA requirements, so it's recommended to install PyTorch first following instructions at [https://pytorch.org/get-started/locally](https://pytorch.org/get-started/locally).
|
||||
|
||||
<a href="https://pytorch.org/get-started/locally/">
|
||||
<img width="800" alt="PyTorch Installation Instructions" src="https://user-images.githubusercontent.com/26833433/228650108-ab0ec98a-b328-4f40-a40d-95355e8a84e3.png">
|
||||
</a>
|
||||
|
||||
## Use Ultralytics with CLI
|
||||
|
||||
The Ultralytics command line interface (CLI) allows for simple single-line commands without the need for a Python environment. CLI requires no customization or Python code. You can simply run all tasks from the terminal with the `yolo` command. Check out the [CLI Guide](usage/cli.md) to learn more about using YOLOv8 from the command line.
|
||||
|
||||
!!! example
|
||||
|
||||
=== "Syntax"
|
||||
|
||||
Ultralytics `yolo` commands use the following syntax:
|
||||
```bash
|
||||
yolo TASK MODE ARGS
|
||||
|
||||
Where TASK (optional) is one of [detect, segment, classify]
|
||||
MODE (required) is one of [train, val, predict, export, track]
|
||||
ARGS (optional) are any number of custom 'arg=value' pairs like 'imgsz=320' that override defaults.
|
||||
```
|
||||
See all ARGS in the full [Configuration Guide](usage/cfg.md) or with `yolo cfg`
|
||||
|
||||
=== "Train"
|
||||
|
||||
Train a detection model for 10 epochs with an initial learning_rate of 0.01
|
||||
```bash
|
||||
yolo train data=coco128.yaml model=yolov8n.pt epochs=10 lr0=0.01
|
||||
```
|
||||
|
||||
=== "Predict"
|
||||
|
||||
Predict a YouTube video using a pretrained segmentation model at image size 320:
|
||||
```bash
|
||||
yolo predict model=yolov8n-seg.pt source='https://youtu.be/LNwODJXcvt4' imgsz=320
|
||||
```
|
||||
|
||||
=== "Val"
|
||||
|
||||
Val a pretrained detection model at batch-size 1 and image size 640:
|
||||
```bash
|
||||
yolo val model=yolov8n.pt data=coco128.yaml batch=1 imgsz=640
|
||||
```
|
||||
|
||||
=== "Export"
|
||||
|
||||
Export a YOLOv8n classification model to ONNX format at image size 224 by 128 (no TASK required)
|
||||
```bash
|
||||
yolo export model=yolov8n-cls.pt format=onnx imgsz=224,128
|
||||
```
|
||||
|
||||
=== "Special"
|
||||
|
||||
Run special commands to see version, view settings, run checks and more:
|
||||
```bash
|
||||
yolo help
|
||||
yolo checks
|
||||
yolo version
|
||||
yolo settings
|
||||
yolo copy-cfg
|
||||
yolo cfg
|
||||
```
|
||||
|
||||
!!! warning "Warning"
|
||||
|
||||
Arguments must be passed as `arg=val` pairs, split by an equals `=` sign and delimited by spaces ` ` between pairs. Do not use `--` argument prefixes or commas `,` between arguments.
|
||||
|
||||
- `yolo predict model=yolov8n.pt imgsz=640 conf=0.25` ✅
|
||||
- `yolo predict model yolov8n.pt imgsz 640 conf 0.25` ❌
|
||||
- `yolo predict --model yolov8n.pt --imgsz 640 --conf 0.25` ❌
|
||||
|
||||
[CLI Guide](usage/cli.md){ .md-button .md-button--primary}
|
||||
|
||||
## Use Ultralytics with Python
|
||||
|
||||
YOLOv8's Python interface allows for seamless integration into your Python projects, making it easy to load, run, and process the model's output. Designed with simplicity and ease of use in mind, the Python interface enables users to quickly implement object detection, segmentation, and classification in their projects. This makes YOLOv8's Python interface an invaluable tool for anyone looking to incorporate these functionalities into their Python projects.
|
||||
|
||||
For example, users can load a model, train it, evaluate its performance on a validation set, and even export it to ONNX format with just a few lines of code. Check out the [Python Guide](usage/python.md) to learn more about using YOLOv8 within your Python projects.
|
||||
|
||||
!!! example
|
||||
|
||||
```python
|
||||
from ultralytics import YOLO
|
||||
|
||||
# Create a new YOLO model from scratch
|
||||
model = YOLO('yolov8n.yaml')
|
||||
|
||||
# Load a pretrained YOLO model (recommended for training)
|
||||
model = YOLO('yolov8n.pt')
|
||||
|
||||
# Train the model using the 'coco128.yaml' dataset for 3 epochs
|
||||
results = model.train(data='coco128.yaml', epochs=3)
|
||||
|
||||
# Evaluate the model's performance on the validation set
|
||||
results = model.val()
|
||||
|
||||
# Perform object detection on an image using the model
|
||||
results = model('https://ultralytics.com/images/bus.jpg')
|
||||
|
||||
# Export the model to ONNX format
|
||||
success = model.export(format='onnx')
|
||||
```
|
||||
|
||||
[Python Guide](usage/python.md){.md-button .md-button--primary}
|
||||
|
||||
## Ultralytics Settings
|
||||
|
||||
The Ultralytics library provides a powerful settings management system to enable fine-grained control over your experiments. By making use of the `SettingsManager` housed within the `ultralytics.utils` module, users can readily access and alter their settings. These are stored in a YAML file and can be viewed or modified either directly within the Python environment or via the Command-Line Interface (CLI).
|
||||
|
||||
### Inspecting Settings
|
||||
|
||||
To gain insight into the current configuration of your settings, you can view them directly:
|
||||
|
||||
!!! example "View settings"
|
||||
|
||||
=== "Python"
|
||||
You can use Python to view your settings. Start by importing the `settings` object from the `ultralytics` module. Print and return settings using the following commands:
|
||||
```python
|
||||
from ultralytics import settings
|
||||
|
||||
# View all settings
|
||||
print(settings)
|
||||
|
||||
# Return a specific setting
|
||||
value = settings['runs_dir']
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
Alternatively, the command-line interface allows you to check your settings with a simple command:
|
||||
```bash
|
||||
yolo settings
|
||||
```
|
||||
|
||||
### Modifying Settings
|
||||
|
||||
Ultralytics allows users to easily modify their settings. Changes can be performed in the following ways:
|
||||
|
||||
!!! example "Update settings"
|
||||
|
||||
=== "Python"
|
||||
Within the Python environment, call the `update` method on the `settings` object to change your settings:
|
||||
```python
|
||||
from ultralytics import settings
|
||||
|
||||
# Update a setting
|
||||
settings.update({'runs_dir': '/path/to/runs'})
|
||||
|
||||
# Update multiple settings
|
||||
settings.update({'runs_dir': '/path/to/runs', 'tensorboard': False})
|
||||
|
||||
# Reset settings to default values
|
||||
settings.reset()
|
||||
```
|
||||
|
||||
=== "CLI"
|
||||
If you prefer using the command-line interface, the following commands will allow you to modify your settings:
|
||||
```bash
|
||||
# Update a setting
|
||||
yolo settings runs_dir='/path/to/runs'
|
||||
|
||||
# Update multiple settings
|
||||
yolo settings runs_dir='/path/to/runs' tensorboard=False
|
||||
|
||||
# Reset settings to default values
|
||||
yolo settings reset
|
||||
```
|
||||
|
||||
### Understanding Settings
|
||||
|
||||
The table below provides an overview of the settings available for adjustment within Ultralytics. Each setting is outlined along with an example value, the data type, and a brief description.
|
||||
|
||||
| Name | Example Value | Data Type | Description |
|
||||
|--------------------|-----------------------|-----------|------------------------------------------------------------------------------------------------------------------|
|
||||
| `settings_version` | `'0.0.4'` | `str` | Ultralytics _settings_ version (different from Ultralytics [pip](https://pypi.org/project/ultralytics/) version) |
|
||||
| `datasets_dir` | `'/path/to/datasets'` | `str` | The directory where the datasets are stored |
|
||||
| `weights_dir` | `'/path/to/weights'` | `str` | The directory where the model weights are stored |
|
||||
| `runs_dir` | `'/path/to/runs'` | `str` | The directory where the experiment runs are stored |
|
||||
| `uuid` | `'a1b2c3d4'` | `str` | The unique identifier for the current settings |
|
||||
| `sync` | `True` | `bool` | Whether to sync analytics and crashes to HUB |
|
||||
| `api_key` | `''` | `str` | Ultralytics HUB [API Key](https://hub.ultralytics.com/settings?tab=api+keys) |
|
||||
| `clearml` | `True` | `bool` | Whether to use ClearML logging |
|
||||
| `comet` | `True` | `bool` | Whether to use [Comet ML](https://bit.ly/yolov8-readme-comet) for experiment tracking and visualization |
|
||||
| `dvc` | `True` | `bool` | Whether to use [DVC for experiment tracking](https://dvc.org/doc/dvclive/ml-frameworks/yolo) and version control |
|
||||
| `hub` | `True` | `bool` | Whether to use [Ultralytics HUB](https://hub.ultralytics.com) integration |
|
||||
| `mlflow` | `True` | `bool` | Whether to use MLFlow for experiment tracking |
|
||||
| `neptune` | `True` | `bool` | Whether to use Neptune for experiment tracking |
|
||||
| `raytune` | `True` | `bool` | Whether to use Ray Tune for hyperparameter tuning |
|
||||
| `tensorboard` | `True` | `bool` | Whether to use TensorBoard for visualization |
|
||||
| `wandb` | `True` | `bool` | Whether to use Weights & Biases logging |
|
||||
|
||||
As you navigate through your projects or experiments, be sure to revisit these settings to ensure that they are optimally configured for your needs.
|
||||
58
docs/en/reference/cfg/__init__.md
Normal file
58
docs/en/reference/cfg/__init__.md
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
---
|
||||
description: Explore Ultralytics cfg functions like cfg2dict, handle_deprecation, merge_equal_args & more to handle YOLO settings and configurations efficiently.
|
||||
keywords: Ultralytics, YOLO, Configuration, cfg2dict, handle_deprecation, merge_equals_args, handle_yolo_settings, copy_default_cfg, Image Detection
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/cfg/__init__.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/__init__.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/__init__.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/cfg/__init__.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.cfg2dict
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.get_cfg
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.get_save_dir
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg._handle_deprecation
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.check_dict_alignment
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.merge_equals_args
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.handle_yolo_hub
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.handle_yolo_settings
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.parse_key_value_pair
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.smart_value
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.entrypoint
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.cfg.copy_default_cfg
|
||||
<br><br>
|
||||
14
docs/en/reference/data/annotator.md
Normal file
14
docs/en/reference/data/annotator.md
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
description: Enhance your machine learning model with Ultralytics’ auto_annotate function. Simplify data annotation for improved model training.
|
||||
keywords: Ultralytics, Auto-Annotate, Machine Learning, AI, Annotation, Data Processing, Model Training
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/annotator.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/annotator.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/annotator.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/annotator.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.annotator.auto_annotate
|
||||
<br><br>
|
||||
86
docs/en/reference/data/augment.md
Normal file
86
docs/en/reference/data/augment.md
Normal file
|
|
@ -0,0 +1,86 @@
|
|||
---
|
||||
description: Detailed exploration into Ultralytics data augmentation methods including BaseTransform, MixUp, LetterBox, ToTensor, and more for enhancing model performance.
|
||||
keywords: Ultralytics, Data Augmentation, BaseTransform, MixUp, RandomHSV, LetterBox, Albumentations, classify_transforms, classify_albumentations
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/augment.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/augment.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/augment.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/augment.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.BaseTransform
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.Compose
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.BaseMixTransform
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.Mosaic
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.MixUp
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.RandomPerspective
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.RandomHSV
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.RandomFlip
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.LetterBox
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.CopyPaste
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.Albumentations
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.Format
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.ClassifyLetterBox
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.CenterCrop
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.ToTensor
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.v8_transforms
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.classify_transforms
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.hsv2colorjitter
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.augment.classify_albumentations
|
||||
<br><br>
|
||||
14
docs/en/reference/data/base.md
Normal file
14
docs/en/reference/data/base.md
Normal file
|
|
@ -0,0 +1,14 @@
|
|||
---
|
||||
description: Explore BaseDataset in Ultralytics docs. Learn how this implementation simplifies dataset creation and manipulation.
|
||||
keywords: Ultralytics, docs, BaseDataset, data manipulation, dataset creation
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/base.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/base.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/base.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/base.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.base.BaseDataset
|
||||
<br><br>
|
||||
38
docs/en/reference/data/build.md
Normal file
38
docs/en/reference/data/build.md
Normal file
|
|
@ -0,0 +1,38 @@
|
|||
---
|
||||
description: Explore the Ultralytics YOLO v3 data build procedures, including the InfiniteDataLoader, seed_worker, build_dataloader, and load_inference_source.
|
||||
keywords: Ultralytics, YOLO v3, Data build, DataLoader, InfiniteDataLoader, seed_worker, build_dataloader, load_inference_source
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/build.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/build.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/build.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/build.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.InfiniteDataLoader
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build._RepeatSampler
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.seed_worker
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.build_yolo_dataset
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.build_dataloader
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.check_source
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.build.load_inference_source
|
||||
<br><br>
|
||||
34
docs/en/reference/data/converter.md
Normal file
34
docs/en/reference/data/converter.md
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
---
|
||||
description: Explore Ultralytics data converter functions like coco91_to_coco80_class, merge_multi_segment, rle2polygon for efficient data handling.
|
||||
keywords: Ultralytics, Data Converter, coco91_to_coco80_class, merge_multi_segment, rle2polygon
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/converter.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/converter.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/converter.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/converter.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.coco91_to_coco80_class
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.coco80_to_coco91_class
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.convert_coco
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.convert_dota_to_yolo_obb
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.min_index
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.converter.merge_multi_segment
|
||||
<br><br>
|
||||
30
docs/en/reference/data/dataset.md
Normal file
30
docs/en/reference/data/dataset.md
Normal file
|
|
@ -0,0 +1,30 @@
|
|||
---
|
||||
description: Explore the YOLODataset and SemanticDataset classes in YOLO data. Learn how to efficiently handle and manipulate your data with Ultralytics.
|
||||
keywords: Ultralytics, YOLO, YOLODataset, SemanticDataset, data handling, data manipulation
|
||||
---
|
||||
|
||||
# Reference for `ultralytics/data/dataset.py`
|
||||
|
||||
!!! note
|
||||
|
||||
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/dataset.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/dataset.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/data/dataset.py) 🛠️. Thank you 🙏!
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.dataset.YOLODataset
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.dataset.ClassificationDataset
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.dataset.SemanticDataset
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.dataset.load_dataset_cache_file
|
||||
<br><br>
|
||||
|
||||
---
|
||||
## ::: ultralytics.data.dataset.save_dataset_cache_file
|
||||
<br><br>
|
||||
Some files were not shown because too many files have changed in this diff Show more
Loading…
Add table
Add a link
Reference in a new issue