ultralytics 8.0.97 confusion matrix, windows, docs updates (#2511)
Co-authored-by: Yonghye Kwon <developer.0hye@gmail.com> Co-authored-by: Dowon <ks2515@naver.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Laughing <61612323+Laughing-q@users.noreply.github.com>
This commit is contained in:
parent
6ee3a9a74b
commit
d1107ca4cb
138 changed files with 744 additions and 351 deletions
|
|
@ -1,10 +1,12 @@
|
|||
---
|
||||
comments: true
|
||||
description: 'Ultralytics YOLOv5 Docs: Learn model structure, data augmentation & training strategies. Build targets and the losses of object detection.'
|
||||
---
|
||||
|
||||
## 1. Model Structure
|
||||
|
||||
YOLOv5 (v6.0/6.1) consists of:
|
||||
|
||||
- **Backbone**: `New CSP-Darknet53`
|
||||
- **Neck**: `SPPF`, `New CSP-PAN`
|
||||
- **Head**: `YOLOv3 Head`
|
||||
|
|
@ -13,10 +15,9 @@ Model structure (`yolov5l.yaml`):
|
|||
|
||||

|
||||
|
||||
|
||||
Some minor changes compared to previous versions:
|
||||
|
||||
1. Replace the `Focus` structure with `6x6 Conv2d`(more efficient, refer #4825)
|
||||
1. Replace the `Focus` structure with `6x6 Conv2d`(more efficient, refer #4825)
|
||||
2. Replace the `SPP` structure with `SPPF`(more than double the speed)
|
||||
|
||||
<details markdown>
|
||||
|
|
@ -79,6 +80,7 @@ if __name__ == '__main__':
|
|||
```
|
||||
|
||||
result:
|
||||
|
||||
```
|
||||
True
|
||||
spp time: 0.5373051166534424
|
||||
|
|
@ -87,30 +89,26 @@ sppf time: 0.20780706405639648
|
|||
|
||||
</details>
|
||||
|
||||
|
||||
|
||||
## 2. Data Augmentation
|
||||
|
||||
- Mosaic
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109235-c7aad8f2-1d4f-41f9-8d5f-b2fde6f2885e.png#pic_center" width=80%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109235-c7aad8f2-1d4f-41f9-8d5f-b2fde6f2885e.png#pic_center" width=80%>
|
||||
|
||||
- Copy paste
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159116277-91b45033-6bec-4f82-afc4-41138866628e.png#pic_center" width=80%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159116277-91b45033-6bec-4f82-afc4-41138866628e.png#pic_center" width=80%>
|
||||
|
||||
- Random affine(Rotation, Scale, Translation and Shear)
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109326-45cd5acb-14fa-43e7-9235-0f21b0021c7d.png#pic_center" width=80%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109326-45cd5acb-14fa-43e7-9235-0f21b0021c7d.png#pic_center" width=80%>
|
||||
|
||||
- MixUp
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109361-3b24333b-f481-478b-ae00-df7838f0b5cd.png#pic_center" width=80%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109361-3b24333b-f481-478b-ae00-df7838f0b5cd.png#pic_center" width=80%>
|
||||
|
||||
- Albumentations
|
||||
- Augment HSV(Hue, Saturation, Value)
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109407-83d100ba-1aba-4f4b-aa03-4f048f815981.png#pic_center" width=80%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109407-83d100ba-1aba-4f4b-aa03-4f048f815981.png#pic_center" width=80%>
|
||||
|
||||
- Random horizontal flip
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109429-0d44619a-a76a-49eb-bfc0-6709860c043e.png#pic_center" width=80%>
|
||||
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/31005897/159109429-0d44619a-a76a-49eb-bfc0-6709860c043e.png#pic_center" width=80%>
|
||||
|
||||
## 3. Training Strategies
|
||||
|
||||
|
|
@ -121,13 +119,11 @@ sppf time: 0.20780706405639648
|
|||
- Mixed precision
|
||||
- Evolve hyper-parameters
|
||||
|
||||
|
||||
|
||||
## 4. Others
|
||||
|
||||
### 4.1 Compute Losses
|
||||
|
||||
The YOLOv5 loss consists of three parts:
|
||||
The YOLOv5 loss consists of three parts:
|
||||
|
||||
- Classes loss(BCE loss)
|
||||
- Objectness loss(BCE loss)
|
||||
|
|
@ -136,12 +132,14 @@ The YOLOv5 loss consists of three parts:
|
|||

|
||||
|
||||
### 4.2 Balance Losses
|
||||
|
||||
The objectness losses of the three prediction layers(`P3`, `P4`, `P5`) are weighted differently. The balance weights are `[4.0, 1.0, 0.4]` respectively.
|
||||
|
||||

|
||||
|
||||
### 4.3 Eliminate Grid Sensitivity
|
||||
In YOLOv2 and YOLOv3, the formula for calculating the predicted target information is:
|
||||
|
||||
In YOLOv2 and YOLOv3, the formula for calculating the predicted target information is:
|
||||
|
||||
+c_x)
|
||||
+c_y)
|
||||
|
|
@ -152,12 +150,12 @@ In YOLOv2 and YOLOv3, the formula for calculating the predicted target informati
|
|||
|
||||
|
||||
|
||||
In YOLOv5, the formula is:
|
||||
In YOLOv5, the formula is:
|
||||
|
||||
-0.5)+c_x)
|
||||
-0.5)+c_y)
|
||||
)^2)
|
||||
)^2)
|
||||
)^2)
|
||||
|
||||
Compare the center point offset before and after scaling. The center point offset range is adjusted from (0, 1) to (-0.5, 1.5).
|
||||
Therefore, offset can easily get 0 or 1.
|
||||
|
|
@ -168,8 +166,8 @@ Compare the height and width scaling ratio(relative to anchor) before and after
|
|||
|
||||
<img src="https://user-images.githubusercontent.com/31005897/158508089-5ac0c7a3-6358-44b7-863e-a6e45babb842.png#pic_center" width=40%>
|
||||
|
||||
|
||||
### 4.4 Build Targets
|
||||
|
||||
Match positive samples:
|
||||
|
||||
- Calculate the aspect ratio of GT and Anchor Templates
|
||||
|
|
@ -194,4 +192,4 @@ Match positive samples:
|
|||
|
||||
- Because the center point offset range is adjusted from (0, 1) to (-0.5, 1.5). GT Box can be assigned to more anchors.
|
||||
|
||||
<img src="https://user-images.githubusercontent.com/31005897/158508139-9db4e8c2-cf96-47e0-bc80-35d11512f296.png#pic_center" width=70%>
|
||||
<img src="https://user-images.githubusercontent.com/31005897/158508139-9db4e8c2-cf96-47e0-bc80-35d11512f296.png#pic_center" width=70%>
|
||||
Loading…
Add table
Add a link
Reference in a new issue