Add Docs models pages FAQs (#14167)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
This commit is contained in:
Glenn Jocher 2024-07-02 20:03:58 +02:00 committed by GitHub
parent 0f2bee4cc6
commit b06c5a4b9e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
16 changed files with 821 additions and 47 deletions

View file

@ -117,3 +117,52 @@ If you find MobileSAM useful in your research or development work, please consid
year={2023}
}
```
## FAQ
### How do I use MobileSAM for image segmentation on a mobile application?
MobileSAM is specifically designed for lightweight and fast image segmentation on mobile applications. To get started, you can download the model weights [here](https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/mobile_sam.pt) and use the following Python code snippet for inference:
```python
from ultralytics import SAM
# Load the model
model = SAM("mobile_sam.pt")
# Predict a segment based on a point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])
```
For more detailed usage and various prompts, refer to the [SAM page](sam.md).
### What are the performance benefits of using MobileSAM over the original SAM?
MobileSAM offers significant improvements in both size and speed over the original SAM. Here is a detailed comparison:
- **Image Encoder**: MobileSAM uses a smaller Tiny-ViT (5M parameters) instead of the original heavyweight ViT-H (611M parameters), resulting in an 8ms encoding time versus 452ms with SAM.
- **Overall Pipeline**: MobileSAM's entire pipeline, including image encoding and mask decoding, operates at 12ms per image compared to SAM's 456ms, making it approximately 7 times faster.
In summary, MobileSAM is about 5 times smaller and 7 times faster than the original SAM, making it ideal for mobile applications.
### Why should developers adopt MobileSAM for mobile applications?
Developers should consider using MobileSAM for mobile applications due to its lightweight and fast performance, making it highly efficient for real-time image segmentation tasks.
- **Efficiency**: MobileSAM's Tiny-ViT encoder allows for rapid processing, achieving segmentation results in just 12ms.
- **Size**: The model size is significantly reduced, making it easier to deploy and run on mobile devices.
These advancements facilitate real-time applications, such as augmented reality, mobile games, and other interactive experiences.
Learn more about the MobileSAM's performance on its [project page](https://github.com/ChaoningZhang/MobileSAM).
### How easy is it to transition from the original SAM to MobileSAM?
Transitioning from the original SAM to MobileSAM is straightforward as MobileSAM retains the same pipeline, including pre-processing, post-processing, and interfaces. Only the image encoder has been changed to the more efficient Tiny-ViT. Users currently using SAM can switch to MobileSAM with minimal code modifications, benefiting from improved performance without the need for significant reconfiguration.
### What tasks are supported by the MobileSAM model?
The MobileSAM model supports instance segmentation tasks. Currently, it is optimized for [Inference](../modes/predict.md) mode. Additional tasks like validation, training, and export are not supported at this time, as indicated in the mode compatibility table:
| Model Type | Tasks Supported | Inference | Validation | Training | Export |
| ---------- | -------------------------------------------- | --------- | ---------- | -------- | ------ |
| MobileSAM | [Instance Segmentation](../tasks/segment.md) | ✅ | ❌ | ❌ | ❌ |
For more information about supported tasks and operational modes, check the [tasks page](../tasks/segment.md) and the mode details like [Inference](../modes/predict.md), [Validation](../modes/val.md), and [Export](../modes/export.md).