Add FAQs to Docs Datasets and Help sections (#14211)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: UltralyticsAssistant <web@ultralytics.com>
This commit is contained in:
Glenn Jocher 2024-07-04 20:42:31 +02:00 committed by GitHub
parent 64862f1b69
commit d5db9c916f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
73 changed files with 3296 additions and 110 deletions

View file

@ -332,3 +332,55 @@ Try our GUI Demo based on Explorer API
- [ ] Remove images that have a higher similarity index than the given threshold
- [ ] Automatically persist new datasets after merging/removing entries
- [ ] Advanced Dataset Visualizations
## FAQ
### What is the Ultralytics Explorer API used for?
The Ultralytics Explorer API is designed for comprehensive dataset exploration. It allows users to filter and search datasets using SQL queries, vector similarity search, and semantic search. This powerful Python API can handle large datasets, making it ideal for various computer vision tasks using Ultralytics models.
### How do I install the Ultralytics Explorer API?
To install the Ultralytics Explorer API along with its dependencies, use the following command:
```bash
pip install ultralytics[explorer]
```
This will automatically install all necessary external libraries for the Explorer API functionality. For additional setup details, refer to the [installation section](#installation) of our documentation.
### How can I use the Ultralytics Explorer API for similarity search?
You can use the Ultralytics Explorer API to perform similarity searches by creating an embeddings table and querying it for similar images. Here's a basic example:
```python
from ultralytics import Explorer
# Create an Explorer object
explorer = Explorer(data="coco128.yaml", model="yolov8n.pt")
explorer.create_embeddings_table()
# Search for similar images to a given image
similar_images_df = explorer.get_similar(img="path/to/image.jpg")
print(similar_images_df.head())
```
For more details, please visit the [Similarity Search section](#1-similarity-search).
### What are the benefits of using LanceDB with Ultralytics Explorer?
LanceDB, used under the hood by Ultralytics Explorer, provides scalable, on-disk embeddings tables. This ensures that you can create and reuse embeddings for large datasets like COCO without running out of memory. These tables are only created once and can be reused, enhancing efficiency in data handling.
### How does the Ask AI feature work in the Ultralytics Explorer API?
The Ask AI feature allows users to filter datasets using natural language queries. This feature leverages LLMs to convert these queries into SQL queries behind the scenes. Here's an example:
```python
from ultralytics import Explorer
# Create an Explorer object
explorer = Explorer(data="coco128.yaml", model="yolov8n.pt")
explorer.create_embeddings_table()
# Query with natural language
query_result = explorer.ask_ai("show me 100 images with exactly one person and 2 dogs. There can be other objects too")
print(query_result.head())
```
For more examples, check out the [Ask AI section](#2-ask-ai-natural-language-querying).

View file

@ -34,7 +34,7 @@ pip install ultralytics[explorer]
Ask AI feature works using OpenAI, so you'll be prompted to set the api key for OpenAI when you first run the GUI.
You can set it like this - `yolo settings openai_api_key="..."`
## Semantic Search / Vector Similarity Search
## Vector Semantic Similarity Search
Semantic search is a technique for finding similar images to a given image. It is based on the idea that similar images will have similar embeddings. In the UI, you can select one of more images and search for the images similar to them. This can be useful when you want to find images similar to a given image or a set of images that don't perform as expected.
@ -71,3 +71,52 @@ WHERE labels LIKE '%person%' AND labels LIKE '%dog%'
</p>
This is a Demo build using the Explorer API. You can use the API to build your own exploratory notebooks or scripts to get insights into your datasets. Learn more about the Explorer API [here](api.md).
## FAQ
### What is Ultralytics Explorer GUI and how do I install it?
Ultralytics Explorer GUI is a powerful interface that unlocks advanced data exploration capabilities using the [Ultralytics Explorer API](api.md). It allows you to run semantic/vector similarity search, SQL queries, and natural language queries using the Ask AI feature powered by Large Language Models (LLMs).
To install the Explorer GUI, you can use pip:
```bash
pip install ultralytics[explorer]
```
Note: To use the Ask AI feature, you'll need to set the OpenAI API key: `yolo settings openai_api_key="..."`.
### How does the semantic search feature in Ultralytics Explorer GUI work?
The semantic search feature in Ultralytics Explorer GUI allows you to find images similar to a given image based on their embeddings. This technique is useful for identifying and exploring images that share visual similarities. To use this feature, select one or more images in the UI and execute a search for similar images. The result will display images that closely resemble the selected ones, facilitating efficient dataset exploration and anomaly detection.
Learn more about semantic search and other features by visiting the [Feature Overview](#vector-semantic-similarity-search) section.
### Can I use natural language to filter datasets in Ultralytics Explorer GUI?
Yes, with the Ask AI feature powered by large language models (LLMs), you can filter your datasets using natural language queries. You don't need to be proficient in SQL. For instance, you can ask "Show me 100 images with exactly one person and 2 dogs. There can be other objects too," and the AI will generate the appropriate query under the hood to deliver the desired results.
See an example of a natural language query [here](#ask-ai).
### How do I run SQL queries on datasets using Ultralytics Explorer GUI?
Ultralytics Explorer GUI allows you to run SQL queries directly on your dataset to filter and manage data efficiently. To run a query, navigate to the SQL query section in the GUI and write your query. For example, to show images with at least one person and one dog, you could use:
```sql
WHERE labels LIKE '%person%' AND labels LIKE '%dog%'
```
You can also provide only the WHERE clause, making the querying process more flexible.
For more details, refer to the [SQL Queries Section](#run-sql-queries-on-your-cv-datasets).
### What are the benefits of using Ultralytics Explorer GUI for data exploration?
Ultralytics Explorer GUI enhances data exploration with features like semantic search, SQL querying, and natural language interactions through the Ask AI feature. These capabilities allow users to:
- Efficiently find visually similar images.
- Filter datasets using complex SQL queries.
- Utilize AI to perform natural language searches, eliminating the need for advanced SQL expertise.
These features make it a versatile tool for developers, researchers, and data scientists looking to gain deeper insights into their datasets.
Explore more about these features in the [Explorer GUI Documentation](#explorer-gui).

View file

@ -58,3 +58,51 @@ yolo explorer
<p>
<img width="1709" alt="Ultralytics Explorer OpenAI Integration" src="https://github.com/AyushExel/assets/assets/15766192/1b5f3708-be3e-44c5-9ea3-adcd522dfc75">
</p>
## FAQ
### What is Ultralytics Explorer and how can it help with CV datasets?
Ultralytics Explorer is a powerful tool designed for exploring computer vision (CV) datasets through semantic search, SQL queries, vector similarity search, and even natural language. This versatile tool provides both a GUI and a Python API, allowing users to seamlessly interact with their datasets. By leveraging technologies like LanceDB, Ultralytics Explorer ensures efficient, scalable access to large datasets without excessive memory usage. Whether you're performing detailed dataset analysis or exploring data patterns, Ultralytics Explorer streamlines the entire process.
Learn more about the [Explorer API](api.md).
### How do I install the dependencies for Ultralytics Explorer?
To manually install the optional dependencies needed for Ultralytics Explorer, you can use the following `pip` command:
```bash
pip install ultralytics[explorer]
```
These dependencies are essential for the full functionality of semantic search and SQL querying. By including libraries powered by [LanceDB](https://lancedb.com/), the installation ensures that the database operations remain efficient and scalable, even for large datasets like COCO.
### How can I use the GUI version of Ultralytics Explorer?
Using the GUI version of Ultralytics Explorer is straightforward. After installing the necessary dependencies, you can launch the GUI with the following command:
```bash
yolo explorer
```
The GUI provides a user-friendly interface for creating dataset embeddings, searching for similar images, running SQL queries, and conducting semantic searches. Additionally, the integration with OpenAI's Ask AI feature allows you to query datasets using natural language, enhancing the flexibility and ease of use.
For storage and scalability information, check out our [installation instructions](#installation-of-optional-dependencies).
### What is the Ask AI feature in Ultralytics Explorer?
The Ask AI feature in Ultralytics Explorer allows users to interact with their datasets using natural language queries. Powered by OpenAI, this feature enables you to ask complex questions and receive insightful answers without needing to write SQL queries or similar commands. To use this feature, you'll need to set your OpenAI API key the first time you run the GUI:
```bash
yolo settings openai_api_key="YOUR_API_KEY"
```
For more on this feature and how to integrate it, see our [GUI Explorer Usage](#gui-explorer-usage) section.
### Can I run Ultralytics Explorer in Google Colab?
Yes, Ultralytics Explorer can be run in Google Colab, providing a convenient and powerful environment for dataset exploration. You can start by opening the provided Colab notebook, which is pre-configured with all the necessary settings:
<a href="https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/docs/en/datasets/explorer/explorer.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
This setup allows you to explore your datasets fully, taking advantage of Google's cloud resources. Learn more in our [Google Colab Guide](../../integrations/google-colab.md).