ultralytics 8.3.25 Alibaba MNN export and predict support (#16802)

Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Francesco Mattioli <Francesco.mttl@gmail.com> Co-authored-by: Laughing <61612323+Laughing-q@users.noreply.github.com> Co-authored-by: Laughing-q <1185102784@qq.com> Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2024-10-30 20:59:48 +08:00 · 2024-10-30 20:59:48 +08:00 · 9c72d94ba4
commit 9c72d94ba4
parent 11b4194344
14 changed files with 465 additions and 39 deletions
--- a/docs/en/guides/model-deployment-options.md
+++ b/docs/en/guides/model-deployment-options.md
@ -258,25 +258,30 @@ NCNN is a high-performance neural network inference framework optimized for the

 - **Hardware Acceleration**: Tailored for ARM CPUs and GPUs, with specific optimizations for these architectures.

+#### MNN
+
+MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. In addition, MNN is also used on embedded devices, such as IoT.
+
 ## Comparative Analysis of YOLO11 Deployment Options

 The following table provides a snapshot of the various deployment options available for YOLO11 models, helping you to assess which may best fit your project needs based on several critical criteria. For an in-depth look at each deployment option's format, please see the [Ultralytics documentation page on export formats](../modes/export.md#export-formats).

-| Deployment Option | Performance Benchmarks                          | Compatibility and Integration                  | Community Support and Ecosystem               | Case Studies                               | Maintenance and Updates                     | Security Considerations                           | Hardware Acceleration              |
-| ----------------- | ----------------------------------------------- | ---------------------------------------------- | --------------------------------------------- | ------------------------------------------ | ------------------------------------------- | ------------------------------------------------- | ---------------------------------- |
-| PyTorch           | Good flexibility; may trade off raw performance | Excellent with Python libraries                | Extensive resources and community             | Research and prototypes                    | Regular, active development                 | Dependent on deployment environment               | CUDA support for GPU acceleration  |
-| TorchScript       | Better for production than PyTorch              | Smooth transition from PyTorch to C++          | Specialized but narrower than PyTorch         | Industry where Python is a bottleneck      | Consistent updates with PyTorch             | Improved security without full Python             | Inherits CUDA support from PyTorch |
-| ONNX              | Variable depending on runtime                   | High across different frameworks               | Broad ecosystem, supported by many orgs       | Flexibility across ML frameworks           | Regular updates for new operations          | Ensure secure conversion and deployment practices | Various hardware optimizations     |
-| OpenVINO          | Optimized for Intel hardware                    | Best within Intel ecosystem                    | Solid in computer vision domain               | IoT and edge with Intel hardware           | Regular updates for Intel hardware          | Robust features for sensitive applications        | Tailored for Intel hardware        |
-| TensorRT          | Top-tier on NVIDIA GPUs                         | Best for NVIDIA hardware                       | Strong network through NVIDIA                 | Real-time video and image inference        | Frequent updates for new GPUs               | Emphasis on security                              | Designed for NVIDIA GPUs           |
-| CoreML            | Optimized for on-device Apple hardware          | Exclusive to Apple ecosystem                   | Strong Apple and developer support            | On-device ML on Apple products             | Regular Apple updates                       | Focus on privacy and security                     | Apple neural engine and GPU        |
-| TF SavedModel     | Scalable in server environments                 | Wide compatibility in TensorFlow ecosystem     | Large support due to TensorFlow popularity    | Serving models at scale                    | Regular updates by Google and community     | Robust features for enterprise                    | Various hardware accelerations     |
-| TF GraphDef       | Stable for static computation graphs            | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs        | Scenarios requiring static graphs          | Updates alongside TensorFlow core           | Established TensorFlow security practices         | TensorFlow acceleration options    |
-| TF Lite           | Speed and efficiency on mobile/embedded         | Wide range of device support                   | Robust community, Google backed               | Mobile applications with minimal footprint | Latest features for mobile                  | Secure environment on end-user devices            | GPU and DSP among others           |
-| TF Edge TPU       | Optimized for Google's Edge TPU hardware        | Exclusive to Edge TPU devices                  | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware      | Google's robust IoT security                      | Custom-designed for Google Coral   |
-| TF.js             | Reasonable in-browser performance               | High with web technologies                     | Web and Node.js developers support            | Interactive web applications               | TensorFlow team and community contributions | Web platform security model                       | Enhanced with WebGL and other APIs |
-| PaddlePaddle      | Competitive, easy to use and scalable           | Baidu ecosystem, wide application support      | Rapidly growing, especially in China          | Chinese market and language processing     | Focus on Chinese AI applications            | Emphasizes data privacy and security              | Including Baidu's Kunlun chips     |
-| NCNN              | Optimized for mobile ARM-based devices          | Mobile and embedded ARM systems                | Niche but active mobile/embedded ML community | Android and ARM systems efficiency         | High performance maintenance on ARM         | On-device security advantages                     | ARM CPUs and GPUs optimizations    |
+| Deployment Option | Performance Benchmarks                          | Compatibility and Integration                  | Community Support and Ecosystem               | Case Studies                               | Maintenance and Updates                        | Security Considerations                           | Hardware Acceleration              |
+| ----------------- | ----------------------------------------------- | ---------------------------------------------- | --------------------------------------------- | ------------------------------------------ | ---------------------------------------------- | ------------------------------------------------- | ---------------------------------- |
+| PyTorch           | Good flexibility; may trade off raw performance | Excellent with Python libraries                | Extensive resources and community             | Research and prototypes                    | Regular, active development                    | Dependent on deployment environment               | CUDA support for GPU acceleration  |
+| TorchScript       | Better for production than PyTorch              | Smooth transition from PyTorch to C++          | Specialized but narrower than PyTorch         | Industry where Python is a bottleneck      | Consistent updates with PyTorch                | Improved security without full Python             | Inherits CUDA support from PyTorch |
+| ONNX              | Variable depending on runtime                   | High across different frameworks               | Broad ecosystem, supported by many orgs       | Flexibility across ML frameworks           | Regular updates for new operations             | Ensure secure conversion and deployment practices | Various hardware optimizations     |
+| OpenVINO          | Optimized for Intel hardware                    | Best within Intel ecosystem                    | Solid in computer vision domain               | IoT and edge with Intel hardware           | Regular updates for Intel hardware             | Robust features for sensitive applications        | Tailored for Intel hardware        |
+| TensorRT          | Top-tier on NVIDIA GPUs                         | Best for NVIDIA hardware                       | Strong network through NVIDIA                 | Real-time video and image inference        | Frequent updates for new GPUs                  | Emphasis on security                              | Designed for NVIDIA GPUs           |
+| CoreML            | Optimized for on-device Apple hardware          | Exclusive to Apple ecosystem                   | Strong Apple and developer support            | On-device ML on Apple products             | Regular Apple updates                          | Focus on privacy and security                     | Apple neural engine and GPU        |
+| TF SavedModel     | Scalable in server environments                 | Wide compatibility in TensorFlow ecosystem     | Large support due to TensorFlow popularity    | Serving models at scale                    | Regular updates by Google and community        | Robust features for enterprise                    | Various hardware accelerations     |
+| TF GraphDef       | Stable for static computation graphs            | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs        | Scenarios requiring static graphs          | Updates alongside TensorFlow core              | Established TensorFlow security practices         | TensorFlow acceleration options    |
+| TF Lite           | Speed and efficiency on mobile/embedded         | Wide range of device support                   | Robust community, Google backed               | Mobile applications with minimal footprint | Latest features for mobile                     | Secure environment on end-user devices            | GPU and DSP among others           |
+| TF Edge TPU       | Optimized for Google's Edge TPU hardware        | Exclusive to Edge TPU devices                  | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware         | Google's robust IoT security                      | Custom-designed for Google Coral   |
+| TF.js             | Reasonable in-browser performance               | High with web technologies                     | Web and Node.js developers support            | Interactive web applications               | TensorFlow team and community contributions    | Web platform security model                       | Enhanced with WebGL and other APIs |
+| PaddlePaddle      | Competitive, easy to use and scalable           | Baidu ecosystem, wide application support      | Rapidly growing, especially in China          | Chinese market and language processing     | Focus on Chinese AI applications               | Emphasizes data privacy and security              | Including Baidu's Kunlun chips     |
+| MNN               | High-performance for mobile devices.            | Mobile and embedded ARM systems and X86-64 CPU | Mobile/embedded ML community                  | Moblile systems efficiency                 | High performance maintenance on Mobile Devices | On-device security advantages                     | ARM CPUs and GPUs optimizations    |
+| NCNN              | Optimized for mobile ARM-based devices          | Mobile and embedded ARM systems                | Niche but active mobile/embedded ML community | Android and ARM systems efficiency         | High performance maintenance on ARM            | On-device security advantages                     | ARM CPUs and GPUs optimizations    |

 This comparative analysis gives you a high-level overview. For deployment, it's essential to consider the specific requirements and constraints of your project, and consult the detailed documentation and resources available for each option.