ultralytics 8.3.72 Fix NVIDIA Jetson DLA core support for DLA inference (#19078)

Signed-off-by: Glenn Jocher <glenn.jocher@ultralytics.com> Co-authored-by: UltralyticsAssistant <web@ultralytics.com> Co-authored-by: Lakshantha Dissanayake <lakshanthad@yahoo.com> Co-authored-by: Lakshantha Dissanayake <lakshantha@ultralytics.com> Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
2025-02-06 11:10:36 +08:00 · 2025-02-06 11:10:36 +08:00 · c1860b8333
commit c1860b8333
parent 84a8b067c4
5 changed files with 15 additions and 12 deletions
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@ -99,7 +99,7 @@ jobs:
      fail-fast: false
      matrix:
        # Temporarily disable windows-latest due to https://github.com/ultralytics/ultralytics/actions/runs/13020330819/job/36319338854?pr=18921
-        os: [ubuntu-latest, macos-15, ubuntu-24.04-arm]
+        os: [ubuntu-latest, macos-15]
        python-version: ["3.11"]
        model: [yolo11n]
    steps:
--- a/docs/en/guides/nvidia-jetson.md
+++ b/docs/en/guides/nvidia-jetson.md
@ -289,10 +289,13 @@ The YOLO11n model in PyTorch format is converted to TensorRT to run inference wi

 The following Jetson devices are equipped with DLA hardware:

- Jetson Orin NX 16GB
- Jetson AGX Orin Series
- Jetson AGX Xavier Series
- Jetson Xavier NX Series
+| Jetson Device            | DLA Cores | DLA Max Frequency |
+| ------------------------ | --------- | ----------------- |
+| Jetson AGX Orin Series   | 2         | 1.6 GHz           |
+| Jetson Orin NX 16GB      | 2         | 614 MHz           |
+| Jetson Orin NX 8GB       | 1         | 614 MHz           |
+| Jetson AGX Xavier Series | 2         | 1.4 GHz           |
+| Jetson Xavier NX Series  | 2         | 1.1 GHz           |

 !!! example

@ -318,6 +321,7 @@ The following Jetson devices are equipped with DLA hardware:

        ```bash
        # Export a YOLO11n PyTorch model to TensorRT format with DLA enabled (only works with FP16 or INT8)
+        # Once DLA core number is specified at export, it will use the same core at inference
        yolo export model=yolo11n.pt format=engine device="dla:0" half=True  # dla:0 or dla:1 corresponds to the DLA cores

        # Run inference with the exported model on the DLA
--- a/ultralytics/init.py
+++ b/ultralytics/init.py
@ -1,6 +1,6 @@
 # Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

-__version__ = "8.3.71"
+__version__ = "8.3.72"

 import os

--- a/ultralytics/engine/exporter.py
+++ b/ultralytics/engine/exporter.py
@ -386,6 +386,8 @@ class Exporter:
            "names": model.names,
            "args": {k: v for k, v in self.args if k in fmt_keys},
        }  # model metadata
+        if dla is not None:
+            self.metadata["dla"] = dla  # make sure `AutoBackend` uses correct dla device if it has one
        if model.task == "pose":
            self.metadata["kpt_shape"] = model.model[-1].kpt_shape

--- a/ultralytics/nn/autobackend.py
+++ b/ultralytics/nn/autobackend.py
@ -292,13 +292,10 @@ class AutoBackend(nn.Module):
                    metadata = json.loads(f.read(meta_len).decode("utf-8"))  # read metadata
                except UnicodeDecodeError:
                    f.seek(0)  # engine file may lack embedded Ultralytics metadata
+                dla = metadata.get("dla", None)
+                if dla is not None:
+                    runtime.DLA_core = int(dla)
                model = runtime.deserialize_cuda_engine(f.read())  # read engine
-                if "dla" in str(device.type):
-                    dla_core = int(device.type.split(":")[1])
-                    assert dla_core in {0, 1}, (
-                        "Expected device type for inference in DLA is 'dla:0' or 'dla:1', but received '{device.type}'"
-                    )
-                    runtime.DLA_core = dla_core

            # Model context
            try: