ultralytics 8.3.37 TensorRT auto-workspace size (#17748)
Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
This commit is contained in:
parent
6a762564c8
commit
ee6fde0beb
5 changed files with 24 additions and 24 deletions
|
|
@ -127,11 +127,11 @@ The arguments provided when using [export](../modes/export.md) for an Ultralytic
|
|||
|
||||
- Adjust the `workspace` value according to your calibration needs and resource availability. While a larger `workspace` may increase calibration time, it allows TensorRT to explore a wider range of optimization tactics, potentially enhancing model performance and [accuracy](https://www.ultralytics.com/glossary/accuracy). Conversely, a smaller `workspace` can reduce calibration time but may limit the optimization strategies, affecting the quality of the quantized model.
|
||||
|
||||
- Default is `workspace=4` (GiB), this value may need to be increased if calibration crashes (exits without warning).
|
||||
- Default is `workspace=None`, which will allow for TensorRT to automatically allocate memory, when configuring manually, this value may need to be increased if calibration crashes (exits without warning).
|
||||
|
||||
- TensorRT will report `UNSUPPORTED_STATE` during export if the value for `workspace` is larger than the memory available to the device, which means the value for `workspace` should be lowered.
|
||||
- TensorRT will report `UNSUPPORTED_STATE` during export if the value for `workspace` is larger than the memory available to the device, which means the value for `workspace` should be lowered or set to `None`.
|
||||
|
||||
- If `workspace` is set to max value and calibration fails/crashes, consider reducing the values for `imgsz` and `batch` to reduce memory requirements.
|
||||
- If `workspace` is set to max value and calibration fails/crashes, consider using `None` for auto-allocation or by reducing the values for `imgsz` and `batch` to reduce memory requirements.
|
||||
|
||||
- <u><b>Remember</b> calibration for INT8 is specific to each device</u>, borrowing a "high-end" GPU for calibration, might result in poor performance when inference is run on another device.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue