Quantization

Script for model quantization to reduce size and improve inference speed.

src.pipeline.quantization.fp16_quantization(model, output_path)[source]

Apply FP16 quantization to the model.

Parameters:
  • model – YOLO model to quantize

  • output_path – Path to save the quantized model

Returns:

Path to the quantized model

Return type:

str

src.pipeline.quantization.imx_quantization(model, output_path, quantize_yaml)[source]

Apply IMX post training quantization to the model.

Note - 300+ images recommended for IMX INT8 calibration.

Parameters:
  • model – YOLO model to quantize

  • output_path – Path to save the quantized model

  • quantize_yaml – Path to quantize.yaml with nc and class names

Returns:

Path to the quantized model, or None if on non-Linux platforms.

Return type:

str or None

src.pipeline.quantization.onnx_quantization(model, output_path, preprocessed_path)[source]

Apply ONNX dynamic quantization to the model.

Parameters:
  • model – YOLO model to quantize

  • output_path – Path to save the final quantized model

  • preprocessed_path – Path to save the preprocessed ONNX model

Returns:

Path to the quantized model

Return type:

str

src.pipeline.quantization.quantize_model(model_path: str, quantize_config_path: str) str[source]

Apply quantization to the model to reduce size and improve inference speed.

Parameters:
  • model_path (str) – Path to the model to quantize

  • quantize_config_path (str) – Path to quantization configuration file

Returns:

Path to the quantized model

Return type:

str