Quantization
Script for model quantization to reduce size and improve inference speed.
- src.pipeline.quantization.fp16_quantization(model, output_path)[source]
Apply FP16 quantization to the model.
- Parameters:
model – YOLO model to quantize
output_path – Path to save the quantized model
- Returns:
Path to the quantized model
- Return type:
str
- src.pipeline.quantization.imx_quantization(model, output_path, quantize_yaml)[source]
Apply IMX post training quantization to the model.
Note - 300+ images recommended for IMX INT8 calibration.
- Parameters:
model – YOLO model to quantize
output_path – Path to save the quantized model
quantize_yaml – Path to quantize.yaml with nc and class names
- Returns:
Path to the quantized model, or None if on non-Linux platforms.
- Return type:
str or None
- src.pipeline.quantization.onnx_quantization(model, output_path, preprocessed_path)[source]
Apply ONNX dynamic quantization to the model.
- Parameters:
model – YOLO model to quantize
output_path – Path to save the final quantized model
preprocessed_path – Path to save the preprocessed ONNX model
- Returns:
Path to the quantized model
- Return type:
str
- src.pipeline.quantization.quantize_model(model_path: str, quantize_config_path: str) str [source]
Apply quantization to the model to reduce size and improve inference speed.
- Parameters:
model_path (str) – Path to the model to quantize
quantize_config_path (str) – Path to quantization configuration file
- Returns:
Path to the quantized model
- Return type:
str