Matching Logic

This page outlines the unit test coverage for the matching.py script, which compares YOLO and Grounding DINO predictions and routes files into labeled or pending folders based on IoU, confidence, and class agreement.

Coverage Overview

This test suite validates the following core functions:

compute_iou: Intersection-over-Union between two bounding boxes
normalize_class: Standardizes class names (e.g., removes suffixes)
match_predictions: Compares YOLO and DINO predictions using class name and IoU
match_and_filter: Reads YOLO/DINO JSON predictions and splits them into labeled or pending output folders using thresholds

Unit Test Descriptions

`compute_iou`

test_compute_iou_overlap
Validates IoU is correctly calculated for overlapping boxes.
test_compute_iou_no_overlap
Ensures IoU returns 0 for non-overlapping boxes.

`normalize_class`

test_normalize_class_strips_suffix
Checks normalization of class strings (e.g., lowercasing, suffix removal).

`match_predictions`

test_match_predictions_positive
Confirms that two bounding boxes with high IoU and same class are matched.
test_match_predictions_negative_iou
Confirms that boxes with low IoU do not match even if class names are the same.

`match_and_filter`

All tests below use the match_dirs fixture, which sets up temporary yolo, dino, labeled, and pending directories for testing.

test_match_and_filter_matched_goes_labeled
Valid match (class + IoU + confidence) goes to labeled.
test_match_and_filter_low_conf_goes_pending
Low confidence YOLO prediction results in pending output.
test_dino_false_negative_goes_pending
High-confidence DINO detection missed by YOLO triggers pending file.
test_match_and_filter_invalid_yolo_json_skipped
Handles bad JSON gracefully — file is skipped, not labeled or pending.

Configuration Parameters Used

These thresholds are passed as a dictionary to match_and_filter and influence labeling decisions:

Key	Description
`iou_threshold`	Minimum IoU for a match (e.g., `0.5`)
`low_conf_threshold`	YOLO predictions below this are flagged as false positives
`mid_conf_threshold`	Intermediate YOLO confidence triggers review
`dino_false_negative_threshold`	DINO detections above this are considered missed by YOLO

Summary

This page outlines robust coverage of the matching.py logic, ensuring:

Reliable IoU calculation
Consistent class name normalization
Correct behavior across all matching outcomes
Fault tolerance for malformed files
Accurate routing of predictions based on configurable thresholds

Together, these tests ensure that only confident, well-aligned predictions are passed automatically to the labeled dataset, while edge cases and uncertainties are flagged for human review.

Matching Logic

Coverage Overview

Unit Test Descriptions

compute_iou

normalize_class

match_predictions

match_and_filter