Matching Logic
This script compares YOLO and Grounding DINO predictions for the same image and flags mismatches for human review. It evaluates object matches based on class name and Intersection-over-Union (IoU) and applies configurable thresholds to determine the confidence of each detection.
Overview
Input:
YOLO-generated JSON files
DINO-generated JSON files
Output:
Matched files → saved to labeled directory
Mismatched files → saved to pending directory
Functions
compute_iou(box1, box2)
Computes the Intersection-over-Union between two bounding boxes.
box1
,box2
: Lists of[x1, y1, x2, y2]
formatReturns:
float
IoU score
normalize_class(c)
Cleans up class names for matching purposes.
Removes
"BSI"
suffixes and lowercasesReturns: normalized class name
match_predictions(yolo_preds, dino_preds, iou_thresh)
Matches YOLO predictions to DINO predictions by class name and IoU.
yolo_preds
,dino_preds
: Lists of prediction dictionariesiou_thresh
: Minimum IoU to consider a matchReturns:
List[bool]
indicating which YOLO predictions matched
match_and_filter(yolo_dir, dino_dir, labeled_dir, pending_dir, config)
Main function to match predictions and split into labeled or pending sets.
Loads predictions from YOLO and DINO
Flags mismatches based on:
Low/medium YOLO confidence
High-confidence DINO detections missed by YOLO
Output Actions:
adds
confidence_flag
to flagged predictionsSaves:
Confident matches to
labeled_dir
Mismatches to
pending_dir
for human review
Config Example:
config = {
"iou_threshold": 0.5,
"low_conf_threshold": 0.3,
"mid_conf_threshold": 0.6,
"dino_false_negative_threshold": 0.5
}
Summary Output:
Total successfully processed files
Skipped/unmatched files
Files that failed to process due to error
Configuration Parameters (from pipeline_config.json
)
The following fields from the pipeline_config.json
file directly control YOLO–DINO Matching Behavior:
Key |
Description |
---|---|
|
Minimum IoU score to consider two boxes (YOLO and DINO) a match (default: |
|
YOLO confidence below this is considered a likely false positive (default: |
|
YOLO confidence below this (but above low) triggers a human review (default: |
|
If DINO detects an object above this confidence and YOLO misses it, flag for review (default: |
These thresholds guide whether a prediction is confidently accepted, flagged for review, or rejected.
Example Usage
match_and_filter(
yolo_dir=Path("data_pipeline/prelabeled/yolo/"),
dino_dir=Path("data_pipeline/prelabeled/gdino/"),
labeled_dir=Path("data_pipeline/labeled/"),
pending_dir=Path("data_pipeline/label_studio/pending/"),
config=config
)