# Matching Logic This script compares YOLO and Grounding DINO predictions for the same image and flags mismatches for human review. It evaluates object matches based on class name and Intersection-over-Union (IoU) and applies configurable thresholds to determine the confidence of each detection. ## Overview - **Input**: - YOLO-generated JSON files - DINO-generated JSON files - **Output**: - Matched files → saved to labeled directory - Mismatched files → saved to pending directory --- ## Functions ### `compute_iou(box1, box2)` Computes the Intersection-over-Union between two bounding boxes. - `box1`, `box2`: Lists of `[x1, y1, x2, y2]` format - **Returns**: `float` IoU score --- ### `normalize_class(c)` Cleans up class names for matching purposes. - Removes `"BSI"` suffixes and lowercases - **Returns**: normalized class name --- ### `match_predictions(yolo_preds, dino_preds, iou_thresh)` Matches YOLO predictions to DINO predictions by class name and IoU. - `yolo_preds`, `dino_preds`: Lists of prediction dictionaries - `iou_thresh`: Minimum IoU to consider a match - **Returns**: `List[bool]` indicating which YOLO predictions matched --- ### `match_and_filter(yolo_dir, dino_dir, labeled_dir, pending_dir, config)` Main function to match predictions and split into labeled or pending sets. - Loads predictions from YOLO and DINO - Flags mismatches based on: - Low/medium YOLO confidence - High-confidence DINO detections missed by YOLO #### Output Actions: - adds `confidence_flag` to flagged predictions - Saves: - Confident matches to `labeled_dir` - Mismatches to `pending_dir` for human review #### Config Example: ```python config = { "iou_threshold": 0.5, "low_conf_threshold": 0.3, "mid_conf_threshold": 0.6, "dino_false_negative_threshold": 0.5 } ``` #### Summary Output: - Total successfully processed files - Skipped/unmatched files - Files that failed to process due to error --- ## Configuration Parameters (from `pipeline_config.json`) The following fields from the `pipeline_config.json` file directly control **YOLO–DINO Matching Behavior**: | **Key** | **Description** | |--------------------------------|---------------------------------------------------------------------------------| | `iou_threshold` | Minimum IoU score to consider two boxes (YOLO and DINO) a match (default: `0.5`). | | `low_conf_threshold` | YOLO confidence below this is considered a likely false positive (default: `0.3`). | | `mid_conf_threshold` | YOLO confidence below this (but above low) triggers a human review (default: `0.6`). | | `dino_false_negative_threshold`| If DINO detects an object above this confidence and YOLO misses it, flag for review (default: `0.5`). | These thresholds guide whether a prediction is confidently accepted, flagged for review, or rejected. --- ## Example Usage ```python match_and_filter( yolo_dir=Path("data_pipeline/prelabeled/yolo/"), dino_dir=Path("data_pipeline/prelabeled/gdino/"), labeled_dir=Path("data_pipeline/labeled/"), pending_dir=Path("data_pipeline/label_studio/pending/"), config=config ) ```