# Matching Logic

This script compares YOLO and Grounding DINO predictions for the same image and flags mismatches for human review. It evaluates object matches based on class name and Intersection-over-Union (IoU) and applies configurable thresholds to determine the confidence of each detection.

## Overview

- **Input**:  
  - YOLO-generated JSON files  
  - DINO-generated JSON files

- **Output**:  
  - Matched files → saved to labeled directory  
  - Mismatched files → saved to pending directory

---

## Functions

### `compute_iou(box1, box2)`
Computes the Intersection-over-Union between two bounding boxes.

- `box1`, `box2`: Lists of `[x1, y1, x2, y2]` format  
- **Returns**: `float` IoU score

---

### `normalize_class(c)`
Cleans up class names for matching purposes.

- Removes `"BSI"` suffixes and lowercases  
- **Returns**: normalized class name

---

### `match_predictions(yolo_preds, dino_preds, iou_thresh)`
Matches YOLO predictions to DINO predictions by class name and IoU.

- `yolo_preds`, `dino_preds`: Lists of prediction dictionaries  
- `iou_thresh`: Minimum IoU to consider a match  
- **Returns**: `List[bool]` indicating which YOLO predictions matched

---

### `match_and_filter(yolo_dir, dino_dir, labeled_dir, pending_dir, config)`
Main function to match predictions and split into labeled or pending sets.

- Loads predictions from YOLO and DINO
- Flags mismatches based on:
  - Low/medium YOLO confidence
  - High-confidence DINO detections missed by YOLO

#### Output Actions:
- adds `confidence_flag` to flagged predictions
- Saves:
  - Confident matches to `labeled_dir`
  - Mismatches to `pending_dir` for human review

#### Config Example:

```python
config = {
    "iou_threshold": 0.5,
    "low_conf_threshold": 0.3,
    "mid_conf_threshold": 0.6,
    "dino_false_negative_threshold": 0.5
}
```

#### Summary Output:
- Total successfully processed files  
- Skipped/unmatched files  
- Files that failed to process due to error

---

## Configuration Parameters (from `pipeline_config.json`)

The following fields from the `pipeline_config.json` file directly control **YOLO–DINO Matching Behavior**:

 | **Key**                         | **Description**                                                                 |
 |--------------------------------|---------------------------------------------------------------------------------|
 | `iou_threshold`                | Minimum IoU score to consider two boxes (YOLO and DINO) a match (default: `0.5`). |
 | `low_conf_threshold`           | YOLO confidence below this is considered a likely false positive (default: `0.3`). |
 | `mid_conf_threshold`           | YOLO confidence below this (but above low) triggers a human review (default: `0.6`). |
 | `dino_false_negative_threshold`| If DINO detects an object above this confidence and YOLO misses it, flag for review (default: `0.5`). |

These thresholds guide whether a prediction is confidently accepted, flagged for review, or rejected.

---


## Example Usage

```python
match_and_filter(
    yolo_dir=Path("data_pipeline/prelabeled/yolo/"),
    dino_dir=Path("data_pipeline/prelabeled/gdino/"),
    labeled_dir=Path("data_pipeline/labeled/"),
    pending_dir=Path("data_pipeline/label_studio/pending/"),
    config=config
)
```