# Grounding DINO Prelabeling This page outlines the unit test coverage for the `gdino_prelabelling.py` script. The module runs object detection on a directory of images using the Grounding DINO model and saves prediction results in JSON format. --- ## Coverage Overview This test suite includes validation for: - File scanning via `_get_image_files` - End-to-end behavior of `generate_gd_prelabelling` - Device detection and fallback logic - Handling of corrupted or invalid image files - Handling of empty or missing directories - JSON output structure and formatting --- ## Constants and Configs The following are tested either directly or via mock configuration injection: - `TEXT_PROMPTS` - `BOX_THRESHOLD` - `TEXT_THRESHOLD` - Device resolution: `"cuda"`, `"cpu"`, `"auto"` - Paths to model weights and config files --- ## Functions Tested ### `_get_image_files(directory)` - **Returns**: Valid image paths (`.jpg`, `.jpeg`, `.png`) - **Tests**: - Valid image files - Mixed content (image + non-image) - Empty directories - Nonexistent directory (expected to raise error upstream) --- ### `generate_gd_prelabelling(...)` **Core test focus:** - Successful predictions written to JSON - Skipped or unreadable files are logged - Predictions include class, confidence, bounding box - All output files use expected format - Handles model loading errors - Applies config thresholds correctly - Automatically creates output directories if missing - Uses fallback device detection logic when set to `"auto"` --- ## Key Edge Cases Tested - **Empty folder**: Returns no predictions and exits gracefully - **Non-image files**: Skipped without error - **Corrupted images**: Skipped and logged - **Missing model file**: Raises `FileNotFoundError` - **Output folder does not exist**: Auto-created - **Multiple valid predictions per image**: Verified in output structure - **Verbose mode**: Confirms detailed logs are printed to stdout --- ## Example Test Assertions - Count of processed images matches input count - Prediction JSON contains required keys: `bbox`, `confidence`, `class`, `source` - Output filenames correspond to input image names - Invalid images do not result in written JSON - `torch_device` settings are passed to model correctly --- ## Summary This test suite ensures that `gdino_prelabelling.py`: - Works as expected across different environments (CPU, GPU, MPS) - Produces consistently formatted output - Handles missing, corrupted, or unexpected files gracefully - Is configurable via the pipeline-level `config` and can be extended The use of mocks and temporary directories isolates test behavior from model internals, ensuring that unit-level functionality is verified in a reproducible way.