Shortcuts

Object Detection

The Task

Object detection is the task of identifying objects in images and their associated classes and bounding boxes.

The ObjectDetector and ObjectDetectionData classes internally rely on IceVision.


Example

Let’s look at object detection with the COCO 128 data set, which contains 91 object classes. This is a subset of COCO train2017 with only 128 images. The data set is organized following the COCO format. Here’s an outline:

coco128
├── annotations
│   └── instances_train2017.json
├── images
│   └── train2017
│       ├── 000000000009.jpg
│       ├── 000000000025.jpg
│       ...
└── labels
    └── train2017
        ├── 000000000009.txt
        ├── 000000000025.txt
        ...

Once we’ve downloaded the data using download_data(), we can create the ObjectDetectionData. We select a pre-trained RetinaNet to use for our ObjectDetector and fine-tune on the COCO 128 data. We then use the trained ObjectDetector for inference. Finally, we save the model. Here’s the full example:

import flash
from flash.core.data.utils import download_data
from flash.image import ObjectDetectionData, ObjectDetector

# 1. Create the DataModule
# Dataset Credit: https://www.kaggle.com/ultralytics/coco128
download_data("https://github.com/zhiqwang/yolov5-rt-stack/releases/download/v0.3.0/coco128.zip", "data/")

datamodule = ObjectDetectionData.from_coco(
    train_folder="data/coco128/images/train2017/",
    train_ann_file="data/coco128/annotations/instances_train2017.json",
    val_split=0.1,
    image_size=128,
)

# 2. Build the task
model = ObjectDetector(head="efficientdet", backbone="d0", num_classes=datamodule.num_classes, image_size=128)

# 3. Create the trainer and finetune the model
trainer = flash.Trainer(max_epochs=1)
trainer.finetune(model, datamodule=datamodule, strategy="freeze")

# 4. Detect objects in a few images!
predictions = model.predict(
    [
        "data/coco128/images/train2017/000000000625.jpg",
        "data/coco128/images/train2017/000000000626.jpg",
        "data/coco128/images/train2017/000000000629.jpg",
    ]
)
print(predictions)

# 5. Save the model!
trainer.save_checkpoint("object_detection_model.pt")

Flash Zero

The object detector can be used directly from the command line with zero code using Flash Zero. You can run the above example with:

flash object_detection

To view configuration options and options for running the object detector with your own data, use:

flash object_detection --help

Custom Transformations

Flash automatically applies some default image / mask transformations and augmentations, but you may wish to customize these for your own use case. The base Preprocess defines 7 hooks for different stages in the data loading pipeline. For object-detection tasks, you can leverage the transformations from Albumentations with the IceVisionTransformAdapter.

import albumentations as alb
from icevision.tfms import A

from flash.core.integrations.icevision.transforms import IceVisionTransformAdapter
from flash.image import ObjectDetectionData

train_transform = {
    "pre_tensor_transform": transforms.IceVisionTransformAdapter(
        [*A.resize_and_pad(128), A.Normalize(), A.Flip(0.4), alb.RandomBrightnessContrast()]
    )
}

datamodule = ObjectDetectionData.from_coco(
    train_folder="data/coco128/images/train2017/",
    train_ann_file="data/coco128/annotations/instances_train2017.json",
    val_split=0.1,
    image_size=128,
    train_transform=train_transform,
)
Read the Docs v: latest
Versions
latest
stable
0.5.0
0.4.0
0.3.2
0.3.1
0.3.0
0.2.3
0.2.2
0.2.1
0.2.0
0.1.0post1
Downloads
pdf
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.