ObjectDetectionData¶
- class flash.image.detection.data.ObjectDetectionData(train_input=None, val_input=None, test_input=None, predict_input=None, data_fetcher=None, transform=<class 'flash.core.data.io.input_transform.InputTransform'>, transform_kwargs=None, val_split=None, batch_size=None, num_workers=0, sampler=None, pin_memory=True, persistent_workers=False)[source]¶
The
ObjectDetectionDataclass is aDataModulewith a set of classmethods for loading data for object detection.- classmethod from_coco(train_folder=None, train_ann_file=None, val_folder=None, val_ann_file=None, test_folder=None, test_ann_file=None, predict_folder=None, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, input_cls=<class 'flash.core.integrations.icevision.data.IceVisionInput'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given data folders and annotation files in the COCO JSON format.For help understanding and using the COCO format, take a look at this tutorial: Create COCO annotations from scratch.
To learn how to customize the transforms applied for each stage, read our customizing transforms guide.
- Parameters
train_folder¶ (
Optional[str]) – The folder containing images to use when training.train_ann_file¶ (
Optional[str]) – The COCO format annotation file to use when training.val_folder¶ (
Optional[str]) – The folder containing images to use when validating.val_ann_file¶ (
Optional[str]) – The COCO format annotation file to use when validating.test_folder¶ (
Optional[str]) – The folder containing images to use when testing.test_ann_file¶ (
Optional[str]) – The COCO format annotation file to use when testing.predict_folder¶ (
Optional[str]) – The folder containing images to use when predicting.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
The folder
train_folderhas the following contents:train_folder ├── image_1.png ├── image_2.png ├── image_3.png ...
The file
train_annotations.jsoncontains the following:{ "annotations": [ {"area": 50, "bbox": [10, 20, 5, 10], "category_id": 1, "id": 1, "image_id": 1, "iscrowd": 0}, {"area": 100, "bbox": [20, 30, 10, 10], "category_id": 2, "id": 2, "image_id": 2, "iscrowd": 0}, {"area": 125, "bbox": [10, 20, 5, 25], "category_id": 1, "id": 3, "image_id": 3, "iscrowd": 0} ], "categories": [ {"id": 1, "name": "cat", "supercategory": "cat"}, {"id": 2, "name": "dog", "supercategory": "dog"} ], "images": [ {"file_name": "image_1.png", "height": 64, "width": 64, "id": 1}, {"file_name": "image_2.png", "height": 64, "width": 64, "id": 2}, {"file_name": "image_3.png", "height": 64, "width": 64, "id": 3} ] }
>>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_coco( ... train_folder="train_folder", ... train_ann_file="train_annotations.json", ... predict_folder="predict_folder", ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_fiftyone(cls, train_dataset=None, val_dataset=None, test_dataset=None, predict_dataset=None, label_field='ground_truth', iscrowd='iscrowd', transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, input_cls=<class 'flash.image.detection.input.ObjectDetectionFiftyOneInput'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
ObjectDetectionDatafrom FiftyOneSampleCollectionobjects.Targets will be extracted from the
label_fieldin theSampleCollectionobjects. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_dataset¶ (
None) – TheSampleCollectionto use when training.val_dataset¶ (
None) – TheSampleCollectionto use when validating.test_dataset¶ (
None) – TheSampleCollectionto use when testing.predict_dataset¶ (
None) – TheSampleCollectionto use when predicting.label_field¶ (
str) – The field in theSampleCollectionobjects containing the targets.iscrowd¶ (
str) – The field in theSampleCollectionobjects containing theiscrowdannotation (if required).input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
>>> import numpy as np >>> import fiftyone as fo >>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> train_dataset = fo.Dataset.from_images( ... ["image_1.png", "image_2.png", "image_3.png"] ... ) ... >>> samples = [train_dataset[filepath] for filepath in train_dataset.values("filepath")] >>> for sample, label, bounding_box in zip( ... samples, ... ["cat", "dog", "cat"], ... [[0.1, 0.2, 0.15, 0.3], [0.2, 0.3, 0.3, 0.4], [0.1, 0.2, 0.15, 0.45]], ... ): ... sample["ground_truth"] = fo.Detections( ... detections=[fo.Detection(label=label, bounding_box=bounding_box)], ... ) ... sample.save() ... >>> predict_dataset = fo.Dataset.from_images( ... ["predict_image_1.png", "predict_image_2.png", "predict_image_3.png"] ... ) ... >>> datamodule = ObjectDetectionData.from_fiftyone( ... train_dataset=train_dataset, ... predict_dataset=predict_dataset, ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) ... >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_files(train_files=None, train_targets=None, train_bboxes=None, val_files=None, val_targets=None, val_bboxes=None, test_files=None, test_targets=None, test_bboxes=None, predict_files=None, target_formatter=None, input_cls=<class 'flash.image.detection.input.ObjectDetectionFilesInput'>, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given data list of image files, bounding boxes, and targets.The supported file extensions are:
.jpg,.jpeg,.png,.ppm,.bmp,.pgm,.tif,.tiff,.webp, and.npy. The targets can be in any of our supported classification target formats. The bounding boxes are expected to be dictionaries with integer values (representing pixels) and the following keys:xmin,ymin,width,height. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_files¶ (
Optional[Sequence[str]]) – The list of image files to use when training.train_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when training.train_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when training.val_files¶ (
Optional[Sequence[str]]) – The list of image files to use when validating.val_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when validating.val_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when validating.test_files¶ (
Optional[Sequence[str]]) – The list of image files to use when testing.test_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when testing.test_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when testing.predict_files¶ (
Optional[Sequence[str]]) – The list of image files to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
>>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_files( ... train_files=["image_1.png", "image_2.png", "image_3.png"], ... train_targets=[["cat"], ["dog"], ["cat"]], ... train_bboxes=[ ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 10}], ... [{"xmin": 20, "ymin": 30, "width": 10, "height": 10}], ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 25}], ... ], ... predict_files=["predict_image_1.png", "predict_image_2.png", "predict_image_3.png"], ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(labels=datamodule.labels) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_folders(predict_folder=None, predict_transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, input_cls=<class 'flash.core.integrations.icevision.data.IceVisionInput'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given data folders This is currently support only for the predicting stage.- Parameters
predict_folder¶ (
Optional[str]) – The folder containing the predict data.predict_transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – The dictionary of transforms to use during predicting which mapsdata_module_kwargs¶ (
Any) – The keywords arguments for creating the datamodule.
- Return type
- Returns
The constructed data module.
- classmethod from_images(train_images=None, train_targets=None, train_bboxes=None, val_images=None, val_targets=None, val_bboxes=None, test_images=None, test_targets=None, test_bboxes=None, predict_images=None, target_formatter=None, input_cls=<class 'flash.image.detection.input.ObjectDetectionImageInput'>, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given lists of PIL images and corresponding lists of bounding boxes and targets.The targets can be in any of our supported classification target formats. The bounding boxes are expected to be dictionaries with integer values (representing pixels) and the following keys:
xmin,ymin,width,height. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_images¶ (
Optional[List[object]]) – The list of PIL images to use when training.train_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when training.train_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when training.val_images¶ (
Optional[List[object]]) – The list of PIL images to use when validating.val_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when validating.val_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when validating.test_images¶ (
Optional[List[object]]) – The list of PIL images to use when testing.test_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when testing.test_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when testing.predict_images¶ (
Optional[List[object]]) – The list of PIL images to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
>>> from PIL import Image >>> import numpy as np >>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_images( ... train_images=[ ... Image.fromarray(np.random.randint(0, 255, (64, 64, 3), dtype="uint8")), ... Image.fromarray(np.random.randint(0, 255, (64, 64, 3), dtype="uint8")), ... Image.fromarray(np.random.randint(0, 255, (64, 64, 3), dtype="uint8")), ... ], ... train_targets=[["cat"], ["dog"], ["cat"]], ... train_bboxes=[ ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 10}], ... [{"xmin": 20, "ymin": 30, "width": 10, "height": 10}], ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 25}], ... ], ... predict_images=[Image.fromarray(np.random.randint(0, 255, (64, 64, 3), dtype="uint8"))], ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(labels=datamodule.labels) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_numpy(train_data=None, train_targets=None, train_bboxes=None, val_data=None, val_targets=None, val_bboxes=None, test_data=None, test_targets=None, test_bboxes=None, predict_data=None, target_formatter=None, input_cls=<class 'flash.image.detection.input.ObjectDetectionNumpyInput'>, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given from numpy arrays (or lists of arrays) and corresponding lists of bounding boxes and targets.The targets can be in any of our supported classification target formats. The bounding boxes are expected to be dictionaries with integer values (representing pixels) and the following keys:
xmin,ymin,width,height. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_data¶ (
Optional[Collection[ndarray]]) – The numpy array or list of arrays to use when training.train_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when training.train_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when training.val_data¶ (
Optional[Collection[ndarray]]) – The numpy array or list of arrays to use when validating.val_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when validating.val_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when validating.test_data¶ (
Optional[Collection[ndarray]]) – The numpy array or list of arrays to use when testing.test_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when testing.test_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when testing.predict_data¶ (
Optional[Collection[ndarray]]) – The numpy array or list of arrays to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
>>> import numpy as np >>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_numpy( ... train_data=[np.random.rand(3, 64, 64), np.random.rand(3, 64, 64), np.random.rand(3, 64, 64)], ... train_targets=[["cat"], ["dog"], ["cat"]], ... train_bboxes=[ ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 10}], ... [{"xmin": 20, "ymin": 30, "width": 10, "height": 10}], ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 25}], ... ], ... predict_data=[np.random.rand(3, 64, 64)], ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(labels=datamodule.labels) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_tensors(train_data=None, train_targets=None, train_bboxes=None, val_data=None, val_targets=None, val_bboxes=None, test_data=None, test_targets=None, test_bboxes=None, predict_data=None, target_formatter=None, input_cls=<class 'flash.image.detection.input.ObjectDetectionTensorInput'>, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given from torch tensors (or lists of tensors) and corresponding lists of bounding boxes and targets.The targets can be in any of our supported classification target formats. The bounding boxes are expected to be dictionaries with integer values (representing pixels) and the following keys:
xmin,ymin,width,height. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_data¶ (
Optional[Collection[Tensor]]) – The torch tensor or list of tensors to use when training.train_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when training.train_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when training.val_data¶ (
Optional[Collection[Tensor]]) – The torch tensor or list of tensors to use when validating.val_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when validating.val_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when validating.test_data¶ (
Optional[Collection[Tensor]]) – The torch tensor or list of tensors to use when testing.test_targets¶ (
Optional[Sequence[Sequence[Any]]]) – The list of lists of targets to use when testing.test_bboxes¶ (
Optional[Sequence[Sequence[Dict[str,int]]]]) – The list of lists of bounding boxes to use when testing.predict_data¶ (
Optional[Collection[Tensor]]) – The torch tensor or list of tensors to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
>>> import torch >>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_tensors( ... train_data=[torch.rand(3, 64, 64), torch.rand(3, 64, 64), torch.rand(3, 64, 64)], ... train_targets=[["cat"], ["dog"], ["cat"]], ... train_bboxes=[ ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 10}], ... [{"xmin": 20, "ymin": 30, "width": 10, "height": 10}], ... [{"xmin": 10, "ymin": 20, "width": 5, "height": 25}], ... ], ... predict_data=[torch.rand(3, 64, 64)], ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(labels=datamodule.labels) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_via(labels, label_field='label', train_folder=None, train_ann_file=None, val_folder=None, val_ann_file=None, test_folder=None, test_ann_file=None, predict_folder=None, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, input_cls=<class 'flash.core.integrations.icevision.data.IceVisionInput'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given data folders and annotation files in the VIA (VGG Image Annotator) JSON format.To learn how to customize the transforms applied for each stage, read our customizing transforms guide.
- Parameters
labels¶ (
List[str]) – A list of class labels. Not that the list should not include a label for the background class which will be added automatically as class zero (additional labels will be sorted).label_field¶ (
str) – The field withinregion_attributeswhich corresponds to the region label.train_folder¶ (
Optional[str]) – The folder containing images to use when training.train_ann_file¶ (
Optional[str]) – The VIA format annotation file to use when training.val_folder¶ (
Optional[str]) – The folder containing images to use when validating.val_ann_file¶ (
Optional[str]) – The VIA format annotation file to use when validating.test_folder¶ (
Optional[str]) – The folder containing images to use when testing.test_ann_file¶ (
Optional[str]) – The VIA format annotation file to use when testing.predict_folder¶ (
Optional[str]) – The folder containing images to use when predicting.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
The folder
train_folderhas the following contents:train_folder ├── image_1.png ├── image_2.png ├── image_3.png ...
The file
train_annotations.jsoncontains the following:{ "image_1.png": { "filename": "image_1.png", "regions": [{ "shape_attributes": {"name": "rect", "x": 10, "y": 20, "width": 5, "height": 10}, "region_attributes": {"label": "cat"} }] }, "image_2.png": { "filename": "image_2.png", "regions": [{ "shape_attributes": {"name": "rect", "x": 20, "y": 30, "width": 10, "height": 10}, "region_attributes": {"label": "dog"}} ]}, "image_3.png": { "filename": "image_3.png", "regions": [{ "shape_attributes": {"name": "rect", "x": 10, "y": 20, "width": 5, "height": 25}, "region_attributes": {"label": "cat"} }] } }
>>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_via( ... ["cat", "dog"], ... train_folder="train_folder", ... train_ann_file="train_annotations.json", ... predict_folder="predict_folder", ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_voc(labels, train_folder=None, train_ann_folder=None, val_folder=None, val_ann_folder=None, test_folder=None, test_ann_folder=None, predict_folder=None, transform=<class 'flash.core.integrations.icevision.transforms.IceVisionInputTransform'>, input_cls=<class 'flash.core.integrations.icevision.data.IceVisionInput'>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
ObjectDetectionDataobject from the given data folders and annotation files in the PASCAL VOC (Visual Object Challenge) XML format.To learn how to customize the transforms applied for each stage, read our customizing transforms guide.
- Parameters
labels¶ (
List[str]) – A list of class labels. Note that the list should not include a label for the background class which will be added automatically as class zero (additional labels will be sorted).train_folder¶ (
Optional[str]) – The folder containing images to use when training.train_ann_folder¶ (
Optional[str]) – The folder containing VOC format annotation files to use when training.val_folder¶ (
Optional[str]) – The folder containing images to use when validating.val_ann_folder¶ (
Optional[str]) – The folder containing VOC format annotation files to use when validating.test_folder¶ (
Optional[str]) – The folder containing images to use when testing.test_ann_folder¶ (
Optional[str]) – The folder containing VOC format annotation files to use when testing.predict_folder¶ (
Optional[str]) – The folder containing images to use when predicting.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
ObjectDetectionData.
Examples
The folder
train_folderhas the following contents:train_folder ├── image_1.png ├── image_2.png ├── image_3.png ...
The folder
train_annotationshas the following contents:train_annotations ├── image_1.xml ├── image_2.xml ├── image_3.xml ...
The file
image_1.xmlcontains the following:<annotation> <filename>image_0.png</filename> <path>image_0.png</path> <source><database>example</database></source> <size><width>64</width><height>64</height><depth>3</depth></size> <object> <name>cat</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <occluded>0</occluded> <bndbox><xmin>10</xmin><xmax>15</xmax><ymin>20</ymin><ymax>30</ymax></bndbox> </object> </annotation>
>>> from flash import Trainer >>> from flash.image import ObjectDetector, ObjectDetectionData >>> datamodule = ObjectDetectionData.from_voc( ... ["cat", "dog"], ... train_folder="train_folder", ... train_ann_folder="train_annotations", ... predict_folder="predict_folder", ... transform_kwargs=dict(image_size=(128, 128)), ... batch_size=2, ... ) >>> datamodule.num_classes 3 >>> datamodule.labels ['background', 'cat', 'dog'] >>> model = ObjectDetector(num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- input_transform_cls¶
alias of
flash.core.integrations.icevision.transforms.IceVisionInputTransform