VideoClassificationData¶
- class flash.video.classification.data.VideoClassificationData(train_input=None, val_input=None, test_input=None, predict_input=None, data_fetcher=None, transform=<class 'flash.core.data.io.input_transform.InputTransform'>, transform_kwargs=None, val_split=None, batch_size=None, num_workers=0, sampler=None, pin_memory=True, persistent_workers=False)[source]¶
The
VideoClassificationDataclass is aDataModulewith a set of classmethods for loading data for video classification.- classmethod from_csv(input_field, target_fields=None, train_file=None, train_videos_root=None, train_resolver=None, val_file=None, val_videos_root=None, val_resolver=None, test_file=None, test_videos_root=None, test_resolver=None, predict_file=None, predict_videos_root=None, predict_resolver=None, target_formatter=None, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', input_cls=<class 'flash.video.classification.input.VideoClassificationCSVInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationCSVPredictInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom CSV files containing video file paths and their corresponding targets.Input video file paths will be extracted from the
input_fieldcolumn in the CSV files. The supported file extensions are:.mp4, and.avi. The targets will be extracted from thetarget_fieldsin the CSV files and can be in any of our supported classification target formats. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
input_field¶ (
str) – The field (column name) in the CSV files containing the video file paths.target_fields¶ (
Union[str,List[str],None]) – The field (column name) or list of fields in the CSV files containing the targets.train_file¶ (
Union[str,bytes,PathLike,None]) – The CSV file to use when training.train_videos_root¶ (
Union[str,bytes,PathLike,None]) – The root directory containing train videos.train_resolver¶ (
Optional[Callable[[Union[str,bytes,PathLike],Any],Union[str,bytes,PathLike]]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.val_file¶ (
Union[str,bytes,PathLike,None]) – The CSV file to use when validating.val_videos_root¶ (
Union[str,bytes,PathLike,None]) – The root directory containing validation videos.val_resolver¶ (
Optional[Callable[[Union[str,bytes,PathLike],Any],Union[str,bytes,PathLike]]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.test_file¶ (
Optional[str]) – The CSV file to use when testing.test_videos_root¶ (
Optional[str]) – The root directory containing test videos.test_resolver¶ (
Optional[Callable[[Union[str,bytes,PathLike],Any],Union[str,bytes,PathLike]]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.predict_file¶ (
Optional[str]) – The CSV file to use when predicting.predict_videos_root¶ (
Optional[str]) – The root directory containing predict videos.predict_resolver¶ (
Optional[Callable[[Union[str,bytes,PathLike],Any],Union[str,bytes,PathLike]]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.clip_sampler¶ (
Optional[str]) – The clip sampler to use. One of:"uniform","random","constant_clips_per_video".clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – The decoder to use to decode videos. One of:"pyav","torchvision". Not used for frame videos.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
The files can be in Comma Separated Values (CSV) format with either a
.csvor.txtextension.The file
train_data.csvcontains the following:videos,targets video_1.mp4,cat video_2.mp4,dog video_3.mp4,cat
The file
predict_data.csvcontains the following:videos predict_video_1.mp4 predict_video_2.mp4 predict_video_3.mp4
>>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> datamodule = VideoClassificationData.from_csv( ... "videos", ... "targets", ... train_file="train_data.csv", ... train_videos_root="train_folder", ... predict_file="predict_data.csv", ... predict_videos_root="predict_folder", ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
Alternatively, the files can be in Tab Separated Values (TSV) format with a
.tsvextension.The file
train_data.tsvcontains the following:videos targets video_1.mp4 cat video_2.mp4 dog video_3.mp4 cat
The file
predict_data.tsvcontains the following:videos predict_video_1.mp4 predict_video_2.mp4 predict_video_3.mp4
>>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> datamodule = VideoClassificationData.from_csv( ... "videos", ... "targets", ... train_file="train_data.tsv", ... train_videos_root="train_folder", ... predict_file="predict_data.tsv", ... predict_videos_root="predict_folder", ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_data_frame(input_field, target_fields=None, train_data_frame=None, train_videos_root=None, train_resolver=None, val_data_frame=None, val_videos_root=None, val_resolver=None, test_data_frame=None, test_videos_root=None, test_resolver=None, predict_data_frame=None, predict_videos_root=None, predict_resolver=None, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', input_cls=<class 'flash.video.classification.input.VideoClassificationDataFrameInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationDataFramePredictInput'>, target_formatter=None, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom pandas DataFrame objects containing video file paths and their corresponding targets.Input video file paths will be extracted from the
input_fieldin the DataFrame. The supported file extensions are:.mp4, and.avi. The targets will be extracted from thetarget_fieldsin the DataFrame and can be in any of our supported classification target formats. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
input_field¶ (
str) – The field (column name) in the DataFrames containing the video file paths.target_fields¶ (
Union[str,Sequence[str],None]) – The field (column name) or list of fields in the DataFrames containing the targets.train_data_frame¶ (
Optional[DataFrame]) – The DataFrame to use when training.train_videos_root¶ (
Optional[str]) – The root directory containing train videos.train_resolver¶ (
Optional[Callable[[str,str],str]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.val_data_frame¶ (
Optional[DataFrame]) – The DataFrame to use when validating.val_videos_root¶ (
Optional[str]) – The root directory containing validation videos.val_resolver¶ (
Optional[Callable[[str,str],str]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.test_data_frame¶ (
Optional[DataFrame]) – The DataFrame to use when testing.test_videos_root¶ (
Optional[str]) – The root directory containing test videos.test_resolver¶ (
Optional[Callable[[str,str],str]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.predict_data_frame¶ (
Optional[DataFrame]) – The DataFrame to use when predicting.predict_videos_root¶ (
Optional[str]) – The root directory containing predict videos.predict_resolver¶ (
Optional[Callable[[str,str],str]]) – Optionally provide a function which converts an entry from theinput_fieldinto a video file path.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.clip_sampler¶ (
Optional[str]) – The clip sampler to use. One of:"uniform","random","constant_clips_per_video".clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – The decoder to use to decode videos. One of:"pyav","torchvision". Not used for frame videos.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
>>> from pandas import DataFrame >>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> train_data_frame = DataFrame.from_dict( ... { ... "videos": ["video_1.mp4", "video_2.mp4", "video_3.mp4"], ... "targets": ["cat", "dog", "cat"], ... } ... ) >>> predict_data_frame = DataFrame.from_dict( ... { ... "videos": ["predict_video_1.mp4", "predict_video_2.mp4", "predict_video_3.mp4"], ... } ... ) >>> datamodule = VideoClassificationData.from_data_frame( ... "videos", ... "targets", ... train_data_frame=train_data_frame, ... train_videos_root="train_folder", ... predict_data_frame=predict_data_frame, ... predict_videos_root="predict_folder", ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_fiftyone(cls, train_dataset=None, val_dataset=None, test_dataset=None, predict_dataset=None, target_formatter=None, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', label_field='ground_truth', input_cls=<class 'flash.video.classification.input.VideoClassificationFiftyOneInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationFiftyOnePredictInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom FiftyOneSampleCollectionobjects.The supported file extensions are:
.mp4, and.avi. The targets will be extracted from thelabel_fieldin theSampleCollectionobjects and can be in any of our supported classification target formats. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_dataset¶ (
None) – TheSampleCollectionto use when training.val_dataset¶ (
None) – TheSampleCollectionto use when validating.test_dataset¶ (
None) – TheSampleCollectionto use when testing.predict_dataset¶ (
None) – TheSampleCollectionto use when predicting.label_field¶ (
str) – The field in theSampleCollectionobjects containing the targets.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.clip_sampler¶ (
Optional[str]) – The clip sampler to use. One of:"uniform","random","constant_clips_per_video".clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – The decoder to use to decode videos. One of:"pyav","torchvision". Not used for frame videos.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ – Additional keyword arguments to provide to the
DataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
>>> import fiftyone as fo >>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> train_dataset = fo.Dataset.from_videos( ... ["video_1.mp4", "video_2.mp4", "video_3.mp4"] ... ) ... >>> samples = [train_dataset[filepath] for filepath in train_dataset.values("filepath")] >>> for sample, label in zip(samples, ["cat", "dog", "cat"]): ... sample["ground_truth"] = fo.Classification(label=label) ... sample.save() ... >>> predict_dataset = fo.Dataset.from_images( ... ["predict_video_1.mp4", "predict_video_2.mp4", "predict_video_3.mp4"] ... ) ... >>> datamodule = VideoClassificationData.from_fiftyone( ... train_dataset=train_dataset, ... predict_dataset=predict_dataset, ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_files(train_files=None, train_targets=None, val_files=None, val_targets=None, test_files=None, test_targets=None, predict_files=None, target_formatter=None, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', input_cls=<class 'flash.video.classification.input.VideoClassificationFilesInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationPathsPredictInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom lists of files and corresponding lists of targets.The supported file extensions are:
.mp4, and.avi. The targets can be in any of our supported classification target formats. To learn how to customize the transforms applied for each stage, read our customizing transforms guide.- Parameters
train_files¶ (
Optional[Sequence[str]]) – The list of video files to use when training.train_targets¶ (
Optional[Sequence[Any]]) – The list of targets to use when training.val_files¶ (
Optional[Sequence[str]]) – The list of video files to use when validating.val_targets¶ (
Optional[Sequence[Any]]) – The list of targets to use when validating.test_files¶ (
Optional[Sequence[str]]) – The list of video files to use when testing.test_targets¶ (
Optional[Sequence[Any]]) – The list of targets to use when testing.predict_files¶ (
Optional[Sequence[str]]) – The list of video files to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.clip_sampler¶ (
Optional[str]) – The clip sampler to use. One of:"uniform","random","constant_clips_per_video".clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – The decoder to use to decode videos. One of:"pyav","torchvision". Not used for frame videos.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ – Additional keyword arguments to provide to the
DataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
>>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> datamodule = VideoClassificationData.from_files( ... train_files=["video_1.mp4", "video_2.mp4", "video_3.mp4"], ... train_targets=["cat", "dog", "cat"], ... predict_files=["predict_video_1.mp4", "predict_video_2.mp4", "predict_video_3.mp4"], ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_folders(train_folder=None, val_folder=None, test_folder=None, predict_folder=None, target_formatter=None, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', input_cls=<class 'flash.video.classification.input.VideoClassificationFoldersInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationPathsPredictInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom folders containing videos.The supported file extensions are:
.mp4, and.avi. For train, test, and validation data, the folders are expected to contain a sub-folder for each class. Here’s the required structure:train_folder ├── cat │ ├── video_1.mp4 │ ├── video_3.mp4 │ ... └── dog ├── video_2.mp4 ...For prediction, the folder is expected to contain the files for inference, like this:
predict_folder ├── predict_video_1.mp4 ├── predict_video_2.mp4 ├── predict_video_3.mp4 ...
To learn how to customize the transforms applied for each stage, read our customizing transforms guide.
- Parameters
train_folder¶ (
Optional[str]) – The folder containing videos to use when training.val_folder¶ (
Optional[str]) – The folder containing videos to use when validating.test_folder¶ (
Optional[str]) – The folder containing videos to use when testing.predict_folder¶ (
Optional[str]) – The folder containing videos to use when predicting.target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.clip_sampler¶ (
Optional[str]) – The clip sampler to use. One of:"uniform","random","constant_clips_per_video".clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – The decoder to use to decode videos. One of:"pyav","torchvision". Not used for frame videos.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ – Additional keyword arguments to provide to the
DataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
>>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> datamodule = VideoClassificationData.from_folders( ... train_folder="train_folder", ... predict_folder="predict_folder", ... transform_kwargs=dict(image_size=(244, 244)), ... batch_size=2, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['cat', 'dog'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- classmethod from_labelstudio(export_json=None, train_export_json=None, val_export_json=None, test_export_json=None, predict_export_json=None, data_folder=None, train_data_folder=None, val_data_folder=None, test_data_folder=None, predict_data_folder=None, val_split=None, multi_label=False, clip_sampler='random', clip_duration=2, clip_sampler_kwargs=None, video_sampler=<class 'torch.utils.data.sampler.RandomSampler'>, decode_audio=False, decoder='pyav', input_cls=<class 'flash.core.integrations.labelstudio.input.LabelStudioVideoClassificationInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Creates a
DataModuleobject from the given export file and data directory using theInputof nameFOLDERSfrom the passed or constructedInputTransform.- Parameters
export_json¶ (
Optional[str]) – path to label studio export filetrain_export_json¶ (
Optional[str]) – path to label studio export file for train set. (overrides export_json if specified)val_export_json¶ (
Optional[str]) – path to label studio export file for validationtest_export_json¶ (
Optional[str]) – path to label studio export file for testpredict_export_json¶ (
Optional[str]) – path to label studio export file for predictdata_folder¶ (
Optional[str]) – path to label studio data foldertrain_data_folder¶ (
Optional[str]) – path to label studio data folder for train data set. (overrides data_folder if specified)val_data_folder¶ (
Optional[str]) – path to label studio data folder for validation datatest_data_folder¶ (
Optional[str]) – path to label studio data folder for test datapredict_data_folder¶ (
Optional[str]) – path to label studio data folder for predict dataval_split¶ (
Optional[float]) – Theval_splitargument to pass to theDataModule.multi_label¶ (
Optional[bool]) – Whether the label are multi encoded.clip_sampler¶ (
Optional[str]) – Defines how clips should be sampled from each video.clip_duration¶ (
float) – Defines how long the sampled clips should be for each video.clip_sampler_kwargs¶ (
Optional[Dict[str,Any]]) – Additional keyword arguments to use when constructing the clip sampler.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order videos are decoded and, if necessary, the distributed split.decode_audio¶ (
bool) – If True, also decode audio from video.decoder¶ (
str) – Defines what type of decoder used to decode a video.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ – Additional keyword arguments to use when constructing the datamodule.
- Return type
- Returns
The constructed data module.
Examples:
data_module = DataModule.from_labelstudio( export_json='project.json', data_folder='label-studio/media/upload', val_split=0.8, )
- classmethod from_tensors(train_data=None, train_targets=None, val_data=None, val_targets=None, test_data=None, test_targets=None, predict_data=None, target_formatter=None, video_sampler=<class 'torch.utils.data.sampler.SequentialSampler'>, input_cls=<class 'flash.video.classification.input.VideoClassificationTensorsInput'>, predict_input_cls=<class 'flash.video.classification.input.VideoClassificationTensorsPredictInput'>, transform=<function VideoClassificationInputTransform>, transform_kwargs=None, **data_module_kwargs)[source]¶
Load the
VideoClassificationDatafrom a dictionary containing PyTorch tensors representing input video frames and their corresponding targets.Input tensor(s) will be extracted from the
input_fieldin thedict. The targets will be extracted from thetarget_fieldsin thedictand can be in any of our supported classification target formats.To learn how to customize the transforms applied for each stage, read our customizing transforms guide.
- Parameters
train_data¶ (
Union[Collection[Tensor],Tensor,None]) – The torch tensor or list of tensors to use when training.train_targets¶ (
Optional[Collection[Any]]) – The list of targets to use when training.val_data¶ (
Union[Collection[Tensor],Tensor,None]) – The torch tensor or list of tensors to use when validating.val_targets¶ (
Optional[Sequence[Any]]) – The list of targets to use when validating.test_data¶ (
Optional[Collection[Tensor]]) – The torch tensor or list of tensors to use when testing.test_targets¶ (
Optional[Sequence[Any]]) – The list of targets to use when testing.predict_data¶ (
Union[Collection[Tensor],Tensor,None]) – The torch tensor or list of tensors to use when predicting.train_data¶ – A torch tensor or list of tensors to use when training.
train_targets¶ – The list of targets to use when training.
target_formatter¶ (
Optional[TargetFormatter]) – Optionally provide aTargetFormatterto control how targets are handled. See Formatting Classification Targets for more details.video_sampler¶ (
Type[Sampler]) – Sampler for the internal video container. This defines the order tensors are used and, if necessary, the distributed split.input_cls¶ (
Type[Input]) – TheInputtype to use for loading the data.predict_input_cls¶ (
Type[Input]) – TheInputtype to use for loading the prediction data.transform¶ (
TypeVar(INPUT_TRANSFORM_TYPE,Type[flash.core.data.io.input_transform.InputTransform],Callable,Tuple[Union[StrEnum,str],Dict[str,Any]],Union[StrEnum,str],None)) – TheInputTransformtype to use.transform_kwargs¶ (
Optional[Dict]) – Dict of keyword arguments to be provided when instantiating the transforms.data_module_kwargs¶ (
Any) – Additional keyword arguments to provide to theDataModuleconstructor.
- Return type
- Returns
The constructed
VideoClassificationData.
Examples
>>> import torch >>> from flash import Trainer >>> from flash.video import VideoClassifier, VideoClassificationData >>> frame = torch.randint(low=0, high=255, size=(3, 5, 10, 10), dtype=torch.uint8, device="cpu") >>> datamodule = VideoClassificationData.from_tensors( ... train_data=[frame, frame, frame], ... train_targets=["fruit", "vegetable", "fruit"], ... val_data=[frame, frame], ... val_targets=["vegetable", "fruit"], ... predict_data=[frame], ... batch_size=1, ... ) >>> datamodule.num_classes 2 >>> datamodule.labels ['fruit', 'vegetable'] >>> model = VideoClassifier(backbone="x3d_xs", num_classes=datamodule.num_classes) >>> trainer = Trainer(fast_dev_run=True) >>> trainer.fit(model, datamodule=datamodule) Training... >>> trainer.predict(model, datamodule=datamodule) Predicting...
- input_transform_cls(image_size=244, temporal_sub_sample=8, mean=tensor([0.4500, 0.4500, 0.4500]), std=tensor([0.2250, 0.2250, 0.2250]), data_format='BCTHW', same_on_frame=False) = <function VideoClassificationInputTransform>¶