Shortcuts

DataModule

class flash.core.data.data_module.DataModule(train_dataset=None, val_dataset=None, test_dataset=None, predict_dataset=None, data_source=None, preprocess=None, postprocess=None, data_fetcher=None, val_split=None, batch_size=4, num_workers=0, sampler=None)[source]

A basic DataModule class for all Flash tasks. This class includes references to a DataSource, Preprocess, Postprocess, and a BaseDataFetcher.

Parameters
available_data_sources()[source]

Get the list of available data source names for use with this DataModule.

Return type

Sequence[str]

Returns

The list of data source names.

static configure_data_fetcher(*args, **kwargs)[source]

This function is used to configure a BaseDataFetcher.

Override with your custom one.

Return type

BaseDataFetcher

classmethod from_csv(input_fields, target_fields=None, train_file=None, val_file=None, test_file=None, predict_file=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given CSV files using the DataSource of name CSV from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_csv(
    "input",
    "target",
    train_file="train_data.csv",
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
classmethod from_data_source(data_source, train_data=None, val_data=None, test_data=None, predict_data=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given inputs to load_data() (train_data, val_data, test_data, predict_data). The data source will be resolved from the instantiated Preprocess using data_source_of_name().

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_data_source(
    DefaultDataSources.FOLDERS,
    train_data="train_folder",
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
classmethod from_datasets(train_dataset=None, val_dataset=None, test_dataset=None, predict_dataset=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given datasets using the DataSource of name DATASETS from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_datasets(
    train_dataset=train_dataset,
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
classmethod from_fiftyone(cls, train_dataset=None, val_dataset=None, test_dataset=None, predict_dataset=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, **preprocess_kwargs)[source]

Creates a DataModule object from the given FiftyOne Datasets using the DataSource of name FIFTYONE from the passed or constructed Preprocess.

Parameters
  • train_dataset (None) – The fiftyone.core.collections.SampleCollection containing the train data.

  • val_dataset (None) – The fiftyone.core.collections.SampleCollection containing the validation data.

  • test_dataset (None) – The fiftyone.core.collections.SampleCollection containing the test data.

  • predict_dataset (None) – The fiftyone.core.collections.SampleCollection containing the predict data.

  • train_transform (Optional[Dict[str, Callable]]) – The dictionary of transforms to use during training which maps Preprocess hook names to callable transforms.

  • val_transform (Optional[Dict[str, Callable]]) – The dictionary of transforms to use during validation which maps Preprocess hook names to callable transforms.

  • test_transform (Optional[Dict[str, Callable]]) – The dictionary of transforms to use during testing which maps Preprocess hook names to callable transforms.

  • predict_transform (Optional[Dict[str, Callable]]) – The dictionary of transforms to use during predicting which maps Preprocess hook names to callable transforms.

  • data_fetcher (Optional[BaseDataFetcher]) – The BaseDataFetcher to pass to the DataModule.

  • preprocess (Optional[Preprocess]) – The Preprocess to pass to the DataModule. If None, cls.preprocess_cls will be constructed and used.

  • val_split (Optional[float]) – The val_split argument to pass to the DataModule.

  • batch_size (int) – The batch_size argument to pass to the DataModule.

  • num_workers (int) – The num_workers argument to pass to the DataModule.

  • preprocess_kwargs (Any) – Additional keyword arguments to use when constructing the preprocess. Will only be used if preprocess = None.

Return type

DataModule

Returns

The constructed data module.

Examples:

train_dataset = fo.Dataset.from_dir(
    "/path/to/dataset",
    dataset_type=fo.types.ImageClassificationDirectoryTree,
)
data_module = DataModule.from_fiftyone(
    train_data = train_dataset,
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
classmethod from_files(train_files=None, train_targets=None, val_files=None, val_targets=None, test_files=None, test_targets=None, predict_files=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given sequences of files using the DataSource of name FILES from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

classmethod from_folders(train_folder=None, val_folder=None, test_folder=None, predict_folder=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given folders using the DataSource of name FOLDERS from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

classmethod from_json(input_fields, target_fields=None, train_file=None, val_file=None, test_file=None, predict_file=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, field=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given JSON files using the DataSource of name JSON from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_json(
    "input",
    "target",
    train_file="train_data.json",
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)

# In the case where the data is of the form:
# {
#     "version": 0.0.x,
#     "data": [
#         {
#             "input_field" : "input_data",
#             "target_field" : "target_output"
#         },
#         ...
#     ]
# }

data_module = DataModule.from_json(
    "input",
    "target",
    train_file="train_data.json",
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
    feild="data"
)
classmethod from_numpy(train_data=None, train_targets=None, val_data=None, val_targets=None, test_data=None, test_targets=None, predict_data=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given numpy array using the DataSource of name NUMPY from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_numpy(
    train_files=np.random.rand(3, 128),
    train_targets=[1, 0, 1],
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
classmethod from_tensors(train_data=None, train_targets=None, val_data=None, val_targets=None, test_data=None, test_targets=None, predict_data=None, train_transform=None, val_transform=None, test_transform=None, predict_transform=None, data_fetcher=None, preprocess=None, val_split=None, batch_size=4, num_workers=0, sampler=None, **preprocess_kwargs)[source]

Creates a DataModule object from the given tensors using the DataSource of name TENSOR from the passed or constructed Preprocess.

Parameters
Return type

DataModule

Returns

The constructed data module.

Examples:

data_module = DataModule.from_tensors(
    train_files=torch.rand(3, 128),
    train_targets=[1, 0, 1],
    train_transform={
        "to_tensor_transform": torch.as_tensor,
    },
)
postprocess_cls

alias of flash.core.data.process.Postprocess

property predict_dataset: Optional[torch.utils.data.Dataset]

This property returns the predict dataset.

Return type

Optional[Dataset]

show_predict_batch(hooks_names='load_sample', reset=True)[source]

This function is used to visualize a batch from the predict dataloader.

Return type

None

show_test_batch(hooks_names='load_sample', reset=True)[source]

This function is used to visualize a batch from the test dataloader.

Return type

None

show_train_batch(hooks_names='load_sample', reset=True)[source]

This function is used to visualize a batch from the train dataloader.

Return type

None

show_val_batch(hooks_names='load_sample', reset=True)[source]

This function is used to visualize a batch from the validation dataloader.

Return type

None

property test_dataset: Optional[torch.utils.data.Dataset]

This property returns the test dataset.

Return type

Optional[Dataset]

property train_dataset: Optional[torch.utils.data.Dataset]

This property returns the train dataset.

Return type

Optional[Dataset]

property val_dataset: Optional[torch.utils.data.Dataset]

This property returns the validation dataset.

Return type

Optional[Dataset]

Read the Docs v: latest
Versions
latest
stable
0.5.0
0.4.0
0.3.2
0.3.1
0.3.0
0.2.3
0.2.2
0.2.1
0.2.0
0.1.0post1
Downloads
pdf
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.