Shortcuts

QuestionAnsweringData

class flash.text.question_answering.data.QuestionAnsweringData(train_input=None, val_input=None, test_input=None, predict_input=None, data_fetcher=None, val_split=None, batch_size=None, num_workers=0, sampler=None, pin_memory=True, persistent_workers=True, output_transform=None)[source]

Data module for QuestionAnswering task.

classmethod from_csv(train_file=None, val_file=None, test_file=None, predict_file=None, train_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, val_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, test_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, predict_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, input_cls=<class 'flash.text.question_answering.input.QuestionAnsweringCSVInput'>, transform_kwargs=None, max_source_length=384, max_target_length=30, padding='max_length', question_column_name='question', context_column_name='context', answer_column_name='answer', doc_stride=128, **data_module_kwargs)[source]

Creates a QuestionAnsweringData object from the given CSV files.

Parameters
  • train_file (Union[str, bytes, PathLike, None]) – The CSV file containing the training data.

  • val_file (Union[str, bytes, PathLike, None]) – The CSV file containing the validation data.

  • test_file (Union[str, bytes, PathLike, None]) – The CSV file containing the testing data.

  • predict_file (Union[str, bytes, PathLike, None]) – The CSV file containing the data to use when predicting.

  • train_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during training which maps InputTransform hook names to callable transforms.

  • val_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during validation which maps InputTransform hook names to callable transforms.

  • test_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during testing which maps InputTransform hook names to callable transforms.

  • predict_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during predicting which maps InputTransform hook names to callable transforms.

  • max_source_length (int) – Max length of the sequence to be considered during tokenization.

  • max_target_length (int) – Max length of each answer to be produced.

  • padding (Union[str, bool]) – Padding type during tokenization.

  • question_column_name (str) – The key in the JSON file to recognize the question field.

  • context_column_name (str) – The key in the JSON file to recognize the context field.

  • answer_column_name (str) – The key in the JSON file to recognize the answer field.

  • doc_stride (int) – The stride amount to be taken when splitting up a long document into chunks.

Return type

QuestionAnsweringData

Returns

The constructed data module.

classmethod from_dicts(train_data=None, val_data=None, test_data=None, predict_data=None, train_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, val_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, test_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, predict_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, input_cls=<class 'flash.text.question_answering.input.QuestionAnsweringDictionaryInput'>, transform_kwargs=None, max_source_length=384, max_target_length=30, padding='max_length', question_column_name='question', context_column_name='context', answer_column_name='answer', doc_stride=128, **data_module_kwargs)[source]

Creates a QuestionAnsweringData object from the given data dictionaries.

Parameters
  • train_data (Optional[Dict[str, Any]]) – The dictionary containing the training data.

  • val_data (Optional[Dict[str, Any]]) – The dictionary containing the validation data.

  • test_data (Optional[Dict[str, Any]]) – The dictionary containing the testing data.

  • predict_data (Optional[Dict[str, Any]]) – The dictionary containing the data to use when predicting.

  • train_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during training which maps InputTransform hook names to callable transforms.

  • val_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during validation which maps InputTransform hook names to callable transforms.

  • test_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during testing which maps InputTransform hook names to callable transforms.

  • predict_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during predicting which maps InputTransform hook names to callable transforms.

  • max_source_length (int) – Max length of the sequence to be considered during tokenization.

  • max_target_length (int) – Max length of each answer to be produced.

  • padding (Union[str, bool]) – Padding type during tokenization.

  • question_column_name (str) – The key in the JSON file to recognize the question field.

  • context_column_name (str) – The key in the JSON file to recognize the context field.

  • answer_column_name (str) – The key in the JSON file to recognize the answer field.

  • doc_stride (int) – The stride amount to be taken when splitting up a long document into chunks.

Return type

QuestionAnsweringData

Returns

The constructed data module.

classmethod from_json(train_file=None, val_file=None, test_file=None, predict_file=None, train_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, val_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, test_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, predict_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, input_cls=<class 'flash.text.question_answering.input.QuestionAnsweringJSONInput'>, transform_kwargs=None, field=None, max_source_length=384, max_target_length=30, padding='max_length', question_column_name='question', context_column_name='context', answer_column_name='answer', doc_stride=128, **data_module_kwargs)[source]

Creates a QuestionAnsweringData object from the given JSON files.

Parameters
  • train_file (Union[str, bytes, PathLike, None]) – The JSON file containing the training data.

  • val_file (Union[str, bytes, PathLike, None]) – The JSON file containing the validation data.

  • test_file (Union[str, bytes, PathLike, None]) – The JSON file containing the testing data.

  • predict_file (Union[str, bytes, PathLike, None]) – The JSON file containing the data to use when predicting.

  • train_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during training which maps InputTransform hook names to callable transforms.

  • val_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during validation which maps InputTransform hook names to callable transforms.

  • test_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during testing which maps InputTransform hook names to callable transforms.

  • predict_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during predicting which maps InputTransform hook names to callable transforms.

  • field (Optional[str]) – The field that holds the data in the JSON file.

  • max_source_length (int) – Max length of the sequence to be considered during tokenization.

  • max_target_length (int) – Max length of each answer to be produced.

  • padding (Union[str, bool]) – Padding type during tokenization.

  • question_column_name (str) – The key in the JSON file to recognize the question field.

  • context_column_name (str) – The key in the JSON file to recognize the context field.

  • answer_column_name (str) – The key in the JSON file to recognize the answer field.

  • doc_stride (int) – The stride amount to be taken when splitting up a long document into chunks.

Return type

QuestionAnsweringData

Returns

The constructed data module.

classmethod from_squad_v2(train_file=None, val_file=None, test_file=None, predict_file=None, train_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, val_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, test_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, predict_transform=<class 'flash.text.question_answering.input_transform.QuestionAnsweringInputTransform'>, input_cls=<class 'flash.text.question_answering.input.QuestionAnsweringSQuADInput'>, transform_kwargs=None, max_source_length=384, max_target_length=30, padding='max_length', question_column_name='question', context_column_name='context', answer_column_name='answer', doc_stride=128, **data_module_kwargs)[source]

Creates a QuestionAnsweringData object from the given data JSON files in the SQuAD2.0 format.

Parameters
  • train_file (Optional[str]) – The JSON file containing the training data.

  • val_file (Optional[str]) – The JSON file containing the validation data.

  • test_file (Optional[str]) – The JSON file containing the testing data.

  • predict_file (Optional[str]) – The JSON file containing the predict data.

  • train_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during training which maps InputTransform hook names to callable transforms.

  • val_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during validation which maps InputTransform hook names to callable transforms.

  • test_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during testing which maps InputTransform hook names to callable transforms.

  • predict_transform (~INPUT_TRANSFORM_TYPE) – The dictionary of transforms to use during predicting which maps InputTransform hook names to callable transforms.

  • max_source_length (int) – Max length of the sequence to be considered during tokenization.

  • max_target_length (int) – Max length of each answer to be produced.

  • padding (Union[str, bool]) – Padding type during tokenization.

  • question_column_name (str) – The key in the JSON file to recognize the question field.

  • context_column_name (str) – The key in the JSON file to recognize the context field.

  • answer_column_name (str) – The key in the JSON file to recognize the answer field.

  • doc_stride (int) – The stride amount to be taken when splitting up a long document into chunks.

Return type

QuestionAnsweringData

Returns

The constructed data module.

input_transform_cls

alias of flash.text.question_answering.input_transform.QuestionAnsweringInputTransform

Read the Docs v: stable
Versions
latest
stable
0.6.0
0.5.2
0.5.1
0.5.0
0.4.0
0.3.2
0.3.1
0.3.0
0.2.3
0.2.2
0.2.1
0.2.0
0.1.0post1
docs-fix_typing
Downloads
pdf
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.