QuestionAnsweringTask¶
- class flash.text.question_answering.model.QuestionAnsweringTask(backbone='sshleifer/tiny-distilbert-base-cased-distilled-squad', max_source_length=384, max_target_length=30, padding='max_length', doc_stride=128, loss_fn=None, optimizer='Adam', lr_scheduler=None, metrics=None, learning_rate=None, enable_ort=False, n_best_size=20, version_2_with_negative=True, null_score_diff_threshold=0.0, use_stemmer=True)[source]¶
The
QuestionAnsweringTask
is aTask
for extractive question answering. For more details, see question_answering.You can change the backbone to any question answering model from HuggingFace/transformers using the
backbone
argument.Note
When changing the backbone, make sure you pass in the same backbone to the
Task
and theDataModule
object! Since this is a QuestionAnswering task, make sure you use a QuestionAnswering model.- Parameters
max_source_length¶ (
int
) – Max length of the sequence to be considered during tokenization.max_target_length¶ (
int
) – Max length of each answer to be produced.padding¶ (
Union
[str
,bool
]) – Padding type during tokenization.doc_stride¶ (
int
) – The stride amount to be taken when splitting up a long document into chunks.loss_fn¶ (
Union
[Callable
,Mapping
,Sequence
,None
]) – Loss function for training.optimizer¶ (
TypeVar
(OPTIMIZER_TYPE
,str
,Callable
,Tuple
[str
,Dict
[str
,Any
]],None
)) – Optimizer to use for training.lr_scheduler¶ (
Optional
[TypeVar
(LR_SCHEDULER_TYPE
,str
,Callable
,Tuple
[str
,Dict
[str
,Any
]],Tuple
[str
,Dict
[str
,Any
],Dict
[str
,Any
]],None
)]) – The LR scheduler to use during training.metrics¶ (
Optional
[TypeVar
(METRICS_TYPE
,Metric
,Mapping
,Sequence
,None
)]) – Metrics to compute for training and evaluation. Defauls to calculating the ROUGE metric. Changing this argument currently has no effect.learning_rate¶ (
Optional
[float
]) – Learning rate to use for training, defaults to 3e-4enable_ort¶ (
bool
) – Enable Torch ONNX Runtime Optimization: https://onnxruntime.ai/docs/#onnx-runtime-for-trainingn_best_size¶ (
int
) – The total number of n-best predictions to generate when looking for an answer.version_2_with_negative¶ (
bool
) – If true, some of the examples do not have an answer.max_answer_length¶ – The maximum length of an answer that can be generated. This is needed because the start and end predictions are not conditioned on one another.
null_score_diff_threshold¶ (
float
) – The threshold used to select the null answer: if the best answer has a score that is less than the score of the null answer minus this threshold, the null answer is selected for this example. Only useful when version_2_with_negative=True.use_stemmer¶ (
bool
) – Whether Porter stemmer should be used to strip word suffixes to improve matching.
- classmethod available_finetuning_strategies(cls)¶
Returns a list containing the keys of the available Finetuning Strategies.
- classmethod available_lr_schedulers(cls)¶
Returns a list containing the keys of the available LR schedulers.
- classmethod available_optimizers(cls)¶
Returns a list containing the keys of the available Optimizers.
- classmethod available_outputs(cls)¶
Returns the list of available outputs (that can be used during prediction or serving) for this
Task
.Examples
..testsetup:
>>> from flash import Task
>>> print(Task.available_outputs()) ['preds', 'raw']
- classmethod load_from_checkpoint(cls, checkpoint_path, map_location=None, hparams_file=None, strict=True, **kwargs)¶
Primary way of loading a model from a checkpoint. When Lightning saves a checkpoint it stores the arguments passed to
__init__
in the checkpoint under"hyper_parameters"
.Any arguments specified through **kwargs will override args stored in
"hyper_parameters"
.- Parameters
checkpoint_path¶ (
Union
[str
,Path
,IO
]) – Path to checkpoint. This can also be a URL, or file-like objectmap_location¶ (
Union
[device
,str
,int
,Callable
[[Union
[device
,str
,int
]],Union
[device
,str
,int
]],Dict
[Union
[device
,str
,int
],Union
[device
,str
,int
]],None
]) – If your checkpoint saved a GPU model and you now load on CPUs or a different number of GPUs, use this to map to the new setup. The behaviour is the same as intorch.load()
.hparams_file¶ (
Union
[str
,Path
,None
]) –Optional path to a
.yaml
or.csv
file with hierarchical structure as in this example:drop_prob: 0.2 dataloader: batch_size: 32
You most likely won’t need this since Lightning will always save the hyperparameters to the checkpoint. However, if your checkpoint weights don’t have the hyperparameters saved, use this method to pass in a
.yaml
file with the hparams you’d like to use. These will be converted into adict
and passed into yourLightningModule
for use.If your model’s
hparams
argument isNamespace
and.yaml
file has hierarchical structure, you need to refactor your model to treathparams
asdict
.strict¶ (
bool
) – Whether to strictly enforce that the keys incheckpoint_path
match the keys returned by this module’s state dict.**kwargs¶ – Any extra keyword args needed to init the model. Can also be used to override saved hyperparameter values.
- Return type
Self
- Returns
LightningModule
instance with loaded weights and hyperparameters (if available).
Note
load_from_checkpoint
is a class method. You should use yourLightningModule
class to call it instead of theLightningModule
instance.Example:
# load weights without mapping ... model = MyLightningModule.load_from_checkpoint('path/to/checkpoint.ckpt') # or load weights mapping all weights from GPU 1 to GPU 0 ... map_location = {'cuda:1':'cuda:0'} model = MyLightningModule.load_from_checkpoint( 'path/to/checkpoint.ckpt', map_location=map_location ) # or load weights and hyperparameters from separate files. model = MyLightningModule.load_from_checkpoint( 'path/to/checkpoint.ckpt', hparams_file='/path/to/hparams_file.yaml' ) # override some of the params with new values model = MyLightningModule.load_from_checkpoint( PATH, num_layers=128, pretrained_ckpt_path=NEW_PATH, ) # predict pretrained_model.eval() pretrained_model.freeze() y_hat = pretrained_model(x)