Shortcuts

Question Answering

The Task

Question Answering is the task of being able to answer questions pertaining to some known context. For example, given a context about some historical figure, any question pertaininig to the context should be answerable. In our case the article would be our input context and question, and the answer would be the output sequence from the model.

Note

We currently only support Extractive Question Answering, like the task performed using the SQUAD like datasets.


Example

Let’s look at an example. We’ll use the SQUAD 2.0 dataset, which contains train-v2.0.json and dev-v2.0.json. Each JSON file looks like this:

{
            "answers": {
                    "answer_start": [94, 87, 94, 94],
                    "text": ["10th and 11th centuries", "in the 10th and 11th centuries", "10th and 11th centuries", "10th and 11th centuries"]
            },
            "context": "\"The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th and 11th centuries gave thei...",
            "id": "56ddde6b9a695914005b9629",
            "question": "When were the Normans in Normandy?",
            "title": "Normans"
    }
...

In the above, the context key represents the context used for the question and answer, the question key represents the question being asked with respect to the context, the answer key stores the answer(s) for the question. id and title are used for unique identification and grouping concepts together respectively. Once we’ve downloaded the data using download_data(), we create the QuestionAnsweringData. We select a pre-trained backbone to use for our QuestionAnsweringTask and finetune on the SQUAD 2.0 data. The backbone can be any Question Answering model from HuggingFace/transformers.

Note

When changing the backbone, make sure you pass in the same backbone to the QuestionAnsweringData and the QuestionAnsweringTask!

Next, we use the trained QuestionAnsweringTask for inference. Finally, we save the model. Here’s the full example:

from flash import Trainer
from flash.core.data.utils import download_data
from flash.text import QuestionAnsweringData, QuestionAnsweringTask

# 1. Create the DataModule
download_data("https://pl-flash-data.s3.amazonaws.com/squad_tiny.zip", "./data/")

datamodule = QuestionAnsweringData.from_squad_v2(
    train_file="./data/squad_tiny/train.json",
    val_file="./data/squad_tiny/val.json",
)

# 2. Build the task
model = QuestionAnsweringTask()

# 3. Create the trainer and finetune the model
trainer = Trainer(max_epochs=3, limit_train_batches=1, limit_val_batches=1)
trainer.finetune(model, datamodule=datamodule)

# 4. Answer some Questions!
predictions = model.predict(
    {
        "id": ["56ddde6b9a695914005b9629", "56ddde6b9a695914005b9628"],
        "context": [
            """
        The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th
        and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse
        ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland and Norway who, under
        their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations
        of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their
        descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct
        cultural and ethnic identity of the Normans emerged initially in the first half of the 10th
        century, and it continued to evolve over the succeeding centuries.
        """,
            """
        The Normans (Norman: Nourmands; French: Normands; Latin: Normanni) were the people who in the 10th
        and 11th centuries gave their name to Normandy, a region in France. They were descended from Norse
        ("Norman" comes from "Norseman") raiders and pirates from Denmark, Iceland and Norway who, under
        their leader Rollo, agreed to swear fealty to King Charles III of West Francia. Through generations
        of assimilation and mixing with the native Frankish and Roman-Gaulish populations, their
        descendants would gradually merge with the Carolingian-based cultures of West Francia. The distinct
        cultural and ethnic identity of the Normans emerged initially in the first half of the 10th
        century, and it continued to evolve over the succeeding centuries.
        """,
        ],
        "question": ["When were the Normans in Normandy?", "In what country is Normandy located?"],
    }
)
print(predictions)

# 5. Save the model!
trainer.save_checkpoint("question_answering_on_sqaud_v2.pt")

Accelerate Training & Inference with Torch ORT

Torch ORT converts your model into an optimized ONNX graph, speeding up training & inference when using NVIDIA or AMD GPUs. Enabling Torch ORT requires a single flag passed to the QuestionAnsweringTask once installed. See installation instructions here.

Note

Not all Transformer models are supported. See this table for supported models + branches containing fixes for certain models.

...

model = QuestionAnsweringTask(backbone="distilbert-base-uncased", max_answer_length=30, enable_ort=True)
Read the Docs v: stable
Versions
latest
stable
0.5.2
0.5.1
0.5.0
0.4.0
0.3.2
0.3.1
0.3.0
0.2.3
0.2.2
0.2.1
0.2.0
0.1.0post1
docs-fix_tabular_forecasting
Downloads
pdf
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.