Longformer Encoder Decoder Baselines for Qasper

This is an implementation of the baselines reported in the paper A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers by Dasigi et al., published at NAACL 2021.

Prerequisites

Download data from here.
Install requirements as follows:

pip install -r requirements.txt

Experiments

With evidence selection scaffold

The configuration file to use is training_config/led_base_with_evidence_scaffold.jsonnet. Remember to set the data paths before training.

allennlp train training_config/led_base_with_evidence_scaffold.jsonnet -s <PATH TO SERIALIZATION DIRECTORY> --include-package qasper_baselines

At the end of training, you will see results on the development set. best_validation_answer_f1 and best_validation_evidence_f1 should give you the Answer F1 and Evidence F1 reported in the paper.

If you do not have a GPU, you will need to set cuda_device to -1.

Without evidence scaffold

Just set use_evidence_scaffold in the model section of the configuration to false.

Experiments on shorter contexts

The paper also reports results of training and evaluating models given contexts shorter than the full text of the paper. Use the configuration file training_config/led_base_smaller_context.jsonnet for these experiments, and set the context field in the dataset_reader and validation_dataset_reader sections of the configuration to appropriate values.

Heuristic evidence baselines

The script scripts/evidence_retrieval_heuristic_baselines.py contains these baselines. Just run

python scripts/evidence_retrieval_heuristic_baselines.py <PATH TO DEV DATA>

You will need to install sklearn for this script.

Feel free to open pull requests if find any thing that needs fixing.

Experiments with LED-large

You can run these by changing the value of transformer_model variable to allenai/led-large-16384. Note that as stated in the paper, the answer_f1 value will be very low (less than 20 F1 points).

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
fixtures		fixtures
qasper_baselines		qasper_baselines
scripts		scripts
tests		tests
training_config		training_config
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Longformer Encoder Decoder Baselines for Qasper

Prerequisites

Experiments

With evidence selection scaffold

Without evidence scaffold

Experiments on shorter contexts

Heuristic evidence baselines

Experiments with LED-large

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

allenai/qasper-led-baseline

Folders and files

Latest commit

History

Repository files navigation

Longformer Encoder Decoder Baselines for Qasper

Prerequisites

Experiments

With evidence selection scaffold

Without evidence scaffold

Experiments on shorter contexts

Heuristic evidence baselines

Experiments with LED-large

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages