Skip to content

jlscheerer/xtr-eval

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Baseline Evaluation of Google DeepMind's XTR


We build on the code provided by Google DeepMind to evaluate XTR. This evaluation serves as the baseline for the highly optimized XTR/WARP retrieval engine.

Installation

xtr-eval requires Python 3.8+, PyTorch 1.9+ and Tensorflow 2.8.2 and uses the Hugging Face Transformers library. We evaluate XTR using the XTR_base checkpoint provided on Hugging Face.

It is strongly recommended to create a conda environment using the commands below. We include the corresponding environment file (environment.yml).

conda activate xtr-eval
source ./scripts/build_indexes.sh

Environment Setup

To construct indexes and perform retrieval, define the following values in a config.yml file in the repository root:

BEIR_COLLECTION_PATH: "..."
LOTTE_COLLECTION_PATH: "..."
  • BEIR_COLLECTION_PATH: Designates the path to the datasets of the BEIR Benchmark.
  • LOTTE_COLLECTION_PATH: Specifies the path to the LoTTE dataset.

BEIR Benchmark

To download and extract a dataset from the BEIR Benchmark use the extract_collection.py script provided in XTR/WARP:

python utility/extract_collection.py -d ${dataset} -i "${BEIR_COLLECTION_PATH}" -s test

Replace ${dataset} with the desired dataset name as specified here.

LoTTE Dataset

  1. Download the LoTTE dataset files from here.
  2. Extract the files manually to the directory specified in LOTTE_COLLECTION_PATH.

About

Baseline Evaluation of Google DeepMind's XTR

Resources

Stars

Watchers

Forks

Contributors