Skip to content

adymaharana/cococon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exposing and Addressing Cross-Task Inconsistency in Unified Vision-Language Models

image

CoCoCON Dataset

We provide the CoCoCON evaluation dataset consisting of 1500 samples at ./data/cococon.json. Each sample contains 1-5 contrast sets. See paper for details and a few examples below. image

Inference using Unified-IO Models

We evaluate the pretrained checkpoints provided here on CoCoCON.

  1. Migrate to the directory unified-io, follow instructions in the original repository to create JAX environment.
    cd unified-io
  2. Download pretrained Unified-IO checkpoints and save in the directory ./checkpoints/.
  3. To run likelihood-based evaluation of cross-task consistency using CoCoCON, execute the following command. Sizes can be chosen from small, base, large and xl. Output files are saved at ./results/ by default. The path to validation split (val2014) of MS-COCO images is needed as additional input.
    bash evaluate_cococon.sh <size> <path-to-image-directory>
  4. To generate predictions for the samples in CoCoCON, execute the following command:
    bash evaluate_tasks.sh <size> <path-to-image-directory>
  5. Follow instructions here for evaluation of task-specific accuracies using output from Step 3.

Training and Inference using OFA Models

We first finetune pretrained checkpoints of OFA models on the four tasks in CoCoCON and then evaluate them on CoCoCON. Instructions for training OFA models coming soon!

Evaluation of COCO Tasks

Migrate to the evaluators directory i.e. cd evaluators/.

Image Captioning
  1. Install packages required for COCO Caption Evaluation.
    pip install -r requirements.txt
  2. Run the following command using output files from Unified-IO or OFA.
    python coco_eval.py <path-to-output-file> ../data/cococon.json

Acknowledgements

We thank the researchers behind Unified-IO and OFA for making their models available for training and inference.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors