- The dataset can be found at this link
- For visualization of brain masks using voxelwise_tutorials, download the data present at this link
- Download the dataset and note down the path to the directory where all the nsd_* dataset directories are present. We will need to put it in the script later on.
- Use the
environment.ymlfile to create the conda environment with all the dependencies. You might have to change the CUDA version to the one on your local device.
- Go to the
./brain_activity_prediction/extract_image_embeddings/directory - All the current prompts will be in the
prompts.pyfile there. - Open and edit the
run_experiments.shfile:- Change the value of the
BASE_DIRECTORYvariable to the path of the dataset folder. This is also the directory where the hidden states will be put. - Change the value of the
GPU_IDvariable to the CUDA ID number of the GPU. For example1. - You can change which prompts to run the experiments for in line 27 (
for prompt_number in {1..9}; do). Currently it will run for all the 9 prompts.
- Change the value of the
- Then just run the script with
./run_experiments.sh.
- Go to the
./brain_activity_prediction/extract_image_embeddings/directory. - In this directory, there are scripts present for each model:
idefics_9b_instruct.pyinstruct_blip_13b.pyllava_1.5_13b.pymplug_owl_llama_7b.pyopenflamingo_9b.pyvit_huge_patch14_224_in21k.py
- All of them take the following flags as input:
-b BATCH_SIZE, --batch-size BATCH_SIZE
The number of images which will be passed into the model together
-d BASE_DIR, --base-dir BASE_DIR
The path to the directory where all the inputs and the outputs will be cached and loaded from
-s {1,2,5,7}, --subject {1,2,5,7}
The subject number from the NSD dataset whose images' embeddings are to be extracted
-t, --test-run, --no-test-run
Enable test-run to just output the outputs for the first batch and exit
-p PROMPT_NUMBER, --prompt-number PROMPT_NUMBER
The number of the prompt from prompts.py that will be passed into the model along with the images
-g GPU_ID, --gpu-id GPU_ID
The CUDA GPU id on which to run inference
-
In all the scripts,
BASE_DIRis basically path to the directory which contains all thensd_directories from dataset. All the outputs will also be extracted into this folder. -
All the scripts output the hidden layers in different batches inside the
BASE_DIR/image_embeddingsdirectory. They are batchified so that if there is a failure in between then the program can continue from a saved point automatically.
- Go to the
./brain_activity_prediction/align_image_embeddings/directory. - Here, you will find the
align_model.pyscript which takes the following flags as input:
-s {1,2,5,7}, --subject {1,2,5,7}
The subject number from the NSD dataset whose image embeddings are to be trained
-d BASE_DIR, --base-dir BASE_DIR
The path to the directory where all the models, inputs and outputs will be stored and loaded from
-m MODEL_ID, --model-id MODEL_ID
The model id whose hidden state representations are to be used
-l LAYER_NUM, --layer-number LAYER_NUM
The layer number to align. It can be 0,1 etc or -1 for last layer. Not passing it will train all layers.
-p PROMPT_NUMBER, --prompt-number PROMPT_NUMBER
The prompt number to use for aligning
--max-log-10-alpha MAX_LOG_10_ALPHA
Maximum value of log10 alpha to consider, default is 4 (basically till 10^4)
--num-alphas NUM_ALPHAS
Number of alpha values to samples, default 60
- In this,
MODEL_IDcan be:google/vit-huge-patch14-224-in21kSalesforce/instructblip-vicuna-13bHuggingFaceM4/idefics-9b-instructllava-hf/llava-1.5-13b-hfMAGAer13/mplug-owl-llama-7bopenflamingo/OpenFlamingo-9B-vitl-mpt7b
- Around the line 50 in this file, you can change the parameters to fully utilize the GPU