Large Language Models for Next Point-of-Interest Recommendation

This repository includes the implementation of the paper "Large Language Models for Next Point-of-Interest Recommendation".

Please select the version you wish to use (we strongly recommend you try the v2 implementation):

🌟 v2: swift based training

Note: This is the latest version of the framework.

Install

Clone this repository to your local machine.
Install the environment or follow the instruction from ms-swift:

cd v2
conda env create -f environment.yml

Note: Flash attention install can be tricky.

Dataset

Download the datasets raw data from datasets.

Unzip datasets.zip to ./datasets
Unzip datasets/nyc/raw.zip to datasets/nyc.
Unzip datasets/tky/raw.zip to datasets/tky.
Unzip datasets/ca/raw.zip to datasets/ca.
run python preprocesssing/generate_ca_raw.py --dataset_name {dataset_name}

Preprocess

We observe that you can achieve good performance simply by using user history alone, without trajectory similarity.

cd ../preprocessing
python run.py -f best_conf/{dataset_name}.yml
cd ../v2
python convert_prompt_llm4poi.py \
    --dataset {dataset name} \
    --train_csv {your train csv path} \
    --test_csv {your test csv path} \
    --out_dir {your output path} \
    --history_limit 50

Main Performance

Train

cd v2
bash sft.sh

Test

bash server_vllm.sh
bash eval.sh

📜 v1: Legacy

Note: Original implementation for the SIGIR 2024 paper.

Install

Clone this repository to your local machine.
Install the enviroment by running

conda env create -f environment.yml

Alternatively, you can download the conda environment in linux directly with this google drive link. Then try:

mkdir -p llm4poi
tar -xzf "venv.tar.gz" -C "llm4poi"
conda activate llm4poi

Download the model from (https://huggingface.co/Yukang/Llama-2-7b-longlora-32k-ft)

Dataset

Download the datasets raw data from datasets.

Unzip datasets.zip to ./datasets
Unzip datasets/nyc/raw.zip to datasets/nyc.
Unzip datasets/tky/raw.zip to datasets/tky.
Unzip datasets/ca/raw.zip to datasets/ca.
run python preprocesssing/generate_ca_raw.py --dataset_name {dataset_name}

Preprocess

cd preprocessing

run python run.py -f best_conf/{dataset_name}.yml

run python traj_qk.py

cd ..

run python traj_sim --dataset_name {dataset_name} --model_path {your_model_path}

run python preprocessing/to_nextpoi_qkt.py --dataset_name {dataset_name}

Main Performance

train

run

torchrun --nproc_per_node=8 supervised-fine-tune-qlora.py  \
--model_name_or_path {your_model_path} \
--bf16 True \
--output_dir {your_output_path}\
--model_max_length 32768 \
--use_flash_attn True \
--data_path datasets/processed/{DATASET_NAME}/train_qa_pairs_kqt.json \
--low_rank_training True \
--num_train_epochs 3  \
--per_device_train_batch_size 1      \
--per_device_eval_batch_size 2      \
--gradient_accumulation_steps 1      \
--evaluation_strategy "no"      \
--save_strategy "steps"      \
--save_steps 1000      \
--save_total_limit 2      \
--learning_rate 2e-5      \
--weight_decay 0.0      \
--warmup_steps 20      \
--lr_scheduler_type "constant_with_warmup"      \
--logging_steps 1      \
--deepspeed "ds_configs/stage2.json" \
--tf32 True

test

run

python eval_next_poi.py --model_path {your_model_path}--dataset_name {DATASET_NAME} --output_dir {your_finetuned_model} --test_file "test_qa_pairs_kqt.txt"

Acknowledgement

This code is developed based on STHGCN and LongLoRA.

Citation

If you find our work useful, please consider cite our paper with following:

@inproceedings{li-2024-large,
author = {Li, Peibo and de Rijke, Maarten and Xue, Hao and Ao, Shuang and Song, Yang and Salim, Flora D.},
booktitle = {SIGIR 2024: 47th international ACM SIGIR Conference on Research and Development in Information Retrieval},
date-added = {2024-03-26 23:47:40 +0000},
date-modified = {2024-03-26 23:48:47 +0000},
month = {July},
publisher = {ACM},
title = {Large Language Models for Next Point-of-Interest Recommendation},
year = {2024}}

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
ds_configs		ds_configs
preprocessing		preprocessing
v2		v2
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
eval_next_poi.py		eval_next_poi.py
llama_attn_replace.py		llama_attn_replace.py
llama_attn_replace_sft.py		llama_attn_replace_sft.py
supervised-fine-tune-qlora.py		supervised-fine-tune-qlora.py
traj_sim.py		traj_sim.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Language Models for Next Point-of-Interest Recommendation

Please select the version you wish to use (we strongly recommend you try the v2 implementation):

Install

Dataset

Preprocess

Main Performance

Train

Test

Install

Dataset

Preprocess

Main Performance

train

test

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Large Language Models for Next Point-of-Interest Recommendation

Please select the version you wish to use (we strongly recommend you try the v2 implementation):

Install

Dataset

Preprocess

Main Performance

Train

Test

Install

Dataset

Preprocess

Main Performance

train

test

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages