This repository is supplemental to the paper 'Context is Key: Aligning Large Language Models with Human Moral Judgments through Retrieval-Augmented Generation', presented at FLAIRS-38 and published in Florida Online Journals.
The project introduces an AI agent that evaluates interpersonal conflicts by using Retrieval-Augmented Generation (RAG) to:
- Collect similar conflicts from a dataset
- Use these conflicts as context to refine the LLM's judgment
- Provide adaptable moral evaluations without costly fine-tuning
A dataset containing the top 50,000 submissions to the r/AmITheAsshole (r/AITA) subreddit from 2018-2022 was created, including the top ten comments for each post.
Using OpenAI's GPT-4o as the base LLM, two agents were developed:
- Base: doesn't use RAG to refine its responses.
- RAG: uses RAG to retrieve AITA conflicts to use as evidence to iteratively refine its response.
The RAG agent, demonstrated clear improvements over the Base Agent, as its accuracy increased from 77% to 84% and its Matthews correlation coefficient (MCC) improved from 0.357 to 0.469. Additionally, the generation of any toxic responses was practically eliminated.
These findings demonstrate that integrating LLMs into RAG frameworks effectively improves alignment with human moral judgments while mitigating harmful language.