DSGram is a novel evaluation framework designed to enhance the performance evaluation of Grammatical Error Correction (GEC) models, especially in the era of large language models (LLMs). Traditional reference-based evaluation metrics often fall short due to the inherent discrepancies between model-generated corrections and provided gold references. DSGram addresses this issue by introducing a dynamic weighting mechanism that integrates Semantic Coherence, Edit Level, and Fluency.
This repository contains the code and data associated with the paper: "DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models" by Jinxiang Xie, Yilin Li, Xunjian Yin, and Xiaojun Wan.
Dataset/: Contains the datasets used for evaluation, including human-annotated and LLM-simulated sentences.results/: Directory to store the evaluation results.
DSGram/: Source code for implementing the DSGram evaluation framework.
If you use DSGram in your research, please cite our paper:
@misc{xie2024dsgramdynamicweightingsubmetrics,
title={DSGram: Dynamic Weighting Sub-Metrics for Grammatical Error Correction in the Era of Large Language Models},
author={Jinxiang Xie and Yilin Li and Xunjian Yin and Xiaojun Wan},
year={2024},
eprint={2412.12832},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2412.12832},
}

