LingGym

Overview

LingGym is a new benchmark that evaluates LLMs’ capacity for metalinguistic reasoning using Interlinear Glossed Text (IGT) and grammatical descriptions extracted from 18 typologically diverse reference grammars. Our work is presented in LingGym: How Far Are LLMs from Thinking Like Field Linguists?

In this github repo. We release three types of datasets:

Benchmark data (Main): the complete multiple choice dataset used for benchmark evaluation.
CSV files: extracted examples and explanations from grammar books.
IGT files: all IGT-format data extracted from grammar books.

The benchmark data is also available in Hugging Face: LINK

If you find our work useful, please consider citing our paper.

@inproceedings{yang-etal-2025-linggym,
    title = "{L}ing{G}ym: How Far Are {LLM}s from Thinking Like Field Linguists?",
    author = "Yang, Changbing  and
      Ma, Franklin  and
      Shi, Freda  and
      Zhu, Jian",
    editor = "Christodoulopoulos, Christos  and
      Chakraborty, Tanmoy  and
      Rose, Carolyn  and
      Peng, Violet",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.69/",
    doi = "10.18653/v1/2025.emnlp-main.69",
    pages = "1314--1340",
    ISBN = "979-8-89176-332-6"
}

Name		Name	Last commit message	Last commit date
Latest commit History 239 Commits
Benchmark_multiple_choice		Benchmark_multiple_choice
CVS-format		CVS-format
IGT-format		IGT-format
LICENSE		LICENSE
README.md		README.md
multiple_choice_question_generation.py		multiple_choice_question_generation.py
qwen2.5_7B_source+gloss+kp+trans.py		qwen2.5_7B_source+gloss+kp+trans.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LingGym

Overview

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

LingGym

Overview

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages