LLM Norms

Code and data for the following paper:

Trott, S. Can large language models help augment English psycholinguistic datasets?. Behav Res 56, 6082–6100 (2024). https://doi.org/10.3758/s13428-024-02337-z

The data

All data can be found in data.

data/raw: contains original human norms for each judgment task, as well as the instructions.
data/processed: contains output of GPT-4 norming process.
data/lexical_statistics: contains files needed to reproduce the substitution analyses.

The LLM-generated norms are already included, but if you'd like to regenerate them, see the section below.

Reproducing the norms

The norms can be reproduced using either src/models/similarity.py (for judgments comparing two words or two contexts) or src/models/single_word.py (for judgments involving a single word). The task itself can be modified in the __main___ part of the script, or using a command line argument.

Note that access the OpenAI API will require an authentication. The code assumes this authentication information from a file in src/models called gpt_key; to run the code, you'll need to create an analogous file with your own authentication information.

Processing the output

Once you've reproduced the norms, you can run src/processing/process_datasets.py, which will convert .txt files to .csv files.

Running the analyses

Finally, the relevant analyses are contained in src/analysis as .Rmd files. The results have already been knit to .html.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
logs		logs
src		src
Estimating Contamination.ipynb		Estimating Contamination.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Norms

The data

Reproducing the norms

Processing the output

Running the analyses

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Norms

The data

Reproducing the norms

Processing the output

Running the analyses

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages