-
Notifications
You must be signed in to change notification settings - Fork 3
wrong split? #5
Description
I have tried to use the following script to generate a stimulus set with a split condition for frequency and a control_condition for number of syllables. The script does not always generate a stimulus set according to the split conditions. Some of the items have a frequency that exceeds the set limits. Any idea what is wrong with the script or lexOPS?
Best wishes Jens
#Demo Paket lexOPS
myLibs <- c("LexOPS", "tidyverse")
lapply(myLibs, require, character.only = TRUE)
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
#Beispiel dlexDB nur vier buchstabige Woerter
dlexDB <- read_tsv("dlexdb_results.csv", locale = locale(encoding = "UTF-8"))
colnames(dlexDB) <- c("Type", "PoSTag", "Lemma", "Silben", "AnTypeFreqN", "TypeFreqN")
dlexDB$NrSilben <- str_count(dlexDB$Silben, "-") + 1
dlexDB$TypeL <- nchar(dlexDB$Type)
dlexDB$LemmaL <- nchar(dlexDB$Lemma)
range(dlexDB$AnTypeFreqN)
stimuli <- dlexDB |>
set_options(id_col = "Lemma") |>
split_by(AnTypeFreqN, 10:20 ~ 200:6557) |>
control_for(NrSilben, 0:0) |>
generate(n = "all", match_null = "inclusive")
stimLong <- long_format(stimuli)
stimLong <- stimLong[order(stimLong$condition),]
#deskriptive Statistik
stimLong %>% group_by(condition) %>% summarise(M = mean(AnTypeFreqN),
SD = sd(AnTypeFreqN),
Max = max(AnTypeFreqN))