Skip to content

wrong split? #5

@JensBoelte

Description

@JensBoelte

I have tried to use the following script to generate a stimulus set with a split condition for frequency and a control_condition for number of syllables. The script does not always generate a stimulus set according to the split conditions. Some of the items have a frequency that exceeds the set limits. Any idea what is wrong with the script or lexOPS?

Best wishes Jens

dlexdb_results.csv

#Demo Paket lexOPS

myLibs <- c("LexOPS", "tidyverse")
lapply(myLibs, require, character.only = TRUE)

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))

#Beispiel dlexDB nur vier buchstabige Woerter

dlexDB <- read_tsv("dlexdb_results.csv", locale = locale(encoding = "UTF-8"))

colnames(dlexDB) <- c("Type", "PoSTag", "Lemma", "Silben", "AnTypeFreqN", "TypeFreqN")

dlexDB$NrSilben <- str_count(dlexDB$Silben, "-") + 1
dlexDB$TypeL <- nchar(dlexDB$Type)
dlexDB$LemmaL <- nchar(dlexDB$Lemma)

range(dlexDB$AnTypeFreqN)

stimuli <- dlexDB |>
set_options(id_col = "Lemma") |>
split_by(AnTypeFreqN, 10:20 ~ 200:6557) |>
control_for(NrSilben, 0:0) |>
generate(n = "all", match_null = "inclusive")

stimLong <- long_format(stimuli)
stimLong <- stimLong[order(stimLong$condition),]
#deskriptive Statistik
stimLong %>% group_by(condition) %>% summarise(M = mean(AnTypeFreqN),
SD = sd(AnTypeFreqN),
Max = max(AnTypeFreqN))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions