This database is registered with Zenodo and part of a published paper (to be linked to). Please cite this data by referring to that paper (citation to follow).
All modules of the database are provided in the CLDF format. The cldf-atlas module contains the main ATLAs database, which encodes typological variables for languages as a whole across many domains. This is the data that is exposed through the atlasclld CLLD web interface. Individual feature set descriptions for the main database are included in the featuresets folder, which further document the database.
The cldf-alignment module encodes data about morphological alignment at the level of constructions and paradigms. The cldf-possession module encodes data about adnominal possession at the level of possession classes and individual possession and unpossession constructions. The cldf-sgpl-verbs module encodes data about singular-plural verb stem alternation, at the level of individual verb pairs. Derived features based on these modules are presented in the main ATLAs database.
Derived features, which come from both independent modules and from "raw" features within the main database, are generated by the running of scripts within the processing-scripts folder. Scripts are written in both Python and R. Python dependencies can be installed from processing-scripts/alignment/requirements.txt. The R code depends on the packages argparse, dplyr, data.table, tidyr, tibble, stringr, and rje. All scripts can be run at once by calling run.sh from within the processing-scripts folder. Errors and warnings are written out to outputs.
For more information about the structure of the database, please see the associated paper.