NB: no need to clone repos by yourself, the script will checkout the most relevant branches
It assumes you have Java 11+ installed and the JAVA_HOME is at JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/. If your path is not like this, you can create a soft link. Or, you can modify the JAVA_HOME setting in the install_semehr.sh
- download the two files into your computer
# bash script for installation https://github.com/CogStack/CogStack-SemEHR/blob/safehaven_mini/installation/install_semehr.sh # semehr configuration template https://github.com/CogStack/CogStack-SemEHR/blob/safehaven_mini/installation/semehr_conf_template.json
-
run the downloaded bash script
sh install_semehr.sh
It will ask for a full path to install relevant software. All packages and repos will be installed there.
-
folder structure When it is installed successfully, the installation folder will contain the following folders.
gcp - contains Java based NLP core software packages semehr - contains two Github repos of NLP and machine learning modules data - the working folder, which contains the following subfolders - input_docs: the free text documents to be analysed - output_docs: the NLP raw outputs - semehr_results: the semehr post processed results - phenome_results: the text phenotyping results -
copy UMLS ontology into the system. (only needed if you would like to identify all UMLS concept mentions from free-text)
- unzip preprocessed UMLS file (please get in touch if you have got your license of using UMLS - a preprocessed copy will be shared with you)
- copy two subfolders in
output/en/intoYOUR_INSTALLATION_FOLDER/gcp/bio-yodie-1-2-1/bio-yodie-resources/en
- run nlp
cd $install_path/semehr/CogStack-SemEHR python semehr_processor.py ../../data/semehr_settings.json
results will be saved to $install_path/data/semehr_results
- [optional] run phenotype computing for stroke, for example
reulsts will be saved to
cd $install_path/semehr/nlp2phenome python predict_helper.py ./pretrained_models/stroke_settings/prediction_task.json python doc_inference.py ./pretrained_models/stroke_settings/doc_infer.json
$install_path/data/phenome_results - [optional] use customised document level rules
- goto
cd $install_path/semehr/nlp2phenome - edit
./pretrained_models/stroke_settings/prediction_task.json. Change therule_fileto a customised rule file, for example using$install_path/semehr/nlp2phenome/settings/stroke-subtype-rules-full.json.
- goto