Skip to content

RokasEl/simgen

Repository files navigation

DOI DOI

Similarity based Molecular Generation (SiMGen)

SiMGen is a local similarity based molecular generation method. It uses a pretrained MACE model to generate local molecular descriptors and a time-dependent similarity kernel to generate new molecules.

SiMGen is available as an online web-tool at https://zndraw.icp.uni-stuttgart.de/.

The paper is now out on Nature Communications!

Installation

The package can be installed using pip:

pip install simgen

Note that the analysis code requires the rdkit package, which is not installed by default. To install it, run

pip install simgen[all]

Usage

There are two main ways to use the package: via a command line interface or interactively using ZnDraw.

Interactive use

We host an online gpu-powered web-tool at https://zndraw.icp.uni-stuttgart.de/. The documentation for the web-tool is available here.

However, you can also run ZnDraw locally. After installing the package, you can run the following command to start the web-tool:

zndraw --port 1234 PATH_TO_XYZ_FILE # Path is optional; Use --no-browser for remote servers
# Do the next command in a separate terminal
simgen connect --device cuda # default port is 1234

If you want to try out linker generation, add the --add-linkers flag to the simgen connect command.

Run simgen connect --help for more information.

Tip

SiMGen uses the mace-models package to download data and the hydrogenation model. Downloading local copies can speed up your workflow. To do so, run

git clone https://github.com/RokasEl/MACE-Models
cd MACE-Models
dvc pull
simgen init . # or simgen init /path/to/MACE-Models

This will set SiMGen's default path to the local MACE models.

CLI use

For unconstrained generation, you can use the following command:

python scripts/generate_mols_cli.py --save-path PATH_TO_SAVE_MOLS \
    --num-molecules 10 \
    --num-heavy-atoms 9 \
    --track-trajectories \
    --prior-gaussian-covariance 1. 1. 0.1 # controls the shape of the prior

To construct molecules with more complicated shapes, you will have to manually define the shape via a point cloud prior. See scripts/paper_examples/generate_macrocycles.py for an example.

docker-compose setup

Given you have setup ZnDraw you can setup SiMGen via docker-compose.yaml as follows:

services:
  simgen:
    image: pythonf/simgen
    restart: always
    command: simgen connect --device "cuda" --mace-model-name "medium_spice" --reference-data-name "simgen_reference_data_medium" --path . --hydrogenation-model-name hydromace_spice_only
    environment:
      - SIMGEN_AUTH_TOKEN=XXXXXXXXXXX
      - SIMGEN_URL=XXXXXXXXXXX
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

References

If you use SiMGen in your research, please cite the following paper:

@article{elijosius_zero_2025,
	title = {Zero shot molecular generation via similarity kernels},
	volume = {16},
	issn = {2041-1723},
	url = {https://www.nature.com/articles/s41467-025-60963-3},
	doi = {10.1038/s41467-025-60963-3},
	number = {1},
	urldate = {2025-07-01},
	journal = {Nature Communications},
	author = {Elijošius, Rokas and Zills, Fabian and Batatia, Ilyes and Norwood, Sam Walton and Kovács, Dávid Péter and Holm, Christian and Csányi, Gábor},
	month = jul,
	year = {2025},
	note = {Publisher: Nature Publishing Group},
	pages = {1--16},
}

License

The code is licensed under the MIT license. See LICENSE for more information.

About

Zero Shot Molecular Generation via Similarity Kernels

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors