Hi, I am looking for a faster alternative to a common substrings algorithm for my python bioinformatics package pydna.
I currently use a Python implementation of the Kärkkäinen Sanders suffix array algorithm, mostly for predicting homologous recombination between DNA molecules. This is surprisingly fast, but profiling revealed that it is still a bottleneck.
My background is not CS, so I wonder if anyone could show me how to use pydivsufsort to get all common substrings between two strings longer than some given cutoff.
Hi, I am looking for a faster alternative to a common substrings algorithm for my python bioinformatics package pydna.
I currently use a Python implementation of the Kärkkäinen Sanders suffix array algorithm, mostly for predicting homologous recombination between DNA molecules. This is surprisingly fast, but profiling revealed that it is still a bottleneck.
My background is not CS, so I wonder if anyone could show me how to use pydivsufsort to get all common substrings between two strings longer than some given cutoff.