Experiments with cleanup of dirty ALTO OCR files using anagram hashing.
As of 2013-11-09 the experiments are primarily about the applicability of anagram hashing for machine based synonym detection. No real effort has been done to make it scale.
| Name | Name | Last commit date | ||
|---|---|---|---|---|