-
Notifications
You must be signed in to change notification settings - Fork 25
Description
I believe that all words following "Roller bearing" from the CIDE.R source file are missing from the resulting dictionary. See "Rut", "Ruta-baga", etc.
I ran a git bisect, and it appears that the breaking change was introduced in 3375fe6. I tried to read the changes introduced there but I haven't been able to figure the issue out yet.
There's a similar issue for words following "Stooge", such as "Sweet". Here the issue seems to be that the source data is missing a closing </p> tag for the "Stooge" entry. I guess this should be reported upstream to GCIDE, but perhaps we could make the parser more robust against things like that?
Given that I just stumbled upon some examples, I suspect that there are quite a few words missing. I wonder if we could come up with an automated way to verify that the resulting dictionary contains all words from the source files?