Hello all,
I am interested in obtaining the top N fuzzy variations of an
string (a person or company name) using the same concept
as the Levenshtein distance. Ussually Levenshtein is used to
compute the distance between two given strings ... but I would
like to instead have an algortihm to generate the top N highest
scoring fuzzy variations for any given term e.g.
Giovanni - 100%
Giovann - 98%
iovanni - 98%
Govanni - 98%
....
anni - 55%
This way I can precompute this thing in advance and not
during online matching.
Can anyone recommend an existing implementation e.g. in Java ?
Many thanks in advance,
Best Regards,
Giovanni
>> Stay informed about: algorithm for generating top fuzzy variations ...