ezoic

Saturday, November 22, 2014

Python's Levenshtein method is much faster than difflib.SquenceMatcher method

I did one project last week. The project is mainly about text mining, comparing the similarity between two strings.

And I used Python's difflib.SequenceMatcher to do the similarity comparison. It took around 120 mins to finish a 400M times computations.

And after I switched to Levenshtein method on Python, it only took 30 mins to finish the same amount of computation.

I was told Levenshtein used C to do the computation and difflib.SequenceMatcher used Python to do the computation. So Levenshtein is much faster.

No comments:

Post a Comment

looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...