Easy Programming: Python's Levenshtein method is much faster than difflib.SquenceMatcher method

Saturday, November 22, 2014

Python's Levenshtein method is much faster than difflib.SquenceMatcher method

I did one project last week. The project is mainly about text mining, comparing the similarity between two strings.

And I used Python's difflib.SequenceMatcher to do the similarity comparison. It took around 120 mins to finish a 400M times computations.

And after I switched to Levenshtein method on Python, it only took 30 mins to finish the same amount of computation.

I was told Levenshtein used C to do the computation and difflib.SequenceMatcher used Python to do the computation. So Levenshtein is much faster.

Easy Programming

ezoic

Saturday, November 22, 2014

Python's Levenshtein method is much faster than difflib.SquenceMatcher method

No comments:

Post a Comment

R is not a simple programming language, and it does better on reading excel files than python

Followers