I did one project last week. The project is mainly about text mining, comparing the similarity between two strings.
And I used Python's difflib.SequenceMatcher to do the similarity comparison. It took around 120 mins to finish a 400M times computations.
And after I switched to Levenshtein method on Python, it only took 30 mins to finish the same amount of computation.
I was told Levenshtein used C to do the computation and difflib.SequenceMatcher used Python to do the computation. So Levenshtein is much faster.
I wrote about the solutions to some problems I found from programming and data analytics. They may help you on your work. Thank you.
ezoic
Subscribe to:
Post Comments (Atom)
looking for a man
I am a mid aged woman. I was born in 1980. I do not have any kid. no complicated dating before . I am looking for a man here for marriage...
-
I tried to commit script to bitbucket using sourcetree. I first cloned from bitbucket using SSH, and I got an error, "authentication ...
-
https://github.com/boto/boto3/issues/134 import boto3 import botocore client = boto3.client('s3') result = client.list_obje...
-
Previously, I wanted to install "script" on Atom to run PHP. And there was some problem, like the firewall. So I tried atom-runner...
No comments:
Post a Comment