ezoic

Monday, June 6, 2016

A possible search algorithm

One of my tasks at work is to match one set of strings to another set of strings. Some of the strings in one set is similar to the strings in another set.

I used the string matching method to do this. Calculate the similarity score  between one string and each string in another set using Levenshtein algorithm, and sort the similarity scores,  the string  with the highest score with be assigned as the match to the original string.

Here is the wiki page for the Levenshtein algorithm:

https://en.wikipedia.org/wiki/Levenshtein_distance



Here is part of the code.

The whole code is very long. We used some other searching logic to refine the search.

 We deployed it to a web service.

Similarly, google used some searching algorithm. This method can possibly be a searching algorithm. 


if level2=='UNK' and level3=='unk' and level1!='unk':
      ee={}
      level=level1
      for e in range(len(feature)):
        d2=float(Levenshtein.ratio(str(level),feature[e][len(feature[e])-1]))
        ee[str(feature[e][2].strip())]=d2
      my_list=sorted(ee.items(),key=lambda x:x[1],reverse=True)[:5]
      match=selection(my_list,avails_dic)
      if logging_level>5:
          print "returned value : % s" % match
      step2=time.strftime("%x %X")

      tdelta = datetime.datetime.strptime(step2, FMT) - datetime.datetime.strptime(step1, FMT)
      if logging_level>4:
           print "first  step ending time   is:"+step2
           print "first  step used time is %s:" % tdelta
      return match

No comments:

Post a Comment

looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...