ezoic

Saturday, November 22, 2014

Save file in .csv file or tab delimited file

One day, I tried to finish a project in which I used some flat file. First, I save the file to .txt tab delimited file. And when I read in the file, there was always an error, like the rows are not of the same length. Then I converted the file to .csv file. There was no more error. Later on, I found, since when it was tab delimited file, some columns were not actually delimited by tab, i.e. two or more columns are collapsed together, and became one column. So for different rows the number of columns are not equal. But when I used the csv file, no such issue, since the commas are always there, and separated the columns.

So maybe the csv files are safer to used than tab delimited file.

Python's Levenshtein method is much faster than difflib.SquenceMatcher method

I did one project last week. The project is mainly about text mining, comparing the similarity between two strings.

And I used Python's difflib.SequenceMatcher to do the similarity comparison. It took around 120 mins to finish a 400M times computations.

And after I switched to Levenshtein method on Python, it only took 30 mins to finish the same amount of computation.

I was told Levenshtein used C to do the computation and difflib.SequenceMatcher used Python to do the computation. So Levenshtein is much faster.

Thursday, November 6, 2014

This tutorial greatly helps me understand classes of python language

This tutorial greatly helps me understand classes in python. It explains everything very clear.

http://www.sthurlow.com/python/lesson08/

looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...