ezoic

Thursday, June 22, 2017

If the python program showed the error Error tokenizing data. C error: Expected 1 fields in line 13, saw 2, the reason

If the python program showed the error Error tokenizing data. C error: Expected 1 fields in line 13, saw 2, the reason is probably you did not use the right delimiter for your file.

I used pandas python, read_csv to read the csv file, and I got the error: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2.

Later on, I found that the reason I got this is because, I used the wrong delimiter in sep. The columns are separated by "\t", but I used "," for delimiter. I changed, everything got fine.

Customer segmentation using data science



http://blog.yhat.com/posts/customer-segmentation-using-python.html


https://www.r-bloggers.com/search/segmentation/


http://planetpython.org/


http://www.pythonblogs.com/


http://pythonthusiast.pythonblogs.com/230_pythonthusiast/archive/1347_starting_to_use_kivy__developing_letter_of_heroes_
an_android_alphabet_teaching_aid_for_kids_part_2_of_2.html


Propagation clustering:

http://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html














Wednesday, June 14, 2017

foreach loop can not update the list in scala, but for loop can

I did a script, part of it is to update a list within a loop. foreach loop can not update the list, but for loop can. Here is the sample code:

For loop:

var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb =  scala.collection.mutable.ListBuffer[String]()



val Pattern1 = "TYPEA".r

val Pattern2 = "TYPEB".r
var rrr=rdd.toDF.collect()
for (f<-rrr) {
      val fName=f.getString(0)
     for (k<- fName.split("\\/")) { if ( Pattern1.findFirstIn(k)!=None)
{println(k);  Typea:+=k.toString}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString}

}
}

val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)

println(Typea1)
println(Typeb1)

It updated the list, Typea, Typeb. After updating, I printed the lists, there were  strings in it.



Foreach loop:

var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb =  scala.collection.mutable.ListBuffer[String]()



val Pattern1 = "TYPEA".r

val Pattern2 = "TYPEB".r
rdd.toDF.foreach{ f=>
      val fName=f.getString(0)
      fName.split("\\/").foreach(k=> if ( Pattern1.findFirstIn(k)!=None)
{println(k);  Typea:+=k.toString; println(Typea)}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString;println(Typeb)}
)
}

val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)

println(Typea1)
println(Typeb1)

After updating the list, I printed the lists, they were empty.

Tuesday, June 13, 2017

amazon s3 spark scala get sub directories



https://stackoverflow.com/questions/42063077/spark-read-multiple-directories-into-mutiple-dataframes


 import org.apache.hadoop.conf.Configuration
    import org.apache.hadoop.fs.{ FileSystem, Path }

    val path = "foo/"

    val hadoopConf = new Configuration()
    val fs = FileSystem.get(hadoopConf)
    val paths: Array[String] = fs.listStatus(new Path(path)).
      filter(_.isDirectory).
      map(_.getPath.toString)




Wednesday, June 7, 2017

Spark SQL query in cluase for strings


val list = List("a","b")
val query = s"select * from df where uid in
 (${list.map ( x => "'" + x + "'").mkString(",") })"
 
 

Monday, June 5, 2017

Scala Saprk loop through a data frame



http://permalink.gmane.org/gmane.comp.lang.scala.spark.user/30128

You could call *collect()* method provided by DataFram API. This will give
you an *Array[Rows]*. You could then iterate over this array and create
your map. Something like this :

val mapOfVals = scala.collection.mutable.Map[String,String]()
var rows = DataFrame.collect()
rows.foreach(r => mapOfVals.put(r.getString(0), r.getString(1)))
println("KEYS : " + mapOfVals.keys)
println("VALS : " + mapOfVals.values)
 
 



looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...