Princeton cs classes recommendation on coursera:
algorithms: design and analysis
previously, they had it.
Now they have analysis of algorithms:
https://www.coursera.org/learn/analysis-of-algorithms
related ones:
https://www.coursera.org/specializations/algorithms
https://www.edx.org/course/algorithm-design-analysis-pennx-sd3x
I wrote about the solutions to some problems I found from programming and data analytics. They may help you on your work. Thank you.
ezoic
Wednesday, June 28, 2017
Thursday, June 22, 2017
If the python program showed the error Error tokenizing data. C error: Expected 1 fields in line 13, saw 2, the reason
If the python program showed the error Error tokenizing data. C error: Expected 1 fields in line 13, saw 2, the reason is probably you did not use the right delimiter for your file.
I used pandas python, read_csv to read the csv file, and I got the error: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2.
Later on, I found that the reason I got this is because, I used the wrong delimiter in sep. The columns are separated by "\t", but I used "," for delimiter. I changed, everything got fine.
I used pandas python, read_csv to read the csv file, and I got the error: Error tokenizing data. C error: Expected 1 fields in line 13, saw 2.
Later on, I found that the reason I got this is because, I used the wrong delimiter in sep. The columns are separated by "\t", but I used "," for delimiter. I changed, everything got fine.
Customer segmentation using data science
http://blog.yhat.com/posts/customer-segmentation-using-python.html
https://www.r-bloggers.com/search/segmentation/
http://planetpython.org/
http://www.pythonblogs.com/
http://pythonthusiast.pythonblogs.com/230_pythonthusiast/archive/1347_starting_to_use_kivy__developing_letter_of_heroes_
an_android_alphabet_teaching_aid_for_kids_part_2_of_2.html
Propagation clustering:
http://scikit-learn.org/stable/modules/generated/sklearn.cluster.AffinityPropagation.html
Wednesday, June 21, 2017
Wednesday, June 14, 2017
foreach loop can not update the list in scala, but for loop can
I did a script, part of it is to update a list within a loop. foreach loop can not update the list, but for loop can. Here is the sample code:
For loop:
var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb = scala.collection.mutable.ListBuffer[String]()
val Pattern1 = "TYPEA".r
val Pattern2 = "TYPEB".r
var rrr=rdd.toDF.collect()
for (f<-rrr) {
val fName=f.getString(0)
for (k<- fName.split("\\/")) { if ( Pattern1.findFirstIn(k)!=None)
{println(k); Typea:+=k.toString}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString}
}
}
val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)
println(Typea1)
println(Typeb1)
It updated the list, Typea, Typeb. After updating, I printed the lists, there were strings in it.
Foreach loop:
var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb = scala.collection.mutable.ListBuffer[String]()
val Pattern1 = "TYPEA".r
val Pattern2 = "TYPEB".r
rdd.toDF.foreach{ f=>
val fName=f.getString(0)
fName.split("\\/").foreach(k=> if ( Pattern1.findFirstIn(k)!=None)
{println(k); Typea:+=k.toString; println(Typea)}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString;println(Typeb)}
)
}
val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)
println(Typea1)
println(Typeb1)
After updating the list, I printed the lists, they were empty.
For loop:
var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb = scala.collection.mutable.ListBuffer[String]()
val Pattern1 = "TYPEA".r
val Pattern2 = "TYPEB".r
var rrr=rdd.toDF.collect()
for (f<-rrr) {
val fName=f.getString(0)
for (k<- fName.split("\\/")) { if ( Pattern1.findFirstIn(k)!=None)
{println(k); Typea:+=k.toString}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString}
}
}
val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)
println(Typea1)
println(Typeb1)
It updated the list, Typea, Typeb. After updating, I printed the lists, there were strings in it.
Foreach loop:
var Typea = scala.collection.mutable.ListBuffer[String]()
var Typeb = scala.collection.mutable.ListBuffer[String]()
val Pattern1 = "TYPEA".r
val Pattern2 = "TYPEB".r
rdd.toDF.foreach{ f=>
val fName=f.getString(0)
fName.split("\\/").foreach(k=> if ( Pattern1.findFirstIn(k)!=None)
{println(k); Typea:+=k.toString; println(Typea)}
else if ( Pattern2.findFirstIn(k)!=None)
{println(k); Typeb:+=k.toString;println(Typeb)}
)
}
val Typea1=Typea.distinct.sorted
val Typeb1=Typeb.distinct.sorted
println(Typea)
println(Typeb)
println(Typea1)
println(Typeb1)
After updating the list, I printed the lists, they were empty.
Tuesday, June 13, 2017
amazon s3 spark scala get sub directories
https://stackoverflow.com/questions/42063077/spark-read-multiple-directories-into-mutiple-dataframes
import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{ FileSystem, Path }
val path = "foo/"
val hadoopConf = new Configuration()
val fs = FileSystem.get(hadoopConf)
val paths: Array[String] = fs.listStatus(new Path(path)).
filter(_.isDirectory).
map(_.getPath.toString)
Wednesday, June 7, 2017
Spark SQL query in cluase for strings
val list = List("a","b")
val query = s"select * from df where uid in
(${list.map ( x => "'" + x + "'").mkString(",") })"
Monday, June 5, 2017
Scala Saprk loop through a data frame
http://permalink.gmane.org/gmane.comp.lang.scala.spark.user/30128
You could call *collect()* method provided by DataFram API. This will give you an *Array[Rows]*. You could then iterate over this array and create your map. Something like this : val mapOfVals = scala.collection.mutable.Map[String,String]() var rows = DataFrame.collect() rows.foreach(r => mapOfVals.put(r.getString(0), r.getString(1))) println("KEYS : " + mapOfVals.keys) println("VALS : " + mapOfVals.values)
Subscribe to:
Posts (Atom)
looking for a man
I am a mid aged woman. I was born in 1980. I do not have any kid. no complicated dating before . I am looking for a man here for marriage...
-
I tried to commit script to bitbucket using sourcetree. I first cloned from bitbucket using SSH, and I got an error, "authentication ...
-
https://github.com/boto/boto3/issues/134 import boto3 import botocore client = boto3.client('s3') result = client.list_obje...
-
Previously, I wanted to install "script" on Atom to run PHP. And there was some problem, like the firewall. So I tried atom-runner...