ezoic

Thursday, March 23, 2017

Spark Scala mapreduce example


Spark Scala mapreduce example:


https://resources.sei.cmu.edu/asset_files/Presentation/2016_017_001_454701.pdf


Split String in Spark Scala


Split String in Spark Scala

http://stackoverflow.com/questions/29820501/splitting-strings-in-apache-spark-using-scala


Spark, scala map functions

Spark, scala map functions:

http://www.brunton-spall.co.uk/post/2011/12/02/map-map-and-flatmap-in-scala/


Spark Date and Time

Spark Date and Time:


https://github.com/SparklineData/spark-datetime




Spark SQL tutorial:


https://www.tutorialspoint.com/spark_sql/spark_sql_tutorial.pdf



Basic RDD Spark Scala


Basic RDD Spark Scala


https://piazza-resources.s3.amazonaws.com/i386b733m0i5tx/i76rsi9c1wz2nv/spark.pdf?AWSAccessKeyId=AKIAIEDNRLJ4AZKBW6HA&Expires=1490311603&Signature=FvO8b0ONVFXme5yE%2B4JBTm0KT4k%3D



http://www.sparktutorials.net/Getting+Started+with+Apache+Spark+DataFrames+in+Python+and+Scala




Some Scala Tutorials

Some Scala Tutorials:

https://www.tutorialspoint.com/scala/scala_tutorial.pdf


http://www.scala-lang.org/docu/files/ScalaTutorial.pdf


https://www.cs.rice.edu/~javaplt/411/12-fall/Lectures/ScalaBasics.pdf


https://www.scala-lang.org/old/sites/default/files/linuxsoft_archives/docu/files/ScalaByExample.pdf








Wednesday, March 22, 2017

How I build my first Spark Application

In hadoop, we use job, in Spark, we use application.

I followed the two instructions to build a Spark Application.


http://backtobazics.com/big-data/spark/building-spark-application-jar-using-scala-and-sbt/




http://scalatutorials.com/beginner/2013/07/18/getting-started-with-sbt/

I make the directory myProject, etc following the second post.

And I set up WordCount.scala and WordCount.sbt in myProject/src/main/scala:








And I ran "sbt package" under the directory and got a jar.

And I create a bash script:






It took a while to find where spark-submit is.

And I ran the bash, still something wrong.

 



 But it seems it is close.

And then, I re-wrote the sh file to be:






And I re-wrote the scala wordcount to be:




And I got the results in /home/ubuntu  output directory.  It is in a file named part-00000:



My original file is like:





Building Spark Application using scala and sbt

Building Spark Application using scala and sbt

http://backtobazics.com/big-data/spark/building-spark-application-jar-using-scala-and-sbt/

I tried the method. Their sbt location expired, I used another one to install sbt.



http://scalatutorials.com/beginner/2013/07/18/getting-started-with-sbt/

Run Spark App on linux

Run Spark App on linux

https://www.cs.duke.edu/courses/fall15/compsci290.1/TA_Material/jungkang/how_to_run_spark_app.pdf

build a jar app using sbt:

http://www.learn4master.com/learn-how-to/how-to-package-a-scala-project-to-a-jar-file-with-sbt



https://www.cs.helsinki.fi/u/lagerspe/courses/scala.pdf

spark-submit


https://www.cloudera.com/documentation/enterprise/5-6-x/PDF/cloudera-spark.pdf



https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.0/bk_spark-component-guide/content/spark-dataframe-api.html


Tuesday, March 21, 2017

run spark using bash script

run spark using bash script:

http://stackoverflow.com/questions/40606903/execute-apache-spark-scala-code-in-bash-script


A real Python Spark example tested on the linux I use

The linux is Ubuntu.

And the example:





How to login scala spark and python spark?

To login Scala Spark:


./bin/spark-shell

The shell:



To Login Python Spark:

./bin/pyspark

The shell:





Examples in Spark:

http://spark.apache.org/docs/latest/quick-start.html

Tested the Python version.


Where is the Spark data located?

Where is the Spark data located?
Spark data is located in the Spark Shell:



We can put the data there, or the subdirectory. And we can go to the data using:


Where is the Spark Data located? How to store data in Spark?

Where is the Spark Data located? How to store data in Spark?

Here is a tutorial about it:

https://www.dezyre.com/apache-spark-tutorial/apache-spark-installation-tutorial


We can see, we make an entry for spark in .bashrc file


SPARK_HOME=/DeZyre/spark

export PATH=$SPARK_HOME/bin:$PATH

And we put the data under the spark directory:



Spark Data examples

Here are some examples about how to work on data in Spark.


http://spark.apache.org/docs/latest/sql-programming-guide.html


Spark Data tutorial:

https://www.tutorialspoint.com/spark_sql/spark_sql_dataframes.htm




How to check which Linux version is installed and how I installed Spark and run first Example.

How to check which Linux is installed?

Use "cat /etc/issue". I currently have two systems, Ubuntu and Red Hat.





And I tried to install Spark on my Ubuntu linux.


To install Spark, I followed the following instruction:


http://blog.prabeeshk.com/blog/2014/10/31/install-apache-spark-on-ubuntu-14-dot-04/

And I followed it step by step. But on the sbt/sbt assembly step, I got an error:

Error: Invalid or corrupt jarfile sbt/sbt-launch-0.13.5.jar

Then I found this solution:

http://stackoverflow.com/questions/31594937/error-invalid-or-corrupt-jarfile-sbt-sbt-launch-0-13-5-jar

And I followed the first solution:

cd spark-1.2.0-bin-hadoop2.4/
./bin/spark-shell
And I ran the example from here:

http://blog.prabeeshk.com/blog/2014/10/31/install-apache-spark-on-ubuntu-14-dot-04/

And I got the following results:










Wednesday, March 15, 2017

Python problem

Write a function find_longest_word() that takes a list of words and returns the length of the longest one. Use only higher order functions.

Solution:

class Solution:
    def longest1(self, list1):
        aa={}
        for item in list1:
            aa[item]=len(item)

        max1=0
        for key in aa.keys():
            if aa[key]>max1:
                max1=aa[key]
            else:
                max1=max1

        lkey=list(aa.keys())[list(aa.values()).index(max1)]

        return(lkey)

kk=Solution()

if __name__=="__main__":

    print(kk.longest1(["atr","i","ertuo","worutynd"]))

Python problem



In English, the present participle is formed by adding the suffix -ing to the infinite form: go -> going. A simple set of heuristic rules can be given as follows:
  1. If the verb ends in e, drop the e and add ing (if not exception: beseefleeknee, etc.)
  2. If the verb ends in ie, change ie to y and add ing
  3. For words consisting of consonant-vowel-consonant, double the final letter before adding ing
  4. By default just add ing
Your task in this exercise is to define a function make_ing_form() which given a verb in infinitive form returns its present participle form. Test your function with words such as lieseemove and hug. However, you must not expect such simple rules to work for all cases.



Partial Solution:


class Solution:
    def make_ing1(self,word1):
        if word1[-1].lower()=='e' and word1[-2].lower()!='i':
            word1=word1[:-1]+"ing"
        elif word1[-1].lower()=="e" and word1[-2].lower()=="i":
            word1=word1[:-2]+"y"+"ing"
        else:
            word1=word1+"ing"
        return(word1)



kk=Solution()
if __name__=="__main__":

    print(kk.make_ing1("like"))


results:








Monday, March 13, 2017

Sorting algorithms

Sort a list from highest to lowest.

https://www.cs.cmu.edu/~adamchik/15-121/lectures/Sorting%20Algorithms/sorting.html


Python problem 9

Write a Python program to find the highest 3 values in a dictionary.

class Solution:
    def high3(self,dic1):
        list1=[]
        for value in dic1.values():
            list1.append(value)

        a=sorted(list1,reverse=True)[:3]

        return(a)

kk=Solution()

if __name__=="__main__":

    print(kk.high3({"a":6,"l":8,"o":5,"i":2,"t":4}))

Python problem 8

Write a Python program to create a dictionary from a string. Note: Track the count of the letters from the string.

class Solution:
    def dicstr1(self, str1):
        dic1={}
        list1=list(str1)
        for item in list1:
            dic1[item]=0
        for item in list1:
            dic1[item]=dic1[item]+1
        return(dic1)


kk=Solution()
if __name__=="__main__":

    print(kk.dicstr1("aattyyliert"))


Results:







class Solution:
    def dicstr1(self, str1):
        dic1={}

        for item in str1:
            dic1[item]=0
        for item in str1:
            dic1[item]=dic1[item]+1
        return(dic1)


kk=Solution()
if __name__=="__main__":

    print(kk.dicstr1("aattyyliert"))

Thursday, March 9, 2017

Python problem 7

Write a Python program to create and display all combinations of letters, selecting each letter from a different key in a dictionary.

class Solution:
    def comb1(self,dict1):

        list1=[]
        for key in dict1.keys():
            for key2 in dict1.keys():
                if key!=key2:
                    for i in range(len(dict1[key])):
                        for j in range(len(dict1[key2])):
                            a1=dict1[key][i]+dict1[key2][j]
                            list1.append(a1)
        list2=[]
        for item in list1:
            item=sorted(item)
            item="".join(item)
            list2.append(item)
        list2=list(set(list2))


        return(list2)

kk=Solution()

if __name__=="__main__":

    print(kk.comb1({'1':['a','b'],'2':['c','d']}))


Wednesday, March 8, 2017

Python problem 6

Write a Python program to match key values in two dictionaries

Sample dictionary: {'key1': 1, 'key2': 3, 'key3': 2}, {'key1': 1, 'key2': 2}
Expected output: key1: 1 is present in both x and y

class Solution:
    def two1(self, dic1,dic2):
        dic3={}
        for key in dic1.keys():
            if key in dic2.keys() and dic1[key]==dic2[key]:
                dic3[key]=dic1[key]
        return(dic3)


kk=Solution()
if __name__=="__main__":

    print(kk.two1({"a":1,"b":5,"i":9,"o":3},{"a":1,"b":5,"o":3,"t":19,"u":100}))


If you want to find python exercises to practice.

If you want to find python exercises to practice, try to search "python exercises" on google, you will find some exercises.

Tuesday, March 7, 2017

Python problem 5

Write a Python program to print all unique values in a dictionary.

class Solution:
    def unique1(self, dict1):
        a1=[]
        for key in dict1.keys():
            a1.append(dict1[key])
        b1=set(a1)

        return(b1)


kk=Solution()
if __name__=="__main__":

    print(kk.unique1({"a":1,"i":1,"u":5}))

Python problem 4

Write a Python program to combine two dictionary adding values for common keys: 

class Solution:
    def add1(self, dic1,dic2):
        for key in dic1.keys():
            if key in dic2.keys():
                dic1[key]=dic1[key]+dic2[key]

        for key in dic2.keys():
            if key not in dic1.keys():
                dic1[key]=dic2[key]


        return(dic1)

kk=Solution()
if __name__=="__main__":
    print(kk.add1({"a":1,"k":3,"u":9},{"a":100,"u":5,"r":100}))

Friday, March 3, 2017

Python problem 3

With a given integral number n, write a program to generate a dictionary that contains (i, i*i) such that is an integral number between 1 and n (both included). and then the program should print the dictionary.

class Solution:
    def square1(self,n):
        dict1={}
        for i in range(1,n+1):
            dict1[i]=i**2
        return(dict1)


if __name__=="__main__":
    kk=Solution()
    print(kk.square1(5))

Thursday, March 2, 2017

Python problem 2

Write a program which can compute the factorial of a given numbers.
The results should be printed in a comma-separated sequence on a single line.
Suppose the following input is supplied to the program


class Solution:
   def fac1( self,num):
        if num==0:
            return(1)
        else:
            return(num*self.fac1(num-1))




if __name__=="__main__":
    kk=Solution()
    print(kk.fac1(5))

if you do not put self  before fac1 in else, it will have an error.




A python problem

Write a program which will find all such numbers which are divisible by 7 but are not a multiple of 5,
between 2000 and 3200 (both included).
The numbers obtained should be printed in a comma-separated sequence on a single line.

class Solution:
    def div1(self,a1,a2):
        q1=[]
        for i in range(a1,a2+1,1):
            if i%5!=0 and i%7==0:
                q1.append(i)
        return(q1)
kk=Solution()
print(kk.div1(200,300))


Wednesday, March 1, 2017

A python problem, and solution



Write a program that maps a list of words into a list of integers representing the lengths of the corresponding words.


class Solution:

    def len1(self, list1):
        list2=[]
        for item in list1:
            list2.append(len(item))
        return(list2)


kk=Solution()

print(kk.len1(['aaa','erty','aaa','aaaar']))


looking for a man

 I am a mid aged woman. I live in southern california.  I was born in 1980. I do not have any kid. no compliacted dating.  I am looking for ...