Pyspark tutorial:
https://www.dezyre.com/apache-spark-tutorial/pyspark-tutorial
I wrote about the solutions to some problems I found from programming and data analytics. They may help you on your work. Thank you.
ezoic
Wednesday, March 28, 2018
How I set up a pyspark job.
I installed spark on a ubuntu.
I wrote a pyspark hello world python script:
from pyspark import SparkContext
from operator import add
sc = SparkContext()
data = sc.parallelize(list("Hello World"))
counts = data.map(lambda x: (x, 1)).reduceByKey(add).sortBy(lambda x: x[1], ascending=False).collect()
for (word, count) in counts:
print("{}: {}".format(word, count))
sc.stop()
I wrote a pyspark hello world python script:
from pyspark import SparkContext
from operator import add
sc = SparkContext()
data = sc.parallelize(list("Hello World"))
counts = data.map(lambda x: (x, 1)).reduceByKey(add).sortBy(lambda x: x[1], ascending=False).collect()
for (word, count) in counts:
print("{}: {}".format(word, count))
sc.stop()
And I wrote a sh file to run the pyspark code:
/home/ubuntu/spark/bin/spark-submit --master local[8] --driver-memory 12g --executor-memory 12g helloworld.py
And I run it. Got the results:
Thursday, March 22, 2018
Wednesday, March 21, 2018
Thursday, March 8, 2018
Can one use tools to simulate logon by python script?
Can one use tools to simulate logon by python script?
I once saw some ppl use tools to simulate logon by python scripts. I was not sure if it would work. So I tried one script I found. And it works. Here is the script.
I once saw some ppl use tools to simulate logon by python scripts. I was not sure if it would work. So I tried one script I found. And it works. Here is the script.
# -*- coding:utf-8 -*-import requests from bs4 import BeautifulSoup import urllib import re url = 'https://accounts.douban.com/login'data={ 'redir': 'https://www.douban.com/', 'form_email':'XXXXXX', 'form_password':'XXXXXX', 'login':u'登录'} headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.87 Safari/537.36'} r = requests.post(url, data, headers=headers) page = r.text soup = BeautifulSoup(page,"html.parser") print(soup)
It was written in Python 3.6. And the headers does not matter. I put a random headers.
I got what I wanted.
Friday, March 2, 2018
Touch Typing without looking at keyboard
Touch Typing without looking at keyboard.
https://www.youtube.com/watch?v=QSomh1VsKSU
https://www.tipp10.com/en/
https://www.youtube.com/watch?v=soYGESxy76g
https://www.keybr.com/
https://www.youtube.com/watch?v=QSomh1VsKSU
https://www.tipp10.com/en/
https://www.youtube.com/watch?v=soYGESxy76g
https://www.keybr.com/
Subscribe to:
Comments (Atom)
R is not a simple programming language, and it does better on reading excel files than python
R is not a simple programming language, and it does better on reading excel files than python . tried to read excel files to python and R. i...
-
Previously, I wanted to install "script" on Atom to run PHP. And there was some problem, like the firewall. So I tried atom-runner...
-
I tried to commit script to bitbucket using sourcetree. I first cloned from bitbucket using SSH, and I got an error, "authentication ...
-
https://github.com/boto/boto3/issues/134 import boto3 import botocore client = boto3.client('s3') result = client.list_obje...