A python Scrapy tutorial
https://www.youtube.com/watch?v=OJ8isyws2yw
Python scrapy is a web crawling package.
The youtuber gave a scrapy example.
First type scrapy startproject tutorial
You will get a directory, tutorial, under tutorial/spiders, generate a python file called quotes_spider.py under spiders directory.
The code:
import scrapy
class QuotesSpider(scrapy.Spider):
name="quotes"
def start_requests(self):
urls=[
'http://quotes.toscrape.com/page/1/',
'http://quotes.toscrape.com/page/2/'
]
for url in urls:
yield scrapy.Request(url=url,callback=self.parse)
def parse(self, response):
page=response.url.split("/")[-2]
filename="quotes-%s.html"%page
with open(filename,"wb") as f:
f.write(response.body)
self.log("Saved file %s" % filename)
How to run it:
under spiders directory run
scrapy crawl quotes
Results, in spiders folder:
got two quotes-.html files.
I wrote about the solutions to some problems I found from programming and data analytics. They may help you on your work. Thank you.
ezoic
Subscribe to:
Post Comments (Atom)
looking for a man
I am a mid aged woman. I was born in 1980. I do not have any kid. no complicated dating before . I am looking for a man here for marriage...
-
I tried to commit script to bitbucket using sourcetree. I first cloned from bitbucket using SSH, and I got an error, "authentication ...
-
https://github.com/boto/boto3/issues/134 import boto3 import botocore client = boto3.client('s3') result = client.list_obje...
-
Previously, I wanted to install "script" on Atom to run PHP. And there was some problem, like the firewall. So I tried atom-runner...
No comments:
Post a Comment