Python: ScrapyFromScript

2 min readMay 15, 2021

Background:

So I was asked to study the topic of “Web Scraping/Crawling” at work. Specifically, my task was to collect the entire text data based on a website domain which I have no idea how to start. Nonetheless, I had come across a python tool called SCRAPY!!
Link to the official documentation:

Scrapy 2.5 documentation - Scrapy 2.5.0 documentation

Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data…

docs.scrapy.org

Great Tutorial to Start:

I followed the tutorials from this author. I think the author did a great job in introducing the basics of Scrapy, thus I highly recommend you to check it out!

Python Scrapy tutorial for beginners - 01 - Creating your first spider

Learn how to fetch the data of any website with Python and the Scrapy Framework in just minutes. On the first lesson of…

letslearnabout.net

Main Content:

If you try to look into Scrapy tutorials online, you will soon find out that not many articles talk about “How to run Scrapy from the script?”

One of the reasons is that Scrapy has a well-designed built-in system. However, just to provide an alternative for users, I have created demos for you to play with. The python script is built based on the tutorials listed above, please check out the tutorials before proceeding to my repo.
Good Luck and Have Fun!

CJsGit-tech/ScrapyFromScript

1. You have basic understanding of html and know how to inspect a website 2. You have python installed and know how to…

github.com

Assumptions

1. You have a basic understanding of HTML and know how to inspect a website
2. You have python installed and know how to run python from CLI

What you will get out of this REPO

Learn to run Scrapy from CLI (python WebScraper.py) under scenarios of
1. Single Page scraping and 2. Multi-Page Crawling

Sample Output

{“title”: “It’s Only the Himalayas”,
“UPC”: “a22124811bfa8350”,
“IMG”: “http://books.toscrape.com/media/cache/6d/41/6d418a73cc7d4ecfd75ca11d854041db.jpg",
“stars”: “ Two”,
“des”: Wherever you go, whatever you do, just . . .
“price”: “£45.17”,
“price_inc_tax”: “£45.17”,
“instock”: “In stock (19 available)”},