Scrapy (// SKRAY-pee) is a free and open-source web-crawling framework written in Python. Originally designed for web scraping, it can also be used to extract data using APIs or as a general-purpose web crawler. It is currently maintained by Scrapinghub Ltd., a web-scraping development and services company.
|Initial release||26 June 2008|
1.7.3 / 1 August 2019
|Operating system||Windows, macOS, Linux|
Scrapy project architecture is built around "spiders", which are self-contained crawlers that are given a set of instructions. Following the spirit of other don't repeat yourself frameworks, such as Django, it makes it easier to build and scale large crawling projects by allowing developers to reuse their code. Scrapy also provides a web-crawling shell, which can be used by developers to test their assumptions on a site’s behavior.
Scrapy was born at London-based web-aggregation and e-commerce company Mydeco, where it was developed and maintained by employees of Mydeco and Insophia (a web-consulting company based in Montevideo, Uruguay). The first public release was in August 2008 under the BSD license, with a milestone 1.0 release happening in June 2015. In 2011, Scrapinghub became the new official maintainer.
- "Release notes — Scrapy documentation". doc.scrapy.org. Retrieved 7 August 2019.
- How do you pronounce "Scrapy"?
- Scrapy at a glance.
- "Frequently Asked Questions". Retrieved 28 July 2015.
- "Scrapy shell". Retrieved 28 July 2015.
- Bell, Eddie; Heusser, Jonathan. "Scalable Scraping Using Machine Learning". Retrieved 28 July 2015.
- Scrapy | Companies using Scrapy
- Montalenti, Andrew. "Web Crawling & Metadata Extraction in Python".
- "Scrapy Companies". Scrapy website.
- Hyphe v0.0.0: the first release of our new webcrawler is out!
- Ben Firshman [@bfirsh] (21 January 2010). "World Govt Data site uses Django, Solr, Haystack, Scrapy and other exciting buzzwords bit.ly/5jU3La #opendata #datastore" (Tweet) – via Twitter.
- Medina, Julia (19 June 2015). "Scrapy 1.0 official release out!". scrapy-users (Mailing list).
- Pablo Hoffman (2013). List of the primary authors & contributors. Retrieved 18 November 2013.
- Interview Scraping Hub.