Scraping the Web in Python — IMDb

2 min readMay 16, 2021

--

In this post, we scrape through IMDb web page and extract the name , runtime and genre of the movie using urllib and Beautiful soup libraries in Python.

For an understanding of HTML tags, refer the attached .

Layout of the page.

Filter movies by Released date between January’1950 and December’2012 ordered by “Number of Votes” descending.

Results home page comprises of 50 movies out of a total of 3,711,876 .

Inspect the HTML and locate the data of interest i.e name, runtime and genre .

Import the libraries

Request data from the URL and dump the HTML into a variable page_html

Extract the value from the tags and direct the output to a . CSV file

Parse the HTML dump, iterate through the items and extract the name, year and runtime tags. Direct the output to a .CSV file

Check the contents of the .csv file and it should have the data

Scraping With Python

Scraping Amazon

Written by Hoda Saiful

When I have the time to write

No responses yet

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams