SQL tutorial using IMDb Database

Hoda Saiful
3 min readNov 28, 2020

An advanced learner wanting to brush through SQL skills or a beginner trying to learn SQL, this post is for you !

I run all SQL scripts against Amazon IMDb database. You may install SQL sever and SSMS on your local device, execute the attached to create IMDb database, associated tables and insert data into the tables. The data is a very small subset of the actual database, but good enough to understand the concepts.

Relationship between the tables

In this post, we start with the fundamentals of SQL, and build up complexity as we progress.

SELECT, WHERE, DISTINCT, AND, OR, NOT

We shall learn the above statements executing SQLs against the actors table in IMDb Database

IN, BETWEEN, ORDER BY, MAX, MIN, AVERAGE, SUM, COUNT

JOINS — Inner Join, Left Join, Right Join, Outer Join

To retrieve data from multiple tables in a single SELECT statement, we JOIN the tables together using a common key. In this post, we discuss the 4 major types of joins .

For purpose of illustration, I’ve used directors and directors_genres table. The primary key in directors table is “id” which maps as a foreign key “director_id” in the directors_genres table. To understand more about the primary and foreign key relationship, explore the link.

The director table contains the id, first and last name of the director. The director_genres table contains the director_id, genre of the movies directed and probability. The point I’m trying to make is a single director could have directed movies of multiple genres, so there is a one to many relationship between the director (id, name)and genres.

GROUP BY, HAVING, ORDER BY

In the pictorial representation of GROUP BY, I use the aggregate function SUM; whereas in the script, I use COUNT.

SUBQUERY

UNION, UNION ALL

Set like operations using SQL statements. It comes in very handy when data needs to retrieved from two or more databases (say current Database and Back up or historical Database in a single SQL query).

Intersection returns the data common to the SELECT statements, Union eliminating duplicates whilst Union all does not .

CASE Statement

CASE statement is used to implement if-then-else logic in SQL queries.

TEMP Tables

Temporary tables reduce the complexity of a program and help break down a given problem into small logical units. It can be created within a SQL query (or stored procedure) and dropped at the end.

In the below post, we discuss a very easy way of accomplishing a given analytical problem through temporary tables

--

--