In the last lesson, we learned how to use the SQL SELECT statement to return rows from a table within a database.
In this lesson, we will learn how to improve the functionality of the SQL SELECT statement by using the DISTINCT keyword.
How To Write a SQL SELECT DISTINCT Statement
There are many cases in which you will encounter a table that contains a column with duplicate values. The duplicate values often serve no purpose and it can be helpful to remove them, returning only the unique values from the row.
This is where the DISTINCT keyword can be useful. In SQL, we can use the DISTINCT keyword to return only the distinct values from a column.
The syntax for the DISTINCT keyword looks like this:
SELECT DISTINCT column_name FROM table_name;
Using the SQL SELECT DISTINCT Statement With Multiple Columns
As we saw in our last lesson, you can return multiple columns from a database table by listing them with commas separating them, like this:
SELECT column_name1, column_name2 FROM table_name;
When we add the DISTINCT keyword to this code cell, it looks like this:
SELECT DISTINCT column_name1, column_name2 FROM table_name;
This raises a great question: will this return the unique values for column_name1
, or will it return the unique values for column_name2
?
Interestingly, it is neither! When you use the DISTINCT keyword with multiple columns, it only drops rows where ALL of the column values are the same. Said differently, in order for rows to be removed in the example above, they must have the same values in both column_name1
and column_name2
as a different row in the table.
Here is an excellent visual representation of this in case it wasn't clear:
A Few Examples Of The SELECT DISTINCT Statement From The DVD Rental Database
To conclude this lesson, let's work through a few examples of how we might use the SELECT DISTINCT statement in our DVD rental database.
The table named payment
is an excellent candidate for this, since it contains a list of all payments made by the DVD Rental store.
To start, let's run the following query to see all of the columns available in the payment table:
SELECT * FROM payment;
There are a few columns that make nice candidates for using a SELECT DISTINCT statement:
customer_id
: to see how many unique customers have processed a payment at the storestaff_id
: to see how many staff members have processed a paymentrental_id
: to generate a list of titles that have been rented from the store
Here's how you would perform each of the queries form the bullet points above:
SELECT DISTINCT customer_id FROM payment;
SELECT DISTINCT staff_id FROM payment;
SELECT DISTINCT rental_id FROM payment;
Final Thoughts
In this lesson, you learned how to pair the DISTINCT keyword with the SQL SELECT statement to return distinct values from the rows in a database. We'll learn how to use the COUNT statement next, which can be nicely paired with the SELECT DISTINCT statement to count the number of distinct entries in a database table's column.