JOIN

So far, we have only queried single tables. Things get more interesting when there are multiple related tables. Why would tables have relationships? Let's look at a use case.

Suppose we are working on a social media website called "Tweeter", where people can write short excerpts of texts called "tweets". The users are in a table called tweeters. The tweets are in tweets.


tweeters
column_name type
id
INTEGER
username
VARCHAR
email
VARCHAR
registered_at
TIMESTAMPTZ

tweets
column_name type reference
id
INTEGER
NULL
body
TEXT
NULL
user_id
INTEGER
tweeters.id
created_at
TIMESTAMPTZ
NULL

The tweets are always written by a user (a tweeter). There is a relationship between the tables because each tweet is owned by exactly one user. That information is stored in the user_id column. This is equal to the id of the user who wrote the tweet. In other words, tweets.user_id references tweeters.id. We call this relationship a foreign key. So, the user_id column is a foreign key to tweeters.id.

Visually, engineers often like to depict this relationship with something like this:

Table Name: tweeters column_name type id INTEGER username VARCHAR email VARCHAR registered_at TIMESTAMPTZ Table Name: tweets column_name type id INTEGER body TEXT user_id INTEGER created_at TIMESTAMPTZ

Table Name: tweeters column_name type id INTEGER username VARCHAR email VARCHAR registered_at TIMESTAMPTZ Table Name: tweets column_name type id INTEGER body TEXT user_id INTEGER created_at TIMESTAMPTZ

This shows the actual relationship very clearly. The forked tip at the user_id column is meant to show that a user can write many tweets, i.e. the relationship is "one to many".

Getting Tweets From a Tweeter

Suppose we need to get the tweets for username judy.blick60. It's no problem to filter the tweeters by WHERE username = 'judy.blick60'. However, the tweets table doesn't have username, but the tweeter ID. We don't know what ID we would filter on. So, we can 2 do separate queries, and use the ID we retrieve from the first query.

SELECT id
FROM tweeters
WHERE username = 'judy.blick60';
id
INTEGER
11

We see that her ID is 11, then we put that into our next query…

SELECT body
FROM tweets
WHERE user_id = 11;
body
TEXT
If we program the program, we can get to the THX application through the solid state USB bus!
I'll generate the optical IB bandwidth, that should application the AGP program!
generating the system won't do anything, we need to quantify the optical SQL firewall!

This works, but we had to run 2 queries to get our data. This isn't always the most efficient. There is better way.

Performing a JOIN

SQL gives us the ability to get data from multiple related tables at once. How? The idea is to combine 2 (or more) tables side-by-side using a JOIN clause. Here is what it looks like in code:

SELECT *
FROM tweeters
JOIN tweets
  ON tweeters.id = tweets.user_id
ORDER BY username;
id
INTEGER
username
VARCHAR
email
VARCHAR
registered_at
TIMESTAMPTZ
id
INTEGER
body
TEXT
user_id
INTEGER
created_at
TIMESTAMPTZ
5
annabelle74
annabelle_erdman83@hotmail.com
2016-07-27 14:03:26.745+00
16
If we input the monitor, we can get to the IB transmitter through the primary HDD transmitter!
5
2018-07-26 05:28:38.075+00
6
berry.yost15
berry_yost19@gmail.com
2018-04-13 13:24:45.756+00
18
We need to connect the cross-platform SMTP application!
6
2018-12-03 13:31:28.755+00
3
brisa_jenkins39
brisa_jenkins@hotmail.com
2014-06-12 05:12:17.099+00
17
The SCSI protocol is down, index the multi-byte sensor so we can bypass the AI microchip!
3
2018-08-17 10:39:47.727+00
3
brisa_jenkins39
brisa_jenkins@hotmail.com
2014-06-12 05:12:17.099+00
4
If we override the monitor, we can get to the CSS capacitor through the wireless XSS array!
3
2016-06-21 16:19:17.977+00
3
brisa_jenkins39
brisa_jenkins@hotmail.com
2014-06-12 05:12:17.099+00
22
You can't hack the bandwidth without quantifying the auxiliary SQL firewall!
3
2019-12-03 07:27:54.773+00
4
dax1
dax.bode@gmail.com
2018-07-17 11:50:09.979+00
28
Try to input the SQL port, maybe it will back up the solid state bandwidth!
4
2021-05-01 22:18:55.039+00
1
deonte_murray45
deonte_murray57@hotmail.com
2021-08-02 08:54:42.061+00
36
You can't input the circuit without hacking the back-end CSS matrix!
1
2021-11-13 01:05:58.795+00
1
deonte_murray45
deonte_murray57@hotmail.com
2021-08-02 08:54:42.061+00
34
parsing the monitor won't do anything, we need to index the primary SMS transmitter!
1
2021-11-10 01:39:21.523+00
1
deonte_murray45
deonte_murray57@hotmail.com
2021-08-02 08:54:42.061+00
37
generating the protocol won't do anything, we need to back up the virtual USB hard drive!
1
2021-12-20 11:41:28.716+00
13
elinore56
elinore11@hotmail.com
2015-12-05 12:43:36.332+00
10
compressing the system won't do anything, we need to index the optical SMTP alarm!
13
2017-05-24 13:28:46.409+00
13
elinore56
elinore11@hotmail.com
2015-12-05 12:43:36.332+00
20
Try to bypass the SAS feed, maybe it will bypass the 1080p application!
13
2019-02-13 21:39:53.676+00
13
elinore56
elinore11@hotmail.com
2015-12-05 12:43:36.332+00
2
Try to transmit the SMTP hard drive, maybe it will input the back-end card!
13
2016-05-23 05:03:29.62+00
13
elinore56
elinore11@hotmail.com
2015-12-05 12:43:36.332+00
9
I'll copy the auxiliary PCI alarm, that should card the GB firewall!
13
2017-01-26 05:13:38.387+00
9
enrico.durgan1
enrico45@gmail.com
2015-09-02 18:36:33.47+00
5
We need to hack the bluetooth GB port!
9
2016-07-09 13:23:46.107+00
9
enrico.durgan1
enrico45@gmail.com
2015-09-02 18:36:33.47+00
7
Use the bluetooth XML capacitor, then you can bypass the bluetooth bandwidth!
9
2016-08-13 10:53:39.711+00
9
enrico.durgan1
enrico45@gmail.com
2015-09-02 18:36:33.47+00
35
I'll back up the mobile AI circuit, that should array the EXE hard drive!
9
2021-11-10 23:56:02.234+00
9
enrico.durgan1
enrico45@gmail.com
2015-09-02 18:36:33.47+00
6
Use the optical AGP program, then you can connect the back-end transmitter!
9
2016-07-16 14:24:41.202+00
7
ethan67
ethan_kuhn@gmail.com
2019-05-31 06:42:38.464+00
23
Try to program the SMS interface, maybe it will generate the cross-platform matrix!
7
2020-07-04 21:40:52.919+00
7
ethan67
ethan_kuhn@gmail.com
2019-05-31 06:42:38.464+00
25
Use the solid state SMTP protocol, then you can index the cross-platform monitor!
7
2021-01-24 12:23:05.948+00
2
hilda.grant53
hilda_grant48@hotmail.com
2016-04-24 13:52:43.58+00
12
programming the pixel won't do anything, we need to copy the cross-platform TCP monitor!
2
2018-01-01 07:01:16.356+00
2
hilda.grant53
hilda_grant48@hotmail.com
2016-04-24 13:52:43.58+00
29
You can't copy the array without indexing the primary SMS card!
2
2021-05-14 09:26:34.859+00
2
hilda.grant53
hilda_grant48@hotmail.com
2016-04-24 13:52:43.58+00
11
The PNG card is down, parse the virtual alarm so we can index the SQL capacitor!
2
2017-08-22 16:26:06.154+00
2
hilda.grant53
hilda_grant48@hotmail.com
2016-04-24 13:52:43.58+00
3
You can't bypass the driver without bypassing the virtual HTTP hard drive!
2
2016-06-11 01:12:38.474+00
11
judy.blick60
judy14@yahoo.com
2014-02-11 02:53:02.395+00
1
If we program the program, we can get to the THX application through the solid state USB bus!
11
2016-02-25 22:36:30.812+00
11
judy.blick60
judy14@yahoo.com
2014-02-11 02:53:02.395+00
8
I'll generate the optical IB bandwidth, that should application the AGP program!
11
2016-09-09 10:18:32.523+00
11
judy.blick60
judy14@yahoo.com
2014-02-11 02:53:02.395+00
30
generating the system won't do anything, we need to quantify the optical SQL firewall!
11
2021-06-16 12:42:05.054+00
10
madge.wehner
madge_wehner@yahoo.com
2021-02-15 14:22:34.017+00
26
If we override the array, we can get to the SMS panel through the online RAM interface!
10
2021-03-15 23:07:23.918+00
10
madge.wehner
madge_wehner@yahoo.com
2021-02-15 14:22:34.017+00
32
I'll back up the digital SMS sensor, that should port the XSS firewall!
10
2021-09-15 09:22:09.36+00
10
madge.wehner
madge_wehner@yahoo.com
2021-02-15 14:22:34.017+00
33
We need to back up the back-end AGP pixel!
10
2021-09-23 20:10:58.765+00
10
madge.wehner
madge_wehner@yahoo.com
2021-02-15 14:22:34.017+00
27
You can't index the driver without connecting the wireless FTP application!
10
2021-04-26 20:48:05.334+00
14
manuel_torp
manuel79@gmail.com
2015-05-16 11:41:28.533+00
13
Use the back-end PNG port, then you can index the cross-platform port!
14
2018-02-21 07:24:38.871+00
14
manuel_torp
manuel79@gmail.com
2015-05-16 11:41:28.533+00
15
The COM program is down, synthesize the optical alarm so we can calculate the ADP bus!
14
2018-05-30 20:35:11.712+00
12
marjorie.thompson
marjorie_thompson99@gmail.com
2015-03-29 01:28:47.113+00
19
If we connect the interface, we can get to the EXE capacitor through the bluetooth JBOD pixel!
12
2018-12-07 16:41:13.131+00
8
trevion_schuppe15
trevion_schuppe83@gmail.com
2017-07-21 21:15:58.491+00
31
programming the port won't do anything, we need to hack the haptic SSL driver!
8
2021-07-07 16:26:38.927+00
8
trevion_schuppe15
trevion_schuppe83@gmail.com
2017-07-21 21:15:58.491+00
24
backing up the capacitor won't do anything, we need to navigate the haptic PCI array!
8
2020-12-14 17:53:08.401+00
8
trevion_schuppe15
trevion_schuppe83@gmail.com
2017-07-21 21:15:58.491+00
21
We need to navigate the wireless SQL panel!
8
2019-11-01 00:03:20.609+00
8
trevion_schuppe15
trevion_schuppe83@gmail.com
2017-07-21 21:15:58.491+00
14
You can't synthesize the alarm without backing up the optical SSL application!
8
2018-04-24 21:18:00.194+00

Notice that the result's columns are a fusion of both tables' columns. Here's what it look like visually.

... ... ... ... ... ... col_2 ... ... ... ... ... ... col_3 1 2 2 2 3 3 user_id Table Name: tweets ... ... ... ... ... ... col_2 1 2 2 2 3 3 id ... ... ... ... ... ... col_1 ... ... ... ... ... ... col_3 1 2 2 2 3 3 user_id Table Name: tweeters id 1 2 3 ... ... ... col_1

The rows from tweeters try to match up with any rows from tweets. When the rows from the tables match, they combine to form 1 combined row. Those combined rows form the output table. However, notice that 1 row on the left matches multiple rows on the right. What happens in this case?

Whenever a single row matches multiple other rows, its contents are duplicated in the result. So, in the result, a user having 3 tweets will have their user information (email, username, etc.) appear 3 times.

Why is that useful? Well now these new combined rows represent tweets and they have the tweet information and the user information. We can therefore select the tweet information and filter on user information, like username, in a single query!

Let's have a look at the code to make this possible.

You specify the left-hand table in FROM, and then the right-hand table in JOIN. You also need to specify how the rows match up. How do we know which rows in tweets match which rows in tweeters? We know from the ON clause after the JOIN. Rows will match when the expression in the ON clause is true. In our case, when the user_id column in tweets equals the id column in tweeters, i.e. tweeters.id = tweets.user_id.

Notice the notation table_name.column_name, which is new. This is just a more specific way to refer to a column. It wasn't necessary when only one table was involved, but it is now. Watch what happens if we refer to the id column without being specific…

SELECT id
FROM tweeters
JOIN tweets
  ON tweeters.id = tweets.user_id;
error: column reference "id" is ambiguous

Just selecting id is ambiguous because both tables have an id column. Do we mean the tweeter ID or the tweet ID? We can prefix our column names with their respective table names to disambiguate. Let's also add the WHERE clause to filter to the tweets we want.

SELECT
  tweets.id AS tweet_id,
  tweets.body,
  tweets.created_at
FROM tweeters
JOIN tweets
  ON tweeters.id = tweets.user_id
WHERE tweeters.email = 'judy14@yahoo.com';
tweet_id
INTEGER
body
TEXT
created_at
TIMESTAMPTZ
1
If we program the program, we can get to the THX application through the solid state USB bus!
2016-02-25 22:36:30.812+00
8
I'll generate the optical IB bandwidth, that should application the AGP program!
2016-09-09 10:18:32.523+00
30
generating the system won't do anything, we need to quantify the optical SQL firewall!
2021-06-16 12:42:05.054+00

Notice how we can safely request tweets.id without error. Furthermore, we alias it to tweet_id instead. Otherwise, the name in the result would just be id.

Also notice that I use tweeters.email and tweets.body instead of just email and body. Because these columns are unambiguous (only occur in one of the tables involved), prefixing with the table name is optional. However, people reading your query may not know that. It is better to be specific and add the table name unless it is very obvious which table a column belongs to.

Table Aliases

Much like aliases can be assigned to columns in the SELECT clause, we can assign aliases to tables in the FROM and JOIN clauses.

SELECT
  t.id AS tweet_id,
  t.body,
  t.created_at
FROM tweeters u
JOIN tweets t
  ON u.id = t.user_id
WHERE u.email = 'judy14@yahoo.com';
tweet_id
INTEGER
body
TEXT
created_at
TIMESTAMPTZ
1
If we program the program, we can get to the THX application through the solid state USB bus!
2016-02-25 22:36:30.812+00
8
I'll generate the optical IB bandwidth, that should application the AGP program!
2016-09-09 10:18:32.523+00
30
generating the system won't do anything, we need to quantify the optical SQL firewall!
2021-06-16 12:42:05.054+00

When table aliases are used, the column references need to match the table aliases, not the original table name.

Table aliases never end up in the result. They are only used to disambiguate between columns or between multiple instances of the same table (we'll see exapmles of that later).

You also don't use AS before the alias, at least according to the SQL standard. However, Postgres lets you add AS if you want. I prefer to stick to the standard.

Other JOIN Types

This is just one type of join, called an "inner join". We could've typed INNER JOIN instead of JOIN. Inner joins are the default join type. We'll look at more join types soon. First, let's get some practice with the inner join.

© 2022 Andrew Carlson. All rights reserved.