Skip to main content

Intro to Relational Databases and SQL

Learning Objectives

Students will be able to:
Describe the use case of databases
Describe the anatomy of a relational database
Describe the use case of SQL
Use the psql Interactive Terminal
Use SQL to create a database and a table
Use SQL to perform CRUD data operations
Use a SQL JOIN clause to combine data from multiple tables

Road Map

  1. Intro to SQL/Relational Databases
  2. Anatomy of a Relational Database
  3. Structured Query Language (SQL)
  4. Essential Questions
  5. Further Study

1. Intro to SQL/Relational Databases

What is a Database?

The vast majority of applications manipulate and display data.

Early on, our programs held data in "memory" using data structures such as arrays and objects. However, when the app was exited, any changes to the data were lost - databases avoid this...

Simply put, a database provides permanent storage for data.

Different Types of Databases

This site ranks databases according to their popularity.

As you can see, there are several different types of databases and several "brands" within each type.

Most database technologies can be broken down into two main categories:

  • Relational databases
  • NoSQL databases

Relational databases are by far the most popular type of database technology. Conceived by E.F. Codd while working for IBM back in 1970.

In SEI, we'll be using PostgreSQL since it's arguably the best open-source relational database management system (RDBMS) available.

2. Anatomy of a Relational Database

Schema

The structure of a particular database is defined by its schema.

Schemas define the database's:

  • Tables, including the number and data type of each column
  • Indexes for efficient access of data
  • Constraints (rules, such as whether a field can be null or not)

Tables

The primary container for data in a relational database is a table:

As you can see, database tables look a lot like a spreadsheet since they consist of columns and rows.

A single table in a relational database holds data for a particular data resource, for example, customers, orders, reviews, etc.

TABLE: artists

id (PK)namenationality
1PrinceAmerican
2Sir Elton JohnBritish

TABLE: songs

id (PK)nameyear_releasedartist_id (FK)
1Tiny Dancer19712
2Little Red Corvette19821
3Raspberry Beret19851
4Your Song19702

Rows (Records)

A row in a table represents a single instance of the data entity.

For example a particular artist in the artists table.

Columns (Fields)

The columns of a table have a:

  • Name
  • Data type (all data in a column must be of the same type)
  • Optional constraints

The typical naming convention is usually snake_cased and singular.

PostgreSQL has many data types for columns, but common ones include:

  • integer
  • decimal
  • varchar (variable-length strings)
  • text (same as varchar)
  • date (does not include time)
  • timestamp (both date and time)
  • boolean

Common constraints for a column include:

  • PRIMARY KEY: column, or group of columns, uniquely identify a row
  • REFERENCES (Foreign Key): value in column must match the primary key in another table
  • NOT NULL: column must have a value, it cannot be empty (null)
  • UNIQUE: data in this column must be unique among all rows in the table

Primary Keys (PK) and Foreign Keys (FK)

The field (or fields) that uniquely identify each row in table are know known as that table's primary key (PK).

Since only one type of data entity can be held in a single table, related data, for example, the songs for an artist, are stored in separate tables and "linked" via what is known as a foreign key (FK). Note that foreign key fields hold the value of its related parent's PK.

❓ Database Review Questions (1 min)

(1) A __________ defines the structure of a particular database.


schema


(2) A table in a database consists of and.


rows and columns


(3) A __________ key uniquely identifies a row within a table.


primary key


(4) A __________ key references the primary key of a related table.


foreign key


3. Structured Query Language (SQL)

What is SQL?

SQL (Structured Query Language), also pronounced "sequel", is a programming language used to CRUD data stored in a relational database.

SQL syntax is similar to the English language.

Although SQL is fairly standard, it can vary from depending on the particular RDBMS (Relational Database Management System). For example, the SQLite RDBMS implements fewer SQL commands than that of PostgreSQL.

The psql Interactive Terminal

There are several GUI tools available for working with PostgreSQL, however, in SEI we won't need one because we'll be primarily accessing the database using code (Python/Django) and we'll be using Terminal today to learn a touch of SQL commands instead of using a GUI.

psql is a tool that runs in terminal and allows us to work with PostgreSQL databases by typing in commands. It was installed with PostgreSQL.

Open a terminal session and type: psql.

You'll see your PostgreSQL version and psql's prompt:

$ psql
psql (14.5)
Type "help" for help.

jimclark=#

Here are some useful commands (note the use of a backslash):

help -- general help
\? -- help with psql commands
\h -- help with SQL commands
\l -- Lists all databases
\c -- Connect to a database
\d -- List tables in database
\q -- exits psql
q -- exits a psql list or dialogue

Creating a Database and a Table

Let's create a database named music and a bands table:

CREATE DATABASE music; -- Don't forget the semicolon!

\l -- What changed?

\c music -- Connect to the music database

\d -- Lists all tables

-- Define a table's schema
CREATE TABLE bands (
id serial PRIMARY KEY, -- serial is auto-incrementing integer
name varchar NOT NULL,
genre varchar
);

\d -- There should now be a bands table

The backslash commands, e.g. \d, are psql commands.

The CREATE DATABASE and CREATE TABLE are SQL commands.

Basic Querying and Inserting Data

Now let's write some more SQL to query (SELECT) and create data (INSERT INTO):

SELECT * FROM bands; -- The * represents all fields

-- For text, use single quotes, not double
INSERT INTO bands (name) VALUES ('The Smiths');

INSERT INTO bands (name, genre) VALUES ('Rush', 'prog rock');

SELECT * FROM bands; -- Use the up arrow to access previous commands
tip

👀 Comments in SQL begin with --

Because the id column is defined as serial, the database automatically assigns the next available integer.

Let's say we have the following data relationship: Band ---< Musician

A Band has many Musicians, and
a Musician belongs to a Band

Whenever you have a one:many relationship like above, the rows in the table for the many-side must include a column that references which row in the table on the one-side it belongs to.

This column is known as a foreign key (FK) and it must be of the same data type as the primary key in the related/parent table (typically an integer).

Now let's define the musicians table:

-- REFERENCES creates a FK constraint
CREATE TABLE musicians (
id serial PRIMARY KEY,
name varchar NOT NULL,
quote text,
band_id integer NOT NULL REFERENCES bands (id)
);

\d musicians -- details for table

The REFERENCES constraint is what makes a column a FK.

Now let's attempt to add a musician with a bogus foreign key:

INSERT INTO musicians (name, band_id) VALUES ('Geddy Lee', 999);
--- The above command will fail because there's no matching PK in the bands table

-- Let's try again, but first, let's verify the ids of the bands
SELECT * FROM bands;

-- Assuming 'Rush' has an id of 2
INSERT INTO musicians (name, band_id) VALUES ('Geddy Lee', 2);

SELECT * FROM musicians; -- There's Geddy!

-- Now let's add Neil
-- Use two single quotes to embed an apostrophe
INSERT INTO musicians (name, quote, band_id)
VALUES (
'Neil Peart',
'If you''ve got a problem, take it out on a drum',
2);
note

👀 It's possible to insert multiple rows by providing comma separated value lists:
...VALUES ('Geddy Lee', 2), ('Neil Peart', 2);

Querying Data using a JOIN Clause

The JOIN clause is used with a SELECT to query for data from more than one table.

Let's query for all of the bands with their musicians:

-- table right of JOIN has the FKs
SELECT * FROM bands JOIN musicians ON bands.id = musicians.band_id;

Note that no records are returned for bands without any musicians. This is called an INNER JOIN, which is the default.

There are several types of joins.

If we want to return all bands, regardless of whether or not there's any matches for musicians, we use whats called a LEFT JOIN:

-- Using aliases for the table names
SELECT *
FROM bands b
LEFT JOIN musicians m ON b.id = m.band_id;

Querying Data using a WHERE Clause

The WHERE clause allows selecting records that meet a condition or conditions:

SELECT *
FROM bands b
LEFT JOIN musicians m ON b.id = m.band_id
WHERE b.name = 'Rush' AND m.name LIKE 'G%';

The LIKE operator uses:

  • % to match any number of characters (wildcard)
  • _ to match any single character

Updating Data

Time to give Geddy a quote by using the SQL UPDATE command:

UPDATE musicians
SET quote = 'I love to write, it''s my first love.'
WHERE name = 'Geddy Lee';

Deleting Data

Be careful with this command because if you don't use a WHERE clause, you can accidentally delete all of the data from a table:

SELECT * FROM bands;

DELETE FROM bands WHERE name LIKE '%Smiths';

SELECT * FROM bands;

SQL - Summary

As much fun as it is to write SQL, most developers don't have many opportunities to do so because they typically software known as an Object Relational Mapper (ORM) to automatically write SQL and communicate with the database server.

Regardless, understanding enough SQL to put it on your resume is what's important!

For additional practice after the lab, check out this interactive site: PG Exercises

4. ❓ Essential Questions (1 min)

(1) A relational database contains a _________ for each data entity/resource that an application has.


table


(2) True or False: In a relational database, all of the data in a particular column in a table must be of the same data type.


True


(3) A single instance of a data entity, e.g., Band, is represented by a ______ in a table.


row


(4) _____ is the "programming" language used by relational databases.


Structured Query Language (SQL)


5. Further Study

If you'd like to go deeper into SQL, look into:

Additional Practice

References