- SQL was the most in-demand skill in the tech sector in 2016, according to data from Indeed.com.
- Course uses PostgreSQL, one of the most in-demand versions of SQL. Other database systems that will have significant overlap with PostgreSQL include Amazon Redshift, mySQL, Microsoft SQL Server, and Oracle Database.
Course Curriculum Overview
- Install PostgreSQL and PG Admin
- Databases and Tables Basics, SQL Statement Fundamentals, GROUP BY Clause, Assessment Test 1
- JOINs, Advanced SQL Commands, Assessment Test 2
- Create Databases and Tables directly, Assessment Test 3 Bonus: Views, and Using PostgreSQL with Python
Overview of Databases
What are Databases?
- Databases are systems that allow users to store and organize data
- Especially useful when dealing with large amounts of data
Why use Databases?
Spreadsheet Use Cases:
- One-time, quick analysis of a reasonably sized data set
- Useful for untrained people
Advantages of Databases:
- Data integrity - data cannot be readily changed. Can’t just click into a cell and change the data.
- Can scale massively
- Quickly combine different datasets
- Can automate steps to derive the same data, thanks to SQL syntax
Analogies to Spreadsheets
- Tabs in a spreadsheet are analogous to tables in a database
- Spreadsheet Columns/Rows are analogous to Database Columns/Rows
Database Platform Options
|PostgreSQL & MySQL/MariaSQL||Free, widely used on the internet, multi platform|
|MS SQL Server Express||Free, some limitations, compatible with SQL server, Windows only|
|Microsoft Access||Cost, SQL functionality is hidden away|
|SQLite||Free, mainly on command line|
PostgreSQL is used in this course for the reasons in the table above, and because it is easy to install and get up and running.
All of the following use some form of SQL: MySQL, PostgreSQL, Oracle Databases, Microsoft Access, Amazon’s Redshift, Looker, MemSQL, Periscope Data, Hive (related to Hadoop), Google’s BigQuery, Facebook’s Presto
Most of the content of this course are general SQL, and applicable to many of the foregoing.
PgAdmin and PostgreSQL
PgAdmin - graphical interface to connect to a database
PostgreSQL - actual SQL engine
PgAdmin 4 and PostgreSQL 10 utilized for this course
Databases and Tables
pgAdmin is a great tool for creating and restoring databases. It saves a lot of time for those types of tasks.
Creating and Restoring a Database
This section shows how you would use pgAdmin to implement a PostgreSQL database.
Create a New Database
To create a new database within pgAdmin, right-click Databases and select
Provide the database a name, and click
OK. The result is an empty database. There is also a means of creating a database programmatically in SQL (using
CREATE DATABASE db_name).
Opening the database using the explorer, there are no Tables in the empty database.
Restore a Database
To restore a database using pgAdmin, an empty database is required. Right-click the empty database and select
Restore, navigate to the appropriate file (such as dvdrental.tar), then select
Delete a Database
Simply right-click the database name, and select
Restoring a Table Schema
Restoring a Table Schema means restoring the table names and data types only, without any data. An example of a situation requiring restoring a table schema is opening a new location for a business. The new location will, in time, generate the same type of data as existing locations, with the same relationships. But, the database itself should start out empty. This is a fairly common task.
Restoring a table schema is similar to the restoring an entire database. To begin, an empty database is required. As before, navigate to the .tar file. Under the “Restore Options #1” tab, select
Restoring an existing database such that all the data is removed is very similar. Right-click the existing database, click
Restore, navigate to the .tar file. Under the “Restore Options #1” tab, select
Only schema. Under the “Restore Options #2” tab, select
Clean before restore. This will clean away any data before the restoration process begins.