Writing to a SQLite DB from Pandas

01 Mar 2020

This note demonstrates writing from a Pandas dataframe to a SQLite database.

import pandas as pd
import sqlite3

Read CSV Data into a DataFrame

f = (pd.read_csv('writing-to-sqlite-db-from-pandas/fruit_data_with_colors.txt',
                 sep='\t')
     .rename(columns={'fruit_name':'fruit'}))

f.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 59 entries, 0 to 58
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   fruit_label    59 non-null     int64  
 1   fruit          59 non-null     object 
 2   fruit_subtype  59 non-null     object 
 3   mass           59 non-null     int64  
 4   width          59 non-null     float64
 5   height         59 non-null     float64
 6   color_score    59 non-null     float64
dtypes: float64(3), int64(2), object(2)
memory usage: 3.4+ KB

f.head()

	fruit_label	fruit	fruit_subtype	mass	width	height	color_score
0	1	apple	granny_smith	192	8.4	7.3	0.55
1	1	apple	granny_smith	180	8.0	6.8	0.59
2	1	apple	granny_smith	176	7.4	7.2	0.60
3	2	mandarin	mandarin	86	6.2	4.7	0.80
4	2	mandarin	mandarin	84	6.0	4.6	0.79

Create the Database Connection and Write to It

%%bash
cd writing-to-sqlite-db-from-pandas
rm -f fruits.db
ls

fruit_data_with_colors.txt

The following line creates the SQLite DB file if it does not already exist.

conn = sqlite3.connect('writing-to-sqlite-db-from-pandas/fruits.db')

%%bash
cd writing-to-sqlite-db-from-pandas
ls

fruit_data_with_colors.txt
fruits.db

f.to_sql('fruits',             # Name of the sql table
         conn,                 # sqlite.Connection or sqlalchemy.engine.Engine
         if_exists='replace')  # If the table already exists, {‘fail’, ‘replace’, ‘append’}, default ‘fail’

conn.close()   # Closing the connection is best practice. This prevents the database from being locked.

Connect to the Database and Read from It

sql_string = '''SELECT * 
                FROM fruits'''
conn = sqlite3.connect('writing-to-sqlite-db-from-pandas/fruits.db')

f_out = pd.read_sql(sql_string,
                    conn,
                    index_col='index')
conn.close()

f_out.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 59 entries, 0 to 58
Data columns (total 7 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   fruit_label    59 non-null     int64  
 1   fruit          59 non-null     object 
 2   fruit_subtype  59 non-null     object 
 3   mass           59 non-null     int64  
 4   width          59 non-null     float64
 5   height         59 non-null     float64
 6   color_score    59 non-null     float64
dtypes: float64(3), int64(2), object(2)
memory usage: 3.7+ KB

f_out.head()

	fruit_label	fruit	fruit_subtype	mass	width	height	color_score
index
0	1	apple	granny_smith	192	8.4	7.3	0.55
1	1	apple	granny_smith	180	8.0	6.8	0.59
2	1	apple	granny_smith	176	7.4	7.2	0.60
3	2	mandarin	mandarin	86	6.2	4.7	0.80
4	2	mandarin	mandarin	84	6.0	4.6	0.79

%%bash
cd writing-to-sqlite-db-from-pandas
sqlite3 fruits.db
.tables

fruits

%%bash
cd writing-to-sqlite-db-from-pandas
sqlite3 fruits.db
.schema fruits

CREATE TABLE IF NOT EXISTS "fruits" (
"index" INTEGER,
  "fruit_label" INTEGER,
  "fruit" TEXT,
  "fruit_subtype" TEXT,
  "mass" INTEGER,
  "width" REAL,
  "height" REAL,
  "color_score" REAL
);
CREATE INDEX "ix_fruits_index"ON "fruits" ("index");