Writing to a SQLite DB from Pandas
This note demonstrates writing from a Pandas dataframe to a SQLite database.
import pandas as pd
import sqlite3
Read CSV Data into a DataFrame
f = (pd.read_csv('writing-to-sqlite-db-from-pandas/fruit_data_with_colors.txt',
sep='\t')
.rename(columns={'fruit_name':'fruit'}))
f.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 59 entries, 0 to 58
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 fruit_label 59 non-null int64
1 fruit 59 non-null object
2 fruit_subtype 59 non-null object
3 mass 59 non-null int64
4 width 59 non-null float64
5 height 59 non-null float64
6 color_score 59 non-null float64
dtypes: float64(3), int64(2), object(2)
memory usage: 3.4+ KB
f.head()
fruit_label | fruit | fruit_subtype | mass | width | height | color_score | |
---|---|---|---|---|---|---|---|
0 | 1 | apple | granny_smith | 192 | 8.4 | 7.3 | 0.55 |
1 | 1 | apple | granny_smith | 180 | 8.0 | 6.8 | 0.59 |
2 | 1 | apple | granny_smith | 176 | 7.4 | 7.2 | 0.60 |
3 | 2 | mandarin | mandarin | 86 | 6.2 | 4.7 | 0.80 |
4 | 2 | mandarin | mandarin | 84 | 6.0 | 4.6 | 0.79 |
Create the Database Connection and Write to It
%%bash
cd writing-to-sqlite-db-from-pandas
rm -f fruits.db
ls
fruit_data_with_colors.txt
The following line creates the SQLite DB file if it does not already exist.
conn = sqlite3.connect('writing-to-sqlite-db-from-pandas/fruits.db')
%%bash
cd writing-to-sqlite-db-from-pandas
ls
fruit_data_with_colors.txt
fruits.db
f.to_sql('fruits', # Name of the sql table
conn, # sqlite.Connection or sqlalchemy.engine.Engine
if_exists='replace') # If the table already exists, {‘fail’, ‘replace’, ‘append’}, default ‘fail’
conn.close() # Closing the connection is best practice. This prevents the database from being locked.
Connect to the Database and Read from It
sql_string = '''SELECT *
FROM fruits'''
conn = sqlite3.connect('writing-to-sqlite-db-from-pandas/fruits.db')
f_out = pd.read_sql(sql_string,
conn,
index_col='index')
conn.close()
f_out.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 59 entries, 0 to 58
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 fruit_label 59 non-null int64
1 fruit 59 non-null object
2 fruit_subtype 59 non-null object
3 mass 59 non-null int64
4 width 59 non-null float64
5 height 59 non-null float64
6 color_score 59 non-null float64
dtypes: float64(3), int64(2), object(2)
memory usage: 3.7+ KB
f_out.head()
fruit_label | fruit | fruit_subtype | mass | width | height | color_score | |
---|---|---|---|---|---|---|---|
index | |||||||
0 | 1 | apple | granny_smith | 192 | 8.4 | 7.3 | 0.55 |
1 | 1 | apple | granny_smith | 180 | 8.0 | 6.8 | 0.59 |
2 | 1 | apple | granny_smith | 176 | 7.4 | 7.2 | 0.60 |
3 | 2 | mandarin | mandarin | 86 | 6.2 | 4.7 | 0.80 |
4 | 2 | mandarin | mandarin | 84 | 6.0 | 4.6 | 0.79 |
%%bash
cd writing-to-sqlite-db-from-pandas
sqlite3 fruits.db
.tables
fruits
%%bash
cd writing-to-sqlite-db-from-pandas
sqlite3 fruits.db
.schema fruits
CREATE TABLE IF NOT EXISTS "fruits" (
"index" INTEGER,
"fruit_label" INTEGER,
"fruit" TEXT,
"fruit_subtype" TEXT,
"mass" INTEGER,
"width" REAL,
"height" REAL,
"color_score" REAL
);
CREATE INDEX "ix_fruits_index"ON "fruits" ("index");