The to_sql()
method in Pandas is used to write records stored in a DataFrame to a SQL database. It requires the SQLAlchemy
engine to make a connection to the database.
Example
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine)
to_sql() Syntax
The syntax of the to_sql()
method in Pandas is:
df.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)
to_sql() Arguments
The to_sql()
method takes the following common arguments:
name
: the name of the target tablecon
: engine or database connection objectschema
(optional): specifies the schemaif_exists
: how to behave if the table already existsindex
: write index as a columnindex_label
: column label for index column(s)chunksize
: specifies the number of rows in each batch to be written at a timedtype
: specifies the datatype for columnsmethod
: controls the SQL insertion clause used.
to_sql() Return Value
The return value of to_sql()
is None
. It does not return anything since its purpose is to write the DataFrame to a database.
Example 1: Write to SQL with Default Settings
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine)
In this example, we wrote the DataFrame df to the SQL table people
using the default settings.
Example 2: Writing to SQL with Replace Option
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Anna', 'Alex', 'Alice'], 'Age': [28, 22, 23]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='people', con=engine, if_exists='replace')
Here, if the table people already exists, it will be replaced with the new DataFrame df.
Example 3: Specifying Data Types
import pandas as pd
from sqlalchemy import create_engine, Integer, Text
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Sam', 'Lily', 'Max'], 'Age': [32, 24, 27]}
df = pd.DataFrame(data)
# write DataFrame to database
df.to_sql(name='employees', con=engine, dtype={'Name': Text, 'Age': Integer})
In this example, we specified that the Name
column should be stored as Text
and the Age
as Integer
in the SQL table employees
.
Example 4: Append to Existing Table
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
new_data = {'Name': ['Brian', 'Smith'], 'Age': [34, 29]}
new_df = pd.DataFrame(new_data)
# write to database
new_df.to_sql(name='people', con=engine, if_exists='append')
In this example, we appended the records in new_df
to the people table by using if_exists='append'
.
Example 5: Using method Parameter
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('sqlite:///mydatabase.db')
# create a DataFrame
data = {'Name': ['Sara', 'Alex'], 'Age': [26, 22]}
df = pd.DataFrame(data)
# use the 'multi' method to pass multiple insert values
df.to_sql(name='new_people', con=engine, method='multi')
In this example, we used the 'multi'
method to pass multiple insert values in a single INSERT
clause.
Using the method='multi'
argument can lead to performance benefits when inserting multiple records at once.