Pandas to_sql()

The to_sql() method in Pandas is used to write records stored in a DataFrame to a SQL database. It requires the SQLAlchemy engine to make a connection to the database.

Example

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)

# write DataFrame to database
df.to_sql(name='people', con=engine)

to_sql() Syntax

The syntax of the to_sql() method in Pandas is:

df.to_sql(name, con, schema=None, if_exists='fail', index=True, index_label=None, chunksize=None, dtype=None, method=None)

to_sql() Arguments

The to_sql() method takes the following common arguments:

  • name: the name of the target table
  • con: engine or database connection object
  • schema (optional): specifies the schema
  • if_exists: how to behave if the table already exists
  • index: write index as a column
  • index_label: column label for index column(s)
  • chunksize: specifies the number of rows in each batch to be written at a time
  • dtype: specifies the datatype for columns
  • method: controls the SQL insertion clause used.

to_sql() Return Value

The return value of to_sql() is None. It does not return anything since its purpose is to write the DataFrame to a database.


Example 1: Write to SQL with Default Settings

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
data = {'Name': ['Tom', 'Nick', 'John'], 'Age': [20, 21, 19]}
df = pd.DataFrame(data)

# write DataFrame to database
df.to_sql(name='people', con=engine)

In this example, we wrote the DataFrame df to the SQL table people using the default settings.


Example 2: Writing to SQL with Replace Option

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
data = {'Name': ['Anna', 'Alex', 'Alice'], 'Age': [28, 22, 23]}
df = pd.DataFrame(data)

# write DataFrame to database
df.to_sql(name='people', con=engine, if_exists='replace')

Here, if the table people already exists, it will be replaced with the new DataFrame df.


Example 3: Specifying Data Types

import pandas as pd
from sqlalchemy import create_engine, Integer, Text

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
data = {'Name': ['Sam', 'Lily', 'Max'], 'Age': [32, 24, 27]}
df = pd.DataFrame(data)

# write DataFrame to database
df.to_sql(name='employees', con=engine, dtype={'Name': Text, 'Age': Integer})

In this example, we specified that the Name column should be stored as Text and the Age as Integer in the SQL table employees.


Example 4: Append to Existing Table

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
new_data = {'Name': ['Brian', 'Smith'], 'Age': [34, 29]}
new_df = pd.DataFrame(new_data)

# write to database
new_df.to_sql(name='people', con=engine, if_exists='append')

In this example, we appended the records in new_df to the people table by using if_exists='append'.


Example 5: Using method Parameter

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine('sqlite:///mydatabase.db')

# create a DataFrame
data = {'Name': ['Sara', 'Alex'], 'Age': [26, 22]}
df = pd.DataFrame(data)

# use the 'multi' method to pass multiple insert values
df.to_sql(name='new_people', con=engine, method='multi')

In this example, we used the 'multi' method to pass multiple insert values in a single INSERT clause.

Using the method='multi' argument can lead to performance benefits when inserting multiple records at once.