Pandas to_datetime()

The to_datetime() method in Pandas is used to convert various types of date formats into a standardized datetime format.

Example

import pandas as pd
 
# create a Series with date strings in 'YYYYMMDD' format
date_series = pd.Series(['20200101', '20200201', '20200301'])

# convert string dates to datetime objects using pd.to_datetime converted_dates = pd.to_datetime(date_series, format='%Y%m%d')
print(converted_dates) ''' Output 0 2020-01-01 1 2020-02-01 2 2020-03-01 dtype: datetime64[ns] '''

to_datetime() Syntax

The syntax of the to_datetime() method in Pandas is:

Pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None)

to_datetime() Arguments

The to_datetime() method takes following arguments:

  • arg - an object to convert to a datetime
  • errors (optional) - specifies how to handle errors for unparsable dates
  • dayfirst (optional) - if True, parses dates with the day first
  • yearfirst (optional) - if True, parses dates with the year first
  • utc (optional) - if True, returns a UTC DatetimeIndex
  • format (optional) - string format to parse the date
  • unit (optional) - the unit of the arg for epoch times

to_datetime() Return Value

The to_datetime() method returns a datetime object.


Example 1: Convert String Dates to Datetime Objects

import pandas as pd
 
# create a Series with date strings in 'YYYYMMDD' format
date_series = pd.Series(['20201010', '20201111', '20201212'])

# convert string dates to datetime objects using pd.to_datetime converted_dates = pd.to_datetime(date_series)
print(converted_dates)

Output

0   2020-10-10
1   2020-11-11
2   2020-12-12
dtype: datetime64[ns]

In the above example, we have used the pd.to_datetime() method to convert string dates into datetime objects.

The resulting datetime objects are then printed, showing the converted dates.


Example 2: Handle Date Parsing Errors with to_datetime()

import pandas as pd

# create a Series with some valid and some invalid date strings
date_series = pd.Series(['20200101', 'invalid date', '20200301', 'another invalid'])

# use 'coerce' in the errors argument to handle invalid dates converted_dates = pd.to_datetime(date_series, format='%Y%m%d', errors='coerce')
print(converted_dates)

Output

0   2020-01-01
1          NaT
2   2020-03-01
3          NaT
dtype: datetime64[ns]

In this example, date_series includes both valid dates in the YYYYMMDD format and strings that are not valid dates.

Then we used pd.to_datetime() with errors='coerce'. This ensures that instead of raising an error for the invalid dates, Pandas converts them to NaT.

The result is a Series where valid dates are correctly parsed, and invalid dates are represented as NaT.


Example 3: Use of dayfirst and yearfirst in to_datetime()

import pandas as pd

# create a Series with ambiguous date strings
date_series = pd.Series(['01-02-2020', '03-04-2021', '05-06-2022'])

# parse dates with dayfirst=True dates_dayfirst = pd.to_datetime(date_series, dayfirst=True)
# parse dates with yearfirst=True dates_yearfirst = pd.to_datetime(date_series, yearfirst=True)
print("Dates with dayfirst=True:\n", dates_dayfirst) print("\nDates with yearfirst=True:\n", dates_yearfirst)

Output

Dates with dayfirst=True:
0   2020-02-01
1   2021-04-03
2   2022-06-05
dtype: datetime64[ns]

Dates with yearfirst=True:
0   2020-01-02
1   2021-03-04
2   2022-05-06
dtype: datetime64[ns]

Here, with

  • dayfirst=True, which tells Pandas to interpret the first number as the day, and once
  • yearfirst=True, which tells Pandas to interpret the first number as the year

Example 4: Convert datetime to UTC (Coordinated Universal Time)

import pandas as pd

# create a Series with date strings
date_series = pd.Series(['2021-01-01 12:00:00', '2021-06-01 15:30:00', '2021-12-31 23:59:59'])

# convert string dates to UTC datetime objects using pd.to_datetime converted_dates_utc = pd.to_datetime(date_series, utc=True)
print(converted_dates_utc)

Output

0   2021-01-01 12:00:00+00:00
1   2021-06-01 15:30:00+00:00
2   2021-12-31 23:59:59+00:00
dtype: datetime64[ns, UTC]

In this example, we used the pd.to_datetime() method to convert these string dates into datetime objects.

The utc parameter is set to True to convert the dates into UTC timezone.


Example 5: Use of unit Argument in to_datetime()

The unit argument in to_datetime() method specifies the time unit for epoch time conversions.

The common units include D (days), s (seconds), ms (milliseconds), us (microseconds), and ns (nanoseconds).

Let's look at an example.

import pandas as pd

# create a Series with epoch times (in seconds)
epoch_series = pd.Series([1609459200, 1612137600, 1614556800])

# convert the epoch times to datetime objects # the 'unit' argument is set to 's' for seconds converted_dates = pd.to_datetime(epoch_series, unit='s')
print(converted_dates)

Output

0   2021-01-01
1   2021-02-01
2   2021-03-01
dtype: datetime64[ns]

Here, epoch_series is the Series containing epoch times. These are Unix timestamps representing the number of seconds since January 1, 1970.

pd.to_datetime() is used with the unit argument set to s to indicate that the input numbers are in seconds.

The method converts these epoch times to standard datetime objects, which are then printed.


Example 6: to_datetime() With Custom Format

import pandas as pd

# create a dataframe with date strings in custom format
df = pd.DataFrame({'date': ['2021/22/01', '2022/13/01', '2023/30/03']})

# convert the 'date' column to datetime with custom format df['date'] = pd.to_datetime(df['date'], format='%Y/%d/%m')
print(df)

Output

        date
0 2021-01-22
1 2022-01-13
2 2023-03-30

In this example, we converted the date column from string (in YY/DD/MM format) to DateTime data type.