Convert timedelta64[ns] column to seconds in Python Pandas DataFrame

This works properly in the current version of Pandas (version 0.14):

In [132]: df[:5]['duration'] / np.timedelta64(1, 's')
Out[132]: 
0    1232
1    1390
2    1495
3     797
4    1132
Name: duration, dtype: float64

Here is a workaround for older versions of Pandas/NumPy:

In [131]: df[:5]['duration'].values.view('<i8')/10**9
Out[131]: array([1232, 1390, 1495,  797, 1132], dtype=int64)

timedelta64 and datetime64 data are stored internally as 8-byte ints (dtype '<i8'). So the above views the timedelta64s as 8-byte ints and then does integer division to convert nanoseconds to seconds.

Note that you need NumPy version 1.7 or newer to work with datetime64/timedelta64s.


Use the Series dt accessor to get access to the methods and attributes of a datetime (timedelta) series.

>>> s
0   -1 days +23:45:14.304000
1   -1 days +23:46:57.132000
2   -1 days +23:49:25.913000
3   -1 days +23:59:48.913000
4            00:00:00.820000
dtype: timedelta64[ns]
>>>
>>> s.dt.total_seconds()
0   -885.696
1   -782.868
2   -634.087
3    -11.087
4      0.820
dtype: float64

There are other Pandas Series Accessors for String, Categorical, and Sparse data types.


Just realized it's an old thread, anyway leaving it here if wanderers like me clicks only on top 5 results on the search engine and ends up here.

Make sure that your types are correct.

  • If you want to convert datetime to seconds , just sum up seconds for each hour, minute and seconds of the datetime object if its for duration within one date.

      • hours - hours x 3600 = seconds
      • minutes - minutes x 60 = seconds
      • seconds - seconds

linear_df['duration'].dt.hour*3600 + linear_df['duration'].dt.minute*60 + linear_df['duration'].dt.second

  • If you want to convert timedelta to seconds use the one bellow.

linear_df[:5]['duration'].astype('timedelta64[s]')

I got it to work like this:

start_dt and end_dt columns are in this format:

import datetime

linear_df[:5]['start_dt']

0   1970-02-22 21:32:48.000
1   2016-12-30 17:47:33.216
2   2016-12-31 09:33:27.931
3   2016-12-31 09:52:53.486
4   2016-12-31 10:29:44.611
Name: start_dt, dtype: datetime64[ns]

Had my duration in timedelta64[ns] format, which was subtraction of start and end datetime values.

linear_df['duration'] = linear_df['end_dt'] - linear_df['start_dt']

Resulted duration column look like this

linear_df[:5]['duration']

0          0 days 00:00:14
1   2 days 17:44:50.558000
2   0 days 15:37:28.418000
3   0 days 18:45:45.727000
4   0 days 19:21:27.159000
Name: duration, dtype: timedelta64[ns]

Using pandas I had my duration seconds between two dates in float. Easier to compare or filter your duration afterwards.

linear_df[:5]['duration'].astype('timedelta64[s]')

0        14.0
1    236690.0
2     56248.0
3     67545.0
4     69687.0
Name: duration, dtype: float64

In my case if I want to get all duration which is more than 1 second.

Hope it helps.