python linear regression predict by date

Linear regression doesn't work on date data. Therefore we need to convert it into numerical value.The following code will convert the date into numerical value:

import datetime as dt
data_df['Date'] = pd.to_datetime(data_df['Date'])
data_df['Date']=data_df['Date'].map(dt.datetime.toordinal)

When using

dt.datetime.toordinal

be careful that it only converts dates values and does not take into account minutes, seconds etc.. For a complete answer on generating ordinals from full datetime objects you can use something like:

df['Datetime column'].apply(lambda x: time.mktime(x.timetuple()))

convert:

1) date to dataframe index

df = df.set_index('date', append=False)

2) convert datetime object to float64 object

df = df.index.to_julian_date()

run the regression with date being the independent variable.


Liner regression works on numerical data. Datetime type is not appropriate for this case. You should remove that column after separating it to three separate columns (year, month and day).