How to get the regression intercept using Statsmodels.api

So, statsmodels has a add_constant method that you need to use to explicitly add intercept values. IMHO, this is better than the R alternative where the intercept is added by default.

In your case, you need to do this:

import statsmodels.api as sm
endog = Sorted_Data3['net_realization_rate']
exog = sm.add_constant(Sorted_Data3[['Cohort_2','Cohort_3']])

# Fit and summarize OLS model
mod = sm.OLS(endog, exog)
results = mod.fit()
print results.summary()

Note that you can add a constant before your array, or after it by passing True (default) or False to the prepend kwag in sm.add_constant


Or, not recommended, but you can use Numpy to explicitly add a constant column like so:

exog = np.concatenate((np.repeat(1, len(Sorted_Data3))[:, None], 
                       Sorted_Data3[['Cohort_2','Cohort_3']].values),
                       axis = 1)

You can also do something like this:

df['intercept'] = 1

Here you are explicitly creating a column for the intercept.

Then you can just use the sm.OLS method like so:

lm = sm.OLS(df['y_column'], df[['intercept', 'x_column']])
results = lm.fit()
results.summary()