Fine-tuning parameters in Logistic Regression

Grid search is a brutal way of finding the optimal parameters because it train and test every possible combination. best way is using bayesian optimization which learns for past evaluation score and takes less computation time.


You can use grid search to find out the best C value for you. Basically smaller C specify stronger regularization.

>>> param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000] }
>>> clf = GridSearchCV(LogisticRegression(penalty='l2'), param_grid)
GridSearchCV(cv=None,
             estimator=LogisticRegression(C=1.0, intercept_scaling=1,   
               dual=False, fit_intercept=True, penalty='l2', tol=0.0001),
             param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]})

See the GridSearchCv document for more details on your application.


You may use below code for more general details:

LR = LogisticRegression()
LRparam_grid = {
    'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000],
    'penalty': ['l1', 'l2'],
    # 'max_iter': list(range(100,800,100)),
    'solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']
}
LR_search = GridSearchCV(LR, param_grid=LRparam_grid, refit = True, verbose = 3, cv=5)

# fitting the model for grid search 
LR_search.fit(X_train , y_train)
LR_search.best_params_
# summarize
print('Mean Accuracy: %.3f' % LR_search.best_score_)
print('Config: %s' % LR_search.best_params_)