Machine Learning Algorithms for Land Cover Classification

I would have to say that the most complete software environment for Machine Learning and nonparametric modeling is R. This is a big field in statistics, spanning K-NN, Kernel smoothing, General Additive Models, weak learners, support vectors, neural nets, semi-parametric spline regression, imputation, etc... I would highly recommend reading: Hastie, T., R. Tibshirani, J. Friedman (2009) The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer Series in Statistics.

Besides R, commercial software by Salford Systems has Random Forests, Multivariate Adaptive Regression Splines, CART and Gradient Boosting (TreeNet) available in a GUI environment. RuleQuest is still selling See5/C5 which is an updated version of the C4/ID3 CART algorithm. The University of Waikato's Weka 3 is an open source GUI/Commandline Java effort with a large number of models available.


I'd strongly recommend scikits-learn for Python. It supports supervised and unsupervised classification and the documentation is excellent (particularly check out the Machine Learning for Astronomical Data Analysis tutorial and the accompanying YouTube video (note: this is 3 hours long)).

The project is under active development, with the last version being 0.12 which was released in September.

As for what the package is capable of, see Nearest Neighbours, Random Forest (under Ensembe Methods), and Decision Trees to use the examples you gave.

Unfortunately no GUI unless you want to devote time to building one, but I'd recommend the iPython IDE as an excellent interactive scripting environment, including inline plots with matplotlib in the QT console.


A good overview of machine learning techniques in R is the machine learning taskview. It offers a host of different algorithms, recommended by the experts.