Multivariate linear regression

LinearModelFit doesn't do multivariate regression as far as I'm aware. You can use my repository function BayesianLinearRegression instead. The first example in the "Scope" section shows you how. You can provide the data in the format

data[[All, {1, 2}]] -> data[[All, {3, 4}]]

or

#[[{1, 2}]] -> #[[{3, 4}]]& /@ data

For example:

fitData = ResourceFunction["BayesianLinearRegression"][
  data[[All, {1, 2}]] -> data[[All, {3, 4}]],
  {1, x1, x2}, (* basis functions *)
  {x1, x2} (* independent variables *)
];

You can find the best-fit expression with:

Mean[fitData["Posterior", "PredictiveDistribution"]]

The output should provide you with all the details about the uncertainties of the predictions and the regression coefficients. My blog post has more background information, if you need it.


You can build the regression yourself as an NMinimize of residuals which are squared distances to points.

First let's build some synthetic noisy data:

(* create some noisy data that follows a linear model *)
n = 1000;
datax = RandomReal[{-1, 1}, {n, 2}];
testmtx = {{3, 4}, {1/2, 1/6}};
testoffset = {3/2, 5/7};
fn[{x1_, x2_}] := testmtx.{x1, x2} + testoffset
noise = RandomVariate[NormalDistribution[0, 1/10], {n, 2}];
datay = (fn /@ datax) + noise;

(* this is the noisy 4d data *)
data = MapThread[Join, {datax, datay}];

ListPlot[{datax, datay}, PlotRange -> {{-4, 4}, {-4, 4}}, 
 AspectRatio -> 1, PlotStyle -> PointSize[Small]]

The ideal fit is:

$$ \left( \begin{array}{cc} y_1\\ y_2 \end{array} \right)= \left( \begin{array}{cc} 3 & 4 \\ 1/2 & 1/6 \\ \end{array} \right) \left( \begin{array}{cc} x_1\\ x_2 \end{array} \right) + \left( \begin{array}{cc} 3/2\\ 5/7 \end{array} \right) $$

... but pretend we don't know that and we only work with data from this point. Here's what the $x_1,x_2$ values (blue) vs the noisy $y_1,y_2$ values (orange) look like: noisy data

Then construct a residual function and an objective which is to minimize the total residuals:

matrix = {{a1, a2}, {a3, a4}};
offset = {c1, c2};
sqresidual[{x1_, x2_, y1_, y2_}, mtx_, c_] := 
 SquaredEuclideanDistance[c + mtx.{x1, x2}, {y1, y2}]
objective = Total[sqresidual[#, matrix, offset] & /@ data];

... and finally use NMinimize:

NMinimize[objective, {a1, a2, a3, a4, c1, c2}]
(* result: {19.8142, {a1 -> 2.99722, a2 -> 4.00609, a3 -> 0.498218, 
  a4 -> 0.165467, c1 -> 1.49577, c2 -> 0.7118}} *)

The result is pretty close to ideal!


For simple applications of regression, there is no need to fit multiple dependent variables simultaneously. The results are the same as a regression of each dependent variable separately. There are caveats if you are doing further analysis, but if you just want the basic regression results, you can do separate fits.