How does mathematica calculate CovarianceMatrix in NonLinearModelFit

Just a note: it has been called the design matrix for linear models long before even I was born. A statistics class would help with the terminology.

However, in a nonlinear setting it turns out to be a slightly more complicated item that still results in the usual design matrix when the model is linear.

I'm going to ignore the issue of weighting because it's all about what X is for a nonlinear model. Below is shown an example of a nonlinear model using NonlinearModelFit and a brute force maximum likelihood method to show the underlying calculations.

 (* Generate data from a nonlinear model *)
 SeedRandom[12472475];
 x = Flatten[Table[{0, 1, 2, 3, 4, 5, 6, 7, 8}, {i, 5}]];
 n = Length[x];
 aa = 1;
 bb = 0.5;
 σσ = 0.05;
 y = aa Exp[-bb x] + 
    RandomVariate[NormalDistribution[0, σσ], n];

 (* Fit with NonlinearModelFit *)
 nlm = NonlinearModelFit[Transpose[{x, y}], a Exp[-b z], {a, b}, z];

 (* Fit with brute force method *)
 (* Get log of the likelihood *)
 logL = LogLikelihood[NormalDistribution[0, σ], y - a Exp[-b x]];
 (* Find maximum likelihood estimates of parameters *)
 mle = FindMaximum[{logL, σ > 0}, {{a, 1}, {b, 0.5}, {σ, 0.05}}];

 (* Now approximate and then estimate the covariance matrix *)
 (* Note the definition of X, the design matrix *)
 X = D[a Exp[-b x], {{a, b}}];
 (* Multiplication by n/(n-2) is to use the unbiased estimate of σ^2 *)
 (* n - 2 is really sample size minus number of parameters *)
 cov = (n/(n - 2)) (σ^2 Inverse[Transpose[X]. X]) /. mle[[2]];


 (* Parameter estimates *)
 nlm["BestFitParameters"]
 (* {a -> 0.9842981255137611,b -> 0.5023511174255275} *)
 mle[[2]][[1 ;; 2]]
 (* {a -> 0.9842981255018496,b -> 0.50235111740454} *)

 (* Estimates of covariance matrix *)
 nlm["CovarianceMatrix"]
 (* {{0.00034959357568038475,0.00016592363407431294},
    {0.00016592363407431294,0.0002923502562857745}} *)
 cov
 (* {{0.00034959357700581665,0.0001659236346998225},
     {0.0001659236346998225,0.00029235025737224096}} *)

 Rationalize[cov/nlm["CovarianceMatrix"], 0.00001]
 (* Estimates of σ^2 *)
 nlm["EstimatedVariance"]
(* 0.002014630678417603 *)
((n/(n - 2)) σ^2 /. mle[[2, 3]])
(* 0.0020146306860807216 *)

I will answer myself half the way. So I misunderstood the design matrix, it should be DesignMatrix OF X. ok that's stupid because it is called X. design matrix has the form {{1.,Xestimate_1} ,{1.,Xestimate_2},...}. The following code will produce the same matrix as nlm["CovarianceMatrix"] does. BUT just for linear models b + a*x. So there is still a open question. :-)

f = Function[{a, b, x}, b + a*x];
data = {{6., 21.}, {8., 31.}, {9., 39.}};
weights = {0.5, 4.5, 0.1};
nlm = NonlinearModelFit[data, f[a, b, x], {a, b}, x,Weights -> weights, VarianceEstimatorFunction -> Automatic]
yestimates = Table[nlm[data[[i, 1]]], {i, Length[data]}]
xestimates = Flatten[Table[x /. Solve[nlm[x] == yestimates[[i]]], {i,Length[data]}]]
xdesignmatrix = Table[{1., xestimates[[i]]}, {i, Length[data]}];
varianceestimate = nlm["EstimatedVariance"];
matrixofweights = DiagonalMatrix[weights];
varianceestimate*Inverse[Transpose[xdesignmatrix].matrixofweights.xdesignmatrix]
% // MatrixForm
nlm["CovarianceMatrix"] // MatrixForm

output: {{25.9587, -3.29752}, {-3.29752, 0.421488}}