Linearly constrained eigenvalue problem

I think Pushpendre's answer isn't quite right, but it gets you most of the way there. Getting rid of that pesky constant term is a bit tricky relative to the homogeneous case.

Let's take his suggested substitutions: $$ \begin{array}{rl} M&:=N^\top N\\ y&:=Nx\\ D&:=CN^{-1}\\ B&:=N^{-\top}AN^{-1} \end{array} $$ I don't think these are strictly necessary (e.g. this encodes an assumption that $M$ is semidefinite), but it simplifies notation quite a bit for the most realistic/common case. I'm going to be cavalier about assuming certain matrices are symmetric/invertible as convenient.

Then, your problem becomes $$ \begin{array}{rl} \min_y &y^\top By\\ \mathrm{s.t.} & \|y\|^2=1\\ & Dy=b \end{array} $$ This problem has Lagrange multiplier expression (I am sprinkling in constant factors so that they simplify my arithmetic later) $$ \Lambda(y;\lambda,\mu):=\frac{1}{2}y^\top By+\frac{1}{2}\lambda(1-\|y\|^2)+\mu^\top (b-Dy) $$ Differentiating with respect to $y$ shows $$ 0=\nabla_y\Lambda(y;\lambda,\mu)=By-\lambda y-D^\top\mu. $$ Here I am assuming w.l.o.g. that $B$ is symmetric.

Pre-multiplying the critical point condition by $D$ shows $$ DBy=\lambda Dy + DD^\top\mu=\lambda b + DD^\top \mu. $$ Let's further assume that $D$ has full rank. The most common case is that $D\in\mathbb{R}^{m\times n},$ where $m<n$; I'll assume $D$ has rank $m$. Then, $DD^\top$ is an invertible matrix, and $D$ admits a pseudoinverse $D^+:=D^\top (DD^\top)^{-1}$ satisfying $DD^+=I$.

Then, we can isolate $\mu$ as $$ \mu=(DD^\top)^{-1}(DBy-\lambda b). $$ Plugging this back into the expression for $\nabla_y\Lambda$ shows $$ \begin{array}{rl} 0&=By-\lambda y-D^\top (DD^\top)^{-1}(DBy-\lambda b)\\ &=By-\lambda y-D^+(DBy-\lambda b)\\ \implies [(I-D^+D)B-\lambda I]y&=-\lambda D^+b. \end{array} $$ For each $\lambda$, this expression gives a system of equation solvable for $y$. In other words, we can think of $y$ as a function $y(\lambda)$ of $\lambda$.

Define a function $f(\lambda):\mathbb{R}\rightarrow\mathbb{R}\cup\{-\infty,\infty\}$ as $$f(\lambda):=\|y(\lambda)\|^2-1.$$ Use any standard 1d method to find a root $\lambda$ such that $f(\lambda)=0.$ Then, $y(\lambda)$ is a critical point of your optimization problem. To be totally rigorous, you should check that $y(\lambda)$ satisfies the constraint $Dy=b$, but this is easy to check for any $\lambda$ from the system for $y(\lambda)$ (multiply both sides by $D$).

Note that numerically the problem of finding roots of $f(\cdot)$ isn't great. $f$ likely has multiple roots and asymptotes where the system of equations for $y$ is not solvable. It is known as a secular equation, for which some specialized solution algorithms exist.


NOTE: This trick is similar to the one suggested for "LSQI" in Golub/Van Loan's book.


Your problem has been answered at https://scicomp.stackexchange.com/questions/14096/sparse-smallest-eigenvalue-problem-on-a-linear-subspace :) Or you can read Golub's original paper Some modified matrix eigenvalue problems

The basic intuition is that basically you want to find the eigenvalues over a subspace that is defined $Cx = b$ so just find its basis and convert the rayleigh quotient to that basis.

EDIT: I just realized that you need the space which satisfies $Cx=b$ and zero need not be in this space so its not a subspace. But I think the techniques in those answers could still be adapted.

WRONG One way I can think of handling the nonzero right hand size is to make an augmented matrix $C'=[C\; b]$ and augmented $x' = [x; 1]$ then $C'x' = 0$ and then also create $M' = [M\; 0; 0\; 0]$ and same for $A$. After this augmentation and zero padding everything works out.

2nd try: we can re write the objective as $$\arg \min \frac{X^TAX}{X^TMx} \textrm{subject to } C(x-x')=0$$ where $x'$ is any point satisfying $Cx=b$ and then assuming that $M$ is positve definite which means that it can be broken into $M = N^TN$ (assuming real.)

let $y = Nx$.

let $B = (N^{-1})^TAN^{-1}$.

let $D = CN^{-1}$.

then the objective becomes $$\arg\min \frac{y^TBy}{y^Ty} \textrm{ subject to } D(y-y') = 0$$. Now we are pretty close to Golub's setup in 1.1, 1.2 and 1.3 but not quite because of the non-zero RHS. But we can still use the lagrangian (assuming everywhere things were positive definite and inverses were well conditioned (practically))

So the lagrangian is $$y^TBy - \lambda (y^Ty -1) + 2 \mu^T(D(y-y'))$$

EDIT: See Justin's answer above for a correct explanation

The first derivative is $$By - \lambda y + 2 \mu^TD = 0$$ multiply with $D^T$ on the right and use the constraint that $D^Ty = D^Ty'$ to get $$By'B^T - \lambda y'B^T + 2 \mu^T DD^T = 0$$ which gives us a value of $\mu$ (because we know $B$ and $y'$ and I assume that since $C$ was full row rank therefore $DD^T$ would be invertible, if not invertible then we have a range of solutions for $\mu$ and would have to pick the best) then substitute to get a generalized eigen value problem in terms of lambda which would be the solution.