Why Laplacian Matrix need normalization and how come the sqrt of Degree Matrix?

Unnormalized Laplacian serve in the approximation of the minimization of RatioCut, while normalized Laplacian serve in the approximation of the minimization of NCut.

Basically, in the unnormalized case, you optimize an objective relative to the number of nodes in each cluster. And, in the normalized case, you optimize the objective relative to the volume of each cluster.

The square root comes from this: $f^\top(I-D^{-1}A)f = g^\top(I-D^{-1/2}AD^{-1/2})g$ where $g=D^{1/2}f$.

So, if your optimization method relies on a symmetric PSD matrix, you use the symmetric normalized Laplacian, and then remove the bias by computing $D^{-1/2}g$ to recover $f$.


More insight about RatioCut and NCut.

Let $G=(V,E)$ be a graph with vertex set $V=\{v_1,\dots,v_n\}$ and edge set $E$, $w_{ij}$ denote the positive weight on the edge between $v_i$ and $v_j$. Let $\{C_i:1 \le i \le k\}$ be a disjoint partition of $V$. Define $$ Cut(C_i:1\le i\le k)\triangleq\frac{1}{2}\sum_{c=1}^k \sum_{i \in C_c,j\in \bar C_c}A_{ij} $$ $$ RatioCut(A_i:1\le i \le k)\triangleq\sum_{i=1}^k\frac{Cut(A_i,\bar A_i)}{|A_i|} $$ $$ NCut(A_i:1\le i \le k)\triangleq\sum_{i=1}^k\frac{Cut(A_i,\bar A_i)}{vol(A_i)} $$

RatioCut: Let $$\tag{1}\label{1} f_i=\begin{cases}\sqrt{|\bar A|/|A|} & \text{if } v_i\in A\\\sqrt{|A|/|\bar A|} & \text{if } v_i \in \bar A\end{cases} $$

\begin{align*} f^\top L f & = \frac{1}{2}\sum_{i,j=1}^n w_{ij}(f_i-f_j)^2 \\ & = \frac{1}{2}\sum_{i\in A,j\in \bar A}w_{ij}\left(\sqrt{\frac{|\bar A|}{|A|}}+\sqrt{\frac{|A|}{|\bar A|}}\right) + \frac{1}{2}\sum_{i\in \bar A,j\in A}w_{ij}\left(-\sqrt{\frac{|\bar A|}{|A|}}-\sqrt{\frac{|A|}{|\bar A|}}\right) \\ &=Cut(A,\bar A)\left(\frac{|\bar A|}{|A|}+\frac{|A|}{|\bar A|}+2\right)\\ &=Cut(A,\bar A)\left(\frac{|A|+|\bar A|}{|A|}+\frac{|A|+|\bar A|}{|\bar A|}\right)\\ &=|V| RatioCut(A,\bar A) \end{align*}

You can also see that $\sum_{i=1}^n f_i=0$, so $f\perp \mathbb 1$.

So minimizing RatioCut is equivalent to the following problem: $$ \min_{A\subset V}f^\top L f\text{ subject to $f\perp\mathbb 1$, $f_i$ as defined in Eq.\eqref{1}},\|f\|^2=n $$

NCut: For this case, define $$\tag{2}\label{2} f_i=\begin{cases} \sqrt{\frac{vol(\bar A)}{vol(A)}} &\text{if $v_i\in A$}\\ \sqrt{\frac{-vol(A)}{vol(\bar A)}} &\text{if $v_i\in \bar A$}\\ \end{cases} $$

Similar to above, we have $Df\perp \mathbb 1$, $f^\top D f=vol(V)$ and $f^\top L f=vol(V)NCut(A,\bar A)$. Minimizing NCut is equivalent to $$ \min_{A\subset V} f^\top L f \text{ subject to $f$ as in Eq.\eqref{2}, $Df\perp \mathbb 1$ and $f^\top Df=vol(V)$} $$ Substituting $g=D^{1/2}f$ we get $$ \min_{A\subset V} g^\top D^{-1/2}LD^{-1/2} g \text{ subject to $g=D^{1/2}f$, $f$ as in Eq.\eqref{2}, $g\perp D^{1/2}\mathbb 1$ and $\|g\|^2=vol(V)$} $$

Then observe that $D^{-1/2}LD^{-1/2}$ is the symmetric normalized Laplacian.


The notion of Normalised Laplace matrix was given by Fan R. K. Chung in her book Spectral Graph Theory. Two basic reasons of its widely acceptance is that its eigenvalues are consistent with the eigenvalues in the spectral geometry and in stochastic processes, and that many results which were only known for regular graphs can be generalised for all graphs. The Laplacian $L=D-A$ works well for the regular graphs but the Normalised laplacian $ℒ=D^{-1/2}LD^{1/2} =D^{-1/2}(D-A)D^{1/2}=I-D^{-1/2}AD^{1/2}$ not only works well for regular but also irregular graphs. First few pages of the book by Chung will answer your question in detail.