Orthogonal Projection onto the Unit Simplex

Projection onto the Simplex can be calculated as following.
The Lagrangian in that case is given by:

$$ \begin{align} L \left( x, \mu \right) & = \frac{1}{2} {\left\| x - y \right\|}^{2} + \mu \left( \boldsymbol{1}^{T} x - 1 \right) && \text{} \\ \end{align} $$

The trick is to leave non negativity constrain implicit.
Hence the Dual Function is given by:

$$ \begin{align} g \left( \mu \right) & = \inf_{x \succeq 0} L \left( x, \mu \right) && \text{} \\ & = \inf_{x \succeq 0} \sum_{i = 1}^{n} \left( \frac{1}{2} { \left( {x}_{i} - {y}_{i} \right) }^{2} + \mu {x}_{i} \right) - \mu && \text{Component wise form} \end{align} $$

Taking advantage of the Component Wise form the solution is given:

$$ \begin{align} {x}_{i}^{\ast} = { \left( {y}_{i} - \mu \right) }_{+} \end{align} $$

Where the solution includes the non negativity constrain by Projecting onto $ {\mathbb{R}}_{+} $

The solution is given by finding the $ \mu $ which holds the constrain (Pay attention, since the above was equality constrain, $ \mu $ can have any value and it is not limited to non negativity as $ \lambda $).

The objective function (From the KKT) is given by:

$$ \begin{align} 0 = h \left( \mu \right) = \sum_{i = 1}^{n} {x}_{i}^{\ast} - 1 & = \sum_{i = 1}^{n} { \left( {y}_{i} - \mu \right) }_{+} - 1 \end{align} $$

The above is a Piece Wise linear function of $ \mu $.

Since the function is continuous yet it is not differentiable due to its piece wise property theory says we must use derivative free methods for root finding. One could use the Bisection Method for instance.

The function Derivative given by:

$$ \begin{align} \frac{\mathrm{d} }{\mathrm{d} \mu} h \left( \mu \right) & = \frac{\mathrm{d} }{\mathrm{d} \mu} \sum_{i = 1}^{n} { \left( {y}_{i} - \mu \right) }_{+} \\ & = \sum_{i = 1}^{n} -{ \mathbf{1} }_{\left\{ {y}_{i} - \mu > 0 \right\}} \end{align} $$

In practice, it can be solved using Newton Iteration (Since falling into a joint between 2 sections has almost zero probability).

Accurate / Exact Solution

If we look at the values of the function $ h \left( \mu \right) = \sum_{i = 1}^{n} { \left( {y}_{i} - \mu \right) }_{+} - 1 $ one could easily infer a method to calculate the accurate solution:

enter image description here

In the above the parameter $ \mu $ took the values of the vector $ {y}_{i} $ with additional values at the edges (Value greater than the max value of $ {y}_{i} $ and value lower of the min value of $ {y}_{i} $).
By iterating the values one could easily track the 2 values which on each side they have value greater than $ 0 $ and lower then $ 0 $ (In case one of them is zero, then it is the optimal value of $ \mu $). Since it is linear function and we have 2 points we can infer all parameters of the model $ y = a x + b $. Than the optimal value of $ \hat{\mu} = - \frac{b}{a} $.

I wrote MATLAB code which implements the method with Newton Iteration at Mathematics StackExchange Question 2327504 - GitHub. I extended the method for the case $ \sum {x}_{i} = r, \; r > 0 $ (Pseudo Radius).
There is a test which compares the result to a reference calculated by CVX.


The best algorithm to compute the exact solution to this problem can be found in Projection Onto A Simplex.


The paper by Condat [1] presents a review and comparison of existing algorithms with a new proposal for projection onto the unit simplex. This paper lists the worst-case complexity and empirical complexity of those algorithms, and presents concise pseudo-code for all algorithms. In particular, the algorithm proposed by Condat takes $O(n)$ time in practice, whereas sorting-based methods take $O(n \log n)$ time in practice. I have implemented Condat's algorithm in the past, and can vouch for its speed relative to direct sorting-based approaches.

[1] Laurent Condat, Fast Projection onto the Simplex and the $\ell_1$ Ball.

EDIT: Condat has included C and MATLAB implementations of all the algorithms mentioned in his paper here: https://lcondat.github.io/software.html