Size of sets with complete double

Let $S\subset [1,N]$ with $S+S=2N$. I will show that $S$ must have at least $(2.033 +o(1))\sqrt{N}$ elements for large $N$. The argument can certainly be improved, but I don't know what the right answer should be.

Assume that $N$ is large and that $|S| =O(\sqrt{N})$ (else we are done of course). Let $r(n)$ denote the number of ways of writing $n$ as $a+b$ with $a$ and $b$ in $S$. Apart from the cases $a=b$ which occur for at $O(\sqrt{N})$ values, we have $r(n)\ge 2$ for $n\le 2N$ (since $a+b$ and $b+a$ will be counted as two different solutions). As noted in the problem, we then have $$ 4N +O(\sqrt{N}) \le \sum_{n\le 2N} r(n) = |S|^2, $$ which gave the asymptotic bound $|S| \ge 2\sqrt{N}+O(1)$ that we now wish to improve.

Decompose $S$ into three sets $A$, $B$, $C$, where $A=S\cap[1,N/3)$, $B=S\cap[N/3,2N/3)$, and $C= S\cap[2N/3,N]$. Say $|A|=\alpha\sqrt{N}$, $|B|=\beta\sqrt{N}$ and $|C| =\gamma\sqrt{N}$, so that we know already that $\alpha+\beta+\gamma\ge 2+o(1)$.

Consider $A+A$. This must cover all numbers in $[1,N/3]$ and therefore we must have $$ |A|^2 \ge \sum_{n\le N/3} r(n) \ge 2N/3+ O(\sqrt{N}). $$ Therefore we must have $$ \alpha \ge \sqrt{2/3} +o(1). \tag{1} $$

Similarly $C+C$ must cover all elements in $[5N/3,2N]$ which leads to $$ \gamma \ge \sqrt{2/3}+ o(1). \tag{2} $$

Now consider $B+B$ and $A+C$. These sums are all in $[2N/3,4N/3]$ and therefore we obtain $$ \sum_{2N/3 \le n \le 4N/3} r(n) \ge |B|^2 + 2|A| |C| +O(\sqrt{N}) = (\beta^2+2\alpha\gamma+o(1)) N. $$ Therefore $$ |S|^2 =\sum_{n\le 2N} r(n) \ge \Big(\sum_{n< 2N/3} +\sum_{2N/3\le n\le 4N/3} +\sum_{4N/3<n \le 2N} \Big) r(n) $$ must be at least $$ \Big(\frac{8}{3} + \beta^2 + 2\alpha \gamma +o(1)\Big) N. \tag{3} $$

Put $\alpha+\gamma =2\sqrt{2/3}+\delta$, with $\delta>0$. Since both $\alpha$ and $\gamma$ must be $\ge \sqrt{2/3}+o(1)$ we must have $$ \alpha\gamma \ge \sqrt{2/3}(\sqrt{2/3}+\delta)+o(1) = \frac 23 +\sqrt{\frac 23}\delta +o(1). $$ If $\delta\ge 2-2\sqrt{2/3}$, then using this in (3) we would get $$ |S|^2 \ge \Big( 4 +2 \sqrt{2/3} \delta+o(1)\Big) N \ge 4.599N, $$ which is more than we claimed.

Suppose then that $\delta \le 2-2\sqrt{2/3}$, so that $\beta \ge 2-(\alpha+\gamma)=2-2\sqrt{2/3}-\delta (>0)$. Using this in (3) we find $$ |S|^2 \ge \Big( 4+ 2\sqrt{2/3} \delta + (2-2\sqrt{2/3}-\delta)^2 +o(1) \Big) N. $$ The right side is smallest for $\delta=0$, yielding $$ |S|^2 \ge (4+(2-2\sqrt{2/3})^2+o(1))N, $$ which gives $|S| \ge (2.0333\ldots +o(1))\sqrt{N}$.


For a fixed $n$, you can solve the problem via integer linear programming. For $j \in [n]$, let binary decision variable $x_j$ indicate whether $j\in S$. For $0\le j_1\le j_2 \le n$, let binary decision variable $y_{j_1,j_2}$ indicate whether both $j_1$ and $j_2$ are in $S$. The problem is to minimize $\sum_j x_j$ subject to \begin{align} \sum_j y_{j,i-j} &\ge 1 &&\text{for $i \in [2n]$}\\ y_{j_1,j_2} &\le x_{j_1}\\ y_{j_1,j_2} &\le x_{j_2} \end{align}

I didn't try to find all optimal solutions, but here is one for each $n \le 50$:

 n  m optimal S
 1  2 {0,1}
 2  3 {0,1,2}
 3  4 {0,1,2,3}
 4  4 {0,1,3,4}
 5  5 {0,1,2,4,5}
 6  5 {0,1,3,5,6}
 7  6 {0,1,2,4,6,7}
 8  6 {0,1,3,5,7,8}
 9  7 {0,1,3,5,7,8,9}
10  7 {0,1,3,5,7,9,10}
11  8 {0,1,3,5,7,9,10,11}
12  8 {0,1,3,5,7,9,11,12}
13  8 {0,1,2,5,8,11,12,13}
14  9 {0,1,3,5,7,9,11,13,14}
15  9 {0,1,3,4,9,11,12,14,15}
16  9 {0,1,2,5,8,11,14,15,16}
17 10 {0,1,3,5,7,8,13,14,16,17}
18 10 {0,1,3,5,6,12,13,15,17,18}
19 10 {0,1,2,5,8,11,14,17,18,19}
20 10 {0,1,3,4,9,11,16,17,19,20}
21 11 {0,1,3,5,6,13,15,16,18,20,21}
22 11 {0,1,2,3,7,11,15,19,20,21,22}
23 12 {0,1,3,5,6,13,15,16,18,20,22,23}
24 12 {0,1,3,5,7,8,16,17,19,21,23,24}
25 12 {0,1,3,4,9,11,15,17,21,22,24,25}
26 12 {0,1,3,4,9,11,15,17,22,23,25,26}
27 12 {0,1,3,5,6,13,14,21,22,24,26,27}
28 13 {0,1,2,3,5,9,13,17,21,25,26,27,28}
29 13 {0,1,3,4,9,11,16,18,23,25,26,28,29}
30 13 {0,1,3,4,6,10,14,19,21,26,27,29,30}
31 14 {0,1,2,4,7,9,10,15,20,22,27,28,30,31}
32 13 {0,1,3,4,9,11,16,21,23,28,29,31,32}
33 14 {0,1,3,5,6,12,13,20,21,27,28,30,32,33}
34 14 {0,1,2,3,7,11,15,19,23,27,31,32,33,34}
35 14 {0,1,3,5,6,13,14,21,22,29,30,32,34,35}
36 14 {0,1,3,4,9,11,16,20,25,27,32,33,35,36}
37 15 {0,1,3,4,7,9,14,16,21,26,28,33,34,36,37}
38 15 {0,1,2,3,4,9,14,19,24,29,31,34,35,37,38}
39 15 {0,1,3,4,9,11,16,20,25,30,35,36,37,38,39}
40 15 {0,1,3,4,5,8,14,20,26,32,35,36,37,39,40}
41 16 {0,1,3,5,6,11,12,19,20,27,28,35,36,38,40,41}
42 16 {0,1,3,4,7,8,13,18,23,28,33,35,38,39,41,42}
43 16 {0,1,3,4,5,8,14,20,26,29,35,38,39,40,42,43}
44 16 {0,1,3,5,7,8,17,18,26,27,36,37,39,41,43,44}
45 16 {0,1,3,4,9,11,16,20,25,29,34,36,41,42,44,45}
46 16 {0,1,3,4,5,8,14,20,26,32,38,41,42,43,45,46}
47 17 {0,1,2,5,8,9,10,15,21,27,33,39,42,43,44,46,47}
48 17 {0,1,3,4,5,8,14,20,26,32,38,39,43,45,46,47,48}
49 17 {0,1,3,4,5,8,14,20,26,32,38,41,45,46,47,48,49}
50 17 {0,1,3,4,9,11,16,20,25,30,34,39,41,46,47,49,50}

Note that any solution will be of the form $S=\{0,1,s_2,\cdots,s_j,n-1,n\}$ and then $S'=\{0,1,n-s_j,\cdots,n-s_2,n-s_1,n-1,n\}$ is also a solution. If $S=S'$ one might call it a symmetric solution. This requires that $n$ and/or $m(n)$ is even. In any case it might be pleasant to try to maximize $|S \cap S'|.$ A possible alternate, or further, goal would be to minimize the number of distinct jumps between successive entries.Aside from aesthetics, when there are several optimal solutions, the ones with the must symmetry or regularity might be fruitful for suggesting generalizations.

In a somewhat trivial sense, for any two solutions $S_1,S_2$ of size $m(n),$ One can change $S_1$ to $S_2$ by shifting entries. So it is hard to say if one solution is essentially different from another.

Many values of $n$ (but not all) seem to have optimal solutions with this structure:

Start with $0,1,2,\cdots, d-1=s_d$ end with $s_{d+p+1}=n-(d-1),\cdots, n-2,n-1,n$ and in the middle put entries $s_{d+1},s_{d+2},\cdots,s_{d+p}$ which satisfy $s_{i+1}-s_i \leq d$ for $d-1\leq i \leq d+p.$

This will always give a solution of size $2d+p$. For an optimal solution $d$ should be around $\sqrt{\frac{n}2}$ and $p$ as small as possible given $n,d,$ so $p=\lceil \frac{n+2}d-3\rceil.$ In some cases there are $3$ values of $d$ which work.


$m(23)=12$ and one solution is $S=\{0,1,3,5,6,13,15,16,18,20,22,23\}$

There are $22$ symmetric solutions. The lower halves are

$ \left\{ 0,1,2,3,4,9 \right\} , \left\{ 0,1,2,3,6,10 \right\} , \left\{ 0,1,2,3,7,10 \right\} , \left\{ 0,1,2,3,7,11 \right\} , \mathbf{\left\{ 0,1,2,4,5,11 \right\}} , \left\{ 0,1,2,4,6,9 \right\} , \left\{ 0,1,2,4,7,10 \right\} , \left\{ 0,1,2,5,6,8 \right\} , \left\{ 0,1,2,5,7,10 \right\} , \left\{ 0,1,2,5,8,10 \right\} , \left\{ 0,1,2,5,8,11 \right\} , \mathbf{\left\{ 0,1,3,4,5,11 \right\} , \left\{ 0,1,3,4,6,11 \right\} , \left\{ 0,1,3,4,7,9 \right\} , \left\{ 0,1,3,4,8,9 \right\} , \left\{ 0,1,3,4,8,10 \right\} , \left\{ 0,1,3,4,9,10 \right\} , \left\{ 0,1,3,4,9,11 \right\} , \left\{ 0,1,3,5,6,8 \right\} , \left\{ 0,1,3,5,6,10 \right\} , \left\{ 0,1,3,5,6,11 \right\} , \left\{ 0,1,3,5,7,8 \right\}} $

The values of $d$ represented are $3,4,5.$ The ones in bold merit further perusal. They do not fully fit the scheme described as there are jumps greater than the relevant $d$.


$m(20)=10$ and the given solution $S=\{0,1,3,4,9,11,16,17,19,20\}$ is symmetric.

There are no solutions which fit the scheme above as $$6+\lceil \frac{22}3 \rceil-3=8+\lceil \frac{22}4 \rceil-3=11.$$


$m(38)=15$ and the solution $$S=\{0,1,2,3,4,9,14,19,24,29,31,34,35,37,38\}$$ can be shifted to give this solution with $d=4$

$$S=\{0, 1, 2, 3, 7, 11, 15, 19, 23, 27, 31, 35, 36, 37, 38\}$$

and also this one with $d=5$

$$\{0, 1, 2, 3, 4, 9, 14, 19, 24, 29, 34, 35, 36, 37, 38\}.$$ In those cases there is no choice for the middle entries as $\frac{40}{4}-3=7$ and $\frac{40}5-3=5.$


For $n=2d^2-2$ there are optimal symmetric solutions with $|S|=4d-3$. These are the ones listed by Rob Pratt for $n=6,16.$

  • $\{0,1,2,3,7,11,15,19,23,27,28,29,30\}$ works for $d=4.$
  • $\{0,1,2,3,4,9,14,19,24,29,34,39,44,45,46,47,48\}$ works for $d=5.$

One last example: $m(43)=16$ and one solution is $$S=\{0, 1, 3, 4, 5, 8, 14, 20, 26, 29, 35, 38, 39, 40, 42, 43\}$$

A symmetric solution is $$S =\{0, 1, 2, 3, 4, 9, 14, 19, 24, 29, 34, 39, 40, 41, 42, 43\}.$$