Scaling a set of reals to be nearly integers

Brute force is maybe not that brutal. Of course I don't know what $\epsilon$ and $|R|$ and $\max{R}$ you want for your application and all those things could affect the performance. However, since you have an application, it is worth checking the performance of a simple program to see if solvers (which are great) need to be hauled in.

If you were asking for the multiplier $s$ to be an integer you have the venerable question of simultaneous Diophantine approximation which is well researched and much harder for several $x$ than for one. Although I suppose that is more a question not of a fixed $\epsilon$ but rather of finding $s$ with exceptionally good corresponding $\epsilon$ relative to the size of $s.$ Adding a largish integer like $L=1000$ to your set restricts to $s$ within $\frac{\epsilon}L$ of an integer.

Here is a naive analysis followed by an actual example. The chance that a random real multiplier $s$ is good for a given $x$ is $1/{2\epsilon}$ and the set of good multipliers is the union of intervals $\bigcup_k[\frac{k-\epsilon}x,\frac{k+\epsilon}x].$ Given an upper bound $N$ on the multiplier one needs to consider $1 \le k \le Nx.$ I thought about what is a reasonable upper bound but didn't come to a conclusion. It is easy enough to repeat what I propose for $[N,2N]$ if $[1,N]$ does yield a satisfactory $s.$

Once the list of intervals is found (which is easy) for $x_1$ and $x_2,$ the non-empty intersections are intervals of various sizes which can then be intersected with the collection of intervals for $x_3$ etc. Based on the example below the candidates might quickly restrict to only a few intervals. Let me shift to the example.

I considered the same $10$ reals $\{{\sqrt {2},\sqrt {3},\sqrt {5},\sqrt {7},{\rm e},\pi,\sqrt {11},\sqrt {13},\sqrt {17},\sqrt {19}\}}$ as Joro and used $\epsilon=0.08.$ I thought that order might be best since by the time one gets to larger $x$ with many intervals, perhaps only a few will fall into regions which are still feasible. I took $N=2000$ since I know that $s=1696.29460245$ works well.

  • For $x_1=\sqrt{2}$ there are, of course, $2827$ intervals with combined length $319.8.$
  • Combining in the $3464$ intervals for $\sqrt{3}$ and intersecting yields $1006$ viable intervals with combined length $51.14 \approx 2000(0.16)^2$

  • This falls to $274$ intervals with combined length $8.12$ once $\sqrt{5}$ is taken into account. Then $65$ with combined length $1.27$ with $\sqrt{7}.$ Note that $2000(0.16)^4=1.31.$

  • Once $e$ is used, there are only $14$ intervals with combined length $0.158.$ This seems not as close as might be expected to $2000(0.16)^5=0.209.$ Here are the actual intervals:

$ [53.683547, 53.701192], [273.67287, 273.67652], [317.49646, 317.50939], $

$[371.19004, 371.19135], [815.23913, 815.25027], [827.33399, 827.37150],$

$ [881.04183, 881.04656], [924.86894, 924.87349], [1144.8394, 1144.8624],$

$ [1198.5330, 1198.5556], [1642.6034, 1642.6077], [1696.2926, 1696.3013],$

$[1752.2131, 1752.2186], [1969.9650, 1969.9653] $

This limits the number of values which need to be considered for the remaining $x$ values.

  • Moving on to $\pi$ leaves only one interval $[1696.29259952403, 1696.29884826432]$ of width $0.00624874029$ which is certainly less that $2000(0.16)^6=0.03355.$ That entire interval survives for $\sqrt{11}.$

  • It shrinks to $[1696.29259952403, 1696.29538806460]$ with width $0.002788$ once $\sqrt{13}$ is considered.

  • I'm not sure why, but that interval survives the last two $x$ values $\sqrt{17},\sqrt{19}.$

I guess the value $1696.29259 $ given by the sagemath solver was giving the left endpoint. That is actually a bit out of range, probably not enough digits. The midpoint $1696.29399$ would be a good choice but closer examination shows that, $\sqrt{2}$ and $\sqrt{13}$ are the worst approximated. A little jostling yields that $s=1696.29460245$ gives $s\sqrt{2}=2399-0.077167435$ and $s\sqrt{13}=6116.077167426.$ The other $8$ values of $sx$ are within $0.0018 $ to $0.0684$ of integers.

I'm not sure if this example is typical. It may be that the $N(2\epsilon)^j$ heuristic fails to be accurate once its value is sufficiently smaller than $\frac{\epsilon}{x}.$ Perhaps each of the few short surviving are likely to be eliminated or survive intact with probabilities nearly $1-2\epsilon$ and $2\epsilon$ leaving very little chance fairly to survive but be shortened.

LATER To answer a question from joro (see below): That is correct. The feasible range is about $[37000.0693616565, 37000.0693689711]$ with length $7.3 \cdot 10^{-6}.$ It would be more dramatic (probably) if you replaced the fourth constant with $\sqrt{3}+10^{-3}\sqrt{5}$ or something else so that the four constants were independent as vectors over $\mathbb{Q}$.

My naive kludged 3 line Maple program actually took $15$ minutes to process the first two constants and then something went wrong. From what I observed (which might have been expected) I did the speedup mentioned here: We can see from the first two constants that (within each interval of feasible multipliers) $10^{-3}s$ is within $2\cdot 10^{-3}$ of an integer. That means that if a further subinterval of one of those works well for $\sqrt{3}$ it will have a fairly good chance of having something that works for $\sqrt{3}+10^{-3}$. At any rate, I only used $k$ of the form $\lfloor 1000j\sqrt{2}\rceil$ for $j \leq 100$ to sieve for $\sqrt{2}$ (I think it would have been safe to go up to $j=499$). The $100$ intervals of course, each have length $\frac{2\epsilon}{\sqrt{2}}=0.00014142.$ Of them, $56$ survive sieving for $\sqrt{2}+10^{-3}$ in part. They have lengths ranging from $.000138$ down to $8\cdot 10^{-7}.$ A relevant few are

$[32999.9663242965, 32999.9664419055], [33999.8154196000, 33999.8154305033], $

$[37000.0693616565, 37000.0694539665], $

$[37999.9184079485, 37999.9184916756], [41000.1724228288, 41000.1724423150]$

Aside: That was actually not a very dramatic reduction in the number of intervals. It turns out that sieving for $\sqrt{3}$ immediately after $\sqrt{2}$ immediately cuts the $100$ intervals down to just $[37000.06936165, 37000.0694090].$ I guess that makes sense since the optimization for $\sqrt{2}+10^{-3}$ was already baked in. But anyway:

Out of all $56$, only the middle one shown, with length $7.9\cdot 10^{-5},$ survives $\sqrt{3}$ in part with the same left endpoint but length $4.7\cdot 10^{-5}.$ (aside: which is exactly what showed up in the aside above.) That one interval still survives $\sqrt{3}+10^{-3}$ in part as $[37000.0693616565, 37000.0693689711]$ with length $7.3 \cdot 10^{-6}.$

I am sure the more sophisticated procedures mentioned (which this speedup imitates in part) are better than the naive approach for sufficiently small $\epsilon.$ I just wanted to comment that the search space is simple in structure and narrows down quickly.


Let $R = \{x_1,\dots,x_n\}$. An integer relation between $x_j$ and $-1$ is a pair $(a_j,a_{j*}) \in \mathbb{Z}^2$ satisfying $a_j x_j = a_{j*}$. Suppose there are such integer relations for all $j \in [n]$ and let $a_0 := \text{lcm}_j a_j$ and $b_j := a_0/a_j$: then $(a_0,a_{j*}b_j)$ is also such an integer relation. In particular, $a_0 x_j = a_{j*} b_j \in \mathbb{Z}$ for all $j \in [n]$.

Now of course $a_0$ is not a priori the same as your desired $s$. However, PSLQ will produce integer relations of minimal norm in the relevant case of dimension 2 (see, e.g., p.2 of "Analysis of PSLQ, an integer relation finding algorithm'') and I would guess (hope?) that you can turn this into an optimal construction by building your $\epsilon$ into $R$ via suitable rational approximations.


Even though you asked for the optimal solution and others have already answered in this regard, I think it's still worth mentioning the following efficient approach to getting "good" solutions which may not be optimal:

Let me first mention a different but related problem: given $(\xi_1,\ldots,\xi_r) \in \mathbb{R}^r$ (all irrational, say), how can we simultaneously approximate the $\xi_i$ by rationals $p_i/q$ having the same denominator $q$? I.e., how can we scale $(\xi_1,\ldots,\xi_r)$ by a $q\in\mathbb{N}_{>0}$ so that $|q\xi_i - p_i|$ are small without $q$ being too large?

Dirichlet tells us that there exist $q$ arbitrarily large so that $|q\xi_i - p_i| \leq q^{-1/r}$ where $p_i = \lceil q\xi_i \rfloor$ ("closest integer to"); it doesn't help us find them, unfortunately. But here's how we can obtain a not-quite-so-good approximation in an algorithmically convenient way: for $A>0$ real, consider the image of the $\mathbb{Z}$-linear map $\mathbb{Z}^{r+1} \to \mathbb{R}^{r+1}$ taking $(p_1,\ldots,p_r,q)$ to $(A(q\xi_1-p_1),\ldots,A(q\xi_r-p_r),q/A^r)$. This is a lattice in $\mathbb{R}^{r+1}$, which I just described through a matrix, and we are trying to find short nonzero vectors in it: using the LLL algorithm we can find something like $|q\xi_i - p_i| \leq 2^{r/2}/A$ with $q\leq 2^{r/2} A^r$.

As for your original problem, if we are given $(\xi_0,\ldots,\xi_r)$, you can first scale by $\xi_0^{-1}$, say putting $\xi'_i = \xi_i/\xi_0$ so that $\xi'_0 = 1$, forget about this one and apply what I just said to $(\xi'_1,\ldots,\xi'_r)$: this gives you a scale factor $s := q/\xi_0$ such that $s\xi_i \approx p_i$ is close to an integer and $s\xi_0 = q$ is an integer.

Given that the problem of finding the shortest nonzero vector in a lattice is hard (in various practical or conjectural ways), I suspect your problem is algorithmically hard if you insist on getting the optimal solution and if $r$ (your $n-1$) is large. But I thought the LLL algorithm deserved at least a mention.