The behavior of a certain greedy algorithm for Erdős Discrepancy Problem

Update 2: Original answer below. I've put together graphs showing more than just champions, using every $N\leq 10^4$ and also every $N\equiv 0\pmod{100}$ up to $10^5$. This is for the original version, not the variant, but I'd expect that to be essentially the same. I am starting to be somewhat skeptical of my $1/3$ estimate. I'll collect some data further out, but it'll be less complete since it's more costly to collect.

Here's the graphs of $D(N)$ versus $N$. The added curves are $N^{1/3}$ and $\log N$. enter image description here enter image description here

Here's the graph of $\log D/\log N$ versus $N$. The horizontal line is at $1/3$. (Note the different scales.) enter image description here enter image description here

Original answer: I have some basic numerical observations. I hacked together some code in c++ to work on this, and would be happy to collect more focused data or to share the code. Also, whenever there was a choice of whether to assign the value $+1$ or $-1$ to $f(p)$, I chose $-1$ for consistency of output. This yields a well-defined function $f_N$ for each $N$. Let $D(N)$ denote the discrepancy of $f_N$ up to $N$.

(I've also looked at choosing $f(p)=+1$ if it's undetermined, and at $f(p)=\pm 1$ according to whether $p\equiv 1,3\pmod{4}$. In these cases, the data is essentially the same as below.)

  1. $D(N)$ is roughly increasing, but is not monotonic. Its champion values for $N\leq10000$ are, in the form $(N,D(N))$, $(1,1)$,$(10,2)$,$(24,3)$,$(70,5)$,$(91,6)$,$(391,7)$,$(553,8)$,$(668,9)$,$(961,10)$,$(1235,11)$,$(1265,13)$,$(2561,14)$,$(2604,17)$,$(6275,18)$,$(6276,19),\dots$. This growth is more than logarithmic, and is probably polynomial -- $\log D/\log N$ hovers pretty close to $1/3$ for each of these points, so that may be the answer for the $\Omega$ result.

  2. It appears that the functions $f_N$ may converge as $N\to\infty$, but I'm not sure of this and need more data. Let $l(N)$ denote the least prime $p$ for which the value of $f_N(p)$ is undetermined. Certainly $l(N)\geq 5$ once $N\geq 4$, and while there is a great deal of fluctuation, it appears that maybe $l(N)\geq 7$ once $N\geq40500$; I will be seeing if this is (numerically) true.

I hope to update this answer once I have more data.

Update: For the variation where we only look at the values of $f_N$ on squarefree integers, the behavior appears to be the same.

$(N,D(N),\log D/\log N)$:

  • $(1,1,\text{NaN})$
  • $(30,2,0.2037950471)$
  • $(42,3,0.2939297479)$
  • $(77,4,0.3191428313)$
  • $(190,5,0.3067334722)$
  • $(238,6,0.3274252273)$
  • $(319,8,0.3606890916)$
  • $(939,9,0.3210056698)$
  • $(1358,10,0.3191931033)$
  • $(1461,11,0.3290703914)$
  • $(2185,13,0.3335707591)$
  • $(2769,14,0.3329519195)$
  • $(3354,15,0.3335896252)$
  • $(3689,17,0.3449622741)$

Here is a plot of rlo's data (as listed in his/her item (1)):
          Plot D(N)


And here is the $\log D(N) / \log N$ ratio he mentioned in the comment (where the constant red line is $\frac{1}{3}$):
          Plot LogD/LogN