Average digit sum in different bases

Numerics suggest that $S(n) \sim Cn^2$ for some constant $C$ between $0.175$ and $0.18$.

Note that the bases $\frac n2<b\le n$ are easy to calculate: we get a sequence of two-digit numbers of the form $1x$, where $x$ runs from $\frac n2$ or so to $0$ (covering all integers in between). The sum of all these digits is asymptotic to $\frac12(\frac n2)^2$.

In the next range $\frac n3<b\le \frac n2$, we get a sequence of two-digit numbers of the form $2x$, where $x$ hits every other integer between about $\frac n3$ and $0$. The sum of all these digits is asymptotic to $\frac22(\frac n6)^2$.

Continuing to look at these ranges, we get a sequence of contributions of the form $\frac k2(\frac n{k(k+1)})^2$ from the bases $\frac n{k+1}<b\le\frac nk$ (at least until $k$ is about $\sqrt n$ or so). And one can calculate that $$ \sum_{k=1}^\infty \frac k2\bigg(\frac 1{k(k+1)}\bigg)^2 = 1-\frac{\pi^2}{12} = 0.17753... $$

So I'm guessing that one can prove in this way that $S(n) \sim (1-\frac{\pi^2}{12})n^2$.

(Numerically $S_w(n)$ seems to have size $n$, perhaps asymptotic to $Dn$ with some constant $D$ between $0.43$ and $0.44$. The similar heuristic doesn't give such a formula, however, because the large bases don't dominate the sum to the same extent as for $S(n)$.)


The observation by Greg Martin is indeed correct. I have worked with these expressions in my bachelor thesis, which can be accessed for free on here: Digit sums. Proposition 2.10, page 12 states that $$\sum_{2\leq b\leq n}S_b(n)\sim (1-\frac{\pi^2}{12})n^2$$ as $n \to \infty$.

Relationship between the two sums

We have
$$\sum_{2\leq b\leq n}\frac{S_b(n)}{b}=\frac{1}{n}\sum_{2\leq b\leq n}S_b(n)+\int_{1}^{n}\frac{\sum \limits_{2 \leq b \leq x}S_b(n)}{x^2}dx$$

Proof:

Set $n_0$ equal to a fixed positive integer, and consider the sum $\sum \limits_{2\leq b\leq n}\frac{S_b(n_0)}{b}$. If we let $(a_k)_{k=1,2,\ldots}$ be the sequence defined by $a_1=0$ and $a_t=S_t(n_0)$ for $t\geq 2$, and $\phi:t\mapsto\frac{1}{t}$. Then the sum becomes $\sum \limits_{k=1}^{n}a_k\phi(k)$, which by Abel's summation formula equals $$\phi(n)\sum_{k=1}^na_k -\int_{1}^{n}\phi'(x)\sum_{k\leq x}a_k \;\;dx$$ or equivalently $$\frac{1}{n}\sum_{2\leq b\leq n}S_b(n_0)+\int_{1}^{n}\frac{\sum \limits_{2 \leq b \leq x}S_b(n_0)}{x^2}dx$$ Setting $n_0$ equal to $n$ completes the proof.

Asymptotic formula for the second sum

Proposition 4.4, page 27 states that $$\int_{1}^{n}\frac{\sum \limits_{p \leq n}S_p(n)}{x^2}dx \sim (\frac{\pi^2}{12}-\gamma)\frac{n}{\log(n)},$$ where $\gamma$ is the Euler- Mascheroni constant. Here, the summation is taken over the primes $p$. If you modify the deduction of this formula, you could conclude that $$\int_{1}^{n}\frac{\sum \limits_{2 \leq b \leq x}S_b(n)}{x^2}dx \sim (\frac{\pi^2}{12}-\gamma)n$$ Using the relationship we obtained for the two summands, along with these two asymptotic formulae, we get: \begin{align} \sum_{2\leq b\leq n}\frac{S_b(n)}{b}\;=&\;\frac{1}{n}\sum_{2\leq b\leq n}S_b(n)+\int_{1}^{n}\frac{\sum \limits_{2 \leq b \leq x}S_b(n)}{x^2}dx \\ \sim \; &(1-\frac{\pi^2}{12})n+(\frac{\pi^2}{12}-\gamma)n \\[2mm] \sim \; &(1-\gamma)n & \\[2mm] =&\; n \cdot 0.42278433\ldots \end{align} Which perfectly matches the the numerical heuristics as pointed out by Greg Martin.

What if we sum over only primes p?

Proposition 2.12(p.14), Propostion 4.4(p.27) and the forumlae on page 25 show that

\begin{align}\sum_{p \leq n}S_p(n)\; =\; &(1-\frac{\pi^2}{12})\frac{n^2}{\log(n)}+C\frac{n^2}{\log^2(n)}+o(\frac{n^2}{\log^2(n)}) \\ \sum_{p\leq n}\frac{S_p(n)}{p}\sim& \;(1-\gamma)\frac{n}{\log(n)}, \end{align}

where $C=0.119\ldots$


Check out the closely related article by L. E. Bush from the 1940 American Math Monthly. Bush shows that for $r$ fixed, and $n < N,$ the average value of digit sums is asymptotic to: $$ \frac{(r-1) \log N}{2 \log r}. $$