What would base $1$ be?

You're exactly right that such a system would be represented by the use of arbitrary tally marks. Such a system is known as a Unary Numeral System (Wikipedia Entry):

The unary numeral system is the bijective base-1 numeral system. It is the simplest numeral system to represent natural numbers: in order to represent a number N, an arbitrarily chosen symbol representing 1 is repeated N times. This system is used in tallying. For example, using the tally mark |, the number 6 is represented as ||||||.

...

There is no explicit symbol representing zero in unary as there is in other traditional bases, so unary is a bijective numeration system with a single digit. If there were a 'zero' symbol, unary would effectively be a binary system. [boldface mine] In a true unary system there is no way to explicitly represent none of something, though simply making no marks represents it implicitly. Even in advanced tallying systems like Roman numerals, there is no zero character; instead the Latin word for "nothing," nullae, is used.


I would like to expand on Trevor Wilson's answer. Base-$b$ representation of integers is rooted in the fact that, for any non-negative integer $n$, there is a unique representation of $n$ in the form $$n = \sum_{i=0}^\infty a_ib^i$$ where $0 \le a_i < b$. For example, when $b$ is 3, and $n$ is 47, the unique solution has $a_0 = 2, a_1 = 0, a_2 = 2, a_3 = 1, $ and $a_i = 0$ for all $i>3$. The $a_i$ are called the "base-$b$ digits of $n$"; in our example the base-3 digits of 47 are 1202. We say that the sequence of digits is a numeral, and that it represents the number $n$.

The uniqueness property means that each $n$ has exactly one base-$b$ representation. If one requires that the sequence of $a_i$ is eventually zero (that is, that $a_i = 0$ for all sufficiently large $i$) then the converse holds also: each sequence of digits corresponds to exactly one $n$. In fact there are four properties that hold:

  1. Each $n$ has at least one representation
  2. Each $n$ has no more than one representation
  3. Each representation corresponds to at least one $n$
  4. Each representation corresponds to no more than one $n$

It is quite possible to construct representations that lack some of these properties. For example, consider the base-3 representation, but drop the restriction that says that $0\le a_i < 3$. Then property 2 fails: The number 47 has many base-3 representations: 502, for example, or 362, or 1 12 2 (here $a_1 = 12$), or even one (harder to write) where $a_0 = 47$. Each sequence of digits still represents a single $n$, but a particular $n$ might have many representations as a sequence of digits. Sometimes such representations even have some use.

Some of these properties are more important than others. Property 4, for example, is crucial, because if it doesn't hold, then there is some sequence of digits that might represent two different numbers, and when you see it you don't know what number is being represented. Such a system can't really be called a system for representing numbers.

Similarly, a system which fails to have property 1 has limited usefulness. Such a system can represent some $n$, but not all.

Depending on where and how it fails, a representation might be more or less useful. Fraction notation, for example, is universally used to represent rational numbers. But it fails to have properties 2 and 3! (It fails 2 since each rational number has many representations, say as $\frac12, \frac24, $ or $\frac{120}{240}$. And it fails 3 since $\frac10$ and $\frac00$ do not represent any rational numbers.) But these failures don't prevent it from being useful as a representation of rational numbers. A more serious failure arises if you try to make fractions represent real numbers; then property 1 fails, since there is no fraction representation for the number $\pi$ or $\sqrt2$.

Now let's return to $$n = \sum_{i=0}^\infty a_ib^i.$$ I said that this representation of non-negative integers has all four properties, but I left out an important limitation: the four properties only hold for $b\ge 2$. If $b=1$, the restriction $0\le a_i<b$ degenerates to $a_i=0$, and we can no longer represent any number except 0. So only 0 has a base-1 represenation. As a numeral system, this is completely useless.

If we drop the $0\le a_i<b$ restriction, we get something that hardly resembles a system of representation at all: Each number $n$ now has many base-1 representations For example, one could write 5 as 14, or 32, or 1121.

So, although it is inconsistent, mathematicians, and especially computer scientists, adopt a different meaning for "base-$1$ representation". They abandon $\sum a_ib^i$ completely and agree to represent the number $n$ as a sequence of exactly $n$ ones. For example, $7$ is represented as 1111111. This restores properties 1–4, so it is a sensible representation.


Yes, the usual answer is that numbers are represented by "tally marks" in base $1$. However, the numeral $0$ might not be the best choice of a tally mark because if $00000$ were interpreted in base $1$ analogously to its interpretation in other bases, then it would be interpreted as $0 \cdot 1^5 + 0\cdot 1^4 + 0 \cdot 1^3 + 0 \cdot 1^2 + 0 \cdot 1^1$, which is $0$ rather than $5$.