Why do upper case letters come before lower case letters in the ASCII table?

Basically when sorting strings you want 'a' to come before 'b' and the character code of 'a' is less (smaller) than 'b'.

The same thing with uppercase. 'A' is before 'a'.

That way you can easily sort 'Anthony' before 'ant', just by comparing character codes, even though lowercase 'anthony' would normally appear just after 'ant' due to length.

That would have made sorting strings very complex if uppercase has larger character codes than lowercase.

As 'Slaks' mentioned however...Unicode makes it more complicated in you have characters such as ȦAÁÀÂÄĀĂǍÃȂ, which often have unicode numbers larger than 'a' but are generally considered sorting before 'a'.


I'm only guessing, but I imagine it's because the earliest character sets had no lowercase at all. The Baudot telegraph code was only 5 bits, and CDC mainframes natively used a 6-bit code; there was no room for lowercase. When ASCII was developed as a 7-bit code which finally had enough room for lowercase letters, they were considered something of a luxury add-on, so it made sense to put them in the back half of the set.

Of course, we can dive a little deeper and ask why that attitude exists; historically, upper case came first and was the only shape the letters had for centuries or even millennia before the idea of case distinction was invented. For most folks literate in a language that uses the Latin alphabet, uppercase is the base form; you learn it first, the archetype of each letter is the capital shape, etc.

But it's worth noting that this ordering is nonetheless specific to ASCII, and not necessarily true of other character sets; for example, EBCDIC has the lowercase letters first. Commodore microcomputers could switch between two character sets, and even though both were based on ASCII, the one with lowercase letters had them first. (The other set had extra graphic characters in place of lowercase.)

Unicode has taken its cue from ASCII (and the extended-Latin character sets based on it), so most of the alphabets that have case distinctions have the uppercase versions come first within their code blocks. But there are exceptions, and of course many alphabets don't have case distinctions at all, while others have more complicated relationships than our simple 1-to-1 mapping.


To make sure that lowercase letters don't come before uppercase letters when sorting text.

In the modern Unicode era, sorting text is far more complicated, but 20 years ago, you could sort text by ASCII values.