Regular Expressions for City name

This answer assumes that the letters which @Manaysah refers to also encompasses the use of diacritical marks. I've added the single quote ' since many names in Canada and France have it. I've also added the period (dot) since it's required for contracted names.

Building upon @UIDs answer I came up with,

^([a-zA-Z\u0080-\u024F]+(?:. |-| |'))*[a-zA-Z\u0080-\u024F]*$

The list of cities it accepts:

Toronto
St. Catharines
San Fransisco
Val-d'Or
Presqu'ile
Niagara on the Lake
Niagara-on-the-Lake
München
toronto
toRonTo
villes du Québec
Provence-Alpes-Côte d'Azur
Île-de-France
Kópavogur
Garðabær
Sauðárkrókur
Þorlákshöfn

And what it rejects:

A----B
------
*******
&&
()
//
\\

I didn't add in the use of brackets and other marks since it didn't fall within the scope of this question.

I've stayed away from \s for whitespace. Tabs and line feeds aren't part of a city name and shouldn't be used in my opinion.


This can be arbitrarily complex, depending on how precise you need the match to be, and the variation you're willing to allow.

Something fairly simple like ^[a-zA-Z]+(?:[\s-][a-zA-Z]+)*$ should work.

warning: This does not match cities like München, etc, but here you basically need to work with the [a-zA-Z] part of the expression, and define what characters are allowed for your particular case.

Keep in mind that it also allows for something like San----Francisco, or having several spaces.

Translates to something like: 1 or more letters, followed by a block of: 0 or more spaces or dashes and more letters, this last block can occur 0 or more times.

Weird stuff in there: the ?: bit. If you're not familiarized with regexes, it might be confusing, but that simply states that the piece of regex between parenthesis, is not a capturing group (I don't want to capture the part it matches to reuse later), so the parenthesis are only used as to group the expression (and not to capture the match).

"New York" // passes

"San-Francisco" // passes

"San Fran Cisco" // passes (sorry, needed an example with three tokens)

"Chicago" // passes

"  Chicago" // doesn't pass, starts with spaces

"San-" // doesn't pass, ends with a dash

Tags:

Regex

City