c# UK postcode splitting

I've written something similar in the past. I think you can just split before the last digit. (e.g. remove all spaces, find the last digit and then insert a space before it):

static readonly char[] Digits = "0123456789".ToCharArray();

...

string noSpaces = original.Replace(" ", "");
int lastDigit = noSpaces.LastIndexOfAny(Digits);
if (lastDigit == -1)
{
    throw new ArgumentException("No digits!");
}
string normalized = noSpaces.Insert(lastDigit, " ");

The Wikipedia entry has a lot of detail including regular expressions for validation (after normalisation :)


I'm not sure how UK Post Codes work, so is the last part considered the last 3 characters with the first part being everything before?

If it is, something like this should work, assuming you've already handled appropriate validation: (Edited thanks to Jon Skeets commment)

string postCode = "AB111AD".Replace(" ", "");
string firstPart = postCode.Substring(0, postCode.Length - 3);

That will return the Post Code minus the last 3 characters.


UK-postcodes format explained:

Ref: http://www.mrs.org.uk/pdf/postcodeformat.pdf

POSTCODE FORMAT

A Postcode is made up of the following elements: PO1 3AX

  • PO the area. There are 124 postcode areas in the UK
  • 1 the district. There are approximately 20 Postcode districts in an area
  • 3 the sector. There are approximately 3000 addresses in a sector.
  • AX the Unit. There are approximately 15 addresses per unit.

The following list shows all valid Postcode formats. "A" indicates an alphabetic character and "N" indicates a numeric character.

FORMAT EXAMPLE:

AN NAA - M1 1AA
ANN NAA - M60 1NW
AAN NAA - CR2 6XH
AANN NAA - DN55 1PT
ANA NAA - W1A 1HQ
AANA NAA - EC1A 1BB

Please note the following:

  • The letters Q, V and X are not used in the first position
  • The letters I,J and Z are not used in the second position.
  • The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
  • The second half of the postcode is always consistent numeric, alpha, alpha format and the letters C, I, K, M, O and V are never used.

And it is safe to assume that the space will be the forth character from the end, ie., if a postcode is missing a space, SW109RL, you can blindly put a space at the 4th position from the end, SW10 9RL

Tags:

C#

String