Regex for s3 bucket name

Answer

The simplest and safest regex is:

(?!(^xn--|.+-s3alias$))^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$

It ensures that names work for all cases - including when you are using S3 Transfer Acceleration. Also, as it doesn't include any backslashes, it's easier to use in string contexts.

Alternative

If you need S3 bucket names that include dots (and you don't use S3 Transfer Acceleration), you can use this instead:

(?!(^((2(5[0-5]|[0-4][0-9])|[01]?[0-9]{1,2})\.){3}(2(5[0-5]|[0-4][0-9])|[01]?[0-9]{1,2})$|^xn--|.+-s3alias$))^[a-z0-9][a-z0-9.-]{1,61}[a-z0-9]$

Explanation

The Amazon S3 bucket naming rules as of 2022-05-14 are:

  1. Bucket names must be between 3 (min) and 63 (max) characters long.
  2. Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
  3. Bucket names must begin and end with a letter or number.
  4. Bucket names must not be formatted as an IP address (for example, 192.168.5.4).
  5. Bucket names must not start with the prefix xn--.
  6. Bucket names must not end with the suffix -s3alias.
  7. Buckets used with Amazon S3 Transfer Acceleration can't have dots (.) in their names.

This regex matches all the rules (including rule 7):

(?!(^xn--|.+-s3alias$))^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$

The first group (?!(^xn--|-s3alias$)) is a negative lookahead that ensures that the name doesn't start with xn-- or end with -s3alias (satisfying rules 5 and 6).

The rest of the expression ^[a-z0-9][a-z0-9-]{1,61}[a-z0-9]$ ensures that:

  • the name starts with a lowercase letter or number (^[a-z0-9]) and ends with a lowercase letter or number ([a-z0-9]$) (rule 3).
  • the rest of the name consists of 1 to 61 lowercase letters, numbers or hyphens ([a-z0-9-]{1,61}) (rule 2).
  • the entire expression matches names from 3 to 63 characters in length (rule 1).

Lastly, we don't need to worry about rule 4 (which forbids names that look like IP addresses) because rule 7 implicitly covers this by forbidding dots in names.

If you do not use Amazon S3 Transfer Acceleration and want to permit more complex bucket names, then you can use this more complicated regular expression:

(?!(^((2(5[0-5]|[0-4][0-9])|[01]?[0-9]{1,2})\.){3}(2(5[0-5]|[0-4][0-9])|[01]?[0-9]{1,2})$|^xn--|.+-s3alias$))^[a-z0-9][a-z0-9.-]{1,61}[a-z0-9]$

The main change is the addition of the expression to match IPv4 addresses (while the spec simply says that bucket names must not be formatted as IP addresses, as IPv6 addresses contain colons, they are already forbidden by rule 2.)


I've adapted Zak's answer a little bit. I found it was a little too complicated and threw out valid domain names. Here's the new regex (available with tests on regex101.com**):

(?!^(\d{1,3}\.){3}\d{1,3}$)(^[a-z0-9]([a-z0-9-]*(\.[a-z0-9])?)*$)

The first part is the negative lookahead (?!^(\d{1,3}\.){3}\d{1,3}$), which only matches valid IP addresses. Basically, we try to match 1-3 numbers followed by a period 3 times (\d{1,3}\.){3}) followed by 1-3 numbers (\d{1,3}).

The second part says that the name must start with a lowercase letter or a number (^[a-z0-9]) followed by lowercase letters, numbers, or hyphens repeated 0 to many times ([a-z0-9-]*). If there is a period, it must be followed by a lowercase letter or number ((\.[a-z0-9])?). These last 2 patterns are repeated 0 to many times (([a-z0-9-]*(\.[a-z0-9])?)*).

The regex does not attempt to enforce the size restrictions set forth by AWS (3-63 characters). That can either be handled by another regex (.{3,6}) or by checking the size of the string.


** At that link, one of the tests I added are failing, but if you switch to the test area and type in the same pattern, it passes. It also works if you copy/paste it into the terminal, so I assume that's a bug on the regex101.com side.