How do I write a regular expression that excludes rather than matches, e.g., not (this|string)?

This is not easily possible. Regular expressions are designed to match things, and this is all they can do.

First off: [^] does not designate an "excludes group", it designates a negated character class. Character classes do not support grouping in any form or shape. They support single characters (and, for convenience, character ranges). Your try [^(not|this)] is 100% equivalent to [^)(|hinots], as far as the regex engine is concerned.

Three ways can lead out of this situation:

  1. match (not|this) and exclude any matches with the help of the environment you are in (negate match results)
  2. use negative look-ahead, if supported by your regex engine and feasible in the situation
  3. rewrite the expression so it can match: see a similar question I asked earlier

First of all: [^n][^o][^t] is not a solution. This would also exclude words like nil ([^n] does not match), bob ([^o] does not match) or cat ([^t] does not match).

But it is possible to build a regular expression with basic syntax that does match strings that neither contain not nor this:

^([^nt]|n($|[^o]|o($|[^t]))|t($|[^h]|h($|[^i]|i($|[^s]))))*$

The pattern of this regular expression is to allow any character that is not the first character of the words or only prefixes of the words but not the whole words.


Hard to believe that the accepted answer (from Gumbo) was actually accepted! Unless it was accepted because it indicated that you cannot do what you want. Unless you have a function that generates such regexps (as Gumbo shows), composing them would be a real pain.

What is the real use case -- what are you really trying to do?

As Tomalak indicated, (a) this is not what regexps do; (b) see the other post he linked to, for a good explanation, including what to do about your problem.

The answer is to use a regexp to match what you do not want, and then subtract that from the initial domain. IOW, do not try to make the regexp do the excluding (it cannot); do the excluding after using a regexp to match what you want to exclude.

This is how every tool that uses regexps works (e.g., grep): they offer a separate option (e.g. via syntax) that carries out the subtraction -- after matching what needs to be subtracted.