Escaping special characters in Java Regular Expressions

The Pattern.quote(String s) sort of does what you want. However it leaves a little left to be desired; it doesn't actually escape the individual characters, just wraps the string with \Q...\E.

There is not a method that does exactly what you are looking for, but the good news is that it is actually fairly simple to escape all of the special characters in a Java regular expression:

regex.replaceAll("[\\W]", "\\\\$0")

Why does this work? Well, the documentation for Pattern specifically says that its permissible to escape non-alphabetic characters that don't necessarily have to be escaped:

It is an error to use a backslash prior to any alphabetic character that does not denote an escaped construct; these are reserved for future extensions to the regular-expression language. A backslash may be used prior to a non-alphabetic character regardless of whether that character is part of an unescaped construct.

For example, ; is not a special character in a regular expression. However, if you escape it, Pattern will still interpret \; as ;. Here are a few more examples:

  • > becomes \> which is equivalent to >
  • [ becomes \[ which is the escaped form of [
  • 8 is still 8.
  • \) becomes \\\) which is the escaped forms of \ and ( concatenated.

Note: The key is is the definition of "non-alphabetic", which in the documentation really means "non-word" characters, or characters outside the character set [a-zA-Z_0-9].


Is there any method in Java or any open source library for escaping (not quoting) a special character (meta-character), in order to use it as a regular expression?

If you are looking for a way to create constants that you can use in your regex patterns, then just prepending them with "\\" should work but there is no nice Pattern.escape('.') function to help with this.

So if you are trying to match "\\d" (the string \d instead of a decimal character) then you would do:

// this will match on \d as opposed to a decimal character
String matchBackslashD = "\\\\d";
// as opposed to
String matchDecimalDigit = "\\d";

The 4 slashes in the Java string turn into 2 slashes in the regex pattern. 2 backslashes in a regex pattern matches the backslash itself. Prepending any special character with backslash turns it into a normal character instead of a special one.

matchPeriod = "\\.";
matchPlus = "\\+";
matchParens = "\\(\\)";
... 

In your post you use the Pattern.quote(string) method. This method wraps your pattern between "\\Q" and "\\E" so you can match a string even if it happens to have a special regex character in it (+, ., \\d, etc.)


The only way the regex matcher knows you are looking for a digit and not the letter d is to escape the letter (\d). To type the regex escape character in java, you need to escape it (so \ becomes \\). So, there's no way around typing double backslashes for special regex chars.


I wrote this pattern:

Pattern SPECIAL_REGEX_CHARS = Pattern.compile("[{}()\\[\\].+*?^$\\\\|]");

And use it in this method:

String escapeSpecialRegexChars(String str) {

    return SPECIAL_REGEX_CHARS.matcher(str).replaceAll("\\\\$0");
}

Then you can use it like this, for example:

Pattern toSafePattern(String text)
{
    return Pattern.compile(".*" + escapeSpecialRegexChars(text) + ".*");
}

We needed to do that because, after escaping, we add some regex expressions. If not, you can simply use \Q and \E:

Pattern toSafePattern(String text)
{
    return Pattern.compile(".*\\Q" + text + "\\E.*")
}