In a regular expression, which characters need escaping?

This depends on the application. In your example [ must be quoted as an argument for grep but not echo.

For the shell (from the POSIX specs):

Quoting is used to remove the special meaning of certain characters or words to the shell. Quoting can be used to preserve the literal meaning of the special characters in the next paragraph, prevent reserved words from being recognized as such, and prevent parameter expansion and command substitution within here-document processing (see Here-Document).

The application shall quote the following characters if they are to represent themselves:

|  &  ;  <  >  (  )  $  `  \  "  '  <space>  <tab>  <newline>

and the following may need to be quoted under certain circumstances. That is, these characters may be special depending on conditions described elsewhere in this volume of IEEE Std 1003.1-2001:

*   ?   [   #   ˜   =   %

The various quoting mechanisms are the escape character, single-quotes, and double-quotes. The here-document represents another form of quoting; see Here-Document.

Specific programs (using regexes, perl, awk) could have additional requirements on escaping.


Each application will have its own set of 'special' characters. The issue that you ran into was with grep not the shell. For which characters need to be quoted in grep, read the manpage's section on "REGULAR EXPRESSIONS".

For the shell, that characters that should be quoted are:

;'"`#$&*?[]<>{}\

and any whitespace.

Depending on the shell, other characters may need to be quoted as well:

!^%

Look under "SHELL GRAMMAR" on the shell's manpage.


There are multiple types of regular expressions and the set of special characters depend on the particular type. Some of them are described below. In all the cases special characters are escaped by backslash \. E.g. to match [ you write \[ instead. Alternatively the characters (except ^) could be escaped by enclosing them between square brackets one by one like [[].

The characters which are special in some contexts like ^ special at the beginning of a (sub-)expression can be escaped in all contexts.

As others wrote: in shell if you do not enclose the expression between single quotes you have to additionally escape the special characters for the shell in the already escaped regex. Example: Instead of '\[' you can write \\[ (alternatively: "\[" or "\\[") in Bourne compatible shells like bash but this is another story.

Basic Regular Expressions (BRE)

  • POSIX: Basic Regular Expressions
  • Commands: grep, sed
  • Special characters: .[\
  • Special in some contexts: *^$
  • Escape a string: "$(printf '%s' "$string" | sed 's/[.[\*^$]/\\&/g')"

Extended Regular Expressions (ERE)

  • POSIX: Extended Regular Expressions
  • Commands: grep -E, GNU: sed -r, *BSD: sed -E
  • Special characters: .[\(
  • Special in some contexts: *^$)+?{|
  • Escape a string: "$(printf '%s' "$string" | sed 's/[.[\*^$()+?{|]/\\&/g')"