What flavor of regex does git use

The Git source uses regcomp and regexec, which are defined by POSIX 1003.2. The code to compile a diff regexp is:

            if (regcomp(ecbdata->diff_words->word_regex,
                        o->word_regex,
                        REG_EXTENDED | REG_NEWLINE))

which in POSIX means that these are "extended" regular expressions as defined here.

(Not every C library actually implements the same POSIX REG_EXTENDED. Git includes its own implementation, which can be built in place of the system's.)

Edit (per updated question): POSIX EREs have neither lookahead nor lookbehind, nor do they have \w (but [_[:alnum:]] is probably close enough for most purposes).


Thanks for the hints from @torek 's answer above, now I realize that there are different flavors of regular expression engines and they could even have different syntax.

Even for one particular program, such as git, it could be compiled with a different regex engine. For example, this blog post hints that \w would be supported by git, contradicting with what I observed from my machine or what the OP here asked.

I ended up finding this section from your recommended wikipedia page most helpful, in terms of presenting different syntax in one table, so that I could do some "translation" between for example [:alnum:] and \w, [:digit:] and \d, [:space:] and \s, etc..