Difference between \b and \B in regex

The confusion stems from your thinking \b matches spaces (probably because "b" suggests "blank").

\b matches the empty string at the beginning or end of a word. \B matches the empty string not at the beginning or end of a word. The key here is that "-" is not a part of a word. So <left>-<right> matches \b-\b because there are word boundaries on either side of the -. On the other hand for <left> - <right> (note the spaces), there are not word boundaries on either side of the dash. The word boundaries are one space further left and right.

On the other hand, when searching for \bcat\b word boundaries behave more intuitively, and it matches " cat " as expected.


\b is a zero-width word boundary. Specifically:

Matches at the position between a word character (anything matched by \w) and a non-word character (anything matched by [^\w] or \W) as well as at the start and/or end of the string if the first and/or last characters in the string are word characters.

Example: .\b matches c in abc

\B is a zero-width non-word boundary. Specifically:

Matches at the position between two word characters (i.e the position between \w\w) as well as at the position between two non-word characters (i.e. \W\W).

Example: \B.\B matches b in abc

See regular-expressions.info for more great regex info

Tags:

Regex