What do 'lazy' and 'greedy' mean in the context of regular expressions?

Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with <.+>. Suppose you have the following:

<em>Hello World</em>

You may think that <.+> (. means any non newline character and + means one or more) would only match the <em> and the </em>, when in reality it will be very greedy, and go from the first < to the last >. This means it will match <em>Hello World</em> instead of what you wanted.

Making it lazy (<.+?>) will prevent this. By adding the ? after the +, we tell it to repeat as few times as possible, so the first > it comes across, is where we want to stop the matching.

I'd encourage you to download RegExr, a great tool that will help you explore Regular Expressions - I use it all the time.


'Greedy' means match longest possible string.

'Lazy' means match shortest possible string.

For example, the greedy h.+l matches 'hell' in 'hello' but the lazy h.+?l matches 'hel'.


Greedy quantifier Lazy quantifier Description
* *? Star Quantifier: 0 or more
+ +? Plus Quantifier: 1 or more
? ?? Optional Quantifier: 0 or 1
{n} {n}? Quantifier: exactly n
{n,} {n,}? Quantifier: n or more
{n,m} {n,m}? Quantifier: between n and m

Add a ? to a quantifier to make it ungreedy i.e lazy.

Example:
test string : stackoverflow
greedy reg expression : s.*o output: stackoverflow
lazy reg expression : s.*?o output: stackoverflow