How can I pass a callback to re.sub, but still inserting match captures?

If you pass a function you lose the automatic escaping of backreferences. You just get the match object and have to do the work. So you could:

Pick a string in the regex rather than passing a function:

text = "abcdef"
pattern = "(b|e)cd(b|e)"

repl = [r"\1bla\2", r"\1blabla\2"]
re.sub(pattern, random.choice(repl), text)
# 'abblaef' or 'abblablaef'

Or write a function that processes the match object and allows more complex processing. You can take advantage of expand to use back references:

text = "abcdef abcdef"
pattern = "(b|e)cd(b|e)"

def repl(m):
    repl = [r"\1bla\2", r"\1blabla\2"]           
    return m.expand(random.choice(repl))


re.sub(pattern, repl, text)

# 'abblaef abblablaef' and variations

You can, or course, put that function into a lambda:

repl = [r"\1bla\2", r"\1blabla\2"]
re.sub(pattern, lambda m: m.expand(random.choice(repl)), text)

Tags:

Python

Regex