perl using constant in regex

There's not much in the way of reasons to use a constant over a variable. It doesn't make a great deal of difference - perl will compile a regex anyway.

For example:

#!/usr/bin/perl

use warnings;
use strict;
use Benchmark qw(:all);

use constant FOO => "foo";
use constant BAR => "bar";

my $FOO_VAR = 'foo';
my $BAR_VAR = 'bar';

sub pattern_replace_const {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_${\FOO}/prefix2_${\BAR}/g;
}

sub pattern_replace_var {
   my $somvar = "prefix1_foo test";
   $somvar =~ s/prefix1_$FOO_VAR/prefix2_$BAR_VAR/g;
}

cmpthese(
   1_000_000,
   {  'const' => \&pattern_replace_const,
      'var'   => \&pattern_replace_var
   }
);

Gives:

          Rate const   var
const 917095/s    --   -1%
var   923702/s    1%    --

Really not enough in it to worry about.

However it may be worth noting - you can compile a regex with qr// and do it that way, which - provided the RE is static - might improve performance (but it might not, because perl can detect static regexes, and does that itself.

    Rate      var    const compiled
var      910498/s       --      -2%      -9%
const    933097/s       2%       --      -7%
compiled 998502/s      10%       7%       --

With code like:

my $compiled_regex = qr/prefix1_$FOO_VAR/;
sub compiled_regex { 
    my $somvar = "prefix1_foo test";
    $somvar =~ s/$compiled_regex/prefix2_$BAR_VAR/g;
}

Honestly though - this is a micro optimisation. The regex engine is fast compared to your code, so don't worry about it. If performance is critical to your code, then the correct way of dealing with it is first write the code, and then profile it to look for hotspots to optimise.


The shown problem is due to those constants being barewords (built at compile time)

Constants defined using this module cannot be interpolated into strings like variables.

In the current implemenation (of constant pragma) they are "inlinable subroutines" (see ).

This problem can be solved nicely by using a module like Const::Fast

use Const::Fast;

const my $foo => 'FOO';
const my $bar => 'BAR';

my $var = 'prefix1_FOO_more';

$var =~ s/prefix1_$foo/prefix2_$bar/g;

Now they do get interpolated. Note that more complex replacements may need /e.

These are built at runtime so you can assign results of expressions to them. In particular, you can use the qr operator, for example

const my $patt => qr/$foo/i;  # case-insensitive 

The qr is the recommended way to build regex patterns. (It interpolates unless the delimiter is '.) The performance gain is most often tiny, but you get a proper regular expression, which can be built and used as such (and in this case a constant as well).

I recommend Const::Fast module over the other one readily, and in fact over all others at this time. See a recent article with a detailed discussion of both. Here is a review of many other options.

I strongly recommend to use a constant (of your chosen sort) for things meant to be read-only. That is good for the health of the code, and of developers who come into contact with it (yourself in the proverbial six months included).


These being subroutines, we need to be able to run code in order to have them evaluated and replaced by given values. Can't just "interpolate" (evaluate) a variable -- it's not a variable.

A way to run code inside a string (which need be interpolated, so effectively double quoted) is to de-reference, where there's an expression in a block under a reference; then the expression is evaluated. So we need to first make that reference. So either

say "@{ [FOO] }";  # make array reference, then dereference

or

say "${ \FOO }";   # make scalar reference then dereference

prints foo. See the docs for why this works and for examples. Thus one can do the same inside a regex, and both in matching and replacement parts

s/prefix1_${\FOO}/prefix2_${\BAR}/g;

(or with @{[...]}), since they are evaluated as interpolated strings.

Which is "better"? These are tricks. There is rarely, if ever, a need for doing this. It has a very good chance to confuse the reader. So I just wouldn't recommend resorting to these at all.

As for (?{ code }), that is a regex feature, whereby code is evaluated inside a pattern (matching side only). It is complex and tricky and very rarely needed. See about it in perlretut and in perlre.

Discussing speed of these things isn't really relevant. They are certainly outside the realm of clean and idiomatic code, while you'd be hard pressed to detect runtime differences.

But if you must use one of these, I'd much rather interpolate inside a scalar reference via a trick then reach for a complex regex feature.


According to PerlMonk, you better create an already-interpolated string if you are concerned about performance:

use constant PATTERN => 'def';
my $regex = qr/${\(PATTERN)}/; #options such as /m can go here.
if ($string =~ regex) { ... }

Here is the link to the whole discussion.

Tags:

Regex

Perl