Find and Increment a Number in an XML File

Process the file using an XML parser. This is just better in every way than hacking it with a regex.

use warnings;
use strict;

use XML::LibXML;

my $file = shift // die "Usage: $0 file\n";

my $doc = XML::LibXML->load_xml(location => $file);

my ($node) = $doc->findnodes('//value');

my $new_value = $node->to_literal =~ s/node1\-\K([0-9]+)/1+$1/er;

$node->removeChildNodes();
$node->appendText($new_value);

$doc->toFile('new_' . $file);   # or just $file to overwrite

Change the output filename to the input name ($file) to overwrite, once tested fully.

Removing and adding a node like above is one way to change an XML object.

Or, setData on the first child

$node->firstChild->setData($new_value);

where setData can be used on a node of type text, cdata or comment.

Or, search for text and then work with a text node directly

my ($tnode) = $doc->findnodes('//value/text()');

my $new_value = $tnode =~ s/node1\-\K([0-9]+)/1+$1/er;

$tnode->setData($new_value);

print $doc->toString;

There's more. What method to use depends on all that need be done. If the sole job is indeed to just edit that text then the simplest way is probably to get a text node.


I don't like using line-oriented text processing for modifying XML. You lose context and position and you can't tell if you are actually modifying what you think you are (inside comments, CDATA, etc).

But, ignoring that, here's your one-liner that has an easy fix. Basically, you aren't anchoring correctly. You match the first group of digits when you want the second:

$ perl -i -pe '/node1-/ && s/(\d+)(.*)/$1+1 . $2/e' filepath

Instead, match a group of digits immediately before a <. The (?=...) is a positive lookahead that doesn't match characters (just the condition), so you don't substitute those:

$ perl -i -pe '/node1-/ && s/(\d+)(?=<)/$1+1/e' filepath

However, I'd combine the first match. The \K allows you to ignore part of a substitution's match. You have to match the stuff before \K, but you won't replace that part:

$ perl -i -pe 's/node1-\K(\d+)/$1+1/e' filepath

Again, these might work, but eventually you (more likely the next guy) will be burned by it. I don't know your situation, but as I often advise people: it's not the rarity, it's the calamity.


Just for fun, I used Perl's Mojo::DOM to do the same task using CSS selectors. This isn't as powerful as XML::Twig (no stream parsing!), but for simple things it can work out nicely:

#!perl
use v5.26;

use Mojo::DOM;

my $xml = <<~"XML";
    <attribute>
        <name>test</name>
        <type>java.lang.String</type>
        <value>node1-3</value>
    </attribute>
    XML

my $dom = Mojo::DOM->new( $xml );
my $node = $dom->at( 'attribute value' ); # CSS Selector

my $current = $node->text;
say "Current text is $current";

# how you change the value is up to you. This line is
# just how I did it.
my $next = $current =~ s/(\d+)\z/ $1 + 1 /re;
say "Next text is $next";

$node->content( $next );

say $dom;

It's not so bad as a one-liner, but it's a bit verbose for that. The -0777 enables paragraph mode to slurp in all the content on the first line read (there's file name command-line argument at the end):

$ perl -MMojo::DOM -0777 -E '$d=Mojo::DOM->new(<>); $n=$d->at(q(attribute value)); $n->content($n->text =~ s/(\d+)\z/$1+1/er); say $d' text.xml
<attribute>
    <name>test</name>
    <type>java.lang.String</type>
    <value>node1-4</value>
</attribute>

Mojo has an ojo module (so, with -M, spells Mojo) that makes this slightly simpler at the expense of declaring variables. It's x() is a shortcut for Mojo::DOM->new():

$ perl -Mojo -0777 -E 'my $d=x(<>); my $n=$d->at(q(attribute value)); $n->content($n->text =~ s/(\d+)\z/$1+1/er); say $d' text.xml
<attribute>
    <name>test</name>
    <type>java.lang.String</type>
    <value>node1-4</value>
</attribute>

Here's an example using Perl's XML::Twig. Basically, you create a handler for a node, then do whatever you need to do in that handler. You can see the current text, make a new string, and set the node text to that string. It's a bit intimidating at first, but it's very powerful once you get used to it. I prefer this to other Perl XML parsers, but for very simple things it might not be the best tool:

#!perl
use v5.26;

use XML::Twig;

my $xml = <<~"XML";
    <attribute>
        <name>test</name>
        <type>java.lang.String</type>
        <value>node1-3</value>
    </attribute>
    XML

my $twig = XML::Twig->new(
    pretty_print  => 'indented',
    twig_handlers => {
        # the key is the name of the node you want to process
        value => sub {
            # each handler gets the twig and the current node
            my( $t, $node ) = @_;
            my $current = $node->text;
            # how you modify the text is not important. This
            # is just a Perl substitution that does not modify
            # the original but returns the new string
            my $next = $current =~ s/(\d+)\z/ $1 + 1 /re;
            $node->set_text( $next );
            }
        }
    );
$twig->parse( $xml );
my $updated_xml = $twig->sprint;

say $updated_xml;

Some other things to read for XML::Twig:

  • I give a long example in Modify XML data with XML::Twig
  • Perlmonks has a parallel example Edit a node value in xml

Tags:

Xml

Perl

Awk