Parse an RPM name into its components

Solution 1:

You don't need to do any of this; RPM has a query format argument which will let you specify exactly the data you want to receive. It will even output without line endings if you don't specify them.

For instance:

rpm --queryformat "%{NAME} %{VERSION} %{RELEASE} %{ARCH}" -q coreutils
rpm --queryformat "The version of %{NAME} is %{VERSION}\n" -q coreutils

rpm --queryformat "%{NAME} %{VERSION} %{RELEASE} %{ARCH}" -qp file.rpm

The complete list of variables you can use can be obtained with:

rpm --querytags

Note that in the case of RELEASE, output like 84.el6 is normal and expected, since this is actually how RPM packages are versioned when packaged by or for a distribution.

Solution 2:

I've been told the official way to do what I'm seeking is in Python:

from rpmUtils.miscutils import splitFilename

(n, v, r, e, a) = splitFilename(filename)

I've written a short Python program that does what I need. I will offer the script to the rpmdev project for inclusion.

Solution 3:

I worked out regular expressions that fit all the data I was able to test them with. I had to use a mixture of greedy and non-greedy matches. That said, here is my perl and python versions:


#! /usr/bin/perl

foreach (@ARGV) {
    ($path, $name, $version, $release, $platform,
      @junk) = m#(.*/)*(.*)-(.*)-(.*?)\.(.*)(\.rpm)#;
    $verrel = $version . '-' . $release;

    print join("\t", $path, $name, $verrel, $version, $rev, $platform), "\n";


#! /usr/bin/python

import sys
import re

for x in sys.argv[1:]:
    m ='(.*/)*(.*)-(.*)-(.*?)\.(.*)(\.rpm)', x)
    if m:
        (path, name, version, release, platform, _) = m.groups()
        path = path or ''
        verrel = version + '-' + release
        print "\t".join([path, name, verrel, version, release, platform])
        sys.stderr.write('ERROR: Invalid name: %s\n' % x)

I'd rather have a regex that comes from the RPM project. The one that I invented above will have to do do for now.