getting a set of numbers with regex in python

Another way to do it using one regular expression:

import re

string = "serial 7's 93-86-79-72-65 very slow, recall 3/3 "

regex = r"(?<=serial 7's) (\d+-?)+"

matches = re.finditer(regex, test_str, re.MULTILINE)

for match in matches:
    integers = match.group(0).strip().split("-")

print(integers) # ['93', '86', '79', '72', '65']

I would use re.findall here combined with split:

string = "serial 7's 93-86-79-72-65 very slow"
matches = re.findall(r"\bserial 7's (\S+)", string)
nums = matches[0].split('-')
print(nums)

This prints:

['93', '86', '79', '72', '65']

My two cents, you could use the below pattern with re.search:

\bserial 7's\s(\d+(?:-\d+)*)

import re
s = "serial 7's 93-86-79-72-65 very slow, recall 3/3 "
res = re.search(r"\bserial 7's\s(\d+(?:-\d+)*)", s)
if res:
    print(res.group(1).split('-')) # ['93', '86', '79', '72', '65']
else:
    print('No match')

I'd check if any match actually occurs first where the pattern must include numbers which, if there are multiple values, are delimited by an hyphen. Since you mentioned: "Note that there might be unknown number of integers we are trying to extract. I only want numbers with the specific pattern.".

  • \b - Word boundary.
  • serial 7's - Match "serial 7's" literally.
  • \s+ - One or more whitespace characters.
  • ( - Open capture group.
  • \d+ - Match at least a single digit.
  • (?:-\d+)* - Non-capture group for zero or more times an hyphen followed by at least a single digit.
  • ) - Close capture group.

Alternatively one could use regex module instead and go with a non-fixed width positive lookbehind:

(?<=\bserial 7's\s+(?:\d+-)*)\d+

import regex
s = "serial 7's 93-86-79-72-65 very slow, recall 77 3/3 "
lst = regex.findall(r"(?<=\bserial 7's\s+(?:\d+-)*)\d+", s)
print(lst) # ['93', '86', '79', '72', '65']
  • (?<= - Start of the positive lookbehind.
    • \b - A word boudnary.
    • serial 7's - Literally "serial 7's".
    • \s+ - One ore more whitespace characters.
    • (?: - Open non-capture group.
      • \d+- - Match at least a single digit followed by a hyphen.
      • )* - Close non-capture group and match it zero or more times.
    • ) - Close positive lookbehind.
  • \d+ - Match at least a single digit.