Regex to extract numbers and trailing letter or white space

I would suggest the following regex pattern:

[0-9]{1,3}(?:,[0-9]{3})*(?:\\.[0-9]+)?[A-Za-z]*

This pattern generates the outputs you expect. Here is an explanation:

[0-9]{1,3}      match 1 to 3 initial digits
(?:,[0-9]{3})*  followed by zero or more optional thousands groups
(?:\\.[0-9]+)?  followed by an optional decimal component
[A-Za-z]*       followed by an optional text unit

I tend to lean towards base R solutions whenever possible, and here is one using gregexpr and regmatches:

txt <- "53.2k Followers, 11 Following, 1,396 Posts"
m <- gregexpr("[0-9]{1,3}(?:,[0-9]{3})*(?:\\.[0-9]+)?[A-Za-z]*", txt)
regmatches(txt, m)

[[1]]
[1] "53.2k"   "11"   "1,396"

Tags:

Regex

R