Use gsub remove all string before first numeric character

In R 3.6 (currently the R devel version) onwards trimws has a new whitespace argument which can be used to specify what is regarded as whitespace -- in this case any non-digit character:

trimws(x, "left", "\\D")
## [1] "65lolo"    "3hihi"     "365meumeu"

You may use

> x <- c("lala65lolo","papa3hihi","george365meumeu")
> sub("^\\D+", "", x)
[1] "65lolo"    "3hihi"     "365meumeu"

Or, to make sure there is a digit:

sub("^\\D+(\\d)", "\\1", x)

The pattern matches

  • ^ - start of string
  • \\D+ - one or more chars other than digit
  • (\\d) - Capturing group 1: a digit (the \1 in the replacement pattern restores the digit captured in this group).

In a similar way, you may achieve the following:

  • sub("^\\s+", "", x) - remove all text up to the first non-whitespace char
  • sub("^\\W+", "", x) - remove all text up to the first word char
  • sub("^[^-]+", "", x) - remove all text up to the first hyphen (if there is any), etc.

Tags:

String

R

Gsub