R remove last word from string

Here's a regexp that does what you need:

sub(df1$city, pattern = " [[:alpha:]]*$", replacement = "")

[1] "Middletown" "Sunny Valley" "Hillside"

That's replacing a substring that starts with a space, then contains only letters until the end of the string, with an empty string.


I would use word() in the stringr package like so:

df1 %>% mutate(city = word(city , 1  , -2))

The first argument (1) indicates that you're starting from the first word, and the second (-2) indicates that you're keeping everything up to the second last word.


This will work:

gsub("\\s*\\w*$", "", df1$city)
[1] "Middletown"   "Sunny Valley" "Hillside"   

It removes any substring consisting of one or more space chararacters, followed by any number of "word" characters (spaces, numbers, or underscores), followed by the end of the string.

Tags:

String

Regex

R