Match sequences of consecutive characters in a string

Using regex in Ruby 1.8.7+:

p s.scan(/((\d)\2*)/).map(&:first)
#=> ["111", "22", "1"]

This works because (\d) captures any digit, and then \2* captures zero-or-more of whatever that group (the second opening parenthesis) matched. The outer (…) is needed to capture the entire match as a result in scan. Finally, scan alone returns:

[["111", "1"], ["22", "2"], ["1", "1"]]

…so we need to run through and keep just the first item in each array. In Ruby 1.8.6+ (which doesn't have Symbol#to_proc for convenience):

p s.scan(/((\d)\2*)/).map{ |x| x.first }
#=> ["111", "22", "1"]

With no Regex, here's a fun one (matching any char) that works in Ruby 1.9.2:

p s.chars.chunk{|c|c}.map{ |n,a| a.join }
#=> ["111", "22", "1"]

Here's another version that should work even in Ruby 1.8.6:

p s.scan(/./).inject([]){|a,c| (a.last && a.last[0]==c[0] ? a.last : a)<<c; a }
# => ["111", "22", "1"]

Tags:

Ruby

Regex