I’ve reorganized the regular expression content in the new Programming Ruby, and added some cool new advanced examples. This one’s fairly straightforward, but I love the fact that I can now start refactoring my more complex patterns, removing duplication.

The stuff below is an extract from the unedited update. It’ll appear in the next beta. It follows a discussion of named groups, \k and related stuff.

There’s a trick which allows us to write subroutines inside regular expressions. Recall that we can invoke a named group using \g<name>, and we define the group using (?<name>...). Normally, the definition of the group is itself matched as part of executing the pattern. However, if you add the suffix {0} to the group, it means “zero matches of this group,” so the group is not executed when first encountered.

sentence = %r{ 
    (?<subject>   cat   | dog   | gerbil    ){0} 
    (?<verb>      eats  | drinks| generates ){0} 
    (?<object>    water | bones | PDFs      ){0} 
    (?<adjective> big   | small | smelly    ){0} 

    (?<opt_adj>   (\g<adjective>\s)?     ){0} 

    The\s\g<opt_adj>\g<subject>\s\g<verb>\s\g<opt_adj>\g<object> 
}x

md = sentence.match("The cat drinks water") 
puts "The subject is #{md[:subject]} and the verb is #{md[:verb]}"
 
md = sentence.match("The big dog eats smelly bones") 
puts "The adjective in the second sentence is #{md[:adjective]}" 

sentence =~ "The gerbil generates big PDFs" 
puts "And the object in the last is #{$~[:object]}" 

produces:

The subject is cat and the verb is drinks 
The adjective in the second sentence is smelly 
And the object in the last is PDFs 

Cool, eh?

Please keep it clean, respectful, and relevant. I reserve the right to remove comments I don't feel belong.
  • NickName, E-Mail, and Website are optional. If you supply an e-mail, we'll notify you of activity on this thread.
  • You can use Markdown in your comment (and preview it using the magnifying glass icon in the bottom toolbar).