Saturday, October 14, 2006

Ruby Regular Expression Replace

Ruby provides Regular Expression replacement via the gsub method of String. For example, if I would like to replace a word I simply need the regex to match the word and the replacement text.
irb(main):006:0> "replacing in regex".gsub /\sin\s/, ' is easy in '
=> "replacing is easy in regex"
The gsub method also lets you use the regex match in the replacement.
irb(main):007:0> "replacing in regex".gsub /\sin\s/, ' is easy\0'
=> "replacing is easy in regex"
If you want part of the regex, but not the whole thing you will need to use a capturing group (parenthesis).
irb(main):008:0> "replacing in regex".gsub /\si(n)\s/, ' is easy i\1 '
=> "replacing is easy in regex"
When using a capturing group that matches multiple times in a single line you can still use \1 to include the match in the result.
irb(main):015:0> "1%, 10%, 100%".gsub /(\d+)%/, '0.0\1'
=> "0.01, 0.010, 0.0100"
Another important detail to note is that I'm using single quotes in my replacement string. Using double quotes neither works with \0 or \1 since "\0" #=> "\000".
irb(main):009:0> "replacing in regex".gsub /\sin\s/, " is easy\0"
=> "replacing is easy\000regex"
irb(main):010:0> "replacing in regex".gsub /\si(n)\s/, " is easy i\1 "
=> "replacing is easy i\001 regex"

7 comments:

  1. Using double quotes neither works with \0 or \1 since "\0" #=> "\000".

    Actually, all you need do is escape the backslash to keep it from substituting on the initial interpolation pass.

    >> suffix = '!'; "fi fie fo fum".gsub(/([aeiou]+)/,"\\1#{suffix}")

    => "fi! fie! fo! fu!m"

    ReplyDelete
  2. Anonymous2:15 AM

    Dude you didn't cover the most important think about Ruby gsub that distinguishes it from Perl and other languages. It takes a block argument! The block is passed the string matching the regular expressions and the return value of the block is used as the replacement string. Thus you can do something like the following:

    irb(main):001:0> "blah23blah45".gsub(/\d+/){|num| ((num.to_i)*3).to_s}
    => "blah69blah135"

    ReplyDelete
  3. Anonymous9:23 AM

    I was trying to focus on replacement via regex. But, I agree that being able to pass a block to gsub is very cool.

    ReplyDelete
  4. Anonymous12:38 AM

    I am new to Ruby and did not understand the 'block argument' concept.
    <<{|num| ((num.to_i)*3).to_s}>>

    Can someone explain pl. what its doing? As I remember, even Perl supports expression in a replacement part of a regex, isn't it?

    ReplyDelete
  5. Anonymous1:15 AM

    escaping is fine, but another go is:

    "you may have to use double-quotas, and keep access on: #{'\0'}"

    ReplyDelete
  6. i wonder if you can do this:

    string.gsub(/#{variable}/)

    i.e. substitute a variable into the regexp part, just the way you can do with a string literal in double quotes

    ReplyDelete
  7. Anonymous2:45 PM

    this is a better script. the suggested method above would lead to errors when parse_query tries to generate the hashes

    rack_input = env["rack.input"].read.gsub(/\&[a-zA-Z]+=/,'nnd\0').gsub("nnd&","^^").gsub("&","^*^").gsub("^^","&")

    params = Rack::Utils.parse_query(rack_input, "&")

    params.each_pair do |k,y|
    params[k] = y.gsub("^*^","&")
    end

    ReplyDelete

Note: Only a member of this blog may post a comment.