Archive

Archive for the ‘regular expressions’ Category

How to replace a substring using regex in Python

June 6th, 2010 7 comments

The problem: You match a string with your regex, but you need to replace just a portion of it. How could we replace it?
The trick is simple, put the text you want to replace within “()” which means “group” in regex language. If the regex worked, you could replace just that portion by using Python match information, like in this example:

#the first group contains the expression we want to replace
pat = "word1\s(.*)\sword2"
test = "word1 will never be a word2"
repl = "replace"

import re
m = re.search(pat,test)

if m and m.groups() > 0:
  line = test[0:m.start(1)] + repl + test[m.end(1):len(test)]
  print line
else:
  print "the pattern didn't capture any text"

This will print: ‘word1 will never be a word2

The group to be replaced could be located in any position of the string.

Multiline regex pattern

June 5th, 2010 6 comments

Task: Parse a file and capture whatever text appears between a pair of double quotes like the following:

“Catch me”

Not so difficult, you could use the following regex:

“.*”

This will catch any character within double quotes in a group
¿any? Read more…