<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Web-In-Sight &#187; backref</title>
	<atom:link href="http://web-in-sight.nl/tag/backref/feed/" rel="self" type="application/rss+xml" />
	<link>http://web-in-sight.nl</link>
	<description>Inzicht in internet en werken</description>
	<lastBuildDate>Mon, 30 Jan 2012 09:00:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Solved python regex raising exception &#8220;unmatched group&#8221;</title>
		<link>http://web-in-sight.nl/2008/07/11/solved-python-regex-raising-exception-unmatched-group/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=solved-python-regex-raising-exception-unmatched-group</link>
		<comments>http://web-in-sight.nl/2008/07/11/solved-python-regex-raising-exception-unmatched-group/#comments</comments>
		<pubDate>Fri, 11 Jul 2008 08:07:40 +0000</pubDate>
		<dc:creator>Gerard</dc:creator>
				<category><![CDATA[All ENGLISH articles]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[backref]]></category>
		<category><![CDATA[bug]]></category>
		<category><![CDATA[exception]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[regex]]></category>

		<guid isPermaLink="false">http://www.gp-net.nl/?p=51</guid>
		<description><![CDATA[If your a regex guru, and you know why you came here, you can go straight to the brief explanation. If not just keep reading. I found a workaround for python bug 1519638. It most definitely will not solve all &#8230; <a href="http://web-in-sight.nl/2008/07/11/solved-python-regex-raising-exception-unmatched-group/">Lees verder <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><!--TOC-->If your a regex guru, and you know why you came here, you can go straight to the <a href="#toc-brief-explanation">brief explanation</a>. If not just keep reading.</p>
<p>I found a workaround for python bug <a title="issue1519638" href="http://bugs.python.org/issue1519638" target="_blank">1519638</a>. It most definitely will not solve all of the puzzles out there but it stops breaking the sub method for replacing with the use of backrefs.</p>
<h3>The problem</h3>
<p>If you would like to replace this:</p>
<pre>&lt;label for="author"&gt;&lt;small&gt;Name</pre>
<p>With this:</p>
<pre>&lt;label for="author"&gt;&lt;small&gt;Naam</pre>
<p>And you&#8217;re not sure if the &lt;small&gt; tags is there, you would group the chars &#8220;&lt;small&gt;&#8221; and use a question mark for making them optional. BTW, running a replace on just &#8220;Name&#8221; is not allowed because they would mess up other parts of the file in question.</p>
<p><em>Example updated. Thanx dbr!</em></p>
<h3>The solution</h3>
<p>Using a compiled pattern and thus a regex to replace this, a solution might look like this:</p>
<pre>reg = re.compile(r'(&lt;label for="author"&gt;)(&lt;small&gt;)?(Name)', \
    re.VERBOSE | re.MULTILINE | re.DOTALL)
replace = r'\g&lt;1&gt;\g&lt;2&gt;\g&lt;3&gt;'
search = reg.sub(replace, data)</pre>
<p>In this case the replacement string uses backreferences to the groups being the sub expressions within the parenthesis in the search pattern.</p>
<h3>The oops</h3>
<p>However, if the &#8220;&lt;small&gt;&#8221; tag is not there the search command raises an exception.</p>
<pre>$ python regex.py
Traceback (most recent call last):
  File "regex.py", line 14, in &lt;module&gt;
    search = reg.sub(replace, data)
  File "/usr/lib/python2.5/re.py", line 274, in filter
    return sre_parse.expand_template(template, match)
  File "/usr/lib/python2.5/sre_parse.py", line 793, in expand_template
    raise error, "unmatched group"
sre_constants.error: unmatched group</pre>
<p>This happens because the second group represented with &#8220;\g&lt;2&gt;&#8221; in the replacement string returns a &#8220;None&#8221; instead of an empty string. That is (seems) the bug.</p>
<h3>Solving the oops</h3>
<p>This can be resolved by replacing the optional notation &#8220;(&lt;small&gt;)?&#8221; with an alternation &#8220;(|&lt;small&gt;)&#8221; because with the &#8220;&lt;small&gt;&#8221; tag being absent it matches on the empty subexpression. And then it actually returns an empty string so the search command won&#8217;t raise the exception.</p>
<p>In other words &#8230;</p>
<h3>Brief explanation</h3>
<p>When doing a search and replace with sub, replace the group represented as optional for a group represented as an alternation with one empty subexpression. So instead of this &#8220;(.+?)?&#8221; use this &#8220;(|.+?)&#8221; (without the double quotes).</p>
<p>If there&#8217;s nothing matched by this group the empty subexpression matches. Then an empty string is returned instead of a None and the sub method is executed normally instead of raising the &#8220;unmatched group&#8221; error.</p>
<p>That&#8217;s all folks &#8230;</p>
<div class="AWD_like_button "><iframe src="http://www.facebook.com/plugins/like.php?href=http%3A%2F%2Fweb-in-sight.nl%2F2008%2F07%2F11%2Fsolved-python-regex-raising-exception-unmatched-group%2F&amp;send=false&amp;layout=button_count&amp;width=&amp;show_faces=false&amp;action=recommend&amp;colorscheme=light&amp;font=arial&amp;height=21" scrolling="no" frameborder="0" style="border:none; overflow:hidden; width:px; height:21px;" allowTransparency="true"></iframe></div>]]></content:encoded>
			<wfw:commentRss>http://web-in-sight.nl/2008/07/11/solved-python-regex-raising-exception-unmatched-group/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
	</channel>
</rss>

