GREP - MSUB Find and Replace for Back Button

The Argument

When I wrote the first of my web pages, I didn't include the back-button that links back to the Psy web site's homepage. I wanted to add the button to all my pages, without having to go through each one individually, and manually cut and paste the button code.

Also, after a first attempt at doing this, I got the replace string wrong, so I had some files with broken button-code. This meant that I couldn't restrict my search to the <HR> tag followed by empty space (perhaps whitespace and/or empty lines), and terminated by the <ADDRESS> tag. Now, I had to allow for non-blank-lines between the two tags, to include the broken code.

Limitation

Having to allow for the non-blank lines does widen the margin for error. There is a greater opportunity for false positives. In other words, the regular expression (RE) may match more than I want. Taken in its widest sense, the RE will match a <HR> tag followed by anything followed by an <ADDRESS> tag. Now, I can't imagine many circumstances where I would want to have that situation, other than the intended one. However, it is a limitation in the general case.

The Search Expression

^:w*"<hr>":w*$(^.*$)*^:w*"<address>"

Explanation

This MSUB script looks for a <HR> tag, followed by blank or non-blank lines (anything, in other words), followed by the <ADDRESS> tag. This defines the footer of my pages. I end the main content of my web pages with a horizontal rule, and then I put my authorship details between <ADDRESS> tags. When I wrote this pattern, I was using only lower-case for my tags, so this is what I specified. I have since changed to upper-case, and this is refelected in this explanation.

To make things easier, I have chunked the RE into search lines.

First Line

^:w*"<hr>":w*$

SOL, whitespace (0 or more), literal string for the <HR> tag, whitespace (0 or more), EOL

Whitespace is of indefinite length, but terminated by EOF. So, you can't have more than one whitespace, because the same whitespace would include all the other presumed contiguous whitespaces. However, I have to allow for no whitespace at all (if the <HR> tag was at the very beginning of the line) so I need the closure operator (*), which means 0 or more occurences. I could have used the optional operator (?), meaning 0 or 1 occurence, given that whitespace is of indefinite length.

Second Line

(^.*$)*

This matches any line, blank or not. The .* means 0 or more appearances of any character other than SOL and EOL. Putting the SOL, any-char, EOL pattern in brackets allows me to specify that the blank/non-blank line may appear multiple times, or not at all, using the * qualifier outside the brackets.

Third Line

^:w*"<address>"

This matches the line containing the <ADDRESS> tag. It starts with the SOL, then allows for any indentation (some or no whitespace), followed by the tag. This terminates the search string.

The Replace Expression

^"<hr>"$^$^"<center><a href="'"'"http://psy.swansea.ac.uk/"'"'"
onMouseOver="'"'"document.images.UWSPsyHome.src='http://psy.swansea.ac.uk/template/back_F2.gif'"'"'$
^"onMouseOut="'"'"document.images.UWSPsyHome.src='http://psy.swansea.ac.uk/template/back.gif'"'">'$
^"<img src="'"'"http://psy.swansea.ac.uk/template/back.gif"'"'"
alt="'"'UWS Psychology Homepage'"'" name="'"'UWSPsyHome'"'" border="'"'0'"'">
</a></center>"$^$^"<address>"

Home About Me
Copyright © Neil Carter

Content last updated: 2000-06-30