When I wrote the first of my web pages, I didn't include the back-button that links back to the Psy web site's homepage. I wanted to add the button to all my pages, without having to go through each one individually, and manually cut and paste the button code.
Also, after a first attempt at doing this, I got the replace
string wrong, so I had some files with broken button-code. This
meant that I couldn't restrict my search to the
<HR>
tag followed by empty space (perhaps
whitespace and/or empty lines), and terminated by the
<ADDRESS>
tag. Now, I had to allow for
non-blank-lines between the two tags, to include the broken
code.
Having to allow for the non-blank lines does widen the margin for
error. There is a greater opportunity for false positives. In other
words, the regular expression (RE) may match more than I want. Taken
in its widest sense, the RE will match a <HR>
tag
followed by anything followed by an
<ADDRESS>
tag. Now, I can't imagine many
circumstances where I would want to have that situation, other than
the intended one. However, it is a limitation in the general
case.
^:w*"<hr>":w*$(^.*$)*^:w*"<address>"
This MSUB script looks for a <HR>
tag,
followed by blank or non-blank lines (anything, in other words),
followed by the <ADDRESS>
tag. This defines the
footer of my pages. I end the main content of my web pages with a
horizontal rule, and then I put my authorship details between
<ADDRESS>
tags. When I wrote this pattern, I was
using only lower-case for my tags, so this is what I specified. I
have since changed to upper-case, and this is refelected in this
explanation.
To make things easier, I have chunked the RE into search lines.
^:w*"<hr>":w*$
SOL, whitespace (0 or more), literal string for the
<HR>
tag, whitespace (0 or more), EOL
Whitespace is of indefinite length, but terminated by EOF. So,
you can't have more than one whitespace, because the same whitespace
would include all the other presumed contiguous
whitespaces. However, I have to allow for no whitespace at all (if
the <HR>
tag was at the very beginning of the
line) so I need the closure operator (*), which means 0 or more
occurences. I could have used the optional operator (?), meaning 0
or 1 occurence, given that whitespace is of indefinite length.
(^.*$)*
This matches any line, blank or not. The .* means 0 or more appearances of any character other than SOL and EOL. Putting the SOL, any-char, EOL pattern in brackets allows me to specify that the blank/non-blank line may appear multiple times, or not at all, using the * qualifier outside the brackets.
^:w*"<address>"
This matches the line containing the <ADDRESS>
tag. It starts with the SOL, then allows for any indentation (some
or no whitespace), followed by the tag. This terminates the search
string.
^"<hr>"$^$^"<center><a href="'"'"http://psy.swansea.ac.uk/"'"'"
onMouseOver="'"'"document.images.UWSPsyHome.src='http://psy.swansea.ac.uk/template/back_F2.gif'"'"'$
^"onMouseOut="'"'"document.images.UWSPsyHome.src='http://psy.swansea.ac.uk/template/back.gif'"'">'$
^"<img src="'"'"http://psy.swansea.ac.uk/template/back.gif"'"'"
alt="'"'UWS Psychology Homepage'"'" name="'"'UWSPsyHome'"'" border="'"'0'"'">
</a></center>"$^$^"<address>"
Home | About Me | Copyright © Neil Carter |
Content last updated: 2000-06-30