Using Vim to Replace Characters with HTML Entities between Specified Lines

This is a simple demonstration of Vim's search/find and replace feature, based on converting individual characters to short strings of characters, but only on those lines that fall between lines that contain some specific text strings. Such a situation occurs when including program source code on a HTML page.

C source code often contains the ampersand character, which is used to refer to the address of a variable. Since this character is actually used as an escape code to start an entity, a simple ampersand in the HTML source may not appear as expected in the resultant web page. There are other symbols (e.g. less–than, ‘<’) appearing frequently in program source code that would interfere with a web page's HTML source. To have such characters display properly in a browser, it is necessary to use an entity. The ampersand entitity is &amp; so this is what you would type into the HTML source.

Thus, it is useful to have a convenient process for dealing with HTML–offending characters. For this example, let's say you've inserted some C source code into a HTML page. The next thing you would do is surround the code with pre and code tags. Let's say the code tags are not on the same line as the source code (it'll make the serch expression simpler).

The command (this is a ‘colon’ command) sequence is described item–by–item below. Note that Vim separates the parts of a command with the forward slash character ‘/’.

  1. First, we specify the search range, i.e. from the line that contains the opening code tag (the text ‘<code>’) to the line before the one that contains the closing tag (the text ‘</code>’). These lines can be specified with a search–expression that can be found on that line (and hopefully nowhere else), followed by plus one (to move to the next line), (and minus one, for the line before). Note that the HTML closing tag symbol (‘/’) has to be escaped, otherwise Vim will think it marks the end of this search pattern.
  2. Next, we specify the search pattern, which is simply the ampersand character alone (the text ‘&’).
  3. Next, we specify the replace text, which is the entity (the text ‘&amp;’). An important issue here is that the ampersand character should be escaped with the backslash character ‘\’. This is because Vim uses the un–escaped ampersand character in the replace text to refer to the ‘found’ text (so you can include the search–for text in the replacement text if necessary).
  4. Finally, we specify some command modifiers. ‘g’ (for Global) tells Vim to replace all occurences of the ampersand on a line (by default Vim only changes the first occurence per line). ‘c’ (for Confirm) tells Vim to ask you to confirm each replacement (not necessary, but safer).
   :/<code>/+1,/<\/code>/-1s/&/\&amp;/gc

Here is a similar command to convert less–than symbols (‘<’) to &lt;

   :/<code>/+1,/<\/code>/-1s/</\&lt;/gc

To reinforce the lesson, here is another command, which converts greater–than symbols (‘>’) to &gt;

 
   :/<code>/+1,/<\/code>/-1s/>/\&gt;/gc

Note that you should run the ampersand conversion before the other entity conversions. If you don't, Vim will try to replace the ampersands that were inserted when the other characters (e.g. ‘<’) were converted to entities.


Home About Me
Copyright © Neil Carter

Content last updated: 2003-11-25