Using the :substitute Command to Join Lines

Vim can word-wrap lines either actually or physically, (i.e. end-of-line (EOL) characters are inserted into the file), or only visually (the line is only wrapped on the display; the file is unchanged). Actual word-wrapping normally follows rules about how long a line should be and where to wrap (presumably between, rather than within, words). In contrast, visual-wrapping wraps at each character and the size of Vim's window dictates the length of the line.

In some cases, it is desirable to leave the lines physically unwrapped; the advantage being that the wrapping can be done on-the-fly so is automatically adjusted by whatever program and view you happen to use when reading the file. Also, log files are typically structured so that each record occupies exactly one line.

For this example, let's suppose you have some physically word-wrapped text that you want to convert to dynamically-wrapped. In this case, a natural format would be to have one (long) line per paragraph. To enable Vim to dynamically wrap these paragraphs, we need to join the file's current multi-line paragraphs.

Thus, we need to remove EOLs that occur within paragraphs, but not between paragraphs. This will put each paragraph onto a single line. This can be done with the following command:


:%s/\(\S\n\(\S\)/\1 \2/c

: invokes command mode.

% means perform the operation on "every line in file".

s is the "substitute" command.

/ is the character used to separate the search and replace expressions (.e.g. /a/b/ means replace every "a" with a "b").

\ is the character used to indicate that the next character is special in some way (not to be taken literally). So "s" means the letter s, but "\s" means whitespace (space and tab characters).

\n means an EOL. Note: on Unix, which is where vi/Vim originated, \n is the single Newline character. On Microsoft systems, however, Vim is clever enough to properly handle the fact that EOLs consist of two consequtive characters: Carriage Return (\r and Newline).

Letters after the last / are command options. g means change all occurrences on that line (as opposed to only the first one). c means confirm every change.

Try


:%s/\n//c

But this will put the entire file onto a single line!

Undo this with


u

Note the lack of a colon, this command is entered in Normal mode.

As discussed, we should remove EOLs only if they are within a paragraph. This condition can be detected by searching for an EOL with a non-blank character before and after (i.e. at the end of this line and the start of the next.

Try the following


:%s/\S\n\S//c

\S means a non-whitespace (space, tab, or EOL) character.

This is better, but it deletes the characters at the end and start of the joined lines. This is because the command, as it stands, replaces the characters with nothing. Instead, we need to put the characters back (i.e. replace them with themselves), whilst still removing the EOL.

Vim remembers what's being replaced if it is included within \( and \). Note: the backslash tells Vim that the bracket is a marker (i.e. don't match it with a literal bracket character.)

We need to remember the start and end characters separately. In the replacement expression, \1 and \2 refer to the items matched in the first and second marker-pairs.

Thus, we end up with the following command


:%s/\\(\S\)\n\(\S\)/\1 \2/c

This demonstrates the :substitute command, but there are other means of joining lines within a paragraph; see: http://superuser.com/questions/200423/join-lines-inside-paragraphs-in-vim



Home About Me
Copyright © Neil Carter

Content last updated: 2011-06-20