gnuplot Plotting Time

One often encounters plots that have time measures in both the x and y axes. Unfortunately, gnuplot's support for time data does not work for both the x and y values at the same time. This web page discusses some solutions to this problem. Hopefully, gnuplot will, one day, support this feature.

First, let's look at the original histogram way to see the problem. Consider the contents of the data file iso8601_hhmm.tsv:

2011-10-20	11:16
2011-10-21	11:11
2011-10-24	12:05
2011-10-26	14:42
2011-10-27	11:55
2011-10-28	10:18
2011-11-02	09:32
2011-11-03	11:07
2011-11-04	11:11
2011-11-07	10:57
2011-11-08	09:12
2011-11-09	09:05

Clearly, both the x (first column) and y (second column) values are time values; dates for x, and minutes:seconds (or hours:minutes) for y. For this demo, we'll ignore whether the time is clock time or elapsed time (where the leftmost column can have values greater than 12 or 24 hours, and 60 minutes). We tell gnuplot what the time format is using timefmt, which is the source of the problem: there is only a single value for timefmt, preventing us from handling different time formats in the x and y values. For instance, we can set timefmt "%Y-%m-%d" so that it treats values as dates (in ISO8601 format), or we can set timefmt "%H:%M" for time (%H = 0...24 hours, %M = 0...60 minutes, %S = 0...60 seconds), but not both. Lowercase m is month, uppercase is minutes.

Note that set timefmt is not the same as set format x "sometimeformat"; the latter command refers to the x axis labels.

For time-based data, gnuplot needs a full using specification for the plot command.


reset
clear
set key off
set title "HH:MM versus Date (xdata is time)"

# Setting xdata to time precludes the use of histograms.
set xdata time
set timefmt "%Y-%m-%d"

# set format x controls the way that gnuplot displays the dates on the x axis.
# "%Y-%m-%d" is the same as "%F", but "%F" applies to output only; it won't work
# for timefmt, which controls data reading.
set format x "%Y-%m-%d"

# out draws the tic marks on the outside of the border; otherwise they'd be
# obscured by the boxes.
set xtics rotate by 90 offset 0,-4 out nomirror

# Impulses are rather thin by default, so thicken them up with linewidth.
plot "iso8601_hhmm.tsv" using 1:2 with impulses linewidth 10


Figure 1: HH:MM versus Date
	   (xdata is time)

Figure 1: HH:MM versus Date (xdata is time)

As you can see, the minutes have been ignored so that the impulses stop at integer values of y. This is because gnuplot doesn't recognise "9:15" as nine minutes and fifteen seconds (or quarter-past-nine). More accurately, gnuplot could be told to recognise the second column as times, but only at the expense of no longer indicating that the first column is a date.

We now know what happens to hh:mm values without a timefmt specification. This time, let's see what happens to the dates if we use timefmt for the hh:mm values:


reset
clear
set key off
set title "HH:MM versus Date (ydata is time)"

set ydata time
set timefmt "%M:%S"
set format y "%M:%S"

# We need a smaller y offset since only YYYY is shown.
set xtics rotate by 90 offset 0,-2 out nomirror

# Using time format characters for a non-time axis results in a
# "Bad format character" error, so leave the default in place for x format.
# set format x "%Y-%m-%d"

plot "iso8601_hhmm.tsv" using 1:2 with impulses linewidth 10
   

Although it can now separate hours and minutes in the y values, gnuplot will now see the x values as being 2011 for all rows; it won't try evaluating the entire column as an expression (e.g. mathematically, 2011-10-20 = 1981). It displays a warning indicating that the x range is empty. Thus, only the largest value in the file is visible in our plot since all values end up at the same point on the x axis, so we really do need to identify the x values as dates with timefmt.


Figure 2: HH:MM versus Date
	   (ydata is time)

Figure 2: HH:MM versus Date (ydata is time)

The most convenient solution to this is to use timefmt for the first column, and interpret the time values manually. We can do this by converting the colon to whitespace (presumably a tab character) and multiplying the number in front of the colon by 60 and adding it to the number after the column. This works whether the times are hours:minutes or minutes:seconds. For hours:minutes:seconds, you'll need to multiply hours by 3600 and minutes by 60. Naturally, you'll need to adjust this hours if you're using elapsed time, where the leftmost column might be larger than 24 (or 60).

gnuplot can read the output of a command, which allows us to write a small program to replace the colons with a tab. Then, we can call this program along with the data file, in the plot command. Unfortunately, neither pgnuplot.exe nor wgnuplot.exe can read the output of commands (you'll get a warning about skipping an unreadable file). So, for this technique, we need to use gnuplot.exe (a text-mode edition of gnuplot) instead. The biggest disadvantage of the text-mode interface is that it is harder to reenter commands using the history.

The VBScript program I wrote to replace the colons to tabs, colon2tab.vbs, is sufficient for this demo, but doesn't do any error checking so you might want to enhance it for serious use.


reset
clear
set key off
set title "HH:MM versus Date (ydata is integer minutes)"

set timefmt "%Y-%m-%d"
set xdata time
set format x "%Y-%m-%d"

set xtics rotate by 90 offset 0,-4 out nomirror

# We'd like to use "%H:%M" for Y values, but we can't because y isn't
# time-based.
# set format y "%H:%M"

plot "<cscript //nologo colon2tab.vbs iso8601_hhmm.tsv" using 1:($2*60+$3) with impulses linewidth 10
   

Figure 3: HH:MM versus Date
	   (ydata is integer minutes)

Figure 3: HH:MM versus Date (ydata is integer minutes)

This gives us a plot with simple minutes on the y axis. If we really want to break this down into hours and minutes, we can do this by adjusting the using part. Leave the hours alone and convert the minutes to a decimal by dividing by 60. This means we'll have decimal time (1:15 comes out as 1.25 hours), but as long as we're not displaying labels for sub-values (minor tics in gnuplot parlance) on the y axis, this won't make any difference. Try this (you'll need to set ytics to account for the fact that the y values will be much smaller):


reset
clear
set key off
set title "HH:MM versus Date (ydata is decimal hours)"

set timefmt "%Y-%m-%d"
set xdata time
set format x "%Y-%m-%d"

set xtics rotate by 90 offset 0,-4 out nomirror

plot "<cscript //nologo colon2tab.vbs iso8601_hhmm.tsv" using 1:($2+$3/60) with impulses linewidth 10
   

Figure 2: HH:MM versus Date
	   (ydata is decimal hours)

Figure 2: HH:MM versus Date (ydata is decimal hours)



Home About Me
Copyright © Neil Carter

Content last updated: 2012-02-19