One often encounters plots that have time measures in both the x and y axes. Unfortunately, gnuplot's support for time data does not work for both the x and y values at the same time. This web page discusses some solutions to this problem. Hopefully, gnuplot will, one day, support this feature.
First, let's look at the original histogram way to see the problem. Consider the contents of the data file iso8601_hhmm.tsv:
2011-10-20 11:16 2011-10-21 11:11 2011-10-24 12:05 2011-10-26 14:42 2011-10-27 11:55 2011-10-28 10:18 2011-11-02 09:32 2011-11-03 11:07 2011-11-04 11:11 2011-11-07 10:57 2011-11-08 09:12 2011-11-09 09:05
Clearly, both the x (first column) and y (second column) values are
time values; dates for x, and minutes:seconds (or hours:minutes) for y.
For this demo, we'll ignore whether the time is clock time or elapsed
time (where the leftmost column can have values greater than 12 or 24
hours, and 60 minutes). We tell gnuplot what the time format is using
timefmt, which is the source of the problem: there is only a
single value for
timefmt, preventing us from handling
different time formats in the x and y values. For instance, we can
set timefmt "%Y-%m-%d" so that it treats values as dates (in
ISO8601 format), or we can
set timefmt "%H:%M" for time (%H
= 0...24 hours, %M = 0...60 minutes, %S = 0...60 seconds), but not both.
Lowercase m is month, uppercase is minutes.
set timefmt is not the same as
set format x "sometimeformat"; the latter command refers to
the x axis labels.
For time-based data, gnuplot needs a full
specification for the plot command.
reset clear set key off set title "HH:MM versus Date (xdata is time)" # Setting xdata to time precludes the use of histograms. set xdata time set timefmt "%Y-%m-%d" # set format x controls the way that gnuplot displays the dates on the x axis. # "%Y-%m-%d" is the same as "%F", but "%F" applies to output only; it won't work # for timefmt, which controls data reading. set format x "%Y-%m-%d" # out draws the tic marks on the outside of the border; otherwise they'd be # obscured by the boxes. set xtics rotate by 90 offset 0,-4 out nomirror # Impulses are rather thin by default, so thicken them up with linewidth. plot "iso8601_hhmm.tsv" using 1:2 with impulses linewidth 10
As you can see, the minutes have been ignored so that the impulses stop at integer values of y. This is because gnuplot doesn't recognise "9:15" as nine minutes and fifteen seconds (or quarter-past-nine). More accurately, gnuplot could be told to recognise the second column as times, but only at the expense of no longer indicating that the first column is a date.
We now know what happens to hh:mm values without a
specification. This time, let's see what happens to the dates if we use
timefmt for the hh:mm values:
reset clear set key off set title "HH:MM versus Date (ydata is time)" set ydata time set timefmt "%M:%S" set format y "%M:%S" # We need a smaller y offset since only YYYY is shown. set xtics rotate by 90 offset 0,-2 out nomirror # Using time format characters for a non-time axis results in a # "Bad format character" error, so leave the default in place for x format. # set format x "%Y-%m-%d" plot "iso8601_hhmm.tsv" using 1:2 with impulses linewidth 10
Although it can now separate hours and minutes in the y values, gnuplot
will now see the x values as being 2011 for all rows; it won't try evaluating
the entire column as an expression (e.g. mathematically, 2011-10-20 = 1981).
It displays a warning indicating that the x range is empty. Thus, only the
largest value in the file is visible in our plot since all values end up at
the same point on the x axis, so we really do need to identify the x values
as dates with
The most convenient solution to this is to use
the first column, and interpret the time values manually. We can do this by
converting the colon to whitespace (presumably a tab character) and
multiplying the number in front of the colon by 60 and adding it to the
number after the column. This works whether the times are hours:minutes or
minutes:seconds. For hours:minutes:seconds, you'll need to multiply hours by
3600 and minutes by 60. Naturally, you'll need to adjust this hours if you're
using elapsed time, where the leftmost column might be larger than 24 (or
gnuplot can read the output of a command, which allows us to write a small program to replace the colons with a tab. Then, we can call this program along with the data file, in the plot command. Unfortunately, neither pgnuplot.exe nor wgnuplot.exe can read the output of commands (you'll get a warning about skipping an unreadable file). So, for this technique, we need to use gnuplot.exe (a text-mode edition of gnuplot) instead. The biggest disadvantage of the text-mode interface is that it is harder to reenter commands using the history.
The VBScript program I wrote to replace the colons to tabs, colon2tab.vbs, is sufficient for this demo, but doesn't do any error checking so you might want to enhance it for serious use.
reset clear set key off set title "HH:MM versus Date (ydata is integer minutes)" set timefmt "%Y-%m-%d" set xdata time set format x "%Y-%m-%d" set xtics rotate by 90 offset 0,-4 out nomirror # We'd like to use "%H:%M" for Y values, but we can't because y isn't # time-based. # set format y "%H:%M" plot "<cscript //nologo colon2tab.vbs iso8601_hhmm.tsv" using 1:($2*60+$3) with impulses linewidth 10
This gives us a plot with simple minutes on the y axis. If we really want
to break this down into hours and minutes, we can do this by adjusting the
using part. Leave the hours alone and convert the minutes to a
decimal by dividing by 60. This means we'll have decimal time (1:15 comes out
as 1.25 hours), but as long as we're not displaying labels for sub-values
(minor tics in gnuplot parlance) on the y axis, this won't make any
difference. Try this (you'll need to
set ytics to account for
the fact that the y values will be much smaller):
reset clear set key off set title "HH:MM versus Date (ydata is decimal hours)" set timefmt "%Y-%m-%d" set xdata time set format x "%Y-%m-%d" set xtics rotate by 90 offset 0,-4 out nomirror plot "<cscript //nologo colon2tab.vbs iso8601_hhmm.tsv" using 1:($2+$3/60) with impulses linewidth 10
|Home||About Me||Copyright © Neil Carter|
Content last updated: 2012-02-19