A VBScript to Group Hits in A Google Analytics Report

The Top Content Report produced by Google Analytics lists the hits received by web pages (which includes files and folders). Often, different pages belong to the same group. In these cases, one might want to accumulate the number of hits received by pages by the groups to which the pages belong. If these pages are stored in directories that correspond to their groups, then all is well. However, this isn't always the case, in which case, the following script can help.

One can count the hits for each of the university's colleges by gathering the relevant directories into the corresponding group. For instance, Swansea University's College of the Environment and Society has pages in various directories, alongside its own main directory. The same thing applies to the other colleges in the university. The following shows some lines from the Analytics Top Content report.

# ----------------------------------------
# Table
# ----------------------------------------
Page	Pageviews	Unique Pageviews	Avg. Time on Page	Bounce Rate	% Exit	$ Index
/humanandhealthsciences/	4870	3572	25.975413402959095	0.19148936170212766	0.05626283367556468	0.0
/environment_society/	2451	1925	40.55918827508455	0.5826558265582655	0.2762137902896777	0.0
/geography/	807	599	62.330128205128204	0.30116959064327486	0.22676579925650558	0.0
/biosci/	637	458	42.221374045801525	0.22395833333333334	0.17739403453689168	0.0
# --------------------------------------------------------------------------------

The script reads the report and adds the value under the Pageviews column to the corresponding group. Groups and their component directories are listed in a second file. The following is a snippet from such a groups file, illustrating the entries for the College of the Environment and Society. The order is unimportant, columns are separated by a tab, and the first column should be an exact match with the page in the report:

/humanandhealthsciences/	Human and Health Sciences
/cscr/	Human and Health Sciences
/biosci/	Environment and Society
/geography/	Environment and Society
/environment_society/	Environment and Society

The script is run with two command-line arguments; the name of the report file and the name of the groups file. The following is an example of the program's output. It lists the matched directories first (excluding any directories that aren't included in the groups file), and then lists the hits per group:

>cscript \nologo groupanalytics.vbs analytics.tsv groups.tsv

/humanandhealthsciences/        004870
/environment_society/   002451
/geography/     000807
/biosci/        000637

Human and Health Sciences       004870
Environment and Society 003895


Download the script

Home About Me
Copyright © Neil Carter

Content last updated: 2011-07-25