Offline Website Analytics
Posted by | Sajit N | October 16, 2009 | No Comments
“The Internet is arguably the most measurable medium in history.” -Bill Perry(Audit Bureau of Circulation)
Website traffic is the de facto performance indicator for all websites. The analytics tools that track website traffic stats assume pivotal significance from the perspective of performance reporting. Being in the know of major traffic sources, time-sliced traffic fluctuations, internal traffic flow, top performing pages, top traffic-drawing keywords, geographic distribution of traffic and conversions helps one rein greater control on website’s performance metrics. Moreover, a website’s traffic stats help rate online properties for ad-space purchases, domain reselling or embedded marketing.
“Lies, damned lies, and statistics” -Benjamin Disraeli
Website traffic analytics tools can be divided into two types: on-line analytics tools and off-line analytics tools.
On-line analytics are meant to track visitor activities at a granular level, giving details about keywords used for visits, the navigation path, exact location of visitor, time spent et al. The techniques used by online analytics tools involves server-resident native database analytics like log file analysis or 3rd-party solutions that make use of client-side script activations through page tagging.
Off-line analytics tools intend to offer, at best, a bird’s eye view of website’s traffic data. Broad-brushed data on overall traffic volume along with visitor demography, psychography and bizography is what is offered by off-line analytic tools, which are mostly 3rd- party in nature. This blog-post aims to specifically discuss about this category of website analytics solutions.
There is considerable demand for competitor/peer website data, besides data about our own websites. This demand is met by 3rd party off-line website analytics services, making use of distinct methodologies to extract data and distil takeaways from it. Most services only offer “satisficed” data-set, not often meeting optimal accuracy standards. At the same time, there is an exigent need for website rating indices, a la Nielsen ratings for Television.
“Three’s a crowd” -Saying
The exercise of choosing a fairly accurate website analytics solution is like picking the bad out of the worst. None of them are accurate enough for one to bet their money on. Of all services, there has emerged a troika of analytics solutions who have gained widespread, though a tad begrudging, acceptability: Alexa, Quantcast and Comscore.
Alexa: A backronym for “Address Lookup EXperts Authority”, ALEXA is a website traffic tracker that makes use of Alexa Toolbar for Internet Explorer and integrated sidebar for Mozilla. The ratings are generated based on the browsing habits of users who installed the toolbar, and thereby get access to Alexa data in return. The ratings themselves, as explained by Alexa.com, are “calculated using a combination of average daily visitors and pageviews over the past 3 months. The site with the highest combination of visitors and pageviews is ranked #1.”
Plusses:
• Worldwide distribution
• Sample size of over 10 million users
Minuses:
• Sample size with a major webmaster-community bias
• Ratings are open for manipulation through automated tools

Techcrunch reported a systemic blooper in Alexa when they pointed out how Alexa put Youtube ahead of Google in traffic volume
Quantcast: Quantcast is a website analytics service restricted to measurement of US internet behavior. Unlike Alexa, Quantcast requires website users to insert special HTML code within website for it to be tracked. Users accessing Quantcast-tracked websites have cookies installed in their browsers which estimate extensive data about gender, income-level, education-level etc. of the users. To this, they also add data they collect from anonymous usage at major internet destinations i.e. reverse tracking internet sites to cookie-identified visitors. So, as Quantcast officially states it, their system “couples machine learning with massive quantities of directly measured data”. This technique is referred to as direct-measurement model- a paradigm that’s obviously crippled by disjunction of cookie counts to unique visitor count. Quantcast claims to have statistically normalized figures for all its data garnered from its direct measurement model, as explained here, though the projected traffic figures still fall short of acceptable accuracy standards.
Plusses:
• Fairly distributed sample size compared to Alexa
• No effect of Hawthorne effect i.e. measurement hampering accuracy of data being measured
Minuses:
• Major disconnect between cookie counts and actual unique visitors count lends inaccuracy to data
• Restricted to USA

A study by RedEye shows how Cookie-based tracking of traffic data has incremental error percentages with increasing period of measurement
All services listed unto now are server-based measurement solutions. They collect data from different sources (website log file uploads, javascript tags activations, html codes etc.) and analyze these to offer website usage patterns. The obverse of this type is panel-based measurement which considers a cross-section of people as sample for tracking internet usage habits. The data obtained is extrapolated to arrive at roughly estimated figures for total traffic. The services below make use of this model of traffic measurement
Comscore: Comscore is the most prominent website traffic analyzer that makes use of panel-based traffic measurement methodology. To counter effects of cohort-specific data errors that emerge in opt-in server-based measurement means, Comscore chooses participants from the widest possible spectrum of internet demography and tracks their behavior. In its own words, Comscore’s panel “includes approximately 2 million people under continuous measurement on a global basis, with 1 million residents in the U.S., and the remaining 1 million distributed across more than 170 countries.” Comscore also tries to reduce “self-selection bias” of its panel by including people that it itself chooses and ensuring that all groups in the panel are adequately represented. Starting August ’09, Comscore has introduced server-based measurement aspects in its analytics data mix, to counter criticism of its panel-based approach.
Plusses:
• Most representative sample space
• Accurate tracking of each participant without redundant counts or automated gaming of system, as those that occur in server-based measurement solutions
Minuses:
• Like pre-election surveys, panel-based measurement means depend heavily on sample space distribution for its accuracy
• USA-centric data

Typical panel-based website audience tracking model
|
|
|
|
|
|
Tags: alexa > comscore > quantcast > website analytics review > website stats > website tracking > website traffic measurement
Comments
Leave a Reply