New web analytics tutorial

October 12th, 2009

This is just a brief post to tell everyone (who is interested) that we (Netminers) have just launched a new website, including a brief web analytics tutorial.

The old one was hopelessly outdated - not simply because of the graphics and layout, but also because we previously released a new version of our web analytics tool, which it didn’t describe at all! In fact, our old website has been misleading our visitors for quite some time now, and I am therefore very happy to see the new, updated version live now.

I invite all of you to check it out. For those who are new to web analytics, you might also want to check out our new web analytics tutorial. It is very brief but (hopefully) to the point. It guides you through the four main steps in web analytics:

  1. Define your goals
  2. Measure you website agains those goals
  3. Analyze how you can improve your site
  4. Optimize your site by means of testing

We have also included a web analytics glossary, which you migh find useful. I hope you like the tutorial and our website in general. As always, please feel free to leave a comment or two :-)

Are Online Surveys Reliable?

July 24th, 2008

Whenever I recommend online surveys to my customers, the number one question I get is whether or not the results are trustworthy. My customers ask this question because they fear that website visitors typically won’t bother to fill out questionnaires online and that the collected data are therefore unlikely to be representative.

In this post I will present a case study which confirms that fear. My study shows that there are in fact significant differences between respondents and other visitors: they tend to be more engaged with the website, they see different content and they even come from different geographical areas.

As such the results from online surveys should not be generalized to your entire website without first correcting for sampling error. The way you should do this, however, depends heavily on the purpose of your survey. In some cases, you can get by without correcting for sampling error at all – for example if your purpose is to analyze the (behavioral) causes of satisfaction / dissatisfaction.

Why use online surveys in web analytics?
According to my own definition, web analytics is primarily concerned with measuring online behavior. Web analytics can tell you exactly what people do on your website as well as how often they do it. However, it cannot tell you who the visitors are, why they act as they do and what they think or feel while browsing your site.

The classical example of this limitation in web analytics is the ambiguous meaning of popular metrics such as “Page Views per Visit” or “Average Visit Duration”. High values for these metrics are often seen as signs of satisfaction. In principle, however, they could just as well mean that it was hard for the visitors to find what they were looking for.

Imagine a stubborn visitor who repeatedly tries to complete a check-out procedure. Such a visitor may produce many page views and spend a lot of time on your site even though he or she never completes the task. When you think about it, any given web metric or KPI based purely on behavioral data can always be interpreted positively or negatively.

Given this weakness of traditional web analytics, more and more web analysts are turning to online surveys as a supplement to behavioral data (see here and here). Such surveys enable you to ask the visitors directly instead of guessing from their clickstreams. If they are conducted continuously, you can even use them to include opinion scores in your dashboard alongside your conversion rate and other traditional web metrics.

If done right, online surveys make up a powerful tool for gaining access to the minds of your visitors. They are a perfect companion to web analytics, because they add a subjective dimension to the otherwise purely objective observation of behavior.

The problem of representativeness
Although online surveys complement web analytics in a powerful way, they have one major drawback: Not all visitors want to fill out questionnaires online and as such data are always sampled. I recently calculated the average response rate for all our survey customers and found that it was only around 8%.

The question is, therefore, whether or not survey data are representative. To answer this we first need to know what is required for a sample to be representative. Contrary to common belief, the small response rates of online surveys are not in themselves a problem.  As long as the entire visitor population is relatively large, the sample can be proportionally very small and still be representative.

For example, if you have a total of 5,000 visitors on your website in a given period, you only need 357 respondents in order to be 95% sure that your results are correct +/- 5%. This corresponds to a response rate of only 7% (i.e. less than our average of 8%). If your visitor population increases, the ratio gets even better. Thus, if you have 30,000 visitors on your website, the required sample size is 380, corresponding to a response rate of only 1%.

In most cases the response rate is not the main problem. What is much more problematic is whether or not there is a systematic difference between respondents and other visitors. For a sample to be representative, all members of the population must have an equal chance to be drawn.

This is not the case if certain groups of visitors are more inclined than others to participate in online surveys. In this case, your sample will be biased even if you tried to increase the response rate (e.g. by offering an incentive to participate such as a gift or a chance to win a prize). If there is a systematic difference between respondents and non-respondents, an increased response rate will do little more than underscoring this difference.

A case study
It is normally difficult to compare online survey respondents with other visitors on a website. In most cases, we have no information about visitors who do not respond. At Netminers, however, we have developed an integrated web analytics and online survey tool. This tool enables us not only to see what survey respondents do online, but also how their behavior differs from non-respondents. We are therefore in a unique position to study the sampling bias of online surveys.

The following study compares respondents with all visitors on 12 websites belonging to the same company (a customer of ours who has kindly allowed us to use their data in an anonymous form). The websites are similar in structure and content, but differ in terms of language. A total of 55 surveys were launched across all of these websites. This resulted in 59,957 respondents from a variety of countries, including Denmark, Sweden, Norway, Finland, Holland, Germany, Poland, France, Italy, Spain, United Kingdom and United States.

In order to make the data more comparable, all repeat visitors were filtered out. This brought the base of respondents down to 43,154 and the total population of visitors to 8.6 million. The reason why repeat visitors should be disregarded here is to avoid counting respondents multiple times. Returning respondents are not invited to participate in the survey again and would therefore wrongly be considered non-respondents. This would distort the comparison.

Let us now look at the results. The following charts show that respondents and other visitors do indeed differ in terms of their behavior. The first chart shows the difference in traffic sources. As we can see, respondents tend to enter the site directly, whereas the rest of the visitors more often come from search engines.

This means that respondents are more likely to know the website beforehand and to visit with a particular purpose. They are unlikely to enter “by chance” because a particular search word happened to bring them to the website.

Online Survey Bias - Traffic Sources

If we look at the next chart, we see another interesting difference, namely that respondents are less likely to “bounce” when they land on the site. A “bounce” is here defined as a single-page visit, whereas “retained” means a visit which views at least two pages.  The chart shows a huge difference: whereas the general bounce rate for the website is 52%, it is only 23% for respondents!

 Online Survey Bias - Bounce Rate

Respondent seems to be much more engaged in the website:  They both enter directly and delve deeper into the content after arrival. This is underscored by the fact that if we look only at “retained” visitors, respondents view in average 2 pages more per visit. That is to say, retained respondents view 12 page views per visits, whereas all retained visitors view 10 page views per visit. 

The level of engagement is certainly higher for respondents than for all visitors. However, respondents also tend to see different content. The next chart shows the difference in exit pages for respondents and all visitors. More specifically, it shows which content section the visitors exited from.

The website in this case study is in the travel business and its two biggest content sections are called “Inspiration” and “Tourist Information”.  Both of these sections have an over-representation of respondents. The reverse is true for the rest of the sections, where respondents are under-represented. It is especially noteworthy that the “Online Booking” section which includes the company’s conversion pages has proportionally fewer respondents.

Online Survey Bias - Content Section Exits

The last chart compares the geography of respondents and all visitors. Again we see considerable differences. In general, the biggest target groups for the website (i.e. the Nordic countries and Germany) tend to have lower response rates than the smaller ones. What is especially interesting is that the UK stands out with an extremely high response rate. For some reason, Britons are much more likely to accept participation in online surveys than visitors from other countries.

Online Survey Bias - Geography

Consequences for analyzing engagement
In this post I have shown that online surveys are indeed biased. The most striking difference is that survey respondents tend to be much more engaged than non-respondents: they know the site beforehand, they bounce less often and they see more pages during their visits.

This is perhaps not surprising: the more involved you are in a website, the more of an incentive you have to provide a feedback which could lead to improvements. In contrast, if you find the website irrelevant from the beginning, and perhaps bounce as a consequence, you have less of an incentive to answer.

What this means is that online surveys are weak when it comes to measuring or analyzing the causes of engagement (or lack thereof). We cannot simply ask visitors why they do not engage since these visitors have no intention of answering. We probably even cannot correct this sampling error by weighting the data since the difference is too big. The bounce rate, for example, is so much lower for respondents that it is doubtful whether bounced respondents and bounced non-respondents are comparable at all.

Consequences for analyzing satisfaction
It could be argued that satisfaction scores are likely to be artificially high among online survey respondents. Given that respondents are more engaged than non-respondents, you might think that they are also more satisfied. This is certainly true if engagement is caused by satisfaction or vice versa.

However, in my view, the relationship between engagement and satisfaction is not that simple. Engagement can be defined as an intensive or sustained focus on something (which is often accompanied by intensive use). This focus is not the same as satisfaction; rather, it is the act of building an experience with the object which eventually leads to an evaluation. If the evaluation turns out positive the person is likely to continue being engaged, whereas if it turns out negative he or she is likely to stop. This is why it is sometime possible to observe a correlation between satisfaction and engagement (measured by use intensity) over longer periods of time.

However, in a short term perspective, such as during a visit, engagement and satisfaction are not correlated. They are only related in the sense that satisfaction presupposes engagement. As such, it could be argued that respondents, who do not engage at all (e.g. those who bounce), should be disregarded entirely when calculating satisfaction scores. Given that such respondents have almost no experience with the website their “evaluation” of it must be considered unreliable. By the same token, it could be argued that highly engaged respondents, who still express dissatisfaction, should be given more weight insofar as their evaluations are more reliable.

(See also the discussion on engagements in “Measuring Online Engagement: What Role Does Web Analytics Play?” and ”Responding to Geertz, Papadakis and others.”) 

If the above argument is true, then online surveys are not weak when it comes to analyzing the causes of satisfaction with a website. By comparing the page views of satisfied and dissatisfied respondents it becomes possible to identify those areas of the website which tend to cause this satisfaction / dissatisfaction. It is less important to correct for sampling error here because those visitors who respond to online surveys are exactly the most reliable ones.

Still, it might be relevant to weight data under certain circumstances. If your aim is to measure the overall satisfaction as accurately as possible (rather than analyzing the causes of satisfaction), you need to make sure that respondents are exposed to more or less the same content as other visitors. As I have demonstrated in this post, this is far from always the case. If possible, you should therefore apply weights to those respondents who have visited areas where respondents are generally under-represented.

Consequences for analyzing demography
Finally, an important reason to weigh your survey data is that respondents tend to differ in terms of demography. In this post I have shown considerable geographical differences between respondents and non-respondents. These differences are likely to skew other, underlying demographic data. It is probably always a good idea to correct for this type of sampling error. However, if your aim is to analyze the demography of your visitors, it becomes imperative.

Did you find this post helpful? Do yo have experience yourself with online surveys? Perhaps you have tried to integrate web analytics and online surveys? Please share your thoughts or experience by leaving a comment!

 

Online Marketing KPIs - An Overview

May 27th, 2008

In my post Designing Web Analytics Dashboards I mentioned that web analytics KPIs (Key Performance Indicators) can be divided into five main groups: Marketing KPIs, Engagement KPIs, Usability KPIs, Conversion KPIs and Loyalty KPIs.

I didn’t specify, however, which KPIs belonged to each of these groups. As such I thought I should write a series of posts to clarify the issue. Here comes the first one, which focuses on measuring your online marketing activities.

As I see it, there are three types of online marketing activities you can measure: Online Campaigns (PPC links, affiliate marketing, banners, etc.), Email Marketing and Search Engine Optimization (SEO) .

In the following I shall provide a list of KPIs and associated actions for each of these types.

Online Campaigns KPIs

  1. How much new traffic do you get from your online campaigns?
  2. What is the click-through rate for each campaign?
  3. What is the bounce rate for each campaign?
  4. What is the conversion rate for each campaign?
  5. What is the ROI or cost-per-acquisition for each campaign?

Actions for online campaign KPIs:

If the click-through rate is low, your campaign itself should be improved. If your bounce rate is high, your landing page(s) should be improved. If the conversion rate and ROI are low and your cost-per-acquisition is high, discard the campaign or re-negotiate the prize!

E-mail Marketing KPIs

  1. How many emails have you sent?
  2. How many emails bounced (could not be delivered)?
  3. What is the open rate for each delivered email?
  4. What is the click-through rate for each opened email?
  5. What is the click-through rate for each link in your emails?
  6. What is the conversion rate and/or ROI for each email?
  7. What is your unsubscribe rate?

Actions for Email Marketing KPIs:

If the open rate is low, the subject text and visible text in the message should be improved. Learn from the click-through rates of individual links what works and what doesn’t. If your unsubscribe rate is high, the news articles should be improved.

Search Engine Optimization (SEO) KPIs

  1. How much traffic do you get from each of the keywords you have chosen to focus on for optimization?
  2. What is the average rank of your keywords on the search result pages across all search engines?
  3. What is the conversion rate for each keyword?
  4. What is the cost-per-acquisition for your SEO activities?
  5. What is the ROI for your SEO activities?
  6. Which keywords rank low on the search result pages while at the same time contributing with many visits?

Actions for SEO KPIs

If the average ranking of the keywords you have chosen to focus on is low, you should consider ways to further improve your website for SEO purposes.

If the conversion rates for the keywords you have chosen to focus on are low, you should consider to replace them with other, potentially more effective keywords.

All keywords which you have NOT chosen to focus on, and which have low average rankings, are candidates for optimization. Choose those which result in many visits or high conversion rates inspite of their low ranking.

That’s it for now! I hope you found this overview helpful. If not, please leave a comment.

Can usability be measured with web analytics?

May 16th, 2008

Traditionally, usability is tested qualitatively by conducting interviews with a small number of potential users.

The interviews take place in a “lab” where the participants are given certain tasks to complete on the web site in question. While interacting with the site, a moderator observes their behavior and asks them to share their experiences.

While I’m a big fan of this kind of usability test, the method also has a number of drawbacks. The most important being that it takes a long time to carry out and doesn’t provide the kind of instant and continuous feedback we know from web analytics.

As such, I find it interesting to search for possible ways of using web analytics to measure usability. What we need, ideally, is some kind of easy-to-understand, yet robust, web metric that indicates how easy it is for the visitors to navigate our site.

Such web metric should not necessarily replace qualitative tests in a lab. Rather, it should help us to obtain early warnings and point to the parts of our website which are most in need of usability attention.

Moving beyond standard metrics
Measuring usability by means of web analytics is not as easy as one should perhaps think.

Traditional web metrics, to be sure, will not do the job: The number of page views per visit, the bounce rate, the number of visits per visitor, the average visit duration, etc. None of these metrics can tell us anything about navigational problems.

Although it is true that frustrated or confused visitors might end up producing many page views per visit, for example, highly engaged visitors might do the exact same.

Introducing Back Navigation Rate
So, the question remains: How can we identify usability problems simply by looking at click stream data? Is it possible at all?

Some years ago, my colleagues and I at Netminers set out to answer that question. We came up with the idea that if there is any way to identify frustrated or confused visitors it must be based on how often they go back and forth between the same pages.

A visitor who runs into a problem, we thought, is likely to stop and go back to make sure he or she didn’t miss something. This might even happen repeatedly since visitors are likely to double-check before abandoning the site altogether.

We therefore decided to develop a technique for tracking what we call “back navigations” and “turn pages”. Consider the following click stream.

Click Steam With Back Navigations and Turn Page

The blue dots represent pages seen by a single visitor. The arrows indicate the direction of the click stream.

A is the entry page, while B and C are pages that lead to a particular interest point, D. Having looked at D, the visitor decides to go back to C, and further back to B. Finally the visitor makes a new move forward from B to E.

As can be seen from the figure, there are four forward navigations and two back navigations. We say in this case that the Back Navigation Rate is 50% (2 divided by 4). Notice also that D can be seen as a turning point in that the visitor decides to turn around here. We call D a “Turn Page”.

When is a high Back Navigation Rate bad?
Is it good or bad that the Back Navigation Rate is 50%? Well, it depends.

You cannot always say it’s a bad thing, because sometimes visitors just like to explore links by moving back and forth on the site. This is especially the case on the home page where visitors often explore many different links before selecting a specific page for closer inspection.

According to Netminers Index (i.e. a series of benchmarks based on all the data we collect across all of our clients), the Back Navigation Rate is in average 30%. Back navigations are therefore not as rare as you might first think.

Keep it low for your check-out procedure!
However, imagine now that you are looking exclusively at a check-out procedure or a similar conversion process. In this case you certainly do not want users to go back 30% of the times! A check-out procedure, to be sure, is supposed to be linear, not zigzag, and, in our experience, here the Back Navigation Rate definitely shouldn’t exceed 5%.

Turn Page Rate
But you can do more to identify problematic back-and-forth movements. Notice that in the click stream presented above D appear as a turn pages. This, however, happens only once.

Consider now a case where the visitor produces repeated turn pages.

Click Stream with Many Turn Pages

In this case B and C are both turn pages, and both of them appear as such two times. We can capture the difference between the former and the latter click stream by what we in Netminers call Turn Page Rate. This rate is calculated as follows: First, you slice your data so that you only look at turn pages. Then you divide the number of page views by visits. This rate is 1 for D in the former click stream , but 2 for both B and C in the latter.

What do you think?
What is your opinion about Back Navigation Rate and Turn Page Rate? Can you think of any flaws of these metrics? Do you have suggestions for some alternative and perhaps better usability metrics? Please let me know!

Designing Web Analytics Dashboards

April 4th, 2008

Dashboards, dashboards, dashboards! Dashboards are a hot topic within the web analytics world at the moment.

Dashboards are great because they enable you to set targets, track your progress and obtain early warnings. This is extremely important if you want to optimize your web activities on a continuous basis.

But how do you build a successful dashboard? How do you ensure relevance and impact? How do you ensure that your dashboard lead to concrete action?

The steps towards a successful dashboard
Although there are many ways of building dashboards, in my experience it always boils down to the following steps:

  1. Target your dashboard
  2. Select only the most relevant KPIs
  3. Visualize data as much as possible
  4. Include benchmarks
  5. Explain the results
  6. Make the dashboard interactive

1.  Target your dashboard
Do you work for a large company with several departments? Are you the responsible for distributing updated reports to many different persons on a monthly basis?

In this case, you should first of all think about your target groups. How can the users of your dashboards be categorized according to their needs? And how can you best build dashboards that serve these different needs?

For example, you could operate with the following groups:

  • Web team
  • Marketing department
  • Sales department
  • IT department
  • Top management

Each of these groups is likely to demand very different types of dashboards. The top management might want a high-level aggregate view of all web activities. The sales department might want a list of new hot leads. The marketing department might want to measure ROI of their online marketing activities. And so on.

Even if you are only building dashboards for yourself, you should think about the different contexts in which you will use dashboards.

What is the purpose of the specific dashboard you are building? Are you trying to measure the effects of the newsletter? Are you aiming to optimize the internal search field on your website?

The more focused you make your dashboard, the higher relevance and impact it is likely to have. Focus is also the prerequisite for the next step, namely to select the right KPIs. 

2. Select only the most relevant KPIs
In case you don’t already know, a KPI (Key Performance Indicator) is a measurement which quantifies the success of your web activities. Given that “success” is a subjective thing (it depends on the goal you set yourself), a measurement becomes a KPI the moment you decide it to be so.

In principle, all measurements could therefore be turned into KPIs. The art, however, is to confine yourself to the most important measurements.

Web analytics includes so many interesting measurements that they cannot all be listed here. Let me instead simply say that web analytics measurements divide into the following main groups:

  • Marketing KPIs which measure the effectiveness of search engines, online banner campaigns, e-mails/newsletters and other traffic sources.
  • Engagement KPIs which measure the website’s ability to retain visitors, stimulate interest in content and make visitors perform certain actions
  • Usability KPIs which measure the effectiveness of the website’s navigation, including the website’s search field
  • Conversion KPIs which measure the website’s ability to convert visitors into customers, leads or prospects
  • Loyalty KPIs which measure the website’s ability to make visitors come back often and with short intervals and to use certain functionality repeatedly

If you break down each of these main groups, you will get a very long list of potentially interesting measurements. From this long list, you should carefully choose a unique collection for each dashboard target group.

3. Visualize data as much as possible
Gestalt psychology tells us that the human mind is much better at understanding visual forms than processing raw numbers. If you want your dashboard to communicate clearly and rapidly, you must carefully consider how to apply visualization. In the end visualization can be what makes or breaks your dashboard.

Visualization of data, as I see it, is not simply about choosing the right chart for the right numbers. It is also about using colors appropriately, arranging the information in meaningful ways, putting visual emphasis on important information, using icons for warnings, exploiting space in the most efficient manner, etc.

One of the best books I have read on data visualization for dashboards is “Information Dashboard Design” by Stephen Few. Although his views are sometimes a bit extreme, the underlying principles are no doubt sound.

4. Include benchmarks
KPIs are almost useless, if they are not related to standards of good and bad values. In order to make a particular KPI stand out as good or bad, you must include benchmarks.

There are several ways of doing so. The simplest type is a static value indicating your goal. For example, such a static value can be displayed in a column graph by a straight horizontal line. In gauges, you can add color ranges – e.g. green, yellow, red – indicating good, medium and bad values.

It is also possible to work with dynamic benchmarks such as a line calculated as X % increase of the value previous year or, even more sophisticated, an error band from a linear regression. Finally, benchmarks can be calculated as average values across content sections, across the company’s websites and/or across competing websites.

5. Explain KPI trends
Benchmark is one way of providing your dashboard with “context”. Another important way is to include information which explains your main KPIs. Ideally, a dashboard should consist of a number of data visualizations which supplement each other.

Although dashboards can rarely offer exhaustive explanations, you should at least try to include the most obvious first step in an in-depth analysis. Let’s say you include a line chart showing an upward traffic trend on your website. In this case, it would be interesting to know which traffic source has contributed with most new visitors. This can be done, for example, by adding a bar chart which breaks down the traffic on sources such as direct entries, search engines, link traffic or paid-for campaigns.

Immediately the dashboard viewer will understand not only that the traffic has gone up in the last few weeks, for example, but also why this might be the case. Even if you always need to supplement with in-depth analyses, you should still think in terms of cause and effects when designing your dashboard.

6. Make the dashboard interactive
I believe it is important to distinguished between dashboards and in-depth analysis. A dashboard should enable the viewer to see all of the relevant information on a single screen. In-depth analysis, on the other hand, involves exploring data and searching for answers by applying new filters and looking at data from different angles.

The primary purpose of a dashboard is to monitor, not to analyze. Having said this, however, there is no reason for a dashboard not to support exploratory analysis. The best dashboard applications allow the viewers to move seamlessly from a passive state of monitoring to an active state of analyzing. In doing so the dashboard transforms itself into a powerful analytical application, which enables the viewer not only to see “what” is happening, but also “why”.

Here are some interactive features that I consider important to a dashboard that supports analysis:

  • Global filters: The ability to remove unwanted data from the entire display
  • Local filters: The ability to remove unwanted data from a particular chart
  • Highlights: The ability to highlight selected segments across all of the views
  • From static to trend: The ability to transform a single KPI (e.g. a gauge) into a trend line and vice versa
  • More charts: The ability to quickly add or remove a chart to the display

Conclusion
Dashboards are a powerful way of monitoring websites and distributing information to different target groups. However, they can be extremely difficult to design, because successful designs require deep insight into the needs of the target groups as well as into the field of web analytics.

In this post I have given some guidelines as to the characteristics of an effective dashboard. If you have other ideas about best practices for building dashboard, please leave a comment below.

Web analytics defined

December 7th, 2007

This blog is dedicated to web analytics. A good place to start is therefore to try and define what web analytics is.

My definition goes as follows:

Web analytics is a new, fast-growing discipline that helps companies optimize their websites, online marketing activities and customer relationships by means of collecting, analyzing and reporting data about internet users.

To this I think we should add that web analytics has four distinctive characteristics:

  • Web analytics can use a variety of data sources, but it is almost always based behavior tracking
  • Web analytics always aims to understand internet users in a natural setting - that is, as they navigate a specific website or web universe
  • Web analytics is primarily based on quantitative data and statistical analysis
  • Web analytics helps companies not only to achieve certain business goals, but also to understand web site visitors and to give them an optimal experience by enabling them to complete the tasks they themselves define