The Blog

Usability evaluation


The pressure to innovate and create products and services that ease users’ workflows and combine previously separate tasks, places even more pressure on the usability of the system. Usability, in short, could be described to be the ease of use of and learnability of a product or a service. As stated by Horzum [3] the term usability is used ubiquitously and contradictorily. Jakob Nielsen [1], for example, defines usability as “a quality attribute that assesses how easy user interfaces are to use”, and also identifies five components to it: learnability, efficiency, memorability, errors, and satisfaction. According to the International Organization for Standardization’s [2] ISO 9241, standard usability also consists of these three dimensions:

  1. Efficiency: the level of resource consumed in performing tasks
  2. Effectiveness: the ability of users to complete tasks using the technology and the quality of output of those tasks
  3. Satisfaction: users’ subjective satisfaction with using the technology

Interactive systems are often designed through an iterative process involving design, evaluation, and redesign [10]. Usability evaluation is especially important in two phases of this process: when usability issues need to be solved before the final design, and when the final product’s efficacy needs to be evaluated. To efficiently recognize usability issues, many different aspects have to be considered. This paper will introduce different images of usability and inspect how the definition of usability is not as narrow as the ISO standard makes it out to be. After this, the various aspects will be employed to inspect the different sets of usability evaluation methods, such as user-based, evaluator-based, and tool-based. After this, we conclude with determining how the selection of methods should be as important as the actual evaluation phase and how certain work-arounds can be employed to make the process more straightforward.

2. Images of usability

Usability evaluation cannot be said to consist of a set of skills or from commonly knowledged facts. Instead, it demands a situation and context specified knowledge about the system and people using it [3,4]. This why, according to Herzum, usability cannot be defined as a single set of properties as the ISO standard aims to do, as it will eventually lead to problems. He describes six images of usability that cover different aspects of usability. The six partly overlying images are:

Universal usability - usability entails embracing the challenge of making systems for everybody to use. Situational usability (usability is equivalent to the quality-in-use of a system in a specified situation with its users, tasks, and the wider context of use.) Perceived usability - usability concerns the user’s subjective experience of a system based on his or her interaction with it. Hedonic usability - usability is about joy of use rather than ease of use, task accomplishment, and freedom of discomfort Organizational usability - According to this image, usability implies groups of people collaborating in an organizational setting Cultural usability - usability takes on different meanings depending on the users’ cultural background.

Herzum also notes that the images, even when used together, aren’t enough to create a unified concept of usability and aren’t mutually exclusive. The images are complementary in a way that they all have different foci but also interwoven and based on the same commonsense meaning of usability that products should be fit, convenient, and easy to use. It is recommended to use multiple images, but one image should be placed as the dominant one. The dominant image depends on the approach and goals at hand.

Designers have an implicit image of usability, which often guides their choices and solutions. [3] This kind of narrowness in usability may cause some of the usability problems to stay hidden [3,4]. To battle this, Herzum describes a three-stage process for working with different images of usability. The process starts with discovery, where the perspective is shifted by applying different images of usability to sensitize the evaluator to multiple ways of perceiving usability. Assigning different images to different persons can enhance the discovery phase; it is also safe to assume that using persons with different backgrounds could prove to be useful. The Discovery phase is followed by an integration phase where the different images are analyzed to form essential aspects of the usability of the system. The integration phase results in the identification of the dominant image, which may be used to dictate the most essential usability aspects of the system. The final stage after integration is challenging, where the dominant image is periodically changed to reveal the blindness in the previous dominant image.

Collectively the images span across a wide set of different approaches to usability. However they do not cover the factor of approaching the usability from the aspect of the stakeholder. According to Hornbaek the usability of the system depends heavily on the business goals of the system under evaluation, as they are beneficial for the systems utility and impact [4]. In a way business goals could be described as an alternate image of usability emphasizing the importance of stakeholder’s perspective.

Usability evaluation methods

There are three types of usability evaluation methods: user-based, evaluator-based and tool-based [8]. The amount usability problems reported depends on the methods used, what the image of usability is emphasized and in the case of evaluator-based methods also on the expertise of the evaluators|7,8]. Different methods and method types help uncover different usability issues. Triangulation of both methods and method types should yield better results than using just one. Evaluation method types can also be presumed to cater to different images of usability. For example evaluator-based methods such as task-analysis can be viewed as representing the universal usability image and tool-based methods such as emotion collecting are most suitable for hedonic usability image. Next I will represent some of the usability evaluation methods from each method type.

3.1. User-based UEMs

User-centered design process emphasizes the importance of “knowing thy user” and not making assumptions of them. For this reason the user-based methods hold great importance when usability image structures on users’ aspects. They are most often done in laboratories where the distractions can be kept to minimum. User-based methods are however both time and resource demanding, as recruiting suitable participants who are both able and willing can prove troublesome [10]. Because of this user-based methods may prove to be too costly if design is in early phase or demands several iterations to be shaped for user demands. They should in this case be aimed at a point in the design process where most of the issues that are already uncovered by less costly methods [8].

According to Nielsen thinking aloud might be the single most valuable of the usability evaluation methods [10]. In the basic procedure the user gets a task to perform with the system under evaluation. The user is asked continuously to say out loud his thoughts while performing tasks. The evaluator forms descriptions of the usability problems from the users actions and descriptions. Evaluator can use video or audio recordings as an aid, to better capture the users’ interaction and thoughts and if needed also go back to inspect the user’s actions. The evaluator however is discouraged from interpreting the user’s words and actions to retain the objectivity.

Thinking-aloud demands a skilled evaluator to produce results and to recognize issues with usability [4,10]. Nielsen also points out a study, which revealed that user’s performed 9% better with the system when thinking aloud [10]. Thinking aloud might be particularly unsuitable for collecting performance data, instead it’s strength lies in the qualitative data it is able to collect even from a quite small sample [10].

3.2. Evaluator-based UEMs

Although user-based evaluation methods often reveal issues in usability that most affect the user, they are not the most efficient way to do testing if time and money are constrains as stated earlier. Evaluator-based methods are possible solutions for both of these problems and might even reveal problems that aren’t revealed with testing only few users [6,10]. However they require an experienced evaluator to work efficiently [4].

One possible way do testing without users is cognitive walkthrough. The process consists of imagining users’ actions and thoughts when using for example a prototype of the interface [6]. In cognitive walkthrough the evaluator goes through one of the system’s tasks by telling a story of all the actions the user has to take to complete the task. Details and motivations behind the user’s actions are added to make the story believable. Emphasis has to be placed on completing the task with only general knowledge and the information the system gives. If the designer is incapable of constructing a believable story about the task, it can be assumed that the task has usability problems. Walkthroughs can be used when the designer wants to quickly inspect small changes made to the design, as it is method developed mainly for developing the interface not for validating it [6].

Another way quite similar way to evaluate is action analysis. Lewis divides action analysis to formal action analysis and “back of the envelope analysis” [6]. Both approaches construct from two phases. First is to determine what physical and mental steps a user will perform to complete the task. The second step is to analyze the steps to find possible problems. Formal analysis highly detailed and can be used even before an actual prototype has been created. This comes with cost of requiring a very skilled evaluator and time. “Back of the envelope” approach as the name suggests is almost the opposite as it is much faster to do and can be used to reveal large-scale problems.

Action analysis is quite avid at finding if completing a task with an interface takes too many steps but the seriousness is one property that should be emphasized when considering this or other evaluator-based methods. It is pointed in several sources that these methods uncover larger amount of problems than traditional user-based methods, but offer no means to validate the severity of the issues [8].

3.3 Tool-based evaluation methods

Third approach to usability evaluation methods are methods that employ a tool of some sort. Emotions are one of the characteristics in the image of hedonic usability [3] and according Isomursu, emotions also affect how users plan to interact with the product. They are however quite difficult to observe in the user and for this reason require a method that emphasis either self-reporting or physiological reactions. Physiological methods are however often expensive and require and laboratory setting. For these reason tool-based methods might provide alternatives especially for collecting data about emotions.

Isomursu describes a case where they developed an experience sampling inspired mobile application to capture user’s emotions in action. While previous methods had relied more on capturing the emotions before/after the use[5]. Isomursu argues that this kind of an approach provided better information of the emotions felt while using the product, as it was always present and provided a quicker way to gather data. An example of the interface used in Isomursu’s study is presented in figure 1. Other tool-based methods are programs that collect statistics regarding the detailed use of an interface, for example, web analytics[8]. There are also models such as GOMS (Goals, operators, methods, and selection)[8], which provide measurements of performance without actually involving users. Tool-based methods are not flawless; most of them still require a skilled analyst to make sense of the data. Figure 1. The interface of an emotion-gathering application. The application would periodically demand the user to input the current emotional state felt when using the target app.

5. Approaching usability issues

Selecting the methods that should be used may prove to be quite troublesome. Images of usability can provide a work-around for choosing usability evaluation methods, as other methods are more suitable than others for specific images of usability. For example, hedonic and perceived usability both place weight on the end user’s view of the usability; for this reason employing methods that are user-based or tool-based will provide insights to problems that would cause the most harm for the user. Models such as GOMS would most likely unbeneficial for finding problems in this setting. Besides this, images of usability could also be employed to broaden the scope of evaluator-based methods to aspects that are often associated with different methods. This should provide the means for balancing resources when the design is still at the early stages.

Images of usability are not, however, the best way to choose which methods to employ. Hartson points out that choosing between usability evaluation methods is often hindered by a general lack of understanding of different methods capabilities and limitations [9]. For this reason, Hartson introduces criteria and evaluating different usability evaluation methods effectiveness through the factors of thoroughness, validity, and reliability. Usability evaluation methods have not yet been thoroughly categorized through this criterion, so it should not be viewed as a bulletproof solution.

What methods are eventually selected depends heavily on the goals of the product. The most favorable situation would be one where multiple means could be used in different phases of the design. As none of the evaluation methods provide final answers by itself or even when multiple methods of the same type are used [7]. Instead, the chosen methods should be chosen by focusing on spending the resources such as time as efficiently as possible [3,4,7,8].


1.Jakob Nielsen, Usability 101: Introduction to Usability, Jakob Nielsen’s Alertbox. Retrieved 2010-06-01

  1. Suomen Standardoimisliitto SFS ry Helsinki, Finland, and European Committee for Standardization, Brussels, Belgium . Ergonomics of human-system interaction - Part 210: Human- centred design for interactive systems (ISO 9241-210:2010) . Helsinki: Suomen Standardoimisliitto SFS ry Helsinki, Finland, and European Committee for Standardization, Brussels, Belgium, 2010.
  2. Hertzum, M. (2010) Images of usability. International Journal of Human–Computer Interaction, 26(6), pp. 567-600.
  3. Hornbæk, K. and Frøkjær, E. (2008) Making use of business goals in usability evaluation: An experiment with novice evaluators. In Proceedings of the twenty-sixth annual SIGCHI conference on Human factors in computing systems (CHI ‘08). ACM, New York, USA, pp. 903-912
  4. Isomursu, M., Tähti, M., Väinämö, S. and Kuutti, K. (2007) Experimental evaluation of five methods for collecting emotions in field settings with mobile applications. International Journal of Human-Computer Studies, Vol. 65, No. 4, pp. 404-418.
  5. Lewis, C. and Rieman, J. (1994) Task-Centered User Interface Design. Chapter 4
  6. Molich, R. and Dumas, J.S. (2008) Comparative usability evaluation (CUE-4). Behaviour & Information Technology, Vol. 27, No. 3, pp. 263-281.
  7. Hasana L. (2011) A comparison of usability evaluation methods for evaluating ecommerce websites
  8. Hartson, H.R., Andre, T.S., and Williges, R.C. (2003) Criteria for evaluating usability evaluation methods. International Journal of Human-Computer Interaction, Vol. 13, No. 4, pp. 145-181.
  9. Nielsen, J. 1993 Usability Engineering