If you inspect a prototype or product and systematically check whether it adheres to a set of heuristics, you are conducting what is called a heuristic inspection or heuristic evaluation. It is a simple, effective, and inexpensive means of identifying problems and defects and is an excellent first technique to use before moving on to more costly and involved methods such as user observation sessions.
It is usually best when a heuristic evaluation is carried out by an experienced usability specialist, but heuristic evaluations can also be very effectively when they are conducted by a team of individuals with diverse backgrounds (for example, domain experts, developers, and users).
To conduct a heuristic evaluation, you should choose several scenarios for various tasks that a user would perform. As you act out each of the steps of the task flows in the scenarios, consult the list of heuristics, and judge whether the interface conforms to each heuristic (if it is applicable).
Jakob Nielsen introduced the idea of heuristic evaluations, and his 1994 list of ten heuristics, reproduced below, is still the most commonly used set of heuristics today (Nielsen, 1994, p. 30):
Visibility of system status | “The system should always keep users informed about what is going on, through appropriate feedback within reasonable time.” |
Match between system and the real world | “The system should speak the users’ language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order.” |
User control and freedom | “Users often choose system functions by mistake and will need a clearly marked ‘emergency exit’ to leave the unwanted state without having to go through an extended dialog. Supports undo and redo.” |
Consistency and standards | “Users should not have to wonder whether different words, situations, or actions mean the same thing. Follow platform conventions.” |
Error prevention | “Even better than a good error message is a careful design that prevents a problem from occurring in the first place.” |
Recognition rather than recall | “Make objects, actions, and options visible. The user should not have to remember information from one part of the dialog to another. Instructions or use of the system should be visible or easily retrievable whenever appropriate.” |
Flexibility and efficiency of use | “Accelerators — unseen by the novice user — may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. Allow users to tailor frequent actions.” |
Aesthetic and minimalist design | “Dialogs should not contain information that is irrelevant or rarely needed. Every extra unit of information in a dialog competes with the relevant units of information and diminishes their relative visibility.” |
Help users recognize, diagnose, and recover from errors | “Error messages should be expressed in plain language (no codes), precisely indicate the problem, and constructively suggest a solution.” |
Help and documentation | “Even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. Any such information should be easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too large.” |
An obvious weakness of the heuristic inspection technique is that the inspectors are usually not the actual users. Biases, pre-existing knowledge, incorrect assumptions about how users go about tasks, and the skill or lack of skill of the inspectors are all factors that can skew the results of a heuristic inspection.
Heuristic inspections can also be combined with standards inspections or checklist inspections, where you inspect the interface and verify that it conforms to documents such as style guides, platform standards guides, or specific checklists devised by your project team. This can help ensure conformity and consistency throughout your application.
]]>Some examples of the type of data that you can collect through analytics include:
Websites and web apps are well suited to logging and tracking user activities. Many web analytics packages and services can provide additional contextual data such as the user’s geographic location, whether they have visited the site before, and what search terms were used to find the site if the user visited via a search engine.
Desktop and mobile apps can also collect usage data, but because of privacy concerns and regulations, it is important to declare to the user what data you intend to collect, and you must gain the user’s permission before transmitting any usage data.
No matter what type of product you offer, privacy concerns are important and you must ensure that your practices and Terms of Service follow the legal regulations appropriate for your jurisdiction. Tracking abstract usage data such as button presses are generally acceptable, but it is usually considered unacceptable to pry into content the user creates with the product.
]]>The KLM-GOMS model, the Keystroke-Level Model for the Goals, Operators, Methods, and Selection Rules analysis approach (Card, Moran, and Newell, 1983), is one analysis technique based on this idea, but instead of assigning scores representing effort, an estimate of the time required to do each action is estimated instead. The amount of time it takes to complete a task is a good proxy for physical effort, although it does not accurately measure the intensity of mental effort.
Let’s take a very condensed tour of the KLM-GOMS approach.
To accomplish a goal, the user will break the work into tasks, and for each task unit, the user will take a moment to construct a mental representation and choose a strategy or method for carrying out the task. This preparation time is called the “task acquisition” time, and can be very short — perhaps 1 to 3 seconds — for routine tasks, or much longer, perhaps even extending into several minutes, for creative design and composition tasks.
After the task acquisition, the user carries out the task by means of a sequence of actions or operations. The total time taken to carry out the actions is called the “task execution” time. Thus the total time required to complete a task is the sum of the task acquisition and task execution times.
To estimate the task execution time, KLM-GOMS defines basic operations (we assume here that we are dealing with a keyboard-and-mouse system):
Operation | Description | Suggested average values | |
K | Keystroking | Pressing a key or mouse button, including the Shift key and other modifier keys | Best typist: 0.08 sec Good typist: 0.12 sec Average typist: 0.20 sec Worst typist: 1.20 sec |
P | Pointing | Moving the mouse pointer to a target on the screen | 1.1 sec |
H | Homing | Moving a hand from the keyboard to the mouse or vice-versa | 0.40 sec |
M | Mental operation | Preparation | 1.35 sec |
R | System response operation | Time taken for the system to respond | varies |
So to use the mouse to click on a button, we would have a sequence of operations encoded as “HPK”: homing, to move the hand to the mouse; pointing, to point the mouse cursor to the button; and a keystroke, representing the pressing of the mouse button.
In addition to these operators, the KLM-GOMS model also includes a set of heuristic rules governing how the “M” operation, the mental operation, is to be inserted into an encoded sequence. For instance, “M” operations should be placed before most “K” and “P” operations, except for various special cases. So the “HPK” sequence discussed above would become “HMPK”. The rules are fairly arcane and we won’t go into the details here.
As an example, let’s consider the task of finding instances of a search term in a document in a text editor. One possible sequence of actions to accomplish this might be:
This can be encoded using KLM-GOMS and used to formulate an estimate of the average time required as follows:
Action/Operation | Encoding | Time (s) |
Task acquisition | (none) | 1.5 |
Click on the “Search” menu | H[mouse] | 0.40 |
MP[“Search” menu] | 1.35 + 1.1 | |
K[“Search” menu] | 0.20 | |
Click on the “Find Text” item | MP[“Find Text” item] | 1.35 + 1.1 |
K[“Find Text” item] | 0.20 | |
H[keyboard] | 0.40 | |
Enter “puppy” as the search term | 5K[p u p p y] | 5(0.20) |
Click on the “OK” button | H[mouse] | 0.40 |
MP[OK button] | 1.35 + 1.1 | |
K[OK button] | 0.20 | |
Total | 11.65 s |
Of course, we would expect a more skilled user to be able to accomplish the same task in substantially less time by using shortcut keystrokes rather than the mouse, and by typing faster than an “average” user.
There are obviously limitations to this kind of analysis; it provides a general rough estimate only, and it assumes that users know the right sequences of actions to complete a task. It also does not account for errors and mistakes. But when you are designing an interface and considering how to design an interaction, methods such as the KLM-GOMS model give you a way to compare the efficiency of different alternatives, and all other things being equal, the alternative that can be done in the least amount of time is the most convenient to the user, and may involve the least cognitive load.
]]>To create an ergonomically sound software application, it is important to first think about the properties and the context of use of the hardware device on which the application will run. For the majority of consumer and business applications, there are currently three main forms of general-purpose personal computing devices:
For more specialized applications, you might have a combination of software and custom-designed, special-purpose hardware. Examples include a machine that sells subway tickets, an automated teller machine, or an industrial thermostat control. If you are a designer for such a product, you may have responsibility for designing the form of the physical interface in addition to the software.
To give you an idea of some of the practical ergonomic aspects that you should keep in mind when designing for different devices, let’s compare desktop computers with touchscreen tablets:
When designing a product, understanding the constraints and limitations, as well as the opportunities, of the hardware devices the software will run on will help you design appropriate and comfortable interactions.
]]>
Let’s examine search systems and look at what factors you have to consider when designing one.
In a search scenario, the user enters a search query, and in response, the system retrieves matching items from a repository.
Alternatively, some people prefer to think of searching as a filtering mechanism: The user chooses filter criteria, and the system filters out any items that do not match those criteria.
When designing a search system, you need to think about and decide on the following:
Are there multiple fields, checkboxes, and drop-down lists that act as filter criteria?
If it is a textual search, does it search for an exact phrase match? Is it case sensitive?
Will you provide “basic” and “advanced” search interfaces to cater to different user audiences?
If you support features such as wildcards, regular expressions, and boolean operators (AND, OR, and NOT), how will you communicate to the user that these features are available, and where will you explain the proper syntax? For example, Google’s Advanced Search page offers many options like these, and it simultaneously explains the syntax for the shortcuts:
If there are many results, are the results broken up across multiple pages?
If the search results are textual documents, is a small snippet of the text surrounding the match presented to provide context? What if there are multiple matches within the same document?
The perceived quality of search results has a big impact on the user experience. Search quality is a function of the following aspects:
A specialized form of search functionality is the lookup function associated with some data-entry fields. Some fields have constraints on what values are valid, but there are too many valid values to make a drop-down list practical.
For example, a customer number field might allow the user to enter a customer number directly, but it is rare that users will have memorized the number for a particular customer, and there may be tens of thousands of customers on file. In this case, the field should provide a lookup button (or a shortcut keystroke) that allows the user to search for a customer by name. Upon selection of a customer, the customer number field is then populated with the corresponding customer number.
In document-based applications like word processors and web browsers, users will expect to be able to find all of the instances of a search term within the current document. (The generally-accepted terminology in English-language software is to “search” to locate instances of a term within a repository of documents, and to “find” to locate instances of a term within an individual document.)
For editable documents, the ability to replace instances of the search term with another term will also be expected.
Some applications will highlight all instances of a search term within a document:
While searching is convenient, in most applications, it should not be the only means of navigation.
In many systems, there will be some items that are accessed frequently. You might offer shortcut links, for example, to provide rapid access to the most popular items, the most recently-added items, or the user’s most recently-accessed items. Allowing the user to bookmark locations or search results may be useful as well.
We can also imagine cases where the user might prefer to browse the contents of the repository. For example:
Hierarchical menus, keyword indices, and sitemaps can be useful strategies for allowing users to browse the repository and discover content.
Some systems might benefit from allowing users to tag items with keywords. Browsing the list of keywords then becomes another way of getting an overview of the repository contents and accessing items.
Some applications can take advantages of metaphors that simulate real-world situations. For example, a website for a bookstore or library might allow users to view the covers of books in various categories, providing an experience similar to browsing titles on a physical bookshelf.
]]>If your application has a set of similar tasks, you first will want to create a list to keep track of them.
You can then design an interaction framework that describes the commonalities of the user interface and behavior for those tasks.
Some of the issues you should consider include:
Designing an interaction framework helps ensure that you really understand how your application fundamentally works. It ensures consistency across similar tasks, which helps users perceive patterns and form correct mental models.
By documenting the commonalities amongst the tasks in an interaction framework, it also saves you from having to re-document the same aspects for each individual task. The interaction framework will also be critical for helping the development team design and build the technical “platform” on which the various tasks can be implemented.
]]>Take this page for example:
The banner is the highest element in the hierarchy of this page. The banner and logo tell the viewer that everything on the page is associated with the site named in the banner.
The navigation bar on the left-hand side of the page comes second in the visual hierarchy.
The main content panel’s heading, “Events Calendar”, which describes the contents that follow, forms the third element in the visual hierarchy.
The two subheadings are subordinate to the main heading, so they come next in the visual hierarchy.
Finally, the sections of body text are subordinate to their respective headings. These come last in the visual hierarchy.
When scanning the page, the viewer’s eye will tend to look first at the banner, then move to the navigation sidebar, then the main heading. While the viewer may read the content under the main heading from top to bottom, it is likely that the viewer’s eye will be caught by the subheadings first, and then the viewer’s eye may go back to read the body text.
The visual hierarchy helps the viewer interpret the content on the page in a logical way. Let’s now take a closer look at how create a visual hierarchy and express relationships between different elements on the page. Our main tools for achieving this are the use of similar or contrasting visual attributes of elements, and the relative positioning of elements.
Attributes are the general properties of things on the page. Common attributes are size, shape, color, texture, and direction. For text, attributes include the typeface, weight, spacing, and decorations such as italicization or underlining. In other words, attributes are ways to style a visual element.
Visual elements that are similar or belong to the same category should share the same attributes, whereas elements that are intended to be different should have one or more contrasting attributes. If one element is intended to be stronger or more important than the other element, then the attributes should be chosen to reflect that.
For example, if you have a list or a menu, then all of the entries belong to the same class or category of elements, and so they should be styled consistently with the same attributes. But the heading that sits atop the list serves a different function. It describes or summarizes the contents of the list or menu, and so it should be styled with contrasting attributes that emphasize its dominance. The heading might be larger or bolder, or it may take a different typeface or color.
Contrast is weak when the elements being contrasted are only slightly different. When two element differ only slightly, it can often look like the difference was made by accident. Strong contrast is produced when the differences are clearly intentional. To create intentional contrast between two elements, the general guideline is to make sure the elements differ in at least two ways; that is, at least two attributes should be different between the elements.
For the purposes of this guideline, surrounding space is often considered to be an attribute as well, so leaving a gap of whitespace between two elements can count as one of the differences.
The following diagram shows some examples of weak contrast and strong contrast between a heading and a list of items:
In example (1), there are no differences between the heading “Commodities” and the entries in the list, so it does not look like a heading at all.
Example (2) is better — the heading is in bold type — but the difference still does not stand out strongly.
Example (3) places a gap between the heading and the list. While this is also better than (1), it is still not satisfying, as the heading is set in the same type as the list entries.
Examples (4) through (6) illustrate how using two differences produces much stronger visual contrast. Example (4) uses a gap and sets the heading in bold type. Example (5) sets the heading in bold type and uses indented bullets to offset the list from the heading. Example (6) increases the size of the heading’s font and sets the heading in a different color.
The latter three examples communicate the relationship between the heading and the list entries much more effectively than do the first three examples.
Positioning
In the English-speaking world, and in other left-to-right languages, we read from left-to-right and from top-to-bottom. What is at the top of the page is considered to be more important than what is at the bottom of the page, and to a lesser extent, things on the left in a row of things are perceived to come first. (In right-to-left languages like Arabic and Hebrew, the right-to-left direction is reversed.)
Thus, the top-left corner of the page is where the eye begins when scanning the page, and so the most important element in the visual hierarchy is usually placed there.
If we have two visual elements A and B, we should ensure that A is positioned either above, or to the left of, element B, when we want to show that:
As an example, let’s take one example of poor design that I’ve encountered. One system had a screen for editing customer details that looked roughly like this:
Users were expected to enter a value in the Customer Number field and then click Retrieve. The other fields on the screen would then be populated with the data on file for that particular customer.
The above design is poor because the relationship between the customer number and the remaining fields is not communicated by the visual design.
The data on this screen is dependent on the customer number, because the customer number is the identifying piece of information, or key, for a customer record. If the user enters a new customer number and clicks Retrieve, new data for the new customer number will be presented.
But because the user will start reading the screen from the top left, the user might assume that the last name and first name are identifying the customer record. Additionally, the fact that the user is expected to locate the Customer Number field first is troubling; it is buried deep in the screen, and there are no visual cues that it is the most important element upon which the others are dependent. If it is the identifying field upon which the other fields depend, then it should be situated in a place that better communicates its importance: the upper left, where the user begins scanning the screen.
And the fact that the user has to jump from the Customer Number field up to the Retrieve button is poor design as well. There are no cues that this is how the interaction flow is supposed to work; because we read from left-to-right and from top-to-bottom, jumping from below to above is counterintuitive. The button should be moved so that there is a left-to-right or top-to-bottom flow from the Customer Number field to the Retrieve button.
Thus, one possibility for an improved layout might be something like the following:
In this design, it is clearer that the details are dependent upon the chosen customer number. There is a left-to-right flow from the Customer Number field to the Retrieve button, and there is top-to-bottom flow that leads towards the finalizing Save and Cancel buttons.
Practical aspects of visual hierarchy for user interface design
While you may not necessarily explicitly design a visual hierarchy when creating a page composition, an awareness of the general concept of the visual hierarchy and an understanding of how relationships between elements can be expressed can help you produce better designs.
In large project teams, you can try to ensure some degree of visual design consistency throughout your product by creating a style guide that defines the general look-and-feel of the interface in terms of a visual hierarchy. Writing a style guide is not always easy; it’s not always possible to completely document everything that makes up a consistent set of visual designs. But by specifying rules for the styles and positioning of headings and other visual elements, and by providing page layout templates and examples, a style guide can help communicate your design intentions to your project team.
]]>To help us understand what’s involved in interaction design for persistence and transactions, let’s first look at how data is saved in two typical classes of applications: document-oriented desktop apps, and multi-user web and client-server apps.
Document-oriented desktop applications
For document-oriented desktop applications, like word processors and spreadsheets, documents are usually saved as individual files on a disk.
When a user is working with a such an application, there is a copy of the document stored in the working memory of the user’s computer. As the user edits the document, the copy in working memory is modified, and so it will no longer match the copy on disk. By saving the document, the copy on disk will be updated to match the copy in working memory. If the user makes changes to the document and then closes the document or closes the application without saving the document, then the changes will be lost.
Most people with computing experience are familiar with this model. You can indicate that your application uses this model by following standard conventions (which can vary between platforms). There should be “Save” and “Save As…” commands under the File menu, and (especially on Windows) there may be a “Save” icon in the toolbar. On Mac OS X, a black dot appears in the red “Close Window” button whenever unsaved changes are present, and this dot disappears after the document has been saved. On Windows, some applications place an asterisk next to the document title in the window’s title bar when unsaved changes are present.
Some usability specialists argue that the need to know about the separation between working memory and persistent storage is a “leaky abstraction” — an underlying aspect of the technology that is exposed to the user, creating an unnecessary mental burden. The Canon Cat was a unique word processing system in the 1980’s that hid the distinction between working memory and persistent storage. No “Save” command was offered because the system automatically synchronized all changes with the copy on disk. The popular word processing application Scrivener similarly saves all changes automatically every few seconds, meaning that users never have to worry about explicitly saving their work. Diverting from the conventional way of doing things can initially cause users confusion, though, and so Scrivener still offers a “Save” command in the File menu for convenience, even though it’s never really necessary.
Web and client-server applications
For most web applications and client-server applications, data is usually stored in a database (which in turn stores the data in files on a disk). Database systems allow many different users to access the same data simultaenously.
In applications that are backed by a database, when a user creates, edits, or deletes data in the system, these changes can be accumulated in units called transactions. If other users of the system retrieve data from the database while the first user’s transaction is still in progress, the other users will not see these changes. But when the software issues a “commit” command for the first user’s transaction, the transaction is ended, and the user’s accumulated pending changes are saved “permanently” to the database so that other users of the system can see the user’s changes.
If instead a “rollback” command is issued, the transaction is also ended, but all of the pending changes for that user’s transaction are cancelled, and the database is not updated; other users see no changes in the data in the database.
Most applications hide the technical concepts of transactions, commits, and rollbacks from the user. This hiding is done by aligning the start and end of transactions with places in the user interface where various events or task flows start and end. Terminology is also used that is more familiar to the user.
For example, we can image that when a dialog box such as a Properties dialog is opened, a transaction will be started. If the user closes the dialog or presses the “Cancel” button, then the transaction will be rolled back and any changes the user had made in the dialog will be lost. If the user presses the “OK” button, the user’s changes in the dialog will be committed to the database.
For multi-user systems, you also need to think about what happens when two users try to edit the same information records simultaneously.
Imagine a situation where two users are attempting to make changes to the address information on file for a particular customer. The original address on file is “123 Main Street”. User A opens the address record and starts changing “123 Main Street” to “456 First Ave.”, while seconds later, User B opens the same address record and starts changing “123 Main Street” to “789 Second Ave.” If User A presses the “OK” button to save the changes, and then User B presses the “OK” button shortly afterwards, what happens to the data on file? There are a couple of possibilities:
Neither of these is particularly satisfying, as both users will think that their changes have been saved, but one user will have had their changes overwritten or lost without their knowledge.
One solution to this issue is to use some form of record locking: When User A opens the customer record, the system locks that record, so that if User B attempts to open the same record, she receives a message that the record is locked and unavailable for editing. When User A commits or rolls back his changes, then the lock is removed and the other users can edit the record again. One problem with locks is that if User A leaves his terminal and goes home, or if User A’s application or operating system crashes, the lock might be “stuck” in place for a long time, requiring an administrator to intervene so that other users can edit the record again.
In many applications, it makes sense to want to allow multiple users to view the same record simultaneously, but only one user at a time should be able to edit the data. This raises the question of whether users who are viewing a record should be notified when the data they are viewing has been changed by another user. If there is no notification and if the display is not automatically refreshed, the user will be looking at “stale” data that was at one time correct, but no longer matches the current state of the database, and this may or may not be a problem depending on the nature of the application.
Collaborative web-based applications where users work together on editing the same document can present many challenges like these, and it can take some creative thinking to find usable and non-intrusive solutions to avoid or manage simultaenous editing conflicts.
We’ve seen that an application’s learnability and usability can be impacted by how it handles the persistence of data and manages multi-user editing conflicts, and how the persistence model is presented via the user interface and interaction design. Therefore, explicitly designing how these aspects will work from a user’s standpoint is a good idea for applications of significant complexity.
Questions you need to ask and eventually make decisions about include:
You’ll often need to clarify some of these questions with the technical architects and developers in your project, as the technology framework being used can often dictate how some of these aspects will have work. At the same time, just because the technology requires things to be done in a certain way does not necessarily mean that you have to expose all of the details to the user; technical details can be hidden. Whenever possible, create the design that is clearest and easiest for the user, and then build the system to support that way of working.
]]>Designing how navigation and wayfinding works is a key aspect of your application’s information architecture. Let’s find out what you need to consider to design navigation effectively.
In this post, we’ll use “places” as a general term to refer to locations or containers that can present content and controls, because these can have different names depending on the context. For instance:
“Places” could also refer to divisions such as panels, tabs, and subsections within a page, screen, or window.
Identification of places using names or titles
When your application has multiple places, each place should be clearly labelled with a title, so that the user can determine what place she is currently looking at. Assigning each place a title then allows you to refer to it in menus and in instructions and help text.
Examples of titles include a “Find and Replace” dialog, a “Departure Schedule” page on an airport website, or a “Level 5″ in a video game. Some places might have user-assigned names; for instance, in many document-oriented applications, the title of a window or tab will be the filename of the document being displayed there.
Some guidelines on naming are given in this blog post.
Depending on the type of application, you may need to think about some of the following questions:
You may want to sketch a navigation map to keep track of how users can move from one place to another. For instance, a simple game may have several screens that are accessible from a main menu screen:
Navigation maps aren’t relevant or useful for all types of application, however, and there are many cases where drawing a complete map is impractical or impossible. In a wiki application, for instance, users can create new pages and interlink them however they like.
The following list explains some of the means by which the user might be able to navigate between different places. Most applications will use some combination of several of these.
The more complex the navigation is and the more places there are in your application, the more important it is to clearly show the user where she currently is. Some ways you can indicate the current location are:
For websites that organize content in a hierarchical fashion, the “breadcrumb” technique is simultaneously a location indicator and a means of navigation. As shown in the following example from Amazon.com, the breadcrumb lists the categories in the hierarchy that you would need to navigate through in order to get to the final destination, and each of these categories is a clickable link:
]]>
Because usability problems tend to emerge when key issues like navigation, workflow, and transactions haven’t been thought all the way through, it is worthwhile spending time on thinking about these issues and explicitly designing and evaluating solutions.
Not everything can or should be decided at the very beginning; preliminary decisions will often have to be made which can then be changed or refined as the project goes on. However, some fundamental things are very hard to change once product development is in full swing, and so these things — what we might call the architectural design — should receive special attention early on in the project.
It is possible to describe an interaction concept in a formal specification document, and for large teams building large products, this can be a reasonable approach for recording and communicating the design to team members. But creating a formal deliverable is not the point, and nor is it the only way to document and communicate design decisions. Writing a specification is good if it forces the team to think through and decide on key issues, but a formal document can be difficult to write because many things, like the visual design, are very difficult to specify completely and accurately in writing.
A more agile approach that tends to be more successful is prototyping, in combination with some minimal, agile documentation. Prototyping can be a very effective way of trying out different design ideas and getting feedback through peer reviews and usability testing. A prototype alone cannot capture and communicate all of the design decisions and rationale, though, so lightweight written records can be used to supplement it.
What kinds of issues make up the interaction concept?
We’ll examine each of these in upcoming blog posts.
]]>