Time for thinking aloud: On testing for usability vs. testing for learnability

Over at Andrew Dillon’s Infomatters blog, an important distinction is made that  almost escaped my notice the first time:  the crucial distinction between measuring usability and learnability (scroll down to comments):

Distinguish between usability and learnability. Most usability tests are short run tests of people’s initial reactions to a technology. This privileges the earliest phases of human reactions, the making sense of something new. This is important but it is not the full story. I have shown repeatedly in my own work that people’s later reactions take time to emerge and often run counter to their initial ones.

The more I think about it, this strikes me as important on mutliple levels: first, for the stated reasons about what is actually being measured. It is a substantially different thing to make a claim about client experience on a system based on initial exposures vs. after some familiarity has set in and the learning curve mostly overcome.

But I think there is a deeper issue here: this is the sort of distinction academics draw, that can be mystifying and frustrating to busy and deadline-focused development teams. My sense is that when usability is “owned” by the people doing the QA process, as a practical matter whatever metrics and measurements are supplied are coming from the engineering and tech side. QA testers are people with impressive knowledge of technology, coding, severity and urgency of bugs, and so forth, but when a usability evaluator with an advanced degree and a behavioral-science focus joins a team from the outside of an organization, there can easily be a sort of culture clash. The business of getting code developed to a point of reasonable stability and tested for show-stopper-level bugs can be exhausting and stressful, which means there is a tendency to want to proceed at a pace that does not easily pause for the critiques of analytical eggheads. An academic type who makes seemingly subtle points about the difference between testing for usability vs. testing for learnability can come off like a being from another dimension, where there is time and money to measure and analyze everything under the sun. After a few meetings the developers may come to believe, sometimes correctly, that the PhD is cheerfully agnostic about the difficulties of writing and testing code, which is to say, making the products that earn revenue.

***

This sort of cultural disconnect makes me recall the ways that certain organizations that have hired me to do usability work passively resist the recommendations from an evaluation. The story behind the story is that the design team and developers get what they wanted: a checkmarked box for testing for”user centered” design, but the actual matter of getting the design and development process to slow down because of  flaws discovered in the UI or evaluations planning is a complicated political matter. Many moons ago, I did an evaluation of the usability of one of the world’s most important websites for software development, IBM Developerworks, and found that getting any changes to the markedly flawed UI (better nowadays) was hopeless. It’s not that I lacked data: I had users telling me about problems with the site clutter and visual infoglut. But were any internal stakeholders were about to give up a little bit of their site real estate for the good of the user navigation? I was gently but firmly disabused of the notion that the overall usability was the true mark of quality. Donald Norman has made a similar point in press at least once or twice:

When I was at Apple, I tried to get them to switch to a two-button mouse. I suggested that by that time, everyone was familiar with the mouse, so the earlier objections would no longer apply. Microsoft had proved the virtue of a two-button mouse by using the right button to provide contextual information: menus and help. But the use of a single button was an important branding symbol for Apple and my efforts to change this went nowhere.

***

If the usability or human factors professional is a contractor, they may feel they have nothing to lose by being the bearer of bad news at a meeting…but then what? It is unfortunately all-too-common to have a sort of dead-end, where the data gets discussed at a conference call, noone escalates to upper management, and the result of the work is only that some talk goes around about re-opening the issue “next release”.

Now, I am often very upfront with my students and with industry people to not to be intimidated by people with PhD’s who set the standards for measuring the usability of interaction design. Academics in information design and usability can be like hammers looking for nails: they prefer sample sizes and methods comparable to the standards of academic publishing and experimental psychology or cognitive science, they want to take their time to do it by the book. Moreover, they might very well not want really to be part of the dev team, or feel a need to understand that dev teams often are made to deal with customer requirements, business rules, and staffing that is an invitation to design flaws or worse: one team I was with had a jaundiced older fellow who told me the security holes in the network-and-server monitoring system we were testing were so big “you could drive the whole world through them”. In such an environment, quality assurance or usability can only do what it can, as there isn’t time to slow down such progress as there is to be had, because the continued viability of what is actually a very ramshackle operation is based on the steely nerves of angel investors and VC’s, and their willingness to live by the old adage in for a penny, in for a pound. I don’t regret working for such “crazy” orgs, which made for great teaching experiences in their own way, but I do think the usability pros ought to be willing to adapt their methodological toolkit according to the available resources, and not the other way around.

But…having said that, my sense is, sometimes academics do insist on seemingly subtle distinctions that can have real-world impact on customer satisfaction. Given the need to really get it right, it is better to bring in an outsider, maybe even with a PhD, and give them the time resources and empowerment to focus the team’s attention on precisely such a difference as the one between measuring how easy an application or product is to learn to use, vs. to use once familiarized. Over the long run, our clients and end-users will benefit from this sort of investment in analytical expertise.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s