Theory and Research in HCI: Back to the Drawing Board

Brian R Zaik

In their chapter of The Human-Computer Interaction Handbook, Welsh et Al. bring up Fitt’s Law, which helps HCI designers predict human perceptual-motor performance for users interacting with computer interfaces. Yet this and other insights were gained during the era of the two-dimensional, windowed graphical user interface. The future of interactive computing technology, as the authors note, seems especially bright when considering the kinds of haptic, augmented, virtual, and interactive interfaces currently under exploration at universities and in corporate labs all over the world. With such dramatic shifts in interface paradigms, Welsh et Al. point out that we cannot continue to believe that the laws and theories of the past will readily apply to the future. I can only conclude that with each new domain of HCI we explore, we will uncover new challenges that fit the paradigm – challenges which will likely require us to augment our existing knowledge of human capabilities with new strategies.

It is sometimes difficult to get past the novelty of an interface to realize its faults when actually used by a human being. In fact, sometimes we never get it completely right. Alan Kay’s lecture from Apple in 1987 included a video of Sketchpad, a computer-aided drawing system demonstrated in 1968. Watching the fluidity of the light pen’s interaction with the capable software in that video, my initial thought was, “Hey, this interface looks great!” But Kay quickly informs his audience that the light pen input device was a terrible input device, since the blood would drain out of the user’s fingers after only about 20 seconds of use (3). We are reminded by examples like this of the fact that no matter how interesting and intuitive an interface may appear at face value, we are ultimately restricted as interface designers by the mental and physical capabilities of our fragile minds and bodies. Unfortunately, new paradigms often bring with them a whole different set of challenges, and it is potentially dangerous to develop laws concerning human capabilities that are specifically suited to one interface paradigm.

Fitts' Law, as the four authors write, can be used as a predictive model of time to engage an object in an interface (4). They state that Fitts' Law is most useful when evaluating computer pointing devices, but should not be used as a crutch due to its limited usefulness with other types of interfaces. And they cite one type of interface for which laws like this just don't cut it: eye gazing interfaces.

I wanted to peer a little deeper into this type of experimental interface, so I discovered the thesis of Arne John Glenstrup and Theo Engell-Nielsen, two former students from the University of Copenhagen (1). These two studied the implementation and practicality of eye gazing computer interfaces in 1995, and included a detailed analysis of the problems associated with such interfaces, as well as the potential opportunities afforded by this new realm of interaction. They cite two key problems with eye gazing interfaces that track users' eye movements for computer interaction:

The 'Midas Touch' problem: The user's eyes never leave the screen, and thus it becomes necessary to use a "clutch" control to engage and disengage eye-gaze control.

The one-way zoom problem: In order to create a visual hierarchy for the user, it is necessary to have a zoom-out function that allows a user to zoom out from an object on which they have focused.

Would Fitts' Law (or Hicks-Hyman, or any other insight covered in Welsh et Al., for that matter) ever allow us to identify and understand these types of problems? The Dutch students realized that the two aforementioned issues are truly problematic if eye gazing is used as a direct replacement for the mouse pointer, and we know that Fitts' Law is particularly applicable to mouse pointer interfaces. Eye gazing UI is one example where our current body of knowledge about human capabilities (and interests, and favoritism of interface elements) is simply constrained by the types of hardware interfaces that were used as the method of computer interaction. This highlights the potential danger of conducting HCI research that relies too heavily on one particular type of UI paradigm. We can see here that the domain with eye gazing has changed so significantly from the old standard that many of the former insights into human capabilities fall short of predicting how humans will treat the eye-gazing interface. We need to develop new knowledge about this novel and unique paradigm.

But sometimes we can simply augment existing knowledge with what we already know to tackle new and distinct interface paradigms. I speak partially from my own experience when I discuss the difficulties of moving from one paradigm to another. I am a co-creator of the Concerto digital signage system (2), an open source advertising and communications medium that launched at RPI in 2008. We’ve delivered a fully functional digital signage package to the world, free of charge with openly distributed source code, and yet now our team of developers is exploring a whole set of new application areas for targeted digital advertising. Direct user interaction is an important and intriguing topic for digital signage in general – up until only recently, most signage platforms, Concerto included, have simply displayed visual messages about events, services, and other information on television screens. Direct user interaction holds the promise of pulling users closer to the advertising they see, and it can also change the “dumb” broadcasting medium into a dynamic platform for providing targeted messages on demand, depending on how individual users might interact with the medium.

With Concerto, gesture-based recognition and direct manipulation interfaces are both well worth exploring. These could extend Concerto’s capabilities to provide specific content on demand to users who stop in front of a Concerto unit and try to interact with it in certain ways. When people mostly use computers with mice and keyboards, the prospect of using hand gestures to virtually “flip” through screens of information seems foreign at first glance. Yet while these interfaces are relatively different from a mouse and keyboard, the insight we have from Fitts' Law may still be relevant. The key conclusions of that law state that "movement time must increase as the distance of the movement increase and/or the width of the target decreases" (4). If we mount a web cam on a Concerto screen and connect it to computer vision software to track hand movements of the user in free space, we’ll be interested in considering how far the user would be expected to move her hands in order to successfully complete a “flipping” gesture that would slide one full-screen message out for another new one. With direct manipulation interfaces, such as a multi-touch kiosk that could provide contextual content on demand in response to the user pressing the screen at arm level, the relative sizes of target objects must be designed to incorporate Fitts' conclusions. So while the interface paradigm is significantly distinct from a pointer-based interface, the insights of old can still be relevant. This is not to say that we are completely in the clear relying on old knowledge. Using hand gestures in front of a Concerto screen for an extended period of time may quickly fatigue the user, just like with the Sketchpad light pen. Only after entering this space and using physical gestures in such a way would we be able to accurately predict human limitations of that sort. So we need to augment our new insights with distinct interface domains with the knowledge that other HCI researchers before us have recorded.

Where do we go from here? It sure seems like we have a lot of work to do. But all is not lost – as we continue to explore new frontiers of HCI, we continue to build up a repository of insights into our own capabilities as human beings using a variety of different means of connecting with machines. It is our collective responsibility, as HCI adventurers in brave new worlds of computing, to study how human beings react to and interact with our new interfaces. Then, we must document them. It is clear that we will not be able to generalize many laws and observations of the past to virtual reality, augmented reality, three dimensions, and gaze-tracking systems. We may, however, be able to piggy-back off of existing knowledge to gain insights to help us deal with new, daunting challenges. We are fortunate to be involved in such a fast-changing, applications-focused field, and we should do our part to keep advancing knowledge forward, so that we may build a more rounded and versatile view of how our physical and mental characteristics affect our use of machines.

FOOTNOTES:

Glenstrup, Arne J., and Theo Engell-Nielsen. "Eye Controlled Media: Present and Future State." Diss. University of Copenhagen, 1995. Denmark: DIKU, 1995. Datalogisk Institut. University of Copenhagen. Web. 12 Oct. 2009.

"Home - Concerto Digital Signage." Concerto Digital Signage." Concerto Digital Signage. July 2009. Web. 12 Oct. 2009.

Kay, A. (1987). Doing with Images Makes Symbols. University Video Communications.

Welsh, T. N., Chua, R., Weeks, D. J., & Goodman, D. (2007). Perceptual-motor interaction: Some implications for HCI. In Sears, A. & Jacko, J. (Eds.). The Human-Computer Interaction Handbook: Fundamentals, Evolving Technologies and Emerging Applications, 2nd Edition. (pp. 27-42). Lawrence Erlbaum.

Theory and Research in HCI

Tuesday, October 13, 2009

Back to the Drawing Board

No comments:

Blog Archive

Labels