Wednesday, September 23, 2009

Again, I experienced some trouble trying to get this to post...

"Three Laws of a Popular Visionary"

Before I was accepted into Rensselaer Polytechnic Institute's Technical Communication program, I wouldn't even know what the acronym "HCI" stood for. One would be hard-pressed to divine any information about the field of human-computer interaction from me. My mind was a blank regarding this subject -- or so I thought. If the same person asked me about the prophetic implications of science fiction writing on the field of modern science, the floodgates of information would burst forth and inundate her mental landscape with a host of ideas that derived from the imaginations of so many authors. Little did I know that these prophetic visions would delve deeply into reality.

Popular culture is something that social scientists and ethnographers have long studied to derive the understanding that a mass audience has regarding several different mental realms. Not surprisingly, science is one such realm. Figures -- media darlings such as Carl Sagan and Michio Kaku -- have brought about a popular notion of what science is. Their ideas bring forth a popular assertion that science is a part of life and will not fall away so long as the public imagination remains captured. Furthermore, their credentials allow the public to believe what they say, because, after all, they are experts in their fields. But something strange also permeates the public imagination more ubiquitously than scientists themselves. While the Sagans and Kakus of the world bring forth "real" science, the imagination is truly touched by the fictional author. Arthur C. Clarke, Robert Heinlein, Frank Herbert, and their ilk all contribute to the speculative realm of popular culture, a realm that in turn impacts science itself. Not surprisingly, when you ask the "common man or woman" what they see in the future, something from science-fiction often emerges.

Within the domains of science and fiction, their is contention. However, the world was blessed on January 20, 1920, with the birth of the truest exemplar of both science and fiction, Isaac Asimov. Born in the Soviet Union, no single figure has grasped the public image of what it means to be a legitimate scientist and a legitimate fictional prophet than Asimov. His pioneering fiction brought new, radical ideas into the common understanding. He literally hybridized both science and fiction into something digestible by the average person and respected by the advanced scientist.

Focus on Asimov's life is often focused on his prolific authorship that helped to spread the importance of science, and rightfully so. Having written or edited nearly 500 books, and with over 90,000 instances of correspondence, he was indeed something of a literary fanatic. Additionally, he was always a scientist at heart. Having graduated from Columbia University in 1939, he earned his Ph.D. in Chemistry from the same institution in 1948. In the decades that followed, he would hone his craft toward humanistic science fiction that brought the genre out of the "pulp-magazine" domain and into the domain of widely-respected fiction. Oddly, he existed in a void, separated from 20th century literature. According to New York Times Reporter Mervyn Rothstein, who wrote an April 6, 1992 obituary of the author:

"I make no effort to write poetically or in a high literary style," he said in 1984. "I try only to write clearly and I have the very good fortune to think clearly so that the writing comes out as I think, in satisfactory shape. I never read Hemingway or Fitzgerald or Joyce or Kafka," he once wrote. "To this day I am a stranger to 20th-century fiction and poetry, and I have no doubt that it shows in my writing."

Regardless of this separation from the works of contemporary authors, Asimov's brilliance prolongs a legacy that continues to impact thinkers to this day.

One area that Asimov's writing has influenced is the arena of human-computer interaction (the aforementioned "HCI"). His most famous statements that would impact this field appeared in the 1950s with his "Three Laws of Robotics." These laws are:

1. A robot may not injure a human being, or, through inaction, allow a human being to come to harm, unless this would violate a higher order law.
2. A robot must obey orders given it by human beings, except where such orders would conflict with a higher order law.
3. A robot must protect its own existence as long as such protection does not conflict with a higher order law.

It was, in fact, Asimov himself who is the origin of the term "robotics," which first appeared in his 1942 story "Runaround."

Robots, and all automata, have been part of the popular imagination for hundreds of years. Mary Shelly's Frankenstein: The Modern Prometheus brings forward a horrific account of the dangers of automating and granting independence to machines and machine-like creations. Automata and clockwork technology existed before Shelly's masterpiece. "During the twentieth century, several new technologies moved automata into the utilitarian realm...[the] clay model, water clock, golem, homunculus, android, and cyborg [all] culminated in the contemporary concept of the robot," writes Roger Clarke of the Australian National University. He continues to define the robot in more complicated terms, having the "exhibit three key elements:"

1. Programmability ("a robot is a computer")
2. Mechanical Capability ("a robot is a machine")
3. Flexibility

Clarke concludes that "we can conceive of a robot, therefore, as either a computer enhanced machine or as a computer with sophisticated input/output devices." He goes on to note that the distinction between "robot" and "computer" is becoming increasingly arbitrary, a notion that suggests that Asimov's optimistic visions of robotics will become more relevant as time progresses. Clarke does note, however, that "Many facets of Asimov’s fiction are clearly inapplicable to real information technology or too far in the future to be relevant to contemporary applications." Yet he continues,

"Some matters, however, deserve our consideration. For example, Asimov’s fiction could help us assess the practicability of embedding some appropriate set of general laws into robotic designs. Alternatively, the substantive content of the laws could be used as a set of guidelines to be applied during the conception, design, development, testing, implementation, use, and maintenance of robotic systems."

Examples that we can see following Asimov's laws are many. For example, consider the implementation of a standardized computer-robotic interface across several cultural borders. Clarke asserts that the flexibility required for technology (and that breaks from Asimov's rigidity) must be able to follow contextual adaptability.

Ideas such as what is and ought to be follow as well. "The greater a technology's potential to promote change, the more carefully a society should consider the desirability of each application," writes Clarke.

Ultimately, one of Clarke's strongest points following Asimov's laws is that of the implementation of a code of ethics that must surround the interaction between humans and computer or computer-like interfaces. Information technology professionals must definitively examine the ethical implications of their creations and their interfaces. Humans will not always be accepting of robots, computers, and the like, and to properly allow greater ease among users, ethical lines must be drawn that afford people the chance to feel comfortable that their interface does not violate tacit codes of conduct and understanding. Maybe the rigidity of Asimov's laws will help guide engineers toward such an understanding.

Jean-Luc Doumont, a senior member of the Institute of Electrical and Electronics Engineers agrees with this notion. He writes:

"The search for fundamental laws, unfortunately, has seldom, if ever, been applied to professional communication. Most how-to books on the subject seem content with long lists of phenomenological principles. Useful as each of these might be, a long list of them will always be hard to assimilate, at least without some perception of a simpler underlying logic."
Doumont feels that using this rigidity, one can effectively communicate ideas through both personal and impersonal laws. He defines a set of "Three Laws of Professional Communication" that echo Asimov's robotic laws. From adapting to an audience to reducing noise to using effective redundancy, Doumont's laws may differ greatly from Asimov's (in terms of scope and applicability), but carry with them the same inspiration.

In a similar circumstance, Dror G. Feitelson uses Asimov's Three Laws of Robotics to derive three laws that apply to software itself, another fundamental element of HCI. Feitelson asserts in his reformulation of Asimov's first law that human production (be it text or the like) is the sacred element that must not come to harm. However, current trends in programming allow software to discard this record. Despite this, he continues to state that the data and the experience must be held as sacred as well, suggesting that if the software disrupts this contract, it will be impossible for the user to successfully employ the software. "These changes," he writes regarding ideas and programs that will protect the user's experience, "might seem small to the developers, who spend considerable time considering the best ways to perform various tasks, but they can be devastating to users who just want to perform their work with as little hassle as possible."

Feitelson's second law parallels Asimov's as well. "Software must obey orders given by its users." He explains how it must function "according to its specs." Likewise, his third law suggests that software, like the robot, must protect its own existence. It should not crash or regularly malfunction, but should continue to work properly, even if the user lacks the know-how to function gracefully within its parameters. Feitelson acknowledges that his adaptation of Asimov's laws is a "trade-off" for designers, but one that will undoubtedly increase the performance and enhance the experience that users face. "It’s time for software developers to be more accountable for their products and to remember that their software is there to serve its users -- just like Asimov’s robots," he writes.

In the end, Asimov is not responsible for an HCI revolution. He did not see himself as anything more than a humble humanist with a few good words to share. However, unlike so many others, his ideas, especially his Three Laws of Robotics, have been a launch pad for other innovators, essayists, and thinkers to move into a new realm of understanding regarding HCI. Without his rigid laws, there would be neither those laws we could emulate nor those laws which we could break so effectively.

WORKS REFERRED

Clarke, Roger. "Asimov's Laws of Robotics: Implications for Information Technology (Part 1)." Computer. Dec 1993: 55. IEEE Xplore. Web. 22 Sept, 2009.

Clarke, Roger. "Asimov's Laws of Robotics: Implications for Information Technology (Part 2)." Computer. Jan 1994: 62. IEEE Xplore. Web. 22 Sept, 2009.
Doumont, Jean-Luc. "Three Laws of Professional Communication." IEEE Transactions of Professional Communication. Dec 2002: 291. IEEE Xplore. Web. 22 Sept, 2009.

Feitelson, Dror G. "Asimov's Laws of Robotics Applied to Software." IEEE Software. July/Aug 2007: 112. IEEE Xplore Web. 22 Sept, 2009.

"Isaac Asimov Biography and Notes." 2009. Web. 22 Sept, 2009.

"Robotics Research Group." University of Texas at Austin. Web. 22 Sept, 2009.

Rothstein, Mervyn. "Isaac Asimov, Whose Thoughts and Books Traveled the Universe, is Dead at 72." 7 April, 1992. Web. 22 Sept, 2009.

For a footnoted version of the essay (in which the works will be directly cited within the text), send me a private message.

Theodor Holm Nelson: “Most people are fools, most authority is malignant, God does not exist, and everything is wrong”

by David F. Bello




Ted Nelson’s biggest effect on the universe of computing is through his disdain for the limitation of specific cultural models on its development. Through his fifty-year-long failure to engage widespread adoption of his own model of hypertext, Project Xanadu, he bitterly declaims the restrictive qualities of technology as we use it, and longingly pines for the breaking down of conventional modes of metaphorical representation.

In other words, it doesn’t have to be this way; it can be any way. We must utilize the best way.


When I read Geeks Bearing Gifts last year, it revealed to me a revisionist perspective on computer history. Nelson is to the history of digital systems what Howard Zinn is to American history.

The cover of the book is a mugshot of Bill Gates with the subtitle, “How the computer world got this way.”

Is there a “computer world” we can talk about so generally, and If so, in what way is this computer world? Why does Nelson see this as a bad thing, and not merely the grand, consistent meta-narrative of functional technology? Finally, what does he propose to do in order to change it?


A “COMPUTER WORLD”?


Yes, a “computer world.” In proposing ZigZag, an alternate “computer world,” Nelson speaks of a cosmology, or “a particular account or system of the universe and its laws” (via, requires university access). In this sense, the “world” of computing is the cultural conception of what computing
is, how and why it works, and overarching ideas about what is can and cannot do or be. Can this be more general? Probably not; and that is Nelson’s point: everything we take to be concrete, immutable, inevitable, permanent, and inflexible about computing is only a single branch of potential ways of being!
I propose to go back to the beginning and branch in another direction. Let's go back to 1945, say, when anything seemed possible, and when conventional wisdom could be challenged more easily because it had only been made up a few months before, perhaps by another 20-year-old in the same lab. The evolving conventions of those days involved lump files with short names, hierarchical directories for storing them, programs as long strings, databases as tables stored in files. (These would later be packaged into "folders" and "applications".) But now imagine that a few guys had snuck off to another lab and created a whole new system of conventions, completely different, for everything (via).

THE WAY OF THE WORLD


The computing world is based on one principal system of conventions -- the simulation of hierarchy and the simulation of paper (via).

Because computation has the relatively unique capacity to interact with information recursively, a system of data representation that acts recursively was adopted in the hierarchical file system. Folders can have folders inside of them, but what are they other than lumps of files? Nelson desires a more fluid and multidimensional model for storing information, which he attempted to instantiate in Project Xanadu, but could not compete against the ideologically adhered to boundaries of the traditional computing industries of Microsoft, Apple, Xerox, Adobe, etc.

(On a somewhat tangential sidenote, in his book Gödel, Escher, Bach: An Eternal Golden Braid, Douglas Hofstadter describes human thought in terms of “strange loops” and “tangled hierarchies” which do not conform to typical representations of data structures and interpretation. If we were to eventually create a computational model of human cognition, it would most certainly not be confined to a strictly hierarchical model.)

Whether you are happy with hierarchy being imposed on your day-to-day computer life (along with the idea that it is imposed on nearly all computer use as well, in some form or another) or not, Nelson’s primary point is that it should not be imposed at all. There should be multiple branches of computer history, starting back at that point in the development of traditional file system architecture. So, when we talk about current “filesystems” in common data storage, we are really talking about multiple instances of the filesystem.

This idea is more apparent in Nelson’s discussion of the PARC user interface (or PUI) in the video at the start of this essay. Every graphical user interface used in PCs today utilizes the desktop metaphor, some kind of taskbar, shortcuts, a buffer folder of files ready to be deleted (such as the unmodifiable “trash” or “recycle bin” folders), a metaphorical “clipboard,” a two-dimensional plane where the mouse can “act,” “shortcuts,” “windows,” and software applications. However, there are many other ways of envisioning a graphical user interface. These are simply the conventions adopted over the course of computer history. These conventions make up, in part, what he calls the cosmology of computing, along with the simulation of paper.


THE WAY OF PAPER


One of Jakob Nielson’s heuristics of user interface design is as follows:
Match between system and the real world. The system should speak the users' language, with words, phrases and concepts familiar to the user, rather than system-oriented terms. Follow real-world conventions, making information appear in a natural and logical order (via).

Nelson would see this as a grave restriction on the potential for data representation models. Of course, the learning curve for something like this would be much shorter because you already have knowledge of an existing cognitive system that has been mapped using digital metaphors of that system. However, we are losing out on the possibility of digital representation to come up with a literally infinite set of computational modelling systems by exclusively adopting this cosmology of hierarchy and paper simulation. Metaphors make this simpler to understand, but when they are adhered to at a fundamental level, they make it difficult to innovate outside of their bounds.

One simple failure of the way we simulate paper using computers comes from the idea of “cutting” a piece of text and “pasting” it somewhere else among text. Nelson describes the failure in Geeks Bearing Gifts:

The term "cut and paste", as used by writers and editors for many years, refers to rearrangement of paper manuscripts by actually cutting them and physically rearranging them on desktop or floor. It is a process of parallel contemplation and rearrangement, where you look at all the parts, move pieces around, put them in temporary nonsequential arrangements, and choose a final sequence. In 1984 this changed: when the Macintosh comes out, they change the meaning of "cut" to *hide* and the meaning of "paste" to *plug*. To be fair, many people, not just the PARCies, imagine that the serial process of hiding and plugging contents is THE SAME as the parallel process of rearrangement. This original parallel rearrangement process is fundamental to writing and revision, especially in large-scale projects.  Because no good rearrangment software exists, it is far harder to organize large projects on the computer than it was with typewriters, before the PUI.  The organization of large projects has become far more difficult, unless you print out and physically cut and paste the old way (via).

So, we are forced to communicate our ideas in only two dimensions. Unless...


(Assuming you didn’t want to sit through the ten minute video above, Nelson demonstrates an alternate cosmology for data representation that acts in multiple dimensions of relation.) The two-dimensionality of common computer use as it is today exists due to the WYSIWYG paradigm of early document design software.


THE FAILURE OF HYPERTEXT


Nelson’s book, Geeks Bearing Gifts, tells the history of what he calls the “paperdigm,” or the conventional adherence to paper-simulation, as having come out of the PARC user interface. He claims that the reason they stuck to a two-dimensional model of design for documents came from the fact that they were funded by Xerox, who wanted to sell printers. Wanting to sell as many printers as they could, they forced the first and most influential user interface designers to incorporate printability as a fundamental characteristic of document creation. A cynical view of why our computing capabilities are so limited, but nonetheless entirely believable.

To look outside of this paradigm, we must consider the idea of hypertext, a term Nelson himself coined in 1963. If text is two-dimensional, hypertext is three-dimensional in that there are links to other documents existing in parallel. However, as Nelson envisioned it, he is incredibly disappointed with its misuse as a building block of HTML and the World Wide Web.

HTML is precisely what we were trying to PREVENT-- ever-breaking links, links going outward only, quotes you can't follow to their origins, no version management, no rights management (via).

In addition to these complaints, the WWW also imposes hierarchy in a number of ways: “the exposed hierarchies of the directories and pages; internal hierarchical markup (HTML, XML); and Cascading Style Sheets” (via). So, the beast that he created corrals him within the same adopted standards from which he was attempting to break away. Nelson’s bitterness is perhaps his most important characteristic.

A DIFFERENT COMPUTER UNIVERSE


Nelson's hatred of conventional structure made him difficult to educate. Bored and disgusted by school, he once plotted to stab his seventh-grade teacher with a sharpened screwdriver, but lost his nerve at the last minute and instead walked out of the classroom, never to return. On his long walk home, he came up with the four maxims that have guided his life: most people are fools, most authority is malignant, God does not exist, and everything is wrong. Nelson loves these maxims and repeats them often. They lead him to sympathize, in every discussion, with the rejected idea and the discounted option (via).

Nelson’s direct impact on the history of human-computer interaction involves the development of hypertext as a part of the WWW and its general use, but Nelson is displeased with its application. He tried to impact computer history with Xanadu and ZigZag, but neither have been widely adopted. His primary impact on the way we deal with and understand computing is not through specific implementation of key ideas in the past or present development of software or hardware systems. No, the most important result of his decades-long career is found in his dissenting attitude and optimistic proclamations of alternative cosmologies. Steve Wozniak says that he “was one of the most inspirational figures in the home computer revolution” (via). Wendy Hall, president of the ACM, predicts that Geeks Bearing Gifts will “be one of the most important, and potentially one of the most influential, computing books ever (also via). Stewart Brand also provides an advertising blurb for Nelson’s work alongside these most notable names in the development of computing.

As part of an internship I had as an undergrad, I attended a conference where Ted Nelson spoke. It was the first time I had encountered his ideas, outside of knowing him for having coined the term “hypertext.” In describing the ideas I’ve outlined above, Nelson blew my metaphorical mind when it came to computing. I assume he is able to do the same for many others. To realize the incredible, unending potential for digital representation and interaction was to forever become a proponent of far-reaching innovation without any restriction at any level.

Norm Cox

COMM 6480 Historical Figure Assignment

Fall Semester, 2009

Ann Marie McCarthy

For this assignment, I’d like to tell a story that is at once personal and historical. It’s a story about the beginning of graphical user interfaces and guess what—I’m in the story.

It was the mid-‘80s, and I was working in the third largest advertising agency in Boston as an artist. A lowlier position there has never been. We were poorly paid, and worked to the bone. Determined to rise above my circumstances, I watched the recruitment ads that we produced for the Boston Globe. This was a time before Monster.com, when jobs were advertised in a section of the Sunday newspaper. Our agency produced those ads.

My opportunity arrived through an ad for a graphic designer at a mini-computer company called Data General. Digital Equipment Corporation is the famous Massachusetts mini-computer company. As was the way in those days, DEC had spawned a stepchild—the product of an embittered engineer who left DEC and started up his own company a few towns away. That company was Data General, and, for a while, it was a fierce competitor to Digital.

The competition, however, was no longer about manufacturing mini-computers. As I discovered during the interview process, the competition was now inside the computer. There was this new thing, I was told, “on screen graphics”, and that’s where Data General wanted to beat DEC and Wang and the other mini-computer makers. Problem was, they were a bunch of engineers. Simply stated, they couldn’t draw. They needed someone who could draw to put a graphical face on their proprietary suite of office products.

It looked better than continuing as an artist, so I left my Exacto knife behind and joined the Industrial Design Group at Data General. In between designing keyboard templates and nameplates for mini-computers, my job was to “infiltrate” the software engineering organization and persuade them that the Industrial Design Group could provide the services they needed to create on screen graphics.

Much of the early innovation in HCI was done in the mini-computer companies. In a few short years, we made incredible leaps as a profession. Working with Human Factors Engineers who had been hired originally to study the ergonomics of keyboards and flicker rates on CRTs, we developed a practice around graphical user interfaces, user-friendly design, and usability.

History moves quickly. The original Macintosh, introduced in 1984, had some commercial success. IBM entered the fray with personal computers and the clones followed. So many personal computers were in use that software was developed to run on all of them—with MS Windows in the lead. The mini-computer and its proprietary software were going the way of the dinosaur.

Employees were cast off and scattered to start ups. In each of those new companies, the same naïve question was raised: “Does anybody know someone who can draw? “

Soon I was moonlighting. It was while on one of these gigs that I first came to know the historical figure that continues to have a fundamental impact on both the field and me. I had been asked to provide an expert review of a graphical interface for a relational database management system software product. The human factors engineer who hired me gave me a report that had been written by Norm Cox, another designer who had done some work for her previously but was not available at this time. She gave me his phone number and said, “Give him a call if you want to talk.”

Norm Cox was the visual designer for the Xerox “Star” in the late ‘70s. The Star was the first product to include a graphical user interface, as we know it. Norm created the original bitmaps for the Star’s icons. To the best of my knowledge, Norm was the first designer to draw the document icon with the corner turned down, the folder icon, the printer icon, and so many others. Stop for a minute and consider the impact those bitmaps have had on our lives and our work. Norm’s background was in art/design/architecture. He’s described those first attempts at using early paint programs as being like trying to draw with a rock on the end of your pencil. And yet he was able to distill the essence of the desktop metaphor into 16x16 pixels in renderings so simple and clear that they have truly become iconic in the broadest sense of the word.

I had the good fortune to follow in Norm’s footsteps. I studied his report. I talked with him about the work I continued to do for our shared client. I confess that my first reports mimicked his very closely. He provided a model for the way to work.

When I reflect on that early work, I am struck by how different it was from the way I work today. The chief differences are thoroughness of thought, sole ownership of ideas, and care of communication.

The pace of work twenty years ago was much slower. It was not unusual to spend time learning a user interface. Research was conducted. Domain experts were interviewed. Specifications were read. Documentation was available. Products were learned and used. There was time to truly understand a software system. After a designer had a clear understanding of how a system worked, it’s strengths and weaknesses; there was time to think. Ideas were iterated, often using pencil and paper. More questions were asked and problems investigated. All of this learning and thinking was synthesized into a new design that advanced the previous design by solving problems and extending useful designs. A thoroughness of thought was allowed.

In contrast, a designer’s work today is often rushed and compartmentalized. The “knowledge base” of our profession has grown enormously. Instead of relying on basic tenets of design and communication, we employ rules of thumbs and patterns. If it’s more than so many clicks, it fails. If it adheres to a standard it is acceptable. We “check” for usability instead of knowing a product and thinking about the design.

I’m currently designing a study to provide baseline usability metrics for a large software system so that the usability of the current version can be measured against future versions. I am able to use very little of the software. I don’t know how. I seek out experts who may be able to perform the six tasks that make up the scenario. They are able to perform one task but are stumped by the next. The explanations are shocking. “Well, I worked on the team that developed this feature but that feature was developed up in Montreal. I don’t know how it works.” And yet, the user of the product moves from one task to the next in the same workflow.

That first report that I received from Norm Cox stood out because Norm solely owned it. There was a cohesiveness to work that is not always present today. Collaboration is valued more highly than solo design efforts. Teamwork is prized for its efficiency. Groups form and storm and norm. Ideas are brought to consensus. Although much work is accomplished in this way, the unique point of view of any one designer is lost. A single vision that may form is blurred. Design is homogenized.

Lastly, the report that Norm wrote was communication designed. It was many pages. It had a beginning, middle and end. It made sense and was easy to read. Fonts were carefully chosen. Pages were laid out. It was illustrated with pixel drawings that were embedded in the text so that you saw what he was talking about. Great care was taken in the way that his thoughts were communicated. It was a report to be printed and read more than once. The information density was high. It clearly articulated the way the product was designed, and where that design failed and succeeded. It recommended an improved design that could be executed. It represented a moment in time that was captured.

There’s only one word to describe communication of today: PowerPoint. Slides provide talking points. Stories are one-liners. Getting it done is what it’s all about. So much action is taken with so little thought or understanding.

In the last few decades, software design has matured. Graphical user interfaces have become the norm—and may become passé as voice, gesture and text interfaces become more useable. Mobile devices, ubiquitous computing, and computing in a cloud now overshadow the personal computer that eclipsed the mini-computer. The profession of design and usability has become institutionalized, no longer populated with talent borrowed from the worlds of writing, graphic arts, and architecture. What used to seem like the wild west of design now seems downright suburban.

Were Norm Cox and his pixel drawings a beginning or an end? Is thoroughness of thought and careful communication antiquated? Do synapses fire that much more quickly now? Have teams producing work replaced the sole designer creating ideas? Are bulleted list enough to act on? Debate will continue. But the influence of Norm Cox on this designer is treasured.

Terry Winograd's Shift from AI to HCI

Introduction
In a more recent paper [1], Terry Winograd discusses the gulf between Artificial Intelligence and Human-Computer Interaction. He mentions that AI is primarily concerned with replicating the human / human mind whereas HCI is primarily concerned with augmenting human capabilities. One question is wether or not we should use AI as a metaphor when constructing human interfaces to computers. Using the AI metaphor, the goal is for the user to attribute human characteristics to the interface, and communicate with it just as if it were a human. There is also a divide in how researches attempt to understand people. The first approach, what Winograd refers to as, the “rationalistic” approach, attempts to model humans as cognitive machines within the workings of the computer. In contrast, the second approach, the “design” approach, focuses on modeling the interactions between a person and the enveloping environment.

During his career, Winograd shifted interests and crossed the gulf between AI and HCI. In his paper, he mentions that he started his career in the AI field, then rejected the AI approach, and subsequently ended up moving to the field of HCI. He writes “I have seen this as a battle between two competing philosophies of what is most effective to do with computers”. This paper looks at some of the work Winograd has done, and illustrates his shift between the two areas.

Winograd's Ph.D. Thesis and SHRDLU
In his Ph.D. thesis entitled “Procedures as a Representation for Data in a Computer Program for Understanding Natural Language” [2], Winograd describes a software system called SHRDLU that is capable of carrying on a English conversation with its user. The system contains a simulation of a robotic arm that can rearrange colored blocks within its environment. The user enters into discourse with the system and can instruct the arm to pick up, drop, and move objects. A “heuristic understander” is used by the software to infer what each command sentence means. Linguistic information about the sentence, information from other parts of the discourse, and general information are used to interpret the commands. Furthermore, the software asks for clarification from the user if it cannot understand what the inputted sentence means.

The thesis examines the issue of talking with computers. Winograd underscores the idea that it is hard for computers and human to communicate since computers communicate in their own terms; The means of communication is not natural for the human user. More importantly, computers aren't able to use reasoning in an attempt to understand ambiguity in natural language. Computers are typically only supplied with syntactical rules and do not use semantic knowledge to understand meaning. To solve this problem, Winograd suggests giving computers the ability to use more knowledge. Computers need to have knowledge of the subject they are discussing and they must be able to assemble facts in such a way so that they can understand a sentence and respond to it. In SHRDLU knowledge is represented in a structured manner and uses a language that facilitates teaching the system about new subject domains.

SHRDLU is a rationalistic attempt to model how the human mind works. It seeks to replicate human understanding of natural language. Although this work is grounded in AI, there a clear implications for work in HCI. Interfaces that communicate naturally with their users are very familiar and have little to no learning curve. Donald Norman provides several examples of natural interfaces in his book “The Design of Future Things” [3]. One example that stands out is a tea kettle whistle. The tea kettle whistle offers natural communication that water is boiling. The user does not need to translate the sound of a tea kettle whistle from system terms into something he/she understands; it already naturally offers the affordance that the water is ready.

Thinking machines: Can there be? Are We?
In “Thinking Machines” [4], Winograd aligns his prospects for artificial intelligence with those of AI critics. The critics argue that a thinking machine is a contradiction in terms. “Computers with their cold logic, can never be creative or insightful or possess real judgement”. He asserts that the philosophy that has guided artificial intelligence research lacks depth and is a “patchwork” of rationalism and logical empiricism. The technology used in conducting artificial intelligence research is not to blame, it is the under-netting and basic tenets that require scrutiny.

Winograd supports his argument by identifying some fundamental problems inherent in AI. He discusses gaps of anticipation where in a any realistically sized domain, it is near impossible to think of all situations, and combinations of events from those situations. The hope is that the body of knowledge built into the cognitive agent will be broad enough to contain the relevant general knowledge needed for success. In most cases, the body of knowledge contributed by the human element is required; since it cannot be modeled exhaustively within the system. He also writes on the blindness of representation. This is in regards to language and interpretation of language. As expounded upon in his Ph.D. thesis, natural language processing goes far beyond grammatical and syntactic rules. The ambiguity of natural language requires a deep understanding of the subject matter as well as the context. When we de-contextualize symbols (representations) they become ambiguous and can be interpreted in varying ways. Finally, he discusses the idea of domain restriction. Since there is a chance of ambiguity in representations, AI programs must be relegated to very restricted domains. Most domains, or at least the domains the AI hopes to model, are not restricted (e.g. - medicine, engineering, law). The corollary is that AI systems can give expected results only in simplified domains.

Thinking Machines offers some interesting and compelling arguments against the “rationalist approach”. It supports the idea that augmenting human capabilities is far more feasible than attempting to model human intelligence. This is inline with the “design approach” (i.e. - placing focus on modeling interactions between the person and his/her surrounding environment.)

Stanford Human-Computer Interaction Group
Winograd currently heads up Stanford's Human-Computer Interaction Group [5]. They are working on some interesting projects grounded in design. One such project, d.tools is a hardware and software toolkit that allows designers to rapidly prototype physical interaction design. Designers can use the physical components (controllers, output devices) and the accompanying software (called the d.tools editor) to form prototypes and study their behavior. Another project, named Blueprint, integrates program examples into the development environment. Program examples are brought into the IDE through an built-in web search. The main idea behind Blueprint is that it helps to facilitate the prototyping and ideation process by allowing programmers to quickly build and compare competing designs. (more information on these projects can be found on their website (link is below))


References
[1] Terry Winograd, Shifting viewpoints: Artificial intelligence and human–computer interaction, Artificial Intelligence 170 (2006) 1256–1258.
http://hci.stanford.edu/winograd/papers/ai-hci.pdf

[2] Winograd, Terry (1971), "Procedures as a Representation for Data in a Computer Program for Understanding Natural Language," MAC-TR-84, MIT Project MAC, 1971.
http://hci.stanford.edu/~winograd/shrdlu/

[3] Norman, Donald (2001), The Design of Future Things, New York: Basic Books, 2007.
http://www.amazon.com/Design-Future-Things-Author-Everyday/dp/0465002277

[4] Winograd, Terry (1991), "Thinking machines: Can there be? Are We?," in James Sheehan and Morton Sosna, eds., The Boundaries of Humanity: Humans, Animals, Machines, Berkeley: University of California Press, 1991 pp. 198-223. Reprinted in D. Partridge and Y. Wilks, The Foundations of Artificial Intelligence, Cambridge: Cambridge Univ. Press, 1990, pp. 167-189.
http://hci.stanford.edu/winograd/papers/thinking-machines.html

[5] The Stanford Human-Computer Interaction Group
http://hci.stanford.edu/

Jef Raskin: The Black Sheep Hero of the Humane Interface


Brian Zaik


The Apple Macintosh. The reason why 1984 wasn’t like 1984. The first commercially successful personal computer to use the now customary mouse and graphical user interface (GUI). As much as Steve Jobs would like to call the Mac his ‘baby,’ he didn’t even come up with the name. That name owes itself to another person – a human interface designer who literally offended his thesis committee with the statement that “design and implementation philosophy…demanded generality and human usability over execution speed and efficiency.”

That man, Jef Raskin, laid the groundwork for what was to become one of the most important commercial products of the late twentieth century. Raskin was a catalyst for creating user-focused commercial computer experiences in an environment of rapid technological advancement, matching people with visionary ideas. Yet he opted to step away from the epicenter of the personal computing revolution in order to realize a more complete vision of his own computer user interface style. And this was when he cemented his own legacy. Raskin can rightfully claim to have ignited the Macintosh project, but his single greatest contribution is a body of best practices for the “humane interface,” an ideal that is perhaps more pertinent to our rapidly approaching future than to our past or present.


A Renaissance Man Ahead of His Time

Jef Raskin was born in New York City in 1943, and was classically trained in mathematics and philosophy at SUNY Stony Brook (Bonnet 1). He developed an early love for music and art, which may have contributed to his strong sense for elegant yet exceedingly human software tools. He entered a Master’s program at Pennsylvania State University in computer science, before moving out to the West Coast to serve as an assistant professor at the University of California, San Diego.

At a time when many members of the burgeoning computer research field were talking about bytes and n-squared algorithmic complexity, Raskin broke the mold by focusing his college thesis on an application for graphics creation, which he dubbed the “Quick-Draw Graphics System” (Raskin “Articles” 3). The paper argued that computers should be completely graphic in nature to users, with various graphical fonts that appear on the screen. He advanced the notion that what the user sees on the screen should be what the user gets (the basis of ‘WYSIWYG’ editors). Raskin’s thesis was published in 1967, five years before Xerox PARC was formed. It’s easy to see why his ideas would have been so important way back then: they made the rare argument that human interface was more important than the efficient, compact design of code. Raskin was a true pioneer in human-computer interfaces, and many of his then groundbreaking ideas stuck with him for the rest of his life.

While still a student on the East Coast, Raskin clearly had an eye for computers that put people in charge. But it was when he moved to the West Coast that he found the people and project to really make his mark. Raskin met Steve Jobs and Steve Wozniak, the two founders of Apple Computer, at the West Coast Computer Faire in 1977 (Bonnet 1). We wrote the Apple II BASIC Programming Manual and was hired by Apple in 1978 as the Manager of Publications (the 31st Apple employee).

Raskin’s influence on Apple’s early projects was felt in many areas, from software review processes to publications put out by the company and, later, to full-fledged product design and implementation. What’s interesting about Raskin at Apple is that he truly embodied the spirit of the corporate Renaissance Man – he could swiftly move from department to department and rely on his visionary philosophies of how computers should work to carry his insights to final implementation. Raskin’s desire to have a more-than-usable editor for writing documentation led to the first Apple text editor, which Raskin shaped into a “what you see is what you get” visual editor (Raskin “Articles” 4). His experiences with crashes and strange errors on the Apple II prompted him to start writing a series of long design memoranda to lobby for the simplification of computing beyond what the Apple II featured. He theorized that the personal computer could become a true consumer appliance, and that Apple could be at the forefront of a push to give “computers to the millions” (Raskin “Computers” 1). This intense and detailed vision ignited the personal computer revolution at Apple. Raskin demonstrated here that he could turn the fundamental concepts being developed at places like Xerox PARC into a marketable, “insanely great” consumer device.


Initiating the Macintosh Project

The commercial personal computer would never be the same. Raskin’s contributions at Apple came in the form of visionary, contagious ideas he developed and spread to those who joined the project. Raskin eventually came up with a rallying cry – the Macintosh (Mac, for short) project, which he named after his favorite type of apple (Raskin “Articles” 1). As he says, he came up with the Mac due to his belief “that to reach a larger marketplace, future computers had to be designed from the user interface out” (Bonnet 1). Before him, he claimed that Apple and most other companies in the space would “provide the latest and most powerful hardware, and let the users and third-party software vendors figure out how to make it usable.”

Raskin built intellectual mindshare at Apple for the project, recruiting from his own pool of colleagues and acquaintances to fill his burgeoning Mac team with highly talented individuals, such as Bill Atkinson, Andy Hertzfeld, and Burrell Smith. Raskin’s approach demonstrated one thing right from the start: in an age of rapid technological innovation, bringing many smart people from disparate backgrounds under one roof with a vision can yield great results. In time, Mac team members would be found all over the computer technology industry, involving themselves with the open source Nautilus file manager (Hertzfeld, at Eazel in 1999), the user interface guidelines for the Macintosh and other platforms (Joanna Hoffman, from 1982 to the 1990’s), and NeXT Computer’s forerunners to the modern Mac OS X (George Crow and Bud Tribble, in the late 1980’s). Raskin was the catalyst for change in the consumer personal computing industry because he was the first motivator and father of the Mac project at Apple.


Between a Rock and Hard Place

Success with the first Mac was slow in coming, but the machine’s place in history as the first commercially viable personal computer for everyone is absolute. Yet Raskin’s mark on the final design was fleeting; two years before the Mac’s release, he left Apple (Raskin, Aza 1). The single biggest factor in this falling out appears to have been Steve Jobs himself, who became interested in the Mac project in 1981. Jobs wanted to use the graphical user interface that originated from Xerox PARC as the primary design paradigm for the Mac. Ironically, Raskin has credited himself for introducing Jobs and the rest of the team to the concepts from PARC, which were licensed by Apple for the Mac project. Raskin’s overall vision for the user interface was beaten down by Jobs, and he bowed out. Computing for millions did in fact become a reality, but not with the kind of interface that Raskin envisioned.

Raskin disagreed with PARC’s GUI paradigm in a number of important ways. First, he believed that the GUI needed to be as invisible as possible – it should melt away in order to leave the user focused on the task at hand (Raskin “Down” 1). Second, Raskin did not like the view of computer programs as componentized applications that have their own special interfaces for tasks. Rather, he proposed command sets that apply to specific types of user tasks. In his explosive article for Wired in 1993, Raskin reiterated this vision of computing, stating that “vendors should supply not applications, but command sets, interoperable with all other command sets that you purchase…mix and match” (Raskin “Down” 1). This is a fundamentally different paradigm from PARC’s methodology. Raskin also had a problem with the three-button mouse proposed by PARC – while he later admitted that a three-button mouse similar to the design of Apple’s more recent Mighty Mouse would be ideal, Raskin had drafted a design memo claiming that a one-button mouse “could do everything that PARC’s three-button mouse could do and with the same number or fewer user actions” (Raskin “Articles” 4). The first Mac’s one-button mouse actually adhered to this paradigm, despite Raskin’s many other unutilized ideas.


Going His Own Way

The black sheep of the Mac project left Apple to start Information Appliance, Inc., in 1982, chiefly to realize his vision of computing for the millions (Raskin, Aza 2). Raskin named the company after a term he coined in 1979 that would only become more important to his larger view of how interfaces should work. He points to the Canon Cat, an information appliance he designed for release in 1987, as one of his greatest achievements. The Cat was a dedicated word processing device intended by the marketing folks at Canon for secretaries and others who deal with text-based tasks every day, but its design included many user-centered ideas for data manipulation and user interface that are still fascinating today (Craig 2). Users could “leap” around text strings effortlessly and use commands from anywhere in the computer interface; many of these design elements seem to have been lifted straight from Raskin’s real vision for the Mac computer. Yet the Canon Cat never achieved commercial success, and thus its innovative interface is relatively unknown to most people outside of the computer industry (Craig 4). This is an unfortunate but nevertheless essential reminder that the most hospitable climates for user-centered design require commercial success to become accepted in general society.

Despite the Cat’s ultimate fate, Raskin pushed on to cement his philosophies in The Humane Interface, a book published in 2000. The text preserves many of his most researched and integral user interface design methods, such as the definition of a humane interface, one that is “responsive to human needs and considerate of human frailties” (Raskin Humane 6). He cites numerous qualms with modern GUIs and fully realizes his vision for a new type of human-computer interface on paper: he discusses information appliances, “leaping” in text interfaces, and zooming user interfaces. Other influential HCI theorists, such as Donald Norman, have built on top of Raskin’s ideas and even refuted them. Norman has discussed the role of information appliances in modern HCI (Norman 2). While the interface described in The Humane Interface has not yet been realized, Raskin started a project named THE, The Humane Environment, in the early 21st century to finally implement his vision alongside other HCI colleagues. The project was renamed Archy, but had not yet been implemented by Raskin’s death later that year. Nonetheless, The Humane Interface has become required reading for HCI professionals and students around the world, and a revival seems likely.


A Hero for a New Age

Raskin’s magnum opus – computing for the millions using an information appliance architecture that made transparent the graphical user interface – was not realized in the Mac or the Canon Cat. While the first Mac introduced the rest of the world to computing and fathered a new era of computer platforms still used today, the core experience is divergent from Raskin and based off of the work of Xerox PARC. I would argue that the Mac of 1984 is so far from Raskin’s initial ideas that it is both inaccurate and unfair to claim the Mac to be his contribution to the development of HCI and computing. As we have already seen, Raskin existed in an environment and time in which many individuals collaborated to create the modern personal computing experience. He was the great Johnny Appleseed who brought together a handful of phenomenal people from engineering, design, cognitive science, and programming backgrounds, and planted in them the vision that personal computing should be focused squarely on the user, and how that user wants to use the machine.

Many may claim that the rocky history of Raskin’s alternative model for the computer user interface has tarnished his legacy. The Humane Environment, later known as Archy, was initiated in the early 21st century and has never been fully implemented. Despite the stated intentions of his son Aza Raskin and his close colleagues to continue the project following his death in 2005, the Archy project has fallen into the state of “vaporware.”

Raskin’s alternative has been studied by HCI students and companies for years, and I see a resurgence happening again. As Raskin wrote in Wired in 1993, “the first popularization of those ideas [was] on the Apple Macintosh in the 1980s, [and] we have had almost nothing really new in interface design” (Raskin “Down” 1). We are once again seeing the start of another period of massive change in computer technology with the Web browser. Browsers have become our portals within portals, and they are quickly usurping powers from their operating system parents. What’s more, Google made clear its intentions this past summer to launch an operating system centered around its Chrome browser (Pichai 1), and the Mozilla Foundation is well poised to follow Google’s lead. Is it surprising that Aza Raskin is now working for Mozilla to integrate his father’s humane interface vision with the Firefox browser?

Jef Raskin saw the future, even if he couldn’t quite build it. One can almost see Raskin’s information appliance staring back at us through the Web browser – a device that is primarily task-oriented and able to provide us with a growing number of commands to use the appliance in specific ways. The extensions in Firefox and other browsers seem to mimic those command sets, as they fluidly extend the amount of tasks we can complete on the Web. But the browser can become a big, confusing mess if designers don’t get it right – we need Raskin’s humane interface now more than ever. And that is why his true legacy will live on for years to come.



Works Cited

Bonnet, Ruth. "Biography: Jef Raskin." The Apple Museum. 2007. Web. 23 Sept. 2009.

Craig, David T. "Canon's Cat Computer: The Real Macintosh." Historically Brewed (1994). 28 June 1998. Web. 23 Sept. 2009.

Norman, Donald A. The invisible computer: why good products can fail, the personal computer is so complex, and information appliances are the solution. Cambridge, MA: MIT, 1998. Print.

Pichai, Sundar. "Introducing the Google Chrome OS." The Official Google Blog. Google, Inc., 07 July 2009. Web. 22 Sept. 2009.

Raskin, Aza. "Jef Raskin." Jef Raskin. Raskin Center for Humane Interfaces, 3 May 2006. Web. 22 Sept. 2009.

Raskin, Jef. "Articles from Jef Raskin about the history of the Macintosh." Ed. Matt Mora. 1996. Web. 23 Sept. 2009.

Raskin, Jef. "Computers by the Millions." Apple Document 1979. Jef Raskin - Computers by the Millions. Raskin Center for Humane Interfaces, 2005. Web. 23 Sept. 2009.

Raskin, Jef. "Down With GUIs!" Wired. Dec. 1993. Web. 23 Sept. 2009.

Raskin, Jef. The Humane Interface: New Directions for Designing Interactive Systems. Reading, MA: Addison-Wesley, 2000. Print.


Myron Krueger and Virtual Reality

The difference of VR (Virtual Reality) research from other HCI research is that the research has been studied not only in the field of science and technology but also in the field of art and humanities. In the history of VR research, there have been many famous researchers, scientists and philosophers. The fact that VR was studied in diverse fields was good for development of VR research. However, this made some vagueness for the concept of VR. Many times, VR has not been used distinctively with cyberspace or the virtual world. Also, VR is still something new, and the technology is being developed. So before talking about VR, we need to talk more clearly about what VR is.

Michael Heim, a philosopher is one of the most influential VR researchers. He tried to conceptualize VR in terms of space, virtuality, and the question of what the reality is. In the book, The Metaphysics of Virtual Reality, Heim made a definition of Virtual Reality as “Virtual Reality is an event or entity that is real in effect but not in fact” (Heim, 1998 p.109). Even though Heim is a great scholar, the definition does not look so clear for us, HCI researchers or students, for use in our research. Heim, himself also said that the definition suggests something but cannot explain distinctiveness of VR.

In the field of HCI, the VR would be closer to the VR system as a technology. When it comes to the VR system, personally, I can imagine someone with HMD and data glove, or a big screen and equipment that make people interact with the screen and immerse into the virtual space on the screen. William R. Sherman and Alan B. Craig suggest four useful key elements for understanding VR: Virtual World, the content of the given medium, Physical and Mental Immersion, Sensory Feedback, Interactivity. (Sherman&Craig, 2003). With these essential elements, they derive a concept of VR as a medium. “A medium composed of interactive computer simulations that sense the participant’s position and actions and replace or augment the feedback to one or more senses, giving the feeling of being mentally immersed or present in the simulation. ”(p.13). The notion of VR as a medium would be more proper and useful to HCI study. So, in this paper, I will see VR as a kind of communication medium.


A Brief History of VR
In his article, a brief history of Human-Computer Interaction Technology, Brad A. Myers says that virtual reality (VR) was started by Ivan Sutherland in 1965~1968 (Myers, 1998). It was a head mounted display providing ultrasonic tracking and three-dimensional visual images. Before this equipment, there were first head mounted display by Albert B. Pratt in 1916 and Comeau and Bryan’s early adoption of HMD sensing head movement in 1961. Another type of VR is the CAVE(Cave Automatic Virtual Environment) developed by the Electronic Visualization Lab at the University of Illinois at Chicago. It was introduced at the SIGGRAPH’ 92 computer graphics conference in Chicago. After that, the VR technology has been developed focusing on Sound and Graphical representation, body and motion tracking, and hardware interface and application software.

In the early stage of VR, the technology was regarded as something to provide experience of realness to people. Now the technology is being developed in various fields for more practical usage, like entertainment, games, and education. Also it becomes expanded to various types of interface, like CAVE, Augmented Reality, and Mixed Reality. The future VR would provide more vivid and real time experience with more variety of sensory experience containing olfactory and haptic display, with smaller equipment, and in a more collaborative environment.

VR is not a wide-spread technology. However, it would be one of the most important technologies in the near future, and many companies and research labs are developing the technology for various purposes. Because VR is one of the most complicated and advanced technologies in the field of HCI, in the history of technological experiments of VR, there have been many pioneers. Myron Krueger is a very unique person among the pioneers, in that he considered aesthetical aspect of VR technology.


Myron Krueger and Videoplace
Myron Krueger is a scientist and artist. He is one of pioneers in artificial interactive environment. He studied liberal arts in Dartmouth College and Computer science in University of Wisconsin. The Videoplace, his best known work, was born during his PhD in University of Wisconsin. As a scientist, he was interested in interaction technology between machine and machine, and human and machine. As an artist, he created a unique interaction between art work and participants using computer technology

Videoplace is an artificially created responsive environment to let people interact with graphical image on the screen. Using the video camera, participant’s motion was captured and inputted to the system, and the motion was translated to graphical image and represented on the projection screen. Then, the participant’s image is merged with a program-created world, previously programmed images, or interacts with another participant’s images.




This work is an experiment of a new type of interaction between human and machine or human and human using media. Most of early virtual reality technology stuck to providing realistic experiences to participants. So it needs heavy and complicated equipment. However, in his work Videoplace, Myron Cruger focused more on interaction itself. This comes from the notion of Virtual Reality as a communication tool. To Cruger, Virtual reality would not be just a fancy scientific technology to provide novel experience, but a media to communicate with each other. In terms of media, the work was also a tool of expression and creation. This different approach made his work innovative not only in the science and technology field but also in art. So, his work should be considered both in a technical aspect and in an artistic aspect.

Technically, Videoplace was an experiment of real time interaction in created world. The system doesn’t use equipment to be worn on a body. Even though it provided immersive interactive environment, it didn’t limit the participants’ body and their movements. This helped future experiments for different types of virtual reality interface, like Augmented Reality or Mixed Reality.

Videoplace received 1st Golden Nica in interactive art of 1990 Prix Ars Electronica, the world’s most famous new media art festival. This shows how important this work is in art history, especially in new media art. Artistically, one of his contributions was that he regarded interactivity as a part of artistic expression. Especially, he thought human-machine interaction can be an artistic expression. His notion contributed to early interactive art by making an audience’s participation and interaction the important part of art work. Interactive art is an art form based on technology and communication. So, it made conceptual change that HCI technology can be tested and used for aesthetic experiment as well as technological development.

Myron Krueger tried various experiments using the work, Videoplace. His experiments lasted to the 1990s, and finally, the work had been developed with 25 interactive programs. Videoplace was a highly influential and innovative work in HCI development. However, after the Videoplace, Myron Krueger couldn’t show such an innovative development in his work. Also, even though the Videoplace had been developed for several years, some of the trials for the work were just small changes of interaction patterns.


Conclusion
Myron Krueger, as a computer scientist and artist, made an innovative experiment with his most famous work, Videoplace. He was not only a very important figure in HCI but also a pioneer in new media art, as an interactive artist giving a reflection and conceptual change to the art scene. Also he helped people to regard HCI technology as a creative tool.



Byul Shin

Clippy: Misunderstood animated pedagogical agent or spawn of Satan?


Let me help you with that. Oh, come on. I don't want anything. I just want to lend a helping hand. Look at me, I have eyebrows! I need attention. But that is all I need. Feed me attention and I will solve all your problems. It looks like you're writing a letter. I love writing letters. I love reading letters. I just finished reading The Collected Letters of Van Gogh in three volumes. That man could write a letter. Plus, he could paint. But you, look at you. You can't spell. I have to AutoCorrect most of your words. Don't be mad, I have eyebrows! It looks like you're writing a BORING letter. Let me spice it up with quotes from Vincent's letters to Theo. Did you notice that below my eyebrows are actual eyes? These eyes of mine have seen many things but nothing more pathetic than your attempts to write a letter. Do you think John gives a shit about your problems at work? I'm going to say no. Click F10 and I'll replace that uninteresting, grammatically weak, lexically poor sentence with one that will— Please don't, I have more suggestions. I can change shapes! Look, I can—
(Vanishes.)

— Justin Kahn
Short
Imagined Monologues:
Microsoft Office Assistant: The Paper Clip


Hello. My name is Daniel, and I hate Clippy. [“Hello, Daniel.”] The first step to recovery is admitting the problem. Fine. I admit it. I probably spend too much time obsessing over a recently deceased animated pedagogical agent. To clarify that techie term, these interface agents pop up in “instructional environments” and “draw upon human-human social communication” by “embodying observable human characteristics (such as the use of gestures and facial expression)” (“Pedagogical Software Agents”). But my Clippy contempt hardly seems problematic in a computer culture that loves to hate Microsoft’s maligned office assistant. Distain for Clippy stems from some seemingly irrational social consciousness that has collectively perceived, judged, and condemned him to the deepest circle of Hell. I would not be surprised if our generation’s children were born with this hatred scripted in their DNA, afraid not of monsters under their bed but of Clippy in their documents. [“It looks like you’re having a nightmare…”] Clippy fans, if there are any, keep to themselves, seldom exclaiming happiness at his lame jokes or expressing thanks for his intrusive help. Microsoft may have buried its grotesque creation for the moment, but Microsoft’s patent history tells us that that they won’t give up on animated pedagogical agents. Even if Clippy does not return, it’s not hard to image his offspring popping up in our .docx files soon enough.

It may not be a problem to hate Clippy—whose real name is Clippit, by the way—but Sun Tzu reminds us that to truly defeat the enemy we must “understand the enemy.” This essay attempts exactly that—by situating Clippy culturally, historically, and theoretically across modern media, consumer product development, and human-computer interaction research. By understanding Clippy in context, we can begin to answer questions such as, “Are all animated pedagogical agents bad?” or “How do we prevent future iterations of Clippy?” or even “Was Clippy the anti-Christ?”

Pop Culture Paper Clip

Anyone who grew up in the era of Microsoft Word is bound to have an opinion on the annoying paper clip that excitedly bounced and blinked upon opening the program. Even before starting a document, Clippy would give users a helpful “hint”—nothing actually helpful, of course, like which horse to bet on at the track or how to pick up women in a bar. Clippy’s advice usually ranged from program functions to design tips—e.g., keyboard shortcuts for “cut” and paste or how to insert clip art to spruce up a page. Moving past the advice and into the document, I assume most users thought they were through with the strange animated being, but Clippy’s persistence and omnipresence is where this digital office supply went from unusual help interface to infamous rage agent.

Upon typing the salutation to a letter, for instance, Clippy would pop up—like a Viagra advertisement on a sketchy website—and proclaim, “It looks like you’re writing a letter” [“No way, paper clip.”] and ask if you’d like help. Of course, it wasn’t just letter writing that prompted a visit from the Clipster. Any time Word determined that the user needed assistance, based on Bayesian probability, the office assistant would appear (“Office Assistant”). This barrage of help, though not that frequent, was enough to create mass hatred.

A CNET news article from 2001, titled “Microsoft ‘Clippy’ gets pink slip”—six years before Microsft actually canned the clip—quotes Ketan Deshpande, senior software engineer at Manage.com: “Not one person in my office, from the receptionist to the sales people to the engineers to the CEO use the blasted paper clip. Not even my wife, who is an elementary school teacher, uses it” (Luening). It’s not unfair to say Deshpande’s sentiments were representative of the time. Even Microsoft sought to capitalize on the unpopularlity of its office mascot, launching officeclippy.com in 2001—a site with animations of Clippit voiced by Gilbert Godfried where users could commeserate about the horrors of the office assistant and learn how to disable his “helpful” nagging (“Office Assistant”).

The mark of pop culture significance, though, comes with a mention on Family Guy—Fox’s “comedy of references” that targets the obscure, random, or painfully obvious. In the episode “Lois Kills Stewie,” matricidal baby Stewie breaks into the CIA to gain control of the world’s power grid. Suddenly, Clippy appears, saying, “Looks like you're trying to take over the world. Can I help?” To the excited cheers of millions of fans, Stewie replies, “Go away, paper clip! Nobody likes you!” (“Lois Kills Stewie”). Clippy has also made appearances in the comedy of Dmitri Martin, College Humor, Drawn Together, and The Simpsons (“Office Assistant”).

Microsoft’s Frakensteinian Obsession

As humorous or pitiful as he may be, Clippy’s pop culture idolatry was an unintended consequence of his flawed design. Unlike Spud McKenzie or Ronald McDonald, whose sole purpose was advertising, the neurotic paper clip we’ve come to despise was actually meant to help us. To see how, it’s important to understand the history of Clippy’s development, and Microsoft’s obsession with animated pedagogical agents.

In 1987 John Sculley, former CEO of Apple, wrote a book called Odyssey in which he described a system of retrieving information with the help of user interface agents—essentially “people” that would serve as liaisons between the user and the information at hand. Apple shot several promotional videos of their idea that featured a bowtie-wearing butler happily fetching information about various topics, informing the user of incoming phone calls, and generally assisting in data gathering and transferring. The concept as a whole was called the “Knowledge Navigator” (“Knowledge Navigator”).

Bill Gates saw potential for this concept and in the July 1995 edition of InfoWorld magazine extolled the virtues of “social interfaces” or interfaces that allows users to communicate with them using basic social skills like talking and gesturing instead of new commands or difficult logic. Gates predicated that social interfaces were the next iteration of computing and not just experimental toys (Pontin 29).

At that time, Microsoft was banking on the success of its face-lift software for novice users of Windows 3.1. Meant as a competitor to Apple’s At Ease and Packard Bell’s Navigator (“At Ease”; “Packard Bell Navigator 3.5”), Microsoft’s Bob was a GUI designed to look like a house. Users could decorate their house however they pleased. Different icons, representing programs, could be placed in corresponding places throughout the house. Tying all of this customization together were Microsoft’s assistants—mainly a yellow dog named Rover—that let the user ask questions and seek advice about operating and navigating Bob (“Microsoft Bob”). In theory, Bob would let users intimidated by the Windows GUI play around in a more familiar environment through a more social interface.

Not surprisingly, people were not interested in asking a dog questions about their house, and Bob failed almost from the beginning (Newman). But Microsoft wasn’t deterred. They were certain that social interfaces, with animated pedagogical agents like Rover, were the future of interface design. Shortly after Bob’s release, Microsoft also introduced the world to Comic Chat and V-Chat, two GUI chat rooms where users were represented as animated avatars and could change the mood, setting, or method of communication with basic commands (Kurlander, Skelly, and Salesin 2-9; Damer). V-Chat also relied on automated “bots” to host the chat rooms and update users on the room’s status or other news. Users could interact with the bots, but their answers were greatly limited (Damer).

Even with limited success in the chat room world, where people actually wanted to talk to animated agents, Microsoft finally released Clippy on the world in Word 97. I’m sure his designer, Kevin J. Atteberry, who credits himself with creating one of the most annoying animations in history, will be forever proud and/or ashamed of the achievement (Atteberry). Clippy, of course, was not the only iteration of the office assistant. Other, equally annoying animations included Bosgrove the butler, Genie the…genie, Kairu the dolphin, Max the Macintosh computer, Peedy the parrot, Robby the robot, Dot the red ball, F1 the robot, Merlin the magician, Links the cat, Rocky the dog (who looked a lot like Rover), Mother Nature, Einstein, Shakespeare, and even an anthropomorphic Microsoft logo (“Office Assistant”). The default and most iconic assistant was Clippy—whose code name at Microsoft during the design phase was “The F****** Clown” (Sinofsky).


Even during production, programmers realized the pain and suffering embodied in Clippit, but Microsoft executives insisted, with the passion of mad scientists, that these animated agents would be helpful for users —even outside the social interface realm. In 2000, despite the cultural climate of disapproval and growing hatred of Clippy, Microsoft introduced Agent, speech recognition and text-to-speech software that used animated agents as its main user interface. To very few people’s surprise, the software will no longer be supported in Windows 7—a testament to its unpopularity (“Microsoft Agent”). Newer versions of Microsoft’s speech recognition software are not designed around user interface agents.

In the face of user disapproval and financial loss, Microsoft fanatically pursued animated pedagogical agents. Rover even continued his prominence in Windows XP as the search dog. Currently, Microsoft holds at least twelve patents for different animated agents, the latest of which was filed in 2006 (McCracken). But why? What does Microsoft know that everyone else apparently disagrees with? Reviewing relevant HCI research may provide some answers—and the names of a few people to burn in effigy.

HCI Researchers and Clippy, Sitting in a Tree...

Nass, Steuer, and Tauber are widely credited with bringing social interfaces and animated pedagogical agents to the forefront of interface design. In their article, “Computers Are Social Actors,” for CHI ’94, the authors found that users anthropomorphized the computers they were using, applying social rules of politeness and even seeing the computer as having a “self” (Nass, Steuer, and Tauber 77). The authors concluded that even the slightest resemblance of human characteristics was enough for users to interact socially with computers and that a “photo-realistic, full-motion video, or other high-bandwidth representation may be high overrated” (77). Nass et al. expanded on the notion of human-computer social interaction in a CHI ’95 paper titled, “Can Computer Personalities be Human Personalities?” Their study confirmed that humans respond socially to computers and to quote at some length:

In contrast to the prevailing idea that the creation of personality requires natural language programming, artificial intelligence, complex graphical environments, or richly defined agents, this research demonstrates that even the most rudimentary manipulations are sufficient to produce powerful effects. (Nash et al. 229)

From then on, the dominant research in animated pedagogical agency focused on believability of actions instead of believability of representation. Lester and Stone’s “Hermann the Bug” implemented a competition-based sequencing engine with fifteen behaviors vying for dominance. The behaviors are chosen based on the demand of the moment—e.g., if a student waits too long to choose an item in Hermann’s environment, he would reassure them that none of the choices were wrong (Lester and Stone 5-6).

Johnson, Rickel, and Lester called for more study of the role of animated agents in pedagogy, arguing that many features still needed proper testing to determine if they were pedagogically relevant (Johnson, Rickel, and Lester 73). Moreno et al. took this call seriously and wanted to see if using animated agents directly correlated with a better education in computer-mediated environments (Moreno et al. 177). Their study showed that students liked their lessons better with animated pedagogical agents and that students had a greater understanding of material when presented in a social agency environment compared to simple text and images (209). Additionally, one of their studies confirmed Nass, Steuer, and Tauber’s earlier findings that the visual appearance of the agent did not significantly affect student learning (210). Atkinson’s study affirmed the efficacy of animated agents in an electronic pedagogical medium that was originally presented by Moreno et al. (426). Interestingly, Atkinson’s study used a parrot animation very similar to Microsoft’s Agent.

While this short snippet of literature is not by any means exhaustive, I argue that it’s representative of the academic culture at the time—both positive and hopeful about the possibilities of animated pedagogical agents. As far as research goes, there’s not much “out there” to contradict the optimistic outlook on user interface agents in pedagogical realms. And why should there be? The literature makes sense; it is well argued and intricately researched. Why, then, did we have so much trouble with that damned paper clip?

What did Clippy Teach Us?

I suppose I’m forced to admit that the idea of animated pedagogical agency isn’t inherently bad. Clippy’s original sin was not so much in his being but in his designed demeanor. I can theoretically imagine an interface agent that doesn’t compel me to injure it or myself, but its characteristics would need to be drastically different. For instance, Clippy threw the technology in our face. He was always there, like a frustrated school master watching over our shoulder—but with the personality of a used car salesman. For a lot of users, he broke the already fragile illusion of transparent computing. [The computer knows best. Big Clipper is watching you.] The invasiveness wasn’t conducive to learning, especially in a user’s home, an environment where most people want as much control as possible.

Yet it seems reasonable to me that people treat their computers like social actors, so the idea of a social animated agent doesn’t have to be annoying. But consider Clippy as a being. He’s analogous to that annoying guy at work who always knows more about computers than you do—that Jimmy Fallon, Saturday Night Live skit character whose first response to your technical issue is, “Move!” Of course, Clippy tries to help, but so might your well-meaning but overbearing and incompetent cousin who “took a computer class in high school” and once worked on his former girlfriend’s Apple IIe. [“It looks like you’re trying to insert pictures. Would you like help?” he asks. And three hours later you need a new graphics card. And a shower.] Clippy’s persona, and the persona of most of Microsoft’s animated pedagogical agents, simply isn’t up to par with the people I’d want to hang out with on a regular basis. I certainly don’t want them as teachers.

Or it may be that characters like Clippy are the last vestige of the social interface design—a concept that has since evolved into social media and cloud computing. The idea may be back with the revitalization of user experience design but certainly as a shell of its former self and, hopefully, without Clippit.

Regardless, Microsoft’s office assistant was the embodiment of an interesting human-computer interaction concept, and animated pedagogical assistants are likely here to stay. Clippy on the other hand, who became a cultural icon and the personification of technological villainy, will be a twisted reminder of a good idea gone bad. [“Get bent, Clippy.”]

Works Cited

“At Ease.” Wikipedia. 27 March 2009. 19 September 2009. [http://en.wikipedia.org/wiki/At_Ease] Web.

Atkinson, Robert K. “Optimizing Learning From Examples Using Animated Pedagogical Agents.” Journal of Educational Psychology. 94.2. (2002): 416-427. Print.

Atteberry, Kevin J. Odd is Good. 19 September 2009. [http://oddisgood.com/pages/cd-clippy.html] Web.

Damer, Bruce. “Microsoft V-Chat: Frantic Antics.” Avatars! 1997. 19 September 2009. [http://www.digitalspace.com/avatars/book/fullbook/chww/chww7.htm] Web.

Johnson, W. Lewis, Jeff W. Rickel, and James C. Lester. “Animated Pedagogical Agents: Face-to-Face Interaction in Interactive Learning Environments.” International Journal of Artificial Intelligence in Education. 11 (2000): 47-78. Print.

Kahn, Justin. “Short Imagined Monologues: Microsoft Office Assistant: The Paper Clip.” McSweeney's. 21 April 2005. 19 September 2009 [http://www.mcsweeneys.net/2005/4/21kahn.html]. Web.

“Knowledge Navigator.” Wikipedia. 7 July 2009. 19 September 2009. [http://en.wikipedia.org/wiki/Knowledge_Navigator] Web.

Kurlander, David, Tim Skelly, and David Salesin. “Comic Chat.” Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH. (1996): 225-36. Print.

Lester, James. C, and Brian A. Stone. “Increasing Believability in Animated Pedagogical Agents.” Proceedings of the Eighth World Conference on Artificial Intelligence in Education. (1997): 16-21. Print.

“Lois Kills Stewie.” Family Guy. Fox Broadcasting Network. 11 November 2007. Television.

Luening, Erich. “Microsoft ‘Clippy’ gets pink slip.” CNET News. 11 April 2001. 19 September 2009 [ http://news.cnet.com/2100-1001-255671.html] Web.

McCracken, Harry. “The Secret Origins of Clippy.” Technologizer. 2 January 2009. 19 September 2009. [http://technologizer.com/2009/01/02/microsoft-clippy-patents/] Web.

“Microsoft Agent.” Wikipedia. 22 September 2009. 22 September 2009. [http://en.wikipedia.org/wiki/Microsoft_Agent] Web.

“Microsoft Bob.” Wikipedia. 3 September 2009. 19 September 2009. [http://en.wikipedia.org/wiki/Microsoft_Bob] Web.

Moreno, Roxana et al. “The Case for Social Agency in Computer-Based Teaching: Do Students Learn More Deeply When They Interact With Animated Pedagogical Agents?” Cognition and Instruction. 19.2 (2001): 177-213. Print.

Nass, Clifford, Jonathan Steuer, and Ellen R. Tauber. “Computers are Social Actors.” Human Factors in Computing Systems. (1994): 72-78. Print.

Nass, Clifford, et al. “Can Computer Personalities be Human Personalities?” CHI ’95 Short Papers. (1995): 228-229. Print.

Newman, Michael. “Bob is dead; long live Bob.” Post-Gazette. 23 May 1999. 19 September 2009. [http://www.post-gazette.com/businessnews/19990523bob6.asp] Web.

“Office Assistant.” Wikipedia. 17 September 2009. 19 September 2009. [http://en.wikipedia.org/wiki/Office_Assistant#cite_note-13] Web.

“Packard Bell Navigator 3.5.” Nathan’s Toasty Technology. 19 September 2009. [http://toastytech.com/guis/pbnav35.html] Web.

“Pedagogical Software Agents.” Communications of the ACM. 47.4 (2004): 47. Print.

Pontin, Jason. “Microsoft plans social interfaces everywhere.” InfoWorld. 23 July 1995. 29. Print.

Sinofsky, Steven. “PM at Microsoft.” Steven Sinofsky’s Microsoft Tech Talk. 16 December 2005. 19 September 2009. [ http://blogs.msdn.com/techtalk/archive/2005/12/16/504872.aspx] Web.