Essays & Articles


How Might Humans Interact with Robots?

Human Robot Interaction and the Laws of Robotology

These notes formed the basis of my keynote address to a DARPA/NSF conference on Human-Robot Interaction, San Luis Obispo, CA, Sep., 2001

In developing understanding of how humans interact with robots, we can draw our lessons from several disciplines:

  1. Human-Computer Interaction
  2. Automation in such areas as Aviation
  3. Science fiction, e.g., Asimov’s 4 laws of Robots
  4. Computer-Supported Cooperative Work
  5. Human Consciousness, Emotion and Personality

All of these areas are valuable, but each stresses a different aspect of interaction so, in the end, we must draw lessons from all. In the case of robots, it turns out that although all these teach valuable lessons, they aren’t enough: we still need more.

The general paradigm in HCI is of a controlling human and a reasonably simple machine, one that takes commands from the person.

With traditional automation, the machine more and more takes over from the human, so that although the person might initially give it instructions, from then on, it works relatively autonomously. As I shall show, this is a major weakness (and a number of people from the Human Factors, HCI and Cognitive Science communities have made a career out of exposing these weaknesses and trying to remedy them).

Science fiction can be a useful source of ideas and information, for the best are in essence detailed scenario development, showing just how a device, e.g., a robot, might fit within everyday work and activities, although some scenarios are more useful than others, of course. Asimov’s laws of Robotics (originally three, then expanded to four with the addition of a Zeroth law), turn out to be more relevant than one might think.

The problem with the above three areas is that in each of them, the interaction between humans and machines is asymmetrical. In HCI, the human is assumed to be in control, whereas in automation, the machine is in control, but monitored and supervised by the human. In Asimov’s case, the robot actually does surprisingly little interaction: once it gets started, it is relatively autonomous.

The field of cooperative work provides a better model for robots, for in this field of study, the emphasis is on cooperative, equal participation of those interacting, be they humans or machines. This is a better model for the study of robots, for we really should assume that they will not be autonomous, but rather cooperative, acting more like a team member than an autonomous, independent device.

Finally, if we truly want intelligent behavior, with intelligent interaction, we must endow our devices with metaknowledge and meta-communicative skills, the better to interact. Consciousness and self-awareness play a critical role: indeed, the meta-cognition afforded by these attributes are essential for human problem solving and communication. Asimov’s laws could not be applied without a higher-level of awareness.

Similarly, emotion plays an important role both in determining goals — and relative importance of otherwise conflicting goals, but also in communicating internal states to both the person involved and to others. Personality, at least for this purpose, could be defined as the consistent in which a person interacts with the world and with others. Several cognitive scientists, including me, believe that intelligent behavior cannot take place without emotion.

1: Human-Computer Interaction

HCI does indeed teach us important principles for understanding how a person might interact with a machine. Among the most important are that for a person to understand the interaction, the machine must project an image of its operation. I have called this the system image (see “The Design of Everyday Things”). The point is that people develop internal, mental, conceptual models of the way the device works, and they form those models from their expectations and experience with the device itself. For this reason, the device must project an image that is useful in developing this conceptualization.

Along the way, visible affordances and continual feedback are essential.

Implications for Robots. The strong, silent type is a bad model — for people, robots, and machines.

In interacting with others, it is essential to have a good model of how the other operates, with what it understands it is about to do, and of the progress it is making. In computer systems, the need for feedback is well known. The information required to yield a good, coherent conceptual model of its operation is not so well known.

(Developers often try to develop a useful metaphor in a misguided attempt to simplify the system. Designing through metaphor is bad design. People need to have good conceptual models of the things with which they interact, for with such a model, it is possible to understand why the behavior has occurred and to figure out how to cause the desired results. Metaphors provide poor conceptual models, for the metaphor is always but a partial match with the system. Alas, design through metaphor has become embedded within the collective wisdom of designers, or perhaps I should say, the collective lack of wisdom.)

2: Automation in such area as Aviation

In many domains, considerable use of automation has simplified the job of the human and improved the safety of systems. However, not all automation is benign, and workers in the field of Aviation Safety have pinpointed many failures of automation, where the existence of automation exacerbated the problem. This is a topic much studied, especially within the area of research concerned with “Supervisory Control.”

I have argued that the worst of all possible worlds is a mid-level of automation where one automates what can be automated and leaves the rest to the human. The result is that the human operator is reduced to monitoring the automated system, supposedly stepping in if things go wrong. There are several problems with this philosophy. First, there is a lack of situation awareness. Thus, an airplane pilot flying from, say, California to Japan, may have ten hours of nothing to do, when suddenly things go wrong. In this case, it takes a while to regain an understanding of the system state. (See “Coffee Cups in the Cockpit” and “It’s a Million To One Chance” [from my book “Turn Signals Are the Facial Expressions of Automobiles“].)

Second, things tend to go wrong in the worst of situations, basically, when everything is going wrong: the weather, the engines, the radios, and other aircraft. This means that the automation takes over when the human needs it the least and often fails when the human needs it the most.

I have argued that systems should either exhibit only no or partial automation that supplements and complements human abilities or full, complete automation that can completely take over, with never a need for human intervention. It is the in-between state that can give rise to disaster. Having a person around just-in-case of failure of an automated system is the worst of worlds.

Thus, in automobiles, spark advance, choke, power steering, and braking kid-control are now completely automated, which is fine.

Implications for Robots. Don’t try to have a robot do a task for which it is imperfect and that therefore requires continual human monitoring. The human will get bored when the robot performs successfully and thus be unable to take over at exactly the times when it is most critical. Either do the whole thing, or don’t do it at all – or at least, do it cooperatively.

The general philosophy among many designers is to automate as much as they can and let the human take over the rest. This is a truly bad way of designing. The correct way is to understand fully the tasks to be performed and the relative strengths and weaknesses of people and machines. Then design the system as a cooperative endeavor, where people do what they are best at, machines do what they are best at, and the interaction between the two is smooth and continuous, but designed around human needs and capabilities.

3: Science fiction, e.g., Asimov’s 4 laws of Robots

Asimov’s Revised Laws of Robotics (1985)

Zeroth Law:A robot may not injure humanity, or, through inaction, allow humanity to come to harm.

First Law: A robot may not injure a human being, or, through inaction, allow a human being to come to harm, unless this would violate the Zeroth Law of Robotics.

Second Law: A robot must obey orders given it by human beings, except where such orders would conflict with the Zeroth or First Law.

Third Law: A robot must protect its own existence as long as such protection does not conflict with the Zeroth, First, or Second Law.

(From Clarke, Roger (1993, 1994): Asimov’s Laws of Robotics: Implications for Information Technology. IEEE Computer. (Published in two parts, in IEEE Computer 26,12 (December 1993) pp. 53-61 and 27,1 (January 1994), pp. 57-66). Available at

Asimov developed a sequence of novels to analyzing the difficulties when autonomous robots populated the earth. As a result, he postulated three laws of robotics (Laws First through Third, above), and then, as his story line progressed into more complex situations, added law Zero.

Asimov’s Laws are somewhat premature for today’s state of the art, but they still have much to recommend them. Indeed, most robots have many of the key aspects of the laws hard-wired into them.

So, although the Zeroth law is beyond current capability, the first law (do not injure) is partially implemented through proximity and collision sensors that safeguard any people with which the robot might come into contact. Even as simple a device as an elevator or garage door has sensors that immediately stop the door from closing on a person. Similarly, robots try to avoid bumping into people or objects. The second law (follow orders) is also built in. Will a robot disobey the second law in order to protect the first law? Maybe. In this case, when a sensor detects a potential person in its path, will the robot move forward anyway as commanded, or will the sensor override the command?

We don’t yet have the case of conflicting orders, but soon we will have interacting robots, where the requests of one robot might come into conflict with the requests of the human supervisors, in which case determining the precedence and priority will become important.

The third law (protect your own existence) is also built into many existing robots. Thus, sensors to avoid falling down stairs or other known hazards are built in. In addition, many robots monitor heir energy state and either go into “sleep” mode or return to a charging station when their energy level drops. I don’t know how this requirement is ordered with respect to the need to follow orders and to avoid doing harm.

Meta-knowledge and Self-awareness. Full implementation of Asimov’s Laws cannot be made unless the robot has knowledge of its own knowledge, and self awareness of its state, activities, and intentions. It can then analyze its current actions with respect to the laws, modifying the actions where necessary.

With today’s rather primitive devices, having some of the capabilities would be useful. Thus, in cases of conflict, there could be sensible overriding of the commands. (Follow the commanded path, unless it will cause you to fly into a mountain – a rule that would have saved lives). Let notice of possible collision course with other aircraft override the commands. Etc. actually, there already is a hierarchy of precedence built into automated systems.

Even the Sony Aibo has some level of this awareness. Its operation is controlled by its “desire” to play with its human, but also to indicate its emotional state, to show when it is bored, and, then above all, to return to its charging station when it is running out of energy, even if the human wishes to play with it. (This does seem to put Asimov’s 3rd law to protect itself above the Second law of following orders.)

Asimov’s Laws, have certain assumptions built in, assumptions that may not apply in today’s systems.

Asimov Assumption 1: Autonomous operation. Asimov’s robots seemed to be fully autonomous – give them a command and they would go off and execute it. Similarly, the robots would reason among themselves and determine a course of action, again which were carried out autonomously. We are more likely to want cooperative robots, systems in which human and robot or teams of robots work together. In the case of cooperation, the laws don’t make much sense, and moreover, need to be supplemented with others (such as the need for full communication of intentions, current state, and progress.)Asimov Assumption 2: Central Control. The very formulation of the rules implies a central control structure, one that follows a hierarchical, prioritized structure. In actuality, we are far more likely to make progress with a system that uses local control, with cooperative, interacting control structures. Distributed cognition and distributed control is a much more likely candidate for our systems. In other words, neural nets and the interacting structures of the automata cells of artificial life rather than a central rule-based logic. (Think of Rodney Brooks versus Expert Systems.) In these cases, the rules become weighting structures rather than prioritization.

Despite these reservations I find the rules a useful way of viewing the issues confronting today’s complex systems.

4: Computer Supported Cooperative Work

When we develop systems of cooperative behavior, one of the more critical things is the requirement for complete and full communication.

Thus, in some accidents involving automation, the automated systems stoically kept control of the aircraft, even as the automation was approaching the limit of its control. When the limit was reached, the system gave up suddenly, so that the pilot, who was unaware that anything was going wrong, was suddenly faced with an unstable aircraft.

Suppose that the system had been more communicative. It could have said (literally, said it aloud) “The aircraft is not evenly balanced, but I am compensating.” Then, later on, it could have said “the unbalance is getting worse. Perhaps you should try to determine the cause.” And finally, it could have said “I am reaching the end of my ability to compensate – I will reach that limit in 1 minute, if the rate of change continues.”

Had a human been flying the airplane when the initial imbalance was detected, then the above conversation would probably have taken place between the crew members.

Implications for Robots. On the one hand, we don’t want overly chatty robots. On the pother hand, the robot must continually monitor its actions and ensure that the humans know what state it is in and what predictions can be made. Even my wife and I have evolved similar rules: When we part in a busy, crowded location, say a store or airport, we not only specify where and when to meet, but each person has to say it aloud: we had too many cases where one person stated the rule and walked away, but the other person never heard it. Robots have to do better.

5: Human Consciousness, Emotion and Personality

Robots should show emotion? At first glance this seems silly. Asimov certainly didn’t think it was silly.

Human emotion is critical to our intellectual power (see Antonio Damasio, Descartes’ Error : Emotion, Reason, and the Human Brain) (Also Toda.)

Moreover, emotion is a communicative device, both within the person and among people. Thus, one function of emotion is to communicate between the autonomous, sub-conscious properties of the body with the conscious mind. Thus the feeling of hunger is one way for the body to signal the mind the need for food. When that need becomes urgent enough, the hunger feeling can become so dominant in human attention, that it will lead to urgent search for food. Other emotions are useful in planning and control. Thus, boredom indicates that no new stimulation are occurring, which might be a signal that the current course of action is unlikely to be fruitful. Similarly, frustration and anxiety signify judgments about the likelihood of success at the current activity and, again, a trigger for new behavior. Attention – and the robot equivalent of computational resources – is a limited resource, and many emotional states in the human are triggers for the appropriate deployment of attention, whether it to narrow its attentional focus (sometimes called “tunnel vision”) or to broaden it to cover more area, although with lesser processing capability on any element.

Imagine a search-and-rescue robot that has limited battery life. It might normally conduct its operations under limited power – traveling more slowly, moving its effectors (and television cameras) relatively slowly, and in general, trying to cover as much area with limited energy as possible. But when the robot sensors detect a promising lead, then the robot might very well want to go into a state of “alertness,” increasing its activity, moving more rapidly, changing its behavioral patterns for scanning, and perhaps turning on brighter lights. The energy cost will be higher, but so too will be the chance of success.

Note that in the alertness example above, the very activity of the robot would also signal to observing humans or other robots) that something of significance has been discovered. This is the second role of emotional states: a communication medium among people, whereby the variety of subtle behavioral changes can convey a rich communicative message about the current state of affairs and the intentions for future responses. Because emotions reflect a complex multidimensional set of triggers, their reflection in a similarly complex set of behaviors (and in the human, facial expressions, a rich and sophisticated way of communicating complex messages.

Robots might do well to emulate such meaningful complexity. Note that several entertainment robots already do a good job, such as R2D2 and 3PO of Star wars and Sony’s dog, Aibo.

Personality is a form of conceptual model, for it channels behavior, beliefs, and intentions into a cohesive, consistent setoff behavior. (This is a fairly dramatic oversimplification of the complex field of human personality and of the many scientific debates that take place within that field.) By deliberately providing a robot with a personality, it helps provide people with good models and good understanding of the behavior.

Thus, Sony’s Aibo has the personality of a puppy, which means that if it fails to understand or obey, or if it tries to execute a command and false, why the very failures add to the allure, for that is just how young puppies react. Similarly, MIT’s Kismet is given the personality of a child, one again making errors and misunderstandings appear natural.

Personality is a powerful design tool, for it helps provide humans with a good conceptual model for understanding and interpreting the behavior of the robot and for understand how they should behave in interaction and in giving commands.

Dealing With Imperfect Systems

Robots are going to have limited capabilities for a long time. As a result, it is critically important that the true capabilities – and its limits – are fully understood by the humans who must interact with and rely upon the robot’s actions.

At one level, this means the robot spells out to the human a cohesive conceptual model of its operation and powers. But how? Certainly nobody will listen to a long and complex recitation of its abilities and deficits. No, the correct way to demonstrate abilities is through its actions.

Speech recognition. Everyone hopes for perfect speech recognition, but a I have explained elsewhere, hoped run far ahead of reality in this area. Among other things, the problem is not speech recognition but language understanding, and that is decades away. But even perfect language understanding is not the solution, as anyone who has tried to explain a task to another human soon comes to realize. What is really wanted is mind reading. In actuality, people who work together over a period of time to get pretty good at mind reading, and one can come to hope that over time, the robot will learn enough about the needs of humans to be able to predict reasonably well what is required.

But while speech input is still imperfect, the robot must make this clear in several ways:

  1. Don’t have flawless, complex speech output at a level far more sophisticated than can be understood. If the robot wants people to realize it has imperfect understanding of language, it should exhibit these imperfections in the way it speaks. (If a foreign speaking person could speak fluent English but only understand pidgin speech, the more it spoke flawlessly, the less other people would understand the need to speak in pidgin).
  2. Always repeat back its understanding of the task in front of it and the approach it will take. If there are obstacles, they should be stated. (“I am not certain I have enough battery power to complete the job.” Or, “I don’t know the route from here to there.” Or, “I am not certain I will fit in the space.”
  3. No errors. There should be no such thing as an error message. When a human command is not understood, the system should explain just what has been understood and ask for help in resolving the remainder.