Should artificial agents’ responses to difficult choices align with our own moral intuitions? Johanna Thoma considers the difficulties involved in programming machines to deal with risk, and how things look different from an aggregate point of view.

A self-driving car is about to crash into an obstacle, which would kill its passengers. The only way to avoid this outcome would be to swerve to the side, killing the same number of innocent bystanders. How should it decide? What if the group of bystanders is larger or smaller in number than the passengers? What if the bystanders are children? Between January 2016 and July 2020, MIT’s Moral Machine website gathered responses from people all over the world to moral dilemmas such as these that could be faced by self-driving cars. The hope of the researchers involved was that gathering this data could usefully inform the way in which artificial agents are programmed to make choices in morally significant choice scenarios.

The project has been controversial for a variety of reasons. For one, the scenarios describe extreme choice situations of a type rarely faced by self-driving cars or human drivers alike. Perhaps it’s more important to focus resources on getting right the balancing of risks in more ordinary driving situations (and then to potentially reason from those to the extreme cases, rather than the other way around). Perhaps more worryingly, the scenarios are also unrepresentative of even the more extreme situations with likely fatalities self-driving cars might find themselves in. For instance, they do not feature uncertainty, even though self-driving cars make probabilistic projections of the movements of the objects in their surroundings. And finally, one might be sceptical more generally of the empirical approach to AI Ethics the project exemplifies. People’s moral judgements can be unreliable for all sorts of reasons, in particular when asked about situations they have never deliberated about in their own lives.

A Question of Moral Perspective

Aside all this, however, there is a fundamental presupposition this project shares with many recent discussions in AI Ethics that I believe deserves more critical scrutiny. What difference does it make, morally speaking, to replace human agents with artificial agents, such as self-driving cars? Two answers to this question are generally acknowledged. For one, the response to morally difficult situations has to be codified if they are to be addressed by artificial agents. And secondly, we can make considered judgements about how these decisions should be made by artificial agents at the programming stage, unlike, for instance, the human agents who find themselves in an unavoidable crash scenario with no time for deliberation.

But there is an important respect in which the Moral Machines project assumes nothing much changes when artificial agents replace human agents. It takes for granted that to inform moral design choices, we can simply look at choice problems of the same scope as those previously faced by individual human agents, and consider how those choices should now be made by artificial agents. The framing of the problem from the perspective of a single agent does not change: Should whoever is driving this one car — previously human, now machine — swerve left or right? The presupposition that this is the right kind of question to ask is shared by many authors in the AI Ethics literature who want artificial agent design to be informed by considered human moral judgement, even if they don’t necessarily go along with the empirical approach of the Moral Machines project.

Framing things in this way makes a lot of sense if we think that artificial agents are simply proxies for individual human agents, that they make decisions on their behalf. In the case of self-driving cars, for instance, you might think that the cars are proxy decision makers for their users. In the moral context, the question then is how the car should make important moral decisions on the user’s behalf. Framing these problems from the perspective of a single agent seems right for answering that question.

But there is another way of looking at ethical artificial agent design. Designers of artificial agents meant for wide distribution, such as self-driving cars, but also, widening our gaze, artificial trading agents used in financial markets, nursebots, robot teachers and so on, decide not only about the programming of one such agent. Rather, they determine how a large number of such agents will behave — including in morally difficult situations with potential for severe harm. From their perspective, and from the perspective of a policy-maker wishing to regulate artificial agent design, what seems relevant are the aggregate expected consequences of many artificial agents acting in the way we programme them to. They are not just deciding how one nursebot will respond to a particular patient symptom, or how one self-driving car will handle an unavoidable crash scenario. Rather, they are deciding how many nursebots and many self-driving cars will handle hundreds or thousands of instances of similar scenarios. They face a compound decision problem. Another approach to AI ethics would be to determine what considered human moral judgement regarding such compound decision problems is, and to prescribe the programming that implements it.

Importantly, in many areas of (potential) introduction of artificial agents, there was previously nobody who faced a compound choice of the type programmers and regulators are faced with now. There are limits to how laws, rules and regulations can control the behaviour of human drivers or the choices made by medical professionals. Within the leeway the rules and regulations leave, human decision-making is decentralised. When artificial agents are introduced in this sphere, the aggregate perspective suddenly becomes potentially relevant where previously it was not.

Why Perspective Matters

Does it make a difference for AI Ethics whether we frame things from the perspective of an individual agent or take the aggregate point of view? Shouldn’t our moral judgements simply scale up? If it is right for one car to swerve right in one instance of an unavoidable crash scenario when considered in isolation, shouldn’t we also judge that all cars should choose the same in all relevantly similar scenarios when considered in the aggregate?

There is a class of cases where moving to an aggregate perspective quite clearly makes a difference. Where decisions by one individual affect the potential outcomes of the choices made by others, or the likelihood with which they occur, there may be benefits in coordination — indeed benefits that could be acknowledged from the perspective of each. For instance, the aggregate perspective in the case of self-driving cars allows us to think about what set of driving styles is collectively safest for everybody when self-driving cars interact with each other; Whereas, when we frame things from the perspective of the individual, we have to keep fixed how all other cars behave.

But even in the absence of such interdependence between the consequences of individual choices, whether we frame the relevant decision problems from the perspective of the individual or from an aggregate perspective can make a difference. In my forthcoming paper “Risk Imposition by Artificial Agents: The Moral Proxy Problem”, I argue that the scope at which we frame the relevant decision problems can make an important difference whenever there is an element of risk — which there is in most situations. To illustrate, take the following example.

Artificial Rescue Coordination Centre, Single Choice. An artificial rescue coordination centre has to decide between sending a rescue team to one of two fatal accidents involving several victims. If it chooses Accident 1, one person will be saved for certain. If it chooses Accident 2, on the other hand, there is a 50% chance of saving three and a 50% chance of saving nobody. All other morally relevant factors are held fixed.

In this scenario, the expected number of lives saved is higher in Accident 2, namely 1.5 rather than one. If every life saved is equally important, and all that distinguishes the two scenarios is the potential number of lives saved, then a risk neutral rescue coordination centre would simply maximise the expected number of lives saved, and thus choose Accident 2. But I think that, considering this decision problem in isolation, there would be nothing immoral or irrational about choosing Accident 1. Such a choice would express an aversion to risk: you may want to make sure to save at least one, and not want to risk ending up saving nobody for the chance of saving three. Human agents commonly display risk aversion across a variety of choice contexts. But now consider this compound case.

Artificial Rescue Coordination Centre, Compound Choice. Some high-level agent, such as a designer or regulator, has to settle at once on one hundred instances of the choice between Accident 1 and Accident 2. Suppose these instances are probabilistically independent, and that the same choice needs to be implemented in each case. The two options are thus either always going for Accident 1, saving one person for certain each time, or always going for Accident 2, with a 50% chance of saving three each time. The expected aggregate outcome of going for Accident 1 one hundred times is, of course, saving one hundred people for certain. The expected aggregate result of going for Accident 2 one hundred times, on the other hand, is a probability distribution with an expected number of one hundred and fifty lives saved, and, importantly, a less than 0.5% chance of saving fewer lives than if one always went for Accident 1.

In this compound case, it seems unreasonably risk averse to choose the “safe option”. We are now almost certain to do worse by going with the risk averse option in each individual case. And as the number of repetitions increases, the appeal of the “risky” option only increases, since the probability of doing worse than on the “safe” option becomes ever smaller. This lesson carries over to cases that involve a more complex set of values: As independent instances of a risky choice are repeated, at some point the likelihood of doing better by each time choosing a safer option with lower expected value becomes very small. From a sufficiently large compound perspective, the virtual certainty of doing better by picking a risker option with higher expected value is decisive.

A Hard Choice for Artificial Agent Design

What this example illustrates is that in most applications, when we take the perspective of a high-level decision-maker deciding on how a large number of artificial agents should behave in many risky choice scenarios, our considered moral judgement will point to implementing risk neutrality in any individual choice. We implement risk neutrality by specifying a function assigning values to the various outcomes the artificial agent may bring about, which captures the ends we want the agents to pursue. And we then design or train the agent so that it maximises the expectation of that function. As it happens, this is precisely what the current standard approach to artificial agent design is. Taking the aggregate perspective vindicates this standard approach.

By contrast, if we think the right decision framing for thinking about ethical artificial agent design is from the perspective of an individual agent, as implicitly assumed by many AI ethicists, we should let design choices be guided by considered moral judgements about cases like “Artificial Rescue Coordination Centre, Single Choice”. As we have seen, considered human moral judgement in such cases may exhibit attitudes to risk other than risk neutrality. In particular, human agents are often risk averse when faced with important decisions, and this seems neither obviously immoral nor irrational. If we were to take the approach of the “Moral Machines” project and crowd-source judgements about individual choice scenarios featuring risk, we’d be unlikely to find perfect risk neutrality. And so, on the approach to AI Ethics we started out with, whereby ethical design should be guided by considered human judgement regarding individual choice scenarios, we get the result that at least sometimes risk aversion should be made room for in artificial agent design. The standard risk neutral approach in artificial agent design would then need revising.

We thus face a hard choice: Either design artificial agents to emulate considered human moral judgement about individual choice scenarios, allow for risk aversion, and accept that this implies that, as a matter of fact, we may end up with aggregate outcomes that are almost certainly worse than if risk neutrality had been implemented. Or design artificial agents in a way that implements considered human moral judgements about the aggregate consequences of the choices of a large number of artificial agents, implement risk neutrality, and accept that artificial agents may choose in ways that deviate from the way in which human agents would permissibly choose when placed in the same choice scenarios, even if they had time for a considered judgement. We can’t do both.

I don’t think it’s obvious what the right way to go is, and the correct answer may vary from context to context. But either way, the dilemma shows how these new technologies can raise moral challenges beyond the need for codification, and transform our moral landscape in more profound ways.

By Johanna Thoma


Dr. Johanna Thoma is an Associate Professor in the Department of Philosophy, Logic and Scientific Method. She works in ethics, decision theory and economic methodology. Her recent work has centred around the question of how individuals and public decision-makers should deal with risk.


Further reading