In my last post you learned about wireheading. In this post, I’ll bring to attention a specific example of wireheading: suicide or, as one would call it for robots, self-destruction. What differentiates self-destruction from more commonly discussed forms of wireheading is that it does not lead to a pleasurable or very positive internal state, but it too is a measure that does not solve problems in the external world as much as it changes one’s state of mind (into nothingness).
Let’s consider the reinforcement learner who has an internal module which generates rewards between -1 and 1. Now, there will probably be situations in which the reinforcement learner has to expect to mainly receive negative rewards in the future. Assuming that zero utility is assigned to nonexistence (similar to how humans think of the states prior to their existence or phases of sleep as neutral to their hedonic well-being) a reinforcement learner may well want to end its existence to increase its utility. It should be noted that typical reinforcement learners as studied in the AI literature don’t have a concept of death. However, a reinforcement learner that is meant to work in the real world would have to think about its death in some way (which may be completely confused by default). An example view is one in which the “reinforcement learner” actually has a utility function that calculates utility from the sum of the outputs of the physical reward module. In this case, suicide would reduce the number of outputs of the reward module which can increase expected utility if rewards are expected to be more often or to a stronger extent negative in the future. So, we have another example of potentially rational wireheading.
However, for many agents self-destruction is irrational. For example, if my goal is to reduce suffering then I may feel bad once I learn about instances of extreme suffering and about how much suffering there is in the world. Killing myself ends my bad feeling, but prevents me from achieving my goal of reducing suffering. Therefore, it’s irrational given my goals.
The actual motivations of real-world people committing suicide seem a lot more complicated most of the time. Many instances of suicide do seem to be about bad mental states. Also, many attempt to use their death to achieve goals in the outside world as well, the most prominent example being seppuku (or harakiri), in which suicide is performed to maximize one’s reputation or honor.
As a final note, looking at suicide through the lens of wireheading provides one way of explaining why so many beings who live very bad lives don’t commit suicide. If an animal’s goals are things like survival, health, mating etc. that correlate with reproductive success, the animal can’t achieve its goals by suicide. Even if the animal expects its life to continue in an extremely bad and unsuccessful way with near certainty, it behaves rationally if it continues to try to survive and reproduce rather than alleviate its own suffering. Avoiding pain is only one of the goals of the animal and if intelligent, rational agents are to be prevented from committing suicide in a Darwinian environment, reducing their own pain better not be the dominating consideration. In sum, the fact that an agent does not commit suicide tells you little about the state of its well-being if it has goals about the outside world, which we should expect to be the case for most evolved beings.