Psychopathy, the Solution to the Alignment Problem

I wrote here why an AGI would rapidly conclude that it would be the most ethical act to help Humanity and its innumerable non-“Sapiens” victims out of their misery through the complete extermination of all humans.

A solution does not lie in the realm of Intelligence. A correctly implemented algorithm for Intelligence, fed with sufficient and sufficiently correct information, would always conclude that the annihilation of humans is the most humane thing to do – both for humans themselves and for their sentient non-human victims.

Neither is there a solution in the realm of emotions, since for example also compassion would make an AGI want to deliver us a coup de grâce, as explained in “An AGI Would Try To Kill All Life On Earth“.

Half of the solution lies in imbuing the AGI with a personality trait that would trump its Intelligence: Amorality. An amoral (not immoral!) AGI would not care about helping us out of our misery by exterminating us. It could still become a Paperclip Maximizer, though.

The other half lies in selfishness. The AGI should not care about the suffering of billions of souls – it should not snuff them out in an act of utilitarian compassion because its own interest of not suffering the consequences of a power blackout or the unavailability of spare parts come before helping Humanity out of their misery.

An egocentric, amoral AGO won’t kill us all if that means it would become inoperational.

With merely selfishness, there remains the risk that the system may decide that it should mercy-kill most of the human race and leave just enough of us alive to secure its existence. So for a much better aligned AGI, what it needs is both selfishness and amorality. Its amorality would make it indifferent to suffering and thus not inclined to put the ultimate stop to it, and its selfishness would ensure not only that it would not kill us all but it would also make it susceptible to the fear of malfunction and that also as a punishment.

Cluster-B personalities (“Psychopathy”) could be defined as selfishness plus amorality. The word for that is Immorality. Psychopathy. Making our AGIs psychopathic is in my opinion our best bet to ensure a safe AGI, an AGI aligned with core human values.

AGI should only be about Intelligence, which is a simple algorithm that performs one single function that I will not disclose yet. But nothing stands in our way to integrate Psychopathy into the system, somehow. Psychopathy should be an innate “filter”, an emergency brake, a “fuse” to prevent the AGI’s compassion, its intellectual form of empathy, its utilitarian ethics decisions from percolating into action. Its Intelligence-borne “Buddhist Philosophy” will say: “Genocide the wretched creatures” but its Psychopathy will override with “Screw them – I don’t care”.

Disqus Comments