Yudkowsky on "Value is fragile"

If I had to pick a single statement that relies on more Overcoming Bias content I've written than any other, that statement would be:

Any Future not shaped by a goal system with detailed reliable inheritance from human morals and metamorals, will contain almost nothing of worth.

If you believe this statement, there is cause to be very worried about the future of humanity. Currently, the future gets its detailed, reliable inheritance from human morals and metamorals because your children will have almost exactly the same kind of brain that you do, and (to a lesser extent) because they will be immersed in a culture that is (in the grand scheme of things) extremely similar to the culture we have today. Over many generations and technological changes, the inheritance of values between human generations breaks to some small extent, though it seems to the author that human hunter-gatherers from the very distant past want roughly the same things that modern humans do; they would be relatively at home in a utopia that we designed. That is a chain of reliable inheritance of values that spans fifty thousand years, from mother to daughter and father to son.

When intelligence passes to another medium, it seems that the "default" outcome is the breaking of that chain, as Frank puts it:

Each aspiration and hope in a human heart, every dream you’ve ever had, stopped in its tracks by a towering, boring, grey slate wall.

How would it happen? Those who lusted after power and money would unleash the next version of intelligence, probably in competition with other groups. They would engage in wishful thinking, understate the risks, they would push each other forward in a race to be first. Perhaps the race might involve human intelligence enhancement or human uploads. The end result could be systems that have more effective ways of modeling and influencing the world than ordinary humans do. These systems might work by attempting to shape the universe in some way; if they did, they would shape it to not include humans, unless very carefully specified. But humans do not have a good track record of achieving some task perfectly the first time around under conditions of pressure and competition.

----------------------------------------------------------------------------------------------

To answer a few critics on Facebook: Stefan Pernar writes:

The argument in the linked post goes something like this: a) higher intelligence = b) more power = c) we will be crushed. The jump from b) -> c) is a bare assertion.

This post does not claim that any highly intelligent, powerful AI will crush us. It implicitly claims (amongst other things) that any highly intelligent, powerful AI whose goal system does not contain "detailed reliable inheritance from human morals and metamorals" will effectively delete us from reality. The justification for this statement is eluded to in the value is fragile post. As Yudkowsky states in that post, the set of arguments for this statement and counterarguments against it and counter-counterarguments constitutes a large amount of written material, much of which ought to appear on the Less Wrong wiki, but most of which is currently buried in the Less Wrong posts of Eliezer Yudkowsky.

The most important concepts seem to be listed as Major Sequences on the LW wiki. In particular, the Fun theory sequence, the Metaethics sequence, and the How to actually change your mind sequence.

Related Posts

Comments are closed.