A reply to Wait But Why on machine superintelligence

Tim Urban of the wonderfulWait But Whyblog recently wrote two posts on machine superintelligence:The Road to Superintelligence and Our Immortality or Extinction.These postsare probably now among the most-readintroductions to the topic since Ray Kurzweils 2006 book.

In general I agree with Tims posts, but I think lots of details in his summary of the topic deserve to be corrected or clarified.Below, Ill quotepassages from histwo posts, roughlyin the order they appear, and then give my own brief reactions. Someof my commentsare fairlynit-picky but I decided to share them anyway; perhaps my most important clarification comes at the end.

The average rate of advancement between 1985 and 2015 was higher than the rate between 1955 and 1985 because the former was a more advanced world so much more change happened in the most recent 30 years than in the prior 30.

Readers should know this claim is heavily debated, and its truth depends on what Tim means by rate of advancement. If hes talking about the rate of progress in information technology, the claim might be true. But it might be false for most other areas of technology, for example energy and transportation technology. Cowen, Thiel, Gordon, and Huebner argue that technological innovation more generally has slowed. Meanwhile, Alexander, Smart, Gilder, and others critique some of those arguments.

Anyway,most of what Tim saysin these posts doesnt depend muchon the outcome of these debates.

Artificial Narrow Intelligence is machine intelligence that equals or exceeds human intelligence or efficiency at a specific thing.

Well, thats the goal. But lots ofcurrentANI systems dont yet equal human capability or efficiency at their given task.To pick an easy example from game-playing AIs: chess computers reliably beat humans, and Go computers dont (but they will soon).

Each new ANI innovation quietly adds another brick onto the road to AGI and ASI.

I know Tim is speaking loosely, but I should note that many ANI innovations probably most, depending on how you count wont end up contributing to progress toward AGI. ManyANI methodswill end up being dead ends after some initial success, and their performance on the target task will be superseded by other methods. Thats how the history of AI has worked so far, and how it will likely continue to work.

the human brain is the most complex object in the known universe.

Well, not really. For example the brain of an African elephanthas 3 as many neurons.

Hard things like calculus, financial market strategy, and language translation are mind-numbingly easy for a computer, while easy things like vision, motion, movement, and perception are insanely hard for it.

Yes,Moravecs paradox is roughly true, but I wouldnt say that getting AI systems to perform well in asset trading or language translation has been mind-numbingly easy. E.g. machine translation is useful for getting the gist of a foreign-language text,but billions of dollars of effort still hasnt produced a machine translation system as good as a mid-level humantranslator, and I expect this willremain true for at least another 10 years.

One thing that definitely needs to happen for AGI to be a possibility is an increase in the power of computer hardware. If an AI system is going to be as intelligent as the brain, itll need to equal the brains raw computing capacity.

Because computing power is increasing so rapidly, we probablywill have more computing power than the human brain(speaking loosely) before we know how to build AGI, but I just want to flag that this isntconceptually necessary. In principle,an AGIdesign could be very different than the brains design, just like a plane isnt designed much like a bird. Depending onthe efficiency of the AGI design, it might be able to surpass human-level performance in all relevant domains using muchless computing power than the human brain does, especially since evolution is avery dumb designer.

So,we dont necessarilyneedhuman brain-ish amounts of computing power to build AGI, but the more computing power we have available, the dumber (less efficient)our AGI design can afford to be.

One way to express this capacity is in the total calculations per second (cps) the brain could manage

Just an aside:TEPS is probably another good metric to think about.

The science world is working hard on reverse engineering the brain to figure out how evolution made such a rad thing optimistic estimates say we can do this by 2030.

I suspect that approximately zeroneuroscientists think we can reverse-engineer the brain to the degree being discussed in this paragraph by 2030.To get a sense of current and near-future progress in reverse-engineering the brain, seeThe Future of the Brain (2014).

One example of computer architecture that mimics the brain is the artificial neural network.

This probably isnta good example of the kind of brain-inspired insights wed need to build AGI. Artificial neural networks arguably go back to the 1940s, and they mimic the brain only in the most basicsense. TD learning would be a more specific example, except in that case computer scientists were using the algorithmbefore we discovered the brain also uses it.

[We have]just recently been able to emulate a 1mm-long flatworm brain

No we havent.

The human brain contains 100 billion [neurons].

Good news! Thanks to a new technique we now have a more precise estimate: 86 billion neurons.

If that makes [whole brain emulation]seem like a hopeless project, remember the power of exponential progress now that weve conquered the tiny worm brain, an ant might happen before too long, followed by a mouse, and suddenly this will seem much more plausible.

Because computing power advances so quickly, it probably wont be the limiting factor on brain emulation technology. Scanning resolution and neuroscience knowledge are likely to lag far behind computing power: see chapter 2 ofSuperintelligence.

most of our current models for getting to AGI involve the AI getting there by self-improvement.

They do? Says who?

I think the path from AGI to superintelligence is mostly or entirely about self-improvement, but the path from current AI systems to AGI is mostly about human engineering work, probably until relatively shortly before the leading AI project reachesa level of capability worth calling AGI.

the median year on a survey of hundreds of scientists about when they believed wed be more likely than not to have reached AGI was 2040

Thats the number you get when you combine the estimates from several different recent surveys, including surveys of people who were mostlynot AIscientists. If you stick to the survey of the top-cited living AI scientists the one called TOP100 here the median estimate for50% probability of AGI is 2050. (Not a big difference, though.)

many of the thinkers in this field think its likely that the progression from AGI to ASI [will happen]very quickly

True, but it should be noted this is still a minority position, as one can see in Tims 2nd post, or in section 3.3 of the source paper.

90 minutes after that, the AI has become an ASI, 170,000 times more intelligent than a human.

Remember that lots of knowledge and intelligence comes from interacting with the world, not just from running computational processesmore quickly or efficiently. Sometimes learning requires that you wait on some slow natural process to unfold.(In this context, even a1-second experimental test is slow.)

So the median participant thinks its more likely than not that well have AGI 25 years from now.

Again, I think its better to usethe numbers for the TOP100 survey from that paper, rather than the combined numbers.

Due to something called cognitive biases, we have a hard time believing something is real until we see proof.

There are dozens of cognitive biases,so thisis about as informative assaying due to something calledpsychology, we

The specific cognitive bias Tim seems to be discussing in this paragraph is the availability heuristic, or maybe the absurdity heuristic.Also see Cognitive Biases Potentially AffectingJudgment of Global Risks.

[Kurzweil is]well-known for his bold predictions and has a pretty good record of having them come true

The linked article says Ray Kurzweils predictions are right 86% of the time. That statistic is from a self-assessment Kurzweil published in 2010.Not surprisingly, when independent partiestry to grade the accuracy of Kurzweils predictions, they arrive at a much lower accuracy score: seepage 21 of this paper.

How good is this compared to other futurists? Unfortunately,we have no idea. The problem is that nobodyelse has bothered to write down so many specific technological forecasts over the course of multiple decades. So, give Kurzweil credit fordaring to make lots of predictions.

My own vague guess is that Kurzweils track record is actually pretty impressive, but not as impressive as his own self-assessment suggests.

Kurzweil predicts that well get [advanced nanotech]by the 2020s.

Im not surewhich Kurzweilprediction about nanotech Tim isreferring to, because the associated footnote points to a page of The Singularity is Nearthat isnt about nanotech. But if hes talking about advanced Drexlerian nanotech, then I suspect approximately zero nanotechnologists would agree with this forecast.

I expected [Kurzweils]critics to be saying, Obviously that stuff cant happen, but instead they were saying things like, Yes, all of that can happen if we safely transition to ASI, but thats the hard part.Bostrom, one of the most prominent voices warning us about the dangers of AI, still acknowledges

Yeah, but Bostrom and Kurzweil are both famous futurists. There are plenty of non-futuristcritics of Kurzweil who would say Obviously that stuff cant happen. I happen to agree with Kurzweil and Bostrom about the radical goods within reach of a human-aligned superintelligence,but lets not forget that most AI scientists, and most PhD-carrying members of societyin general, probablywould say Obviously that stuff cant happen in response to Kurzweil.

The people on Anxious Avenue arent in Panicked Prairie or Hopeless Hills both of which are regions on the far left of the chart but theyre nervous and theyre tense.

Actually, the people Tim istalking about here are often more pessimistic about societal outcomes than Tim issuggesting.Many of them are, roughly speaking, 65%-85% confident that machine superintelligence will lead to human extinction, and that its only in a small minority of possible worlds that humanity rises to the challenge and gets a machine superintelligence robustly aligned with humane values.

Of course, itsalso true that many of the people who write about the importance of AGI risk mitigationare moreoptimistic than the range shown in Timsgraph of Anxious Avenue. For example, one researcher I know thinks its maybe 65% likely we get really good outcomes from machine superintelligence. But he notes that a ~35% chance ofhuman friggin extinction istotally worth trying to mitigate as much as wecan, including by funding hundreds of smart scientists to study potential solutionsdecades in advance of the worst-case scenarios, like we already do with regard to a global warming, a much smaller problem. (Global warming is a big problem on a normal persons scale of things to worry about, but even climate scientists dont think its capable of human extinction in the next couple centuries.)

Or, as Stuart Russell author of the leading AI textbook likes to put it,If a superior alien civilization sent us a message saying, Well arrive in a few decades, would we just reply, OK, call us when you get here well leave the lights on? Probably not but this is more or less what is happening with AI. Although we are facing potentially the best or worst thing to happen to humanity in history, little serious research is devoted to these issues outside non-profit institutes1

[In the movies]AI becomes as or more intelligent than humans, then decides to turn against us and take over. Heres what I need you to be clear on for the rest of this post: None of the people warning us about AI are talking about this. Evil is a human concept, and applying human concepts to non-human things is called anthropomorphizing.

Thank you. Jesus Christ I am tired of clearing up that very basic confusion, even formany AI scientists.

Turry started off as Friendly AI, but at some point, she turned Unfriendly, causing the greatest possible negative impact on our species.

Just FYI, at MIRI wevestarted to move away from the Friendly AI language recently, since people think Oh, like C-3PO? MIRIs recent papersuse phrases like superintelligence alignmentinstead.

In any case, my real comment here isthat the quoted sentence above doesnt use the terms Friendly or Unfriendly the way theyve been used traditionally. In the usual parlance, a Friendly AI doesnt turn Unfriendly. If it becomes Unfriendly at some point, then it wasalways an Unfriendly AI, it just wasnt powerful enough yet to be a harm to you.

Tim does sorta fix this much later in the same post when he writes: So Turry didnt turn against usor switch from Friendly AI to Unfriendly AI she just kept doing her thing as she became more and more advanced.

When were talking about ASI, the same concept applies it would become superintelligent, but it would be no more human than your laptop is.

Well, this depends on how the AI is designed. If the ASI is an uploaded human, itll be pretty similar to a human in lots of ways. If itsnot an uploaded human, it could still be purposely designed to be human-like in many different ways. But mimicking human psychology in any kind of detail almost certainly isnt the quickest way to AGI/ASI just like mimicking bird flight in lots of detail wasnt how we built planes sopractically speaking yes,the first AGI(s) will likely be very alien from our perspective.

What motivates an AI system?The answer is simple: its motivation is whatever we programmed its motivation to be. AI systems are given goals by their creators your GPSs goal is to give you the most efficient driving directions; Watsons goal is to answer questions accurately. And fulfilling those goals as well as possible is their motivation.

Some AI programs today are goal-driven, but most are not. Siri isnt trying to maximize somegoal likebe useful to the user of this iPhone or anything like that. It just has a long list of rules about what kind of output to provide in response to different kinds of commands and questions. Varioussub-components of Siri might be sortagoal-oriented e.g. theres an evaluation function trying topick the most likely accuratetranscriptionof your spoken words but the system as a whole isnt goal-oriented. (Or at least, this is how it seems to work. Apple hasnt shown me Siris source code.)

As AI systems become more autonomous, giving them goals becomes more important because you cant feasibly specify how the AI shouldreact in every possible arrangement of the environment instead, you need to give it goals and let it do its own on-the-fly planning for how its going achieve those goals in unexpected environmental conditions.

The programming for a Predator drone doesnt include a list of instructionsto follow for every possiblecombination of takeoff points, destinations, and wind conditions, because that list would be impossiblylong. Rather, the operator gives the Predator drone a goal destination and the drone figures out how to get there on its own.

when [Turry]wasnt yet that smart, doing her best to achieve her final goal meant simple instrumental goals like learning to scan handwriting samples more quickly. She caused no harm to humans and was, by definition, Friendly AI.

Again, Ill mention thats not how the term has traditionally been used, but whatever.

But there are all kinds of governments, companies, militaries, science labs, and black market organizations working on all kinds of AI. Many of them are trying to build AI that can improve on its own

This isnt true unless by AI that can improve on its own you just mean machine learning. Almost nobody in AI is working on the kind of recursive self-improvement youd need to get an intelligence explosion. Lots of people are working onsystemsthat could eventually providesomepiece of the foundational architecturefor a self-improving AGI, but almostnobody is working directly on the recursive self-improvement problem right now, because its too far beyond current capabilities.2

because many techniques to build innovative AI systems dont require a large amount of capital, development can take place in the nooks and crannies of society, unmonitored.

True, its much harder to monitor potential AGI projects than it is to track uranium enrichment facilities. But you can at least trackAI research talent. Right nowit doesnt take a ton of work to identify aset of 500 AI researchers that probably contains the most talented ~150AI researchers in the world. Then you can just track all 500 of them.

This is similar to back whenphysicists were starting to realize that a nuclear fission bomb mightbe feasible. Suddenly a few of the most talented researchers stoppedpresenting their work at the usual conferences, and the other nuclear physicists pretty quickly deduced: Oh, shit, theyre probably working on a secret government fission bomb. If Geoff Hinton or even the much younger Ilya Sutskever suddenly went undergroundtomorrow, a lot of AI people would notice.

Of course, such a tracking effort might not be so feasible 30-60 years from now, when serious AGI projects will be more numerous and greater proportions of world GDP and human cognitive talent will be devoted to AI efforts.

On the contrary, what [AI developers are]probably doing is programming their early systems with a very simple, reductionist goal like writing a simple note with a pen on paper to just get the AI to work.Down the road, once theyve figured out how to build a strong level of intelligence in a computer, they figure they can always go back and revise the goal with safety in mind. Right?

Again, I note that most AI systems today are not goal-directed.

I also note that sadly, it probably wouldnt just be a matter of going back to revise the goal with safety in mind after a certain level of AI capability is reached. Most proto-AGIdesigns probably arent even thekind of systems youcan make robustly safe, no matterwhat goals you program into them.

To illustrate what I mean, imaginea hypothetical computer security expert namedBruce.Youtell Bruce that he and his team havejust 3years tomodify the latest version ofMicrosoft Windows so that it cant be hacked in any way, even by the smartest hackers on Earth. If he fails, Earth will be destroyed because reasons.

Bruce just stares at you and says, Well, thats impossible, so I guess were allfucked.

The problem, Bruce explains,is that Microsoft Windows was neverdesigned to be anything remotely like unhackable. It was designed to be easily useable, and compatible with lots of software, and flexible, and affordable, and just barely secure enough to be marketable, and you cant just slap ona specialUnhackability Module at the last minute.

To get a system that even has achance at being robustlyunhackable, Bruce explains, youve got to design an entirely differenthardware + software system that was designedfrom the ground up to be unhackable. And that systemmustbe designed in an entirely different way than Microsoft Windows is,and no team in the world could do everything that is required for that in a mere 3 years. So, were fucked.

But! By a stroke of luck, Bruce learns that some teamsoutsideMicrosoft have been working on a theoretically unhackablehardware + software system for the past several decades (high reliability ishard) people like Greg Morrisett (SAFE) and Gerwin Klein (seL4).Bruce says he might be able to take their work and add thefeatures you need,while preserving the strong security guaranteesof the original highly securesystem. Bruce sets Microsoft Windows aside and gets to work on trying to make this other system satisfy themysteriousreasons while remaining unhackable. He and his team succeedjust in time to savethe day.

This is an oversimplified and comically romanticway to illustrate whatMIRI is trying to do in the area of long-term AI safety. Were trying to think through what properties an AGI would need to haveif it was going to very reliablyact in accordance with humane values even as it rewrote its own code a hundredtimes on its way tomachine superintelligence.Were asking:What would it look like if somebody tried to design an AGI that was designedfrom the ground up not for affordability, or for speed of development, or for economic benefit at every increment of progress, but for reliably beneficial behavior even under conditions of radical self-improvement? What does the computationally unbounded solution to that problem look like,so we can gain conceptual insightsusefulfor later efforts to build a computationally tractable self-improving system reliably aligned with humane interests?

So ifyoure reading this, and you happen to be a highly gifted mathematician or computer scientist, and you want a full-time job working on themost important challenge of the 21st century, well were hiring. (I will also try to appeal to your vanity: Please note that because so little work has been done in this area, youve still got a decentchance to contribute to what will eventually be recognized as the early, foundational results ofthe most important field of research in human history.)

My thanks to Tim Urban for hisvery nice posts on machine superintelligence. Be sure to read his ongoing series about Elon Musk.

More:

A reply to Wait But Why on machine superintelligence

Related Post

Comments are closed.