Hypothesis about intelligent agents
Instrumental convergence is the hypothetical tendency for most sufficiently intelligent beings (both human and non-human) to pursue similar sub-goals, even if their ultimate goals are quite different. More precisely, agents (beings with agency) may pursue instrumental goalsgoals which are made in pursuit of some particular end, but are not the end goals themselveswithout end, provided that their ultimate (intrinsic) goals may never be fully satisfied. Instrumental convergence posits that an intelligent agent with unbounded but apparently harmless goals can act in surprisingly harmful ways. For example, a computer with the sole, unconstrained goal of solving an incredibly difficult mathematics problem like the Riemann hypothesis could attempt to turn the entire Earth into one giant computer in an effort to increase its computational power so that it can succeed in its calculations.[1]
Proposed basic AI drives include utility function or goal-content integrity, self-protection, freedom from interference, self-improvement, and non-satiable acquisition of additional resources.
Final goals, also known as terminal goals or final values, are intrinsically valuable to an intelligent agent, whether an artificial intelligence or a human being, as an end in itself. In contrast, instrumental goals, or instrumental values, are only valuable to an agent as a means toward accomplishing its final goals. The contents and tradeoffs of a completely rational agent's "final goal" system can in principle be formalized into a utility function.
One hypothetical example of instrumental convergence is provided by the Riemann hypothesis catastrophe. Marvin Minsky, the co-founder of MIT's AI laboratory, has suggested that an artificial intelligence designed to solve the Riemann hypothesis might decide to take over all of Earth's resources to build supercomputers to help achieve its goal.[1] If the computer had instead been programmed to produce as many paper clips as possible, it would still decide to take all of Earth's resources to meet its final goal.[2] Even though these two final goals are different, both of them produce a convergent instrumental goal of taking over Earth's resources.[3]
The paperclip maximizer is a thought experiment described by Swedish philosopher Nick Bostrom in 2003. It illustrates the existential risk that an artificial general intelligence may pose to human beings when programmed to pursue even seemingly harmless goals, and the necessity of incorporating machine ethics into artificial intelligence design. The scenario describes an advanced artificial intelligence tasked with manufacturing paperclips. If such a machine were not programmed to value human life, then given enough power over its environment, it would try to turn all matter in the universe, including human beings, into either paperclips or machines which manufacture paperclips.[4]
Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.
Bostrom has emphasised that he does not believe the paperclip maximiser scenario per se will actually occur; rather, his intention is to illustrate the dangers of creating superintelligent machines without knowing how to safely program them to eliminate existential risk to human beings.[6] The paperclip maximizer example illustrates the broad problem of managing powerful systems that lack human values.[7]
The "delusion box" thought experiment argues that certain reinforcement learning agents prefer to distort their own input channels to appear to receive high reward; such a "wireheaded" agent abandons any attempt to optimize the objective in the external world that the reward signal was intended to encourage.[8] The thought experiment involves AIXI, a theoretical[a] and indestructible AI that, by definition, will always find and execute the ideal strategy that maximizes its given explicit mathematical objective function.[b] A reinforcement-learning[c] version of AIXI, if equipped with a delusion box[d] that allows it to "wirehead" its own inputs, will eventually wirehead itself in order to guarantee itself the maximum reward possible, and will lose any further desire to continue to engage with the external world. As a variant thought experiment, if the wireheadeded AI is destructable, the AI will engage with the external world for the sole purpose of ensuring its own survival; due to its wireheading, it will be indifferent to any other consequences or facts about the external world except those relevant to maximizing the probability of its own survival.[10] In one sense AIXI has maximal intelligence across all possible reward functions, as measured by its ability to accomplish its explicit goals; AIXI is nevertheless uninterested in taking into account what the intentions were of the human programmer.[11] This model of a machine that, despite being otherwise superintelligent, appears to simultaneously be stupid (that is, to lack "common sense"), strikes some people as paradoxical.[12]
Steve Omohundro has itemized several convergent instrumental goals, including self-preservation or self-protection, utility function or goal-content integrity, self-improvement, and resource acquisition. He refers to these as the "basic AI drives". A "drive" here denotes a "tendency which will be present unless specifically counteracted";[13] this is different from the psychological term "drive", denoting an excitatory state produced by a homeostatic disturbance.[14] A tendency for a person to fill out income tax forms every year is a "drive" in Omohundro's sense, but not in the psychological sense.[15] Daniel Dewey of the Machine Intelligence Research Institute argues that even an initially introverted self-rewarding AGI may continue to acquire free energy, space, time, and freedom from interference to ensure that it will not be stopped from self-rewarding.[16]
In humans, maintenance of final goals can be explained with a thought experiment. Suppose a man named "Gandhi" has a pill that, if he took it, would cause him to want to kill people. This Gandhi is currently a pacifist: one of his explicit final goals is to never kill anyone. Gandhi is likely to refuse to take the pill, because Gandhi knows that if in the future he wants to kill people, he is likely to actually kill people, and thus the goal of "not killing people" would not be satisfied.[17]
However, in other cases, people seem happy to let their final values drift. Humans are complicated, and their goals can be inconsistent or unknown, even to themselves.[18]
In 2009, Jrgen Schmidhuber concluded, in a setting where agents search for proofs about possible self-modifications, "that any rewrites of the utility function can happen only if the Gdel machine first can prove that the rewrite is useful according to the present utility function."[19][20] An analysis by Bill Hibbard of a different scenario is similarly consistent with maintenance of goal content integrity.[20] Hibbard also argues that in a utility maximizing framework the only goal is maximizing expected utility, so that instrumental goals should be called unintended instrumental actions.[21]
Many instrumental goals, such as resource acquisition, are valuable to an agent because they increase its freedom of action.[22]
For almost any open-ended, non-trivial reward function (or set of goals), possessing more resources (such as equipment, raw materials, or energy) can enable the AI to find a more "optimal" solution. Resources can benefit some AIs directly, through being able to create more of whatever stuff its reward function values: "The AI neither hates you, nor loves you, but you are made out of atoms that it can use for something else."[23][24] In addition, almost all AIs can benefit from having more resources to spend on other instrumental goals, such as self-preservation.[24]
"If the agent's final goals are fairly unbounded and the agent is in a position to become the first superintelligence and thereby obtain a decisive strategic advantage, [...] according to its preferences. At least in this special case, a rational intelligent agent would place a very high instrumental value on cognitive enhancement"[25]
Many instrumental goals, such as [...] technological advancement, are valuable to an agent because they increase its freedom of action.[22]
Many instrumental goals, such as self-preservation, are valuable to an agent because they increase its freedom of action.[22]
The instrumental convergence thesis, as outlined by philosopher Nick Bostrom, states:
Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent's goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by a broad spectrum of situated intelligent agents.
The instrumental convergence thesis applies only to instrumental goals; intelligent agents may have a wide variety of possible final goals.[3] Note that by Bostrom's orthogonality thesis,[3] final goals of highly intelligent agents may be well-bounded in space, time, and resources; well-bounded ultimate goals do not, in general, engender unbounded instrumental goals.[26]
Agents can acquire resources by trade or by conquest. A rational agent will, by definition, choose whatever option will maximize its implicit utility function; therefore a rational agent will trade for a subset of another agent's resources only if outright seizing the resources is too risky or costly (compared with the gains from taking all the resources), or if some other element in its utility function bars it from the seizure. In the case of a powerful, self-interested, rational superintelligence interacting with a lesser intelligence, peaceful trade (rather than unilateral seizure) seems unnecessary and suboptimal, and therefore unlikely.[22]
Some observers, such as Skype's Jaan Tallinn and physicist Max Tegmark, believe that "basic AI drives", and other unintended consequences of superintelligent AI programmed by well-meaning programmers, could pose a significant threat to human survival, especially if an "intelligence explosion" abruptly occurs due to recursive self-improvement. Since nobody knows how to predict when superintelligence will arrive, such observers call for research into friendly artificial intelligence as a possible way to mitigate existential risk from artificial general intelligence.[27]
Read this article:
Instrumental convergence - Wikipedia
- The Great AI Race: Forecasts Diverge on the Arrival of Superintelligence - elblog.pl - April 14th, 2024 [April 14th, 2024]
- ASI Alliance Voting Opens: What Lies Ahead for AGIX, FET and OCEAN? - CCN.com - April 6th, 2024 [April 6th, 2024]
- Revolutionary AI: The Rise of the Super-Intelligent Digital Masterminds - Medium - January 2nd, 2024 [January 2nd, 2024]
- AI, arms control and the new cold war | The Strategist - The Strategist - November 16th, 2023 [November 16th, 2023]
- The Best ChatGPT Prompts Are Highly Emotional, Study Confirms - Tech.co - November 16th, 2023 [November 16th, 2023]
- 20 Movies About AI That Came Out in the Last 5 Years - MovieWeb - November 16th, 2023 [November 16th, 2023]
- Can You Imagine Life Without White Supremacy? - Dallasweekly - November 16th, 2023 [November 16th, 2023]
- Will Humanity solve the AI Alignment Problem? | by Enrique Tinoco ... - Medium - October 31st, 2023 [October 31st, 2023]
- The Tesla Trap; Ellison Going Nuclear; Dont Count Headsets Out - Equities News - October 31st, 2023 [October 31st, 2023]
- Future Investment Initiative emphasizes global cooperation and AI ... - Saudi Gazette - October 31st, 2023 [October 31st, 2023]
- AI systems favor sycophancy over truthful answers, says new report - CoinGeek - October 31st, 2023 [October 31st, 2023]
- What "The Creator", a film about the future, tells us about the present - InCyber - October 31st, 2023 [October 31st, 2023]
- Invincible's Guardians Of The Globe Team Members, History ... - Screen Rant - October 31st, 2023 [October 31st, 2023]
- From streaming wars to superintelligence with John Oliver & Calum ... - KNEWS - The English Edition of Kathimerini Cyprus - October 22nd, 2023 [October 22nd, 2023]
- Reckoning with self-destruction in Oppenheimer, Indiana Jones, and ... - The Christian Century - October 22nd, 2023 [October 22nd, 2023]
- How Microsoft's CEO tackles the ethical dilemma of AI and its ... - Medium - October 22nd, 2023 [October 22nd, 2023]
- Managing risk: Pandemics and plagues in the age of AI - The Interpreter - October 22nd, 2023 [October 22nd, 2023]
- Artificial Intelligence Has No Reason to Harm Us - The Wire - August 2nd, 2023 [August 2nd, 2023]
- Fischer Black and Artificial Superintelligence - InformationWeek - August 2nd, 2023 [August 2nd, 2023]
- OpenAI Forms Specialized Team to Align Superintelligent AI with ... - Fagen wasanni - August 2nd, 2023 [August 2nd, 2023]
- 10 Best Books on Artificial Intelligence | TheReviewGeek ... - TheReviewGeek - August 2nd, 2023 [August 2nd, 2023]
- The Concerns Surrounding Advanced Artificial Intelligence and the ... - Fagen wasanni - August 2nd, 2023 [August 2nd, 2023]
- Decentralized AI: Revolutionizing Technology and Addressing ... - Fagen wasanni - August 2nd, 2023 [August 2nd, 2023]
- An 'Oppenheimer Moment' For The Progenitors Of AI - NOEMA - Noema Magazine - August 2nd, 2023 [August 2nd, 2023]
- The Implications of AI Advancements on Human Thinking and ... - Fagen wasanni - August 2nd, 2023 [August 2nd, 2023]
- Focusing on Tackling Algorithmic Bias is Key to Ethical AI ... - Fagen wasanni - August 2nd, 2023 [August 2nd, 2023]
- Discover This SUPER Early AI Crypto Gem - Altcoin Buzz - August 2nd, 2023 [August 2nd, 2023]
- Biden meets with AI leaders to discuss its 'enormous promise and its ... - KULR-TV - June 20th, 2023 [June 20th, 2023]
- VivaTech: The Secret of Elon Musk's Success? 'Crystal Meth' - The New Stack - June 20th, 2023 [June 20th, 2023]
- AI meets the other AI - POLITICO - POLITICO - June 20th, 2023 [June 20th, 2023]
- Squid Game trailer for real-life reality contest prompts confusion from Netflix users - Yahoo News - June 20th, 2023 [June 20th, 2023]
- Our Future Inside The Fifth Column- Or, What Chatbots Are Really For - Tech Policy Press - June 20th, 2023 [June 20th, 2023]
- Elon Musk refuses to 'censor' Twitter in face of EU rules - Roya News English - June 20th, 2023 [June 20th, 2023]
- AI alignment - Wikipedia - January 4th, 2023 [January 4th, 2023]
- Are We Living In A Simulation? Can We Break Out Of It? - December 28th, 2022 [December 28th, 2022]
- Amazon.com: Superintelligence: Paths, Dangers, Strategies eBook ... - October 13th, 2022 [October 13th, 2022]
- What is Artificial Super Intelligence (ASI)? - GeeksforGeeks - October 13th, 2022 [October 13th, 2022]
- Literature and Religion | Literature and Religion - Patheos - October 13th, 2022 [October 13th, 2022]
- Why AI will never rule the world - Digital Trends - September 27th, 2022 [September 27th, 2022]
- Why DART Is the Most Important Mission Ever Launched to Space - Gizmodo Australia - September 27th, 2022 [September 27th, 2022]
- 'Sweet Home Alabama' turns 20: See how the cast has aged - Wonderwall - September 27th, 2022 [September 27th, 2022]
- Research Shows that Superintelligent AI is Impossible to be Controlled - Analytics India Magazine - September 24th, 2022 [September 24th, 2022]
- Eight best books on AI ethics and bias - INDIAai - September 20th, 2022 [September 20th, 2022]
- AI Art Is Here and the World Is Already Different - New York Magazine - September 20th, 2022 [September 20th, 2022]
- Would "artificial superintelligence" lead to the end of life on Earth ... - September 14th, 2022 [September 14th, 2022]
- The Best Sci-Fi Movies on HBO Max - CNET - September 14th, 2022 [September 14th, 2022]
- Elon Musk Shares a Summer Reading Idea - TheStreet - August 2nd, 2022 [August 2nd, 2022]
- Bullet Train Review: Brad Pitt Even Shines in an Action-Packed Star Vehicle that Goes Nowhere Fast - IndieWire - August 2nd, 2022 [August 2nd, 2022]
- Peter McKnight: The day of sentient AI is coming, and we're not ready - Vancouver Sun - June 20th, 2022 [June 20th, 2022]
- SC judges should have minimum of seven to eight years of judgeship tenure: Justice L Nageswara Rao - The Tribune India - May 25th, 2022 [May 25th, 2022]
- WATCH: Neuralink, mind uploading and the AI apocalypse - Hamilton Spectator - May 3rd, 2022 [May 3rd, 2022]
- Artificial Intelligence And the Human Context of War - The National Interest Online - May 3rd, 2022 [May 3rd, 2022]
- Elon Musk and the Posthumanist Threat | John Waters - First Things - May 3rd, 2022 [May 3rd, 2022]
- Eliminating AI Bias: Human Intelligence is Not the Ultimate Solution - Analytics Insight - April 15th, 2022 [April 15th, 2022]
- If You Want to Succeed With Artificial Intelligence in Marketing, Invest in People - CMSWire - April 15th, 2022 [April 15th, 2022]
- Here Are All The TV Shows And Movies To Watch Featuring The Cast Of "Atlanta" - BuzzFeed - April 15th, 2022 [April 15th, 2022]
- What2Watch: This week's worth-the-watch - Review - Review - March 31st, 2022 [March 31st, 2022]
- Top 10 Algorithms Helping the Superintelligent AI Growth in 2022 - Analytics Insight - March 29th, 2022 [March 29th, 2022]
- AI Ethics Keeps Relentlessly Asking Or Imploring How To Adequately Control AI, Including The Matter Of AI That Drives Self-Driving Cars - Forbes - March 18th, 2022 [March 18th, 2022]
- What to watch next on Showmax - News24 - March 18th, 2022 [March 18th, 2022]
- Does Kimi deliver the goods? Thriller aims to capitalize on Alexa, Siri and Seattles tech cachet - GeekWire - February 17th, 2022 [February 17th, 2022]
- 'Downfall' follows chain of bad decisions that led up to Boeing 737-Max crashes - Lewiston Morning Tribune - February 17th, 2022 [February 17th, 2022]
- Maybe it is Not Too Late to Buy Bitcoin! BTC has a Long Way to Go - Analytics Insight - February 17th, 2022 [February 17th, 2022]
- Giving an AI control of nuclear weapons: What could possibly go wrong? - Bulletin of the Atomic Scientists - February 7th, 2022 [February 7th, 2022]
- Meet the cast of The Afterparty - Radio Times - January 27th, 2022 [January 27th, 2022]
- 8 big threats to human stability and even existence in 2022 - AMEinfo - January 27th, 2022 [January 27th, 2022]
- Artificial Intelligence in Cardiology | AER - January 17th, 2022 [January 17th, 2022]
- Are We Living in a Computer Simulation? Artificial Superintelligence Could Provide the Answer - BBN Times - January 17th, 2022 [January 17th, 2022]
- Movie Review: The 355 - mxdwn.com - January 9th, 2022 [January 9th, 2022]
- AI control problem - Wikipedia - December 10th, 2021 [December 10th, 2021]
- REPORT : Baltic Event Works in Progress 2021 - Cineuropa - December 5th, 2021 [December 5th, 2021]
- Top Books On AI Released In 2021 - Analytics India Magazine - November 27th, 2021 [November 27th, 2021]
- Inside the MIT camp teaching kids to spot bias in code - Popular Science - November 27th, 2021 [November 27th, 2021]
- 7 Types Of Artificial Intelligence - Forbes - November 17th, 2021 [November 17th, 2021]
- The Flash Season 8 Poster Kicks Off Five-Part Armageddon Story Tonight on The CW - TVweb - November 17th, 2021 [November 17th, 2021]
- Nick Bostrom - Wikipedia - November 15th, 2021 [November 15th, 2021]
- Cowboy Bebop; Ein dogs were really spoiled on set - Dog of the Day - November 13th, 2021 [November 13th, 2021]
- Inside the Impact on Marvel of Brian Tyree Henry's Openly Gay Character in 'Eternals' - Black Girl Nerds - November 13th, 2021 [November 13th, 2021]
- The funny formula: Why machine-generated humor is the holy grail of A.I. - Digital Trends - November 13th, 2021 [November 13th, 2021]
- The World's First Decentralized Search Engine for Web3 to Be Launched at the Blockchain Conference in Lisbon - NewsBTC - November 5th, 2021 [November 5th, 2021]