Machine Learning – harbinger of the future of AI?

Attempts to create Artificial Intelligence that can perform tasks at or beyond the level of a human being have, to date, been limited by the tendency of AI researchers to attempt to hand-code knowledge into AI systems, typically using something like the first order predicate calculus. Recently, the discipline of machine learning has shown that for some limited problems, you can create an algorithm that learns its own knowledge in a formalism that is suited towards making predictions - probability theory.
Why didn't the attempts to hand-code knowledge work? There are several reasons. The most prominent is that humans have an ability to introspect, but that ability is severely limited. Imagine a large cake of cognitive algorithms that we all have but are unaware of, and a thin layer of icing consisting of pieces of knowledge and algorithms that we are aware of; this seems to be an accurate picture empirically, and plausible evolutionarily. The first attempts to create artificial intelligence were bound to fail, because the obvious thing to do is to introspect on your own cognition (and, of course, you can only see the icing) and then code that into a computer. The result is rather like the cargo cultists who

"attemped to get cargo to fall by parachute or land in planes or ships again, by imitating the same practices they had seen the soldiers, sailors, and airmen use. They carved headphones from wood and wore them while sitting in fabricated control towers. They waved the landing signals while standing on the runways. They lit signal fires and torches to light up runways and lighthouses. "

In fact, it is worse than that. Computers themselves were invented by men and women who were trying to formalize the notion of "computation" - which was a notion derived from the icing that they could see in their own heads and the activities such as adding up a list of numbers for accountancy purposes, following definite algorithms or evaluating deterministic computations, which themselves were derived from what humans could easily introspect upon and what customs humans found socially useful to implement. My hypothesis is that humans found (and still find) it socially useful to have definite, digital laws and procedures, even when there are very few aspects of the world humans used to inhabit that were digital; for example the custom of giving an exact amount of money for a given product (try making an agreement with your local shop to buy 99p chocolate bars for 99 + Gaussian(0,10) pence, for example) or having laws with crisp boundaries (for example, having sex with a person who is 5844 days old is fine, but if s/he is 5843 days old, then you are a criminal.) . Why do we find such crisp, digital boundaries useful? Because they make it easier to catch people who cheat and to impose punishments and rewards that solve the social co-ordination problems that humans are beset with. This must have influenced our notion of computation.
Those computing pioneers probably invented a notion of computation that was biased in favour of the icing (their own introspection and the social customs that they were immersed in), and missed the cake. Vikash Kumar Mansinghka at MIT who has just published a thesis on Natively Probabilistic Computation that might get closer to the "cake" - the invisible mainstay of human cognition:
Probabilistic algorithms and state machines work by massively parallel stochastic walks, rather than carefully coordinated sequences of deterministic steps. We expect them to eventually produce desired outputs in reasonable proportions, rather than perform any given step precisely. This may help us model biological, neural, psychological and social systems, which robustly exhibit reasonable behavior under a wide range of conditions but rarely - if ever - can be made to repeat themselves perfectly.

Our machines will begin to sanity check implausible inputs and sample plausible alternatives rather than blindly follow our instructions. Our interactions will someday be taken as noisy evidence, interpreted with respect to probabilistic programs that model our intent, rather than taken as definite inputs to some deterministic function. This epistemological flexibility, arising from the wiggle room afforded by probability, could potentially allow us to one day build a probabilistic computer that is not well described by the phrase “garbage in, garbage out”.

Probabilistic computation may also provide clues for understanding neural computation and cognitive architecture. We can let go of our focus on calculating probabilities and optimal actions, instead favoring systems that sample good guesses. For example, neural systems may appear noisy because they are trying to solve problems of inference and decision making under uncertainty by sampling. The variability might not be Gaus sian error around some linearized set-point, but rather the natural dynamics of a distributed circuit that is robustly hallucinating world states in accordance with a generative probabilistic model and the evidence of the senses.


Mansinghka proposes re-building computation from the ground up, with the basic physical components of computers replaced with components that are optimized for massively parallel stochastic simulation at low accuracy rather than deterministic sequential operations at very high accuracy.
Researchers like Shane Legg and Marcus Hutter have proposed the machine learning paradigm as a foundation for general intelligence, and in the final chapter of my thesis I argue that finding structure in data overcomes many problems that the formal logic/hand-coding paradigm is beset with.
Related Posts

Comments are closed.