Future Goals in the AI Race: Explainable AI and Transfer Learning – Modern Diplomacy

Recent years have seen breakthroughs in neural network technology:computers can now beat any living person at the most complex game invented by humankind, as well as imitate humanvoices and faces (both real and non-existent) in a deceptively realistic manner. Is this a victory for artificialintelligence over human intelligence? And if not, what else do researchers anddevelopers need to achieve to make the winners in the AI race the kings of the world?

Background

Over the last 60 years, artificialintelligence (AI) has been the subject of muchdiscussion among researchers representing different approaches and schools ofthought. One of the crucial reasons for this is that there is no unifieddefinition of what constitutes AI, with differences persisting even now. Thismeans that any objective assessment of the current state and prospects of AI, andits crucial areas of research, in particular, will be intricately linked withthe subjective philosophical views of researchers and the practical experienceof developers.

In recent years, the term general intelligence, meaning the ability tosolve cognitive problems in general terms, adapting to the environment throughlearning, minimizing risks and optimizing the losses in achieving goals, hasgained currency among researchers and developers. This led to the concept of artificialgeneral intelligence (AGI), potentially vested not in a human,but a cybernetic system of sufficient computational power. Many refer to thiskind of intelligence as strong AI, as opposed to weak AI, which has becomea mundane topic in recent years.

As applied AI technology has developed over the last 60 years, we cansee how many practical applications knowledge bases, expert systems, imagerecognition systems, prediction systems, tracking and control systems forvarious technological processes are no longer viewed as examples of AI andhave become part of ordinary technology. The bar for what constitutes AIrises accordingly, and today it is the hypothetical general intelligence,human-level intelligence or strong AI, that is assumed to be the real thingin most discussions. Technologies that are already being used are broken downinto knowledge engineering, data science or specific areas of narrow AI thatcombine elements of different AI approaches with specialized humanities ormathematical disciplines, such as stock market or weather forecasting, speechand text recognition and language processing.

Different schools of research, each working within their own paradigms,also have differing interpretations of the spheres of application, goals,definitions and prospects of AI, and are often dismissive of alternativeapproaches. However, there has been a kind of synergistic convergence ofvarious approaches in recent years, and researchers and developers areincreasingly turning to hybrid models and methodologies, coming up withdifferent combinations.

Since the dawn of AI, two approaches to AI have been the most popular.The first, symbolic approach, assumes that the roots of AI lie inphilosophy, logic and mathematics and operate according to logical rules, signand symbolic systems, interpreted in terms of the conscious human cognitiveprocess. The second approach (biological in nature), referred to asconnectionist, neural-network, neuromorphic, associative or subsymbolic, isbased on reproducing the physical structures and processes of the human brainidentified through neurophysiological research. The two approaches have evolvedover 60 years, steadily becoming closer to each other. For instance, logicalinference systems based on Boolean algebra have transformed into fuzzy logic orprobabilistic programming, reproducing network architectures akin to neuralnetworks that evolved within the neuromorphic approach. On the other hand,methods based on artificial neural networks are very far from reproducing thefunctions of actual biological neural networks and rely more on mathematicalmethods from linear algebra and tensor calculus.

Are There Holes in Neural Networks?

In the last decade, it was the connectionist, or subsymbolic, approachthat brought about explosive progress in applying machine learning methods to awide range of tasks. Examples include both traditional statisticalmethodologies, like logistical regression, and more recent achievements inartificial neural network modelling, like deep learning and reinforcementlearning. The most significant breakthrough of the last decade was broughtabout not so much by new ideas as by the accumulation of a critical mass oftagged datasets, the low cost of storing massive volumes of training samplesand, most importantly, the sharp decline of computational costs, including thepossibility of using specialized, relatively cheap hardware for neural networkmodelling. The breakthrough was brought about by a combination of these factorsthat made it possible to train and configure neural network algorithms to makea quantitative leap, as well as to provide a cost-effective solution to a broadrange of applied problems relating to recognition, classification andprediction. The biggest successes here have been brought about by systems basedon deep learning networks that build on the idea of the perceptronsuggested 60 years ago by Frank Rosenblatt. However, achievements in the use ofneural networks also uncovered a range of problems that cannot be solved usingexisting neural network methods.

First, any classic neural network model, whatever amount of data it istrained on and however precise it is in its predictions, is still a black boxthat does not provide any explanation of why a given decision was made, letalone disclose the structure and content of the knowledge it has acquired inthe course of its training. This rules out the use of neural networks incontexts where explainability is required for legal or security reasons. Forexample, a decision to refuse a loan or to carry out a dangerous surgicalprocedure needs to be justified for legal purposes, and in the event that aneural network launches a missile at a civilian plane, the causes of thisdecision need to be identifiable if we want to correct it and prevent futureoccurrences.

Second, attempts to understand the nature of modern neural networks havedemonstrated their weak ability to generalize. Neural networks rememberisolated, often random, details of the samples they were exposed to duringtraining and make decisions based on those details and not on a real generalgrasp of the object represented in the sample set. For instance, a neuralnetwork that was trained to recognize elephants and whales using sets ofstandard photos will see a stranded whale as an elephant and an elephantsplashing around in the surf as a whale. Neural networks are good atremembering situations in similar contexts, but they lack the capacity tounderstand situations and cannot extrapolate the accumulated knowledge tosituations in unusual settings.

Third, neural network models are random, fragmentary and opaque, whichallows hackers to find ways of compromising applications based on these modelsby means of adversarial attacks. For example, a security system trained toidentify people in a video stream can be confused when it sees a person inunusually colourful clothing. If this person is shoplifting, the system may notbe able to distinguish them from shelves containing equally colourful items.While the brain structures underlying human vision are prone to so-calledoptical illusions, this problem acquires a more dramatic scale with modernneural networks: there are known cases where replacing an image with noiseleads to the recognition of an object that is not there, or replacing one pixelin an image makes the network mistake the object for something else.

Fourth, the inadequacy of the information capacity and parameters of theneural network to the image of the world it is shown during training andoperation can lead to the practical problem of catastrophic forgetting. This isseen when a system that had first been trained to identify situations in a setof contexts and then fine-tuned to recognize them in a new set of contexts maylose the ability to recognize them in the old set. For instance, a neuralmachine vision system initially trained to recognize pedestrians in an urbanenvironment may be unable to identify dogs and cows in a rural setting, butadditional training to recognize cows and dogs can make the model forget how toidentify pedestrians, or start confusing them with small roadside trees.

Growth Potential?

The expert community sees a number of fundamental problems that need tobe solved before a general, or strong, AI is possible. In particular, asdemonstrated by the biggest annual AI conference held in Macao, explainable AI and transfer learning are simply necessary in somecases, such as defence, security, healthcare and finance. Many leadingresearchers also think that mastering these two areas will be thekey to creating a general, or strong, AI.

Explainable AI allows for human beings (the user of theAI system) to understand the reasons why a system makes decisions and approvethem if they are correct, or rework or fine-tune the system if they are not.This can be achieved by presenting data in an appropriate (explainable) manneror by using methods that allow this knowledge to be extracted with regard tospecific precedents or the subject area as a whole. In a broader sense,explainable AI also refers to the capacity of a system to store, or at leastpresent its knowledge in a human-understandable and human-verifiable form. Thelatter can be crucial when the cost of an error is too high for it only to beexplainable post factum. And herewe come to the possibility of extracting knowledge from the system, either toverify it or to feed it into another system.

Transfer learning is the possibility of transferringknowledge between different AI systems, as well as between man and machine sothat the knowledge possessed by a human expert or accumulated by an individualsystem can be fed into a different system for use and fine-tuning.Theoretically speaking, this is necessary because the transfer of knowledge isonly fundamentally possible when universal laws and rules can be abstractedfrom the systems individual experience. Practically speaking, it is theprerequisite for making AI applications that will not learn by trial and erroror through the use of a training set, but can be initialized with a base ofexpert-derived knowledge and rules when the cost of an error is too high orwhen the training sample is too small.

How to Get the Best of Both Worlds?

There is currently no consensus on how to make an artificial general intelligence that is capable ofsolving the abovementioned problems or is based on technologies that couldsolve them.

One of the most promising approaches is probabilisticprogramming, which is a modern development ofsymbolic AI. In probabilistic programming, knowledge takes the form ofalgorithms and source, and target data is not represented by values ofvariables but by a probabilistic distribution of all possible values. Alexei Potapov, a leading Russian expert on artificial general intelligence, thinksthat this area is now in a state that deep learning technology was in about tenyears ago, so we can expect breakthroughs in the coming years.

Another promising symbolic area is Evgenii Vityaevs semantic probabilistic modelling, which makes it possible to build explainable predictive models basedon information represented as semantic networks with probabilistic inferencebased on Pyotr Anokhins theory of functional systems.

One of the most widely discussed ways to achieve this is throughso-called neuro-symbolic integration an attempt to get the best of bothworlds by combining the learning capabilities of subsymbolic deep neuralnetworks (which have already proven their worth) with the explainability ofsymbolic probabilistic modelling and programming (which hold significantpromise). In addition to the technological considerations mentioned above, thisarea merits close attention from a cognitive psychology standpoint. As viewed by Daniel Kahneman, human thought can be construed as the interaction oftwo distinct but complementary systems: System 1 thinking is fast, unconscious,intuitive, unexplainable thinking, whereas System 2 thinking is slow,conscious, logical and explainable. System 1 provides for the effectiveperformance of run-of-the-mill tasks and the recognition of familiarsituations. In contrast, System 2 processes new information and makes sure wecan adapt to new conditions by controlling and adapting the learning process ofthe first system. Systems of the first kind, as represented by neural networks,are already reaching Gartnersso-called plateau of productivity in avariety of applications. But working applications based on systems of thesecond kind not to mention hybrid neuro-symbolic systems which the mostprominent industry players have only started to explore have yet to becreated.

This year, Russian researchers, entrepreneurs and government officialswho are interested in developing artificial general intelligence have a uniqueopportunity to attend the first AGI-2020 international conference in St. Petersburg in late June 2020, wherethey can learn about all the latest developments in the field from the worldsleading experts.

From ourpartner RIAC

Related

View post:

Future Goals in the AI Race: Explainable AI and Transfer Learning - Modern Diplomacy

Related Posts

Comments are closed.