AI still can't recognise these simple pictures

Look at these black and yellow bars and tell me what you see. Not much, right? Ask state-of-the-art artificial intelligence the same question, however, and it will tell you they're a school bus. It will be over 99 percent certain of this assessment. And it will be totally wrong.

Computers are getting truly, freakishly good at identifying what they're looking at. They can't look at this pictureand tell you it's a chihuahua wearing a sombrero, but they can say that it's a dog wearing a hat with a wide brim. A new paper, however, directs our attention to one place these super-smart algorithms are totally stupid. It details how researchers were able to fool cutting-edge deep neural networks using simple, randomly generated imagery. Over and over, the algorithms looked at abstract jumbles of shapes and thought they were seeing parrots, ping pong paddles, bagels, and butterflies.

The findings force us to acknowledge a somewhat obvious but hugely important fact: Computer vision and human vision are nothing alike. And yet, since it increasingly relies on neural networks that teach themselves to see, we're not sure preciselyhowcomputer vision differs from our own. As Jeff Clune, one of the researchers who conducted the study, puts it, when it comes to AI, "we can get the results without knowing how we're getting those results."

Evolving Images to Fool AI One way to find out how these self-trained algorithms get their smarts is to find places where they are dumb. In this case, Clune, along with PhD students Anh Nguyen and Jason Yosinski, set out to see if leading image-recognising neural networks were susceptible to false positives. We know that a computer brain can recognise a koala bear. But could you get it to call something else a koala bear?

To find out, the group generated random imagery using evolutionary algorithms. Essentially, they bred highly-effective visual bait. A program would produce an image, and then mutate it slightly. Both the copy and the original were shown to an "off the shelf" neural network trained on ImageNet, a data set of 1.3 million images, which has become a go-to resource for training computer vision AI. If the copy was recognised as something -- anything -- in the algorithm's repertoire with more certainty the original, the researchers would keep it, and repeat the process. Otherwise, they'd go back a step and try again. "Instead of survival of the fittest, it's survival of the prettiest," says Clune. Or, more accurately, survival of the most recognisable to a computer as an African Gray Parrot.

Eventually, this technique produced dozens images that were recognised by the neural network with over 99 percent confidence. To you, they won't seem like much. A series of wavy blue and orange lines. A mandala of ovals. Those alternating stripes of yellow and black. But to the AI, they were obvious matches: Star fish. Remote control. School bus.

Peering Inside the Black Box In some cases, you can start to understand how the AI was fooled. Squint your eyes, and a school bus can look like alternating bands of yellow and black. Similarly, you could see how the randomly generated image that triggered "monarch" would resemble butterfly wings, or how the one that was recognised as "ski mask" does look like an exaggerated human face.

But it gets more complicated. The researchers also found that the AI could routinely be fooled by images of pure static. Using a slightly different evolutionary technique, they generated another set of images. These all look exactly alike -- which is to say, nothing at all, save maybe a broken TV set. And yet, state of the art neural networks pegged them, with upward of 99 percent certainty, as centipedes, cheetahs, and peacocks.

To Clune, the findings suggest that neural networks develop a variety of visual cues that help them identify objects. These cues might seem familiar to humans, as in the case of the school bus, or they might not. The results with the static-y images suggest that, at least sometimes, these cues can be very granular. Perhaps in training, the network notices that a string of "green pixel, green pixel, purple pixel, green pixel" is common among images of peacocks. When the images generated by Clune and his team happen on that same string, they trigger a "peacock" identification. The researchers were also able to elicit an identification of "lizard" with abstract images that looked nothing alike, suggesting that the networks come up with a handful of these cues for each object, any one of which can be enough to trigger a confident identification.

The fact that we're cooking up elaborate schemes to trick these algorithms points to a broader truth about artificial intelligence today: Even when it works, we don't always know how it works. "These models have become very big and very complicated and they're learning on their own," say Clune, who heads the Evolving Artificial Intelligence Laboratory at the University of Wyoming. "There's millions of neurons and they're all doing their own thing. And we don't have a lot of understanding about how they're accomplishing these amazing feats."

Read more:

AI still can't recognise these simple pictures

Related Posts

Comments are closed.