However, this assumes that you can get this training data, says Kautz. He and his colleagues at Nvidia have come up with a different way to share private data, including pictures of faces and other objects, medical data, and more, that doesn’t require access to training data at all.
Instead, they developed an algorithm that can rebuild the data a trained model has been exposed to by reversing the steps the model goes through in processing that data. Take a trained image recognition network: in order to recognize what’s in an image, the network routes it through a series of layers of artificial neurons. Each level extracts different levels of information, from edges to shapes to more recognizable features.
Kautz’s team found that they could pause a model in the middle of these steps and reverse its direction by recreating the input image from the model’s internal data. They tested the technology on a variety of popular image recognition models and GANs. In a test, they showed that they were able to accurately reproduce images from ImageNet, one of the most famous image recognition data sets.
As with Webster, the reproduced images are very similar to the real ones. “We were surprised by the final quality,” says Kautz.
The researchers argue that this type of attack is not just hypothetical. Smartphones and other small devices are increasingly using AI. Due to battery and storage limitations, models are sometimes only half processed on the device itself and sent to the cloud for the final computing crunch, an approach known as split computing. Most researchers assume that split computing does not reveal any private data from a person’s phone because only the model is shared, says Kautz. But his attack shows that it is not.
Kautz and his colleagues are now working to find ways to prevent models from revealing private data. We wanted to understand the risks in order to minimize vulnerabilities, he says.
Although they use very different techniques, he finds that his work and that of Webster complement each other well. Webster’s team showed that private data can be found in the output of a model; Kautz’s team showed that private data can be revealed by doing the opposite and recreating the entry. “Exploring both directions is important to better understand how to ward off attacks,” says Kautz.