Building a World Where Data Privacy Exists Online

Dawn Song, an expert in computer security and trustworthy artificial intelligence, is working on making that vision a reality.

Dawn Song, a professor at the University of California, Berkeley, is an expert in computer security and trustworthy artificial intelligence.
Credit...Jason Henry for The New York Times

This article is part of our Women and Leadership special section, which focuses on approaches taken by women, minorities or other disadvantaged groups challenging traditional ways of thinking.

Data is valuablesomething that companies like Facebook, Google and Amazon realized far earlier than most consumers did. But computer scientists have been working on alternative models, even as the public has grown weary of having their data used and abused.

Dawn Song, a professor at the University of California, Berkeley, and one of the world’s foremost experts in computer security and trustworthy artificial intelligence, envisions a new paradigm in which people control their data and are compensated for its use by corporations. While there have been many proposals for such a system, Professor Song is one actually building the platform to make it a reality.

“As we talk about data as the new oil, it’s particularly important to develop technologies that can utilize data in a privacy-preserving way,” Professor Song said recently from her San Francisco office with an expansive view of the bay.

It is an unlikely trajectory for Professor Song, who grew up in Dalian, China, a seaport in the northeastern province of Liaoning. She is the daughter of two local civil servants with no background in computers. And while she was an exceptional student in high school, she dreamed of being a National Geographic-style nature photographer. One of her teachers, a mentor, gently dissuaded her.

Her mother wanted her to study business and filled out an application on her behalf for a well-known business school. Then, shortly before the national college entrance exams, her mentor intervened again, convincing her mother that a brighter future lay ahead for her daughter in science. Professor Song applied instead to Tsinghua University, China’s top science university, to study physics. She went on to study physics at Cornell University but transferred to Carnegie Mellon University, where she received an M.S. in computer science before settling at Berkeley to finally finish her Ph.D. in computer science. By then, she was focused on computer security.

Professor Song drew attention while still a graduate student at Berkeley with pioneering work that showed a machine-learning algorithm can infer what someone is typing from the timing of their keystrokes picked up by eavesdropping on a network. Since then, she has been at the forefront of trustworthy A.I., including improving the resilience of machine-learning models themselves, the recursive blocks of computer code that learn to recognize patterns in the data they consume.

ImageProfessor Song with examples of stop signs that, with a few stickers attached, were able to fool computer-vision systems.
Credit...Jason Henry for The New York Times

Machine-learning models, as amazing as they are at identifying everything from tumors in X-ray images to words in slurred speech, remain disturbingly easy to fool. Professor Song and her students were the first ones to demonstrate that computer-vision systems could be fooled into identifying a stop sign as a 40-miles-per-hour speed limit sign simply by applying a few innocuous stickers to the sign. Examples of these altered traffic signs have been on exhibit at London’s Science Museum.

“Her work on the stop sign was among the first to craft adversarial examples in the physical domain rather than just manipulating image pixels on a computer,” said Battista Biggio, an assistant professor at Italy’s University of Cagliari and one of the first people to study the vulnerabilities of such systems.

Professor Song, who has taught at Berkeley for a dozen years, has been working to develop techniques and systems that not only can provide security to computer systems, but also privacy. She envisions a world of secure networks where individuals control their personal data and even derive income from it. She compares the world today to a time in human history when people did not have a clear notion of property rights. Once those rights were institutionalized and protected, she notes, it helped revolutionize economies.

She recently started a company, Oasis Labs, that is building a platform that can give people the ability to control their data and audit how it is used. She believes that once data is viewed as property, it can propel the global economy in ways unseen before. “New business models can be built on this,” she said.

Data, of course, is not like a physical object. If a person gives a friend an apple, then someone else cannot have that apple. But data is different, with a property that scientists call nonrivalry. People can give (or sell) as many copies as they want.

Most people give away their data, signing it over to companies by clicking “accept,” not even bothering to read the fine print. Either people online accept the terms and participate in the digital world or they unplug — something that is not really an option for anyone operating in the global economy. Fortunes were built on that data, enriching a handful of entrepreneurs.

“Our data has never been more at risk, and our need for new kinds of robust privacy solutions has never been greater,” said Guy Zyskind, co-founder and chief executive of Enigma, another company building a decentralized private computation protocol.

When people go online, data is collected and stored on centralized servers that are vulnerable to attack. But Professor Song and her colleagues believe that by marrying specialized computer chips and blockchain technology, they can build a system that provides greater scalability and privacy protection.

Credit...Jason Henry for The New York Times

Some computer chips — those in most cellphones, for example — already incorporate a secure zone, called a trusted execution environment, that protects software from most kinds of attack. Professor Song’s group is working on enhancing the security of those zones by building an open-source secure enclave, Keystone. Within the secure enclave, bits of computer code, called smart contracts, allow data owners to control who has access to their data and how it is used.

“You can actually have the integrity that the blockchain ledger provides and also you can have privacy or confidentiality for the smart contract execution that’s provided by the secure enclave,” said Professor Song, who speaks rapidly as if rushing to keep pace with her thoughts. “No central server ever sees the data.”

Oasis Labs has been building a platform to support enterprises and developers. They have begun a pilot with Nebula Genomics, a direct-to-consumer gene-sequencing company, that offers whole genome sequencing reports on ancestry, wellness, and genetic traits with weekly updates. Using Oasis Labs’ privacy-preserving tools, Nebula customers will retain full control and ownership over their genomic data, while enabling Nebula to run specific analysis on the data without exposing the underlying information.

Another application, called Kara, a collaboration with Dr. Robert Chang at the Stanford University School of Medicine, gives eye patients the option to share retina scans and other medical data with researchers who use the data to train machine-learning models to recognize disease.

Part of the Kara project is studying what kind of incentives patients will find meaningful in return for contributing their data for medical research.

“Her approach is unique from other data aggregators,” Dr. Chang said. “This project is really asking the important question — who really owns the data?”

Someday, Professor Song believes, people will have an individual revenue stream from their data. It may not be significant on a monthly or even annual basis, but the fees that accumulate over the course of a lifetime from companies using personal data could contribute to retirement savings, for example. Or revenue from groups of people could be used to fund particular causes. The unlocking of data, meanwhile, could lead to improved services for consumers.

“Today, companies are taking users’ data and essentially using it as a product; they monetize it,” Professor Song said. “The world can be very different if this is turned around and users maintain control of the data and get revenue from it.”

Craig S. Smith is a former correspondent for The Times and now hosts the podcast Eye on A.I.