Philosophy In line with current policy thinking

openPDS/SafeAnswers allows users to collect, store, and give fine-grained access to their data all while protecting their privacy.

With the rise of smartphones and their built-in sensors as well as web-apps, an increasing amount of personal data is being silently collected. Personal data–digital information about users’ location, calls, web-searches, and preferences–is undoubtedly the oil of the new economy. However, the lack of access to the data makes it very hard if not impossible for an individual to understand and manage the risks associated with the collected data. Therefore, advancements in using and mining this data have to evolve in parallel with considerations about ownership and privacy.

Many of the initial and critical steps towards individuals data ownership are technological. Given the huge number of data sources that a user interacts with on a daily basis, interoperability is not enough. Rather, the user needs to actually own a secured space, a Personal Data Store (PDS) acting as a centralized location where his data live. Owning a PDS would allow the user to view and reason about the data collected. The user can then truly control the flow of data and manage fine-grained authorizations for accessing his data.

Publications and Selected Press

Visuals

openPDS/SafeAnswers privacy settings (.zip)

Our vision

We believe that a a New Deal on data is needed. When it comes from data, "ownership" should to be thought of according to the old English common law. Data ownership would therefore be defined as the rights of possession, use, and disposal instead of a literal ownership.

Current thinking

Discussions on such changes and their implications for privacy must also take into account the current political and legal context. We developed openPDS to be the reference implementation of the policies proposed by the National Strategy for Trust Identities in Cyberspace (NSTIC), The Department of Commerce Green Paper, and the Office of the President’s International Strategy for Cyberspace. openPDS/SafeAnswers implementation is also aligned with the European Commission’s 2012 reform of the data protection rules. This reform states individuals’ right to be forgotten, to have easier access to their data, and to be able to easily transfer them. These recommendations, proposed reforms, and regulations all recognize the increasing need for personal data to be under the control of the individual as he is the one who can best mitigate associated risks

Rules

The system rules and participation agreements address the need for harmonized business, legal and technical measures to enable distributed and interoperable systems such as openPDS/SafeAnswers. The latest version of the documents are available on our GitHub repository, where the current research and development on the legal and software code is openly available for public access and re-use.

Architecture and SafeAnswers

Privacy risks

Protecting the privacy of personal data is known to be a hard problem. The recent advances in collecting, storing, and processing high-dimensional data such as call or credit card records at scale makes it even harder. The risks associated with these high-dimensional data are often subtle and hard to predict and anonymizing them is known to be a challenge.

Geospatial data, the second most recorded information by smartphone apps, is probably the best example of the risks and rewards associated with high-dimensional data. On the one hand, the number of users of location-aware services such as Google Local Search, Foursquare and Glancee, are rising quickly as they demonstrate the benefits of location-based services to users. On the other hand, a recent study showed that 4 spatio-temporal points, approximate places and times, are enough to uniquely identify 95% of 1.5M people in a mobility database. The study further shows that these constraints hold even when the resolution of the dataset is low. Therefore, even coarse or blurred datasets provide little anonymity.

Only answers, no raw data

We strongly believe that it will be extremely difficult to anonymize high-dimensional data such as geolocation while retaining the value of the data. Consequently, openPDS turns the problem on its head using a innovative SafeAnswers framework. SafeAnswers allows applications to ask questions that will be answer using the user's personal data. In practice, applications will send code to be run against the data and the answer will be send back to them. openPDS/SafeAnswers ships code, not data. openPDS/SafeAnswers turns a very hard anonymization problem to an easier security problem.

SafeAnswers uses two separate layers for aggregating the user’s data: (1) sensitive data processing takes place within the user’s PDS allowing the dimensionality of the data to be safely reduced on a per-need basis; (2) data can be anonymously aggregated across users without the need to share sensitive data with an intermediate entity through a privacy-preserving group computation method

With SafeAnswers generic computations on user data are performed in the safe environment of the PDS, under the control of the user: the user does not have to hand data over to receive a service. Only the answers, summarized data, necessary to the app leaves the boundaries of the user’s PDS. Rather than exporting raw accelerometer or GPS data, it could be sufficient for an app to know if you’re active or which general geographic zone you are currently in. Instead of sending raw accelerometers readings or GPS coordinates to the app owner’s server to process, that computation can be done inside the user’s PDS by the corresponding Q&A module.

Implementation and preliminary studies

All our code is open-source and available on our GitHub account.

Team and Partners

Yves-Alexandre de Montjoye MIT Media Lab yva@mit.edu
Alex "Sandy" Pentland MIT Media Lab
Erez Shmueli MIT Media Lab
John Clippinger IDcubed
Brian Sweatt MIT Media Lab
Arek Stopczynski DTU/MIT Media Lab
Dazza Greenwood MIT Media Lab
World Economic Forum Partners
Telecom Italia Users
Telefonica Partners
Mobile Territorial Labs Users
Fondazione Bruno Kessler Partners