Tag: datasets

01
Oct
2020
Posted in technology

NIST is crowdsourcing differential privacy techniques for public safety datasets

The National Institute of Standards and Technology (NIST) is launching the Differential Privacy Temporal Map Challenge. It’s a set of contests, with cash prizes attached, that’s intended to crowdsource new ways of handling personally identifiable information (PII) in public safety datasets.

The problem is that although rich, detailed data is valuable for researchers and for building AI models — in this case, in the areas of emergency planning and epidemiology — it raises serious and potentially dangerous data privacy and rights issues. Even if datasets are kept under proverbial lock and key, malicious actors can, based on just a few data points, re-infer sensitive information about people.

The solution is to de-identify the data such that it remains useful without compromising individuals’ privacy. NIST already has a clear standard for what that means. In part, and simply put, it says that “De-identification removes identifying information from a dataset so that individual data cannot be linked with specific individuals.”

The purpose of the Challenge is to find better ways to do that with a technique called differential privacy. Differential privacy essentially introduces enough noise into datasets to ensure privacy. It’s widely used in products from companies like Google, Apple, and Nvidia, and lawmakers are leaning on it to inform data privacy policy.

Specifically, the Challenge focuses on temporal map data, which contains time and spatial information. The call for the NIST contest says, “Public safety agencies collect extensive data containing time, geographic, and potentially personally identifiable information.” For example, a 911 call would reveal a person’s name, age, gender, address, symptoms or situation, and more. “Temporal map data is of particular interest to the public safety community,” reads the call.

The Differential Privacy Temporal Map Challenge stands on the shoulders of similar previous NIST differential privacy Challenges — one centered on