Cos’è la privacy differenziale

Craig Federighi ha spiegato al keynote di ieri che Apple userà tecniche di “privacy differenziale” per rendere privati e sicuri i dati degli utenti di cui ha bisogno per migliorare e offrire i suoi servizi. In tal modo, se un governo o un’entità terza dovesse entrare in possesso di questi dati non dovrebbe essere in grado di ottenere alcuna informazione certa su un individuo specifico:

We believe you should have great features and great privacy. Differential privacy is a research topic in the areas of statistics and data analytics that uses hashing, subsampling and noise injection to enable…crowdsourced learning while keeping the data of individual users completely private. Apple has been doing some super-important work in this area to enable differential privacy to be deployed at scale.

In termini molto semplici (e vaghi): invece di anonimizzare un dataset (cosa che non funziona, dato che spesso questi dataset vengono de-anonimizzati senza problemi) Apple introduce per esempio dei dati falsi al suo interno, rendendo così inaffidabili le risposte dei singoli utenti. I pattern generali d’uso emergono, ma i comportamenti specifici a un utente possono rivelarsi fasulli.

Spiega Wired:

Differential privacy, translated from Apple-speak, is the statistical science of trying to learn as much as possible about a group while learning as little as possible about any individual in it. With differential privacy, Apple can collect and store its users’ data in a format that lets it glean useful notions about what people do, say, like and want. But it can’t extract anything about a single, specific one of those people that might represent a privacy violation. And neither, in theory, could hackers or intelligence agencies. […]

As an example of that last method [noise injection], Microsoft’s Dwork points to the technique in which a survey asks if the respondent has ever, say, broken a law. But first, the survey asks them to flip a coin. If the result is tails, they should answer honestly. If the result is heads, they’re instructed to flip the coin again and then answer “yes” for heads or “no” for tails. The resulting random noise can be subtracted from the results with a bit of algebra, and every respondent is protected from punishment if they admitted to lawbreaking.