# Differential Privacy

I have written earlier about the loss of privacy and that one can find out anything about  anybody using the Internet http://blog.sciencenet.cn/blog-1565-243352.html  and http://blog.sciencenet.cn/home.php?mod=space&uid=1565&do=blog&id=35816 . Nowadays with “Big Data”, the problem  becomes even more acute. Removing personal information from every piece  of “big data”every time a query is posed is infeasible. Suggestions were also made that perhaps only statistical (but not personal) data should be allowed when posing an inquiry by the public. However, even if implemented, this is not safe because of the concept of “differential privacy” or “differential privacy attack”. In other word,  personal data can be gleamed from several statistical inquiries. For example,

Statistical Query 1. How many employees are there in the company (say SinoPac) that have  a criminal record?

Statistical Query 2. How many employees are there in the company (say SinoPac) except  the president that have a criminal record?

The answers to these two seemingly statistical questions actually can reveal personal  information about the president of the company, SinoPac. Now if you further couple this information with other publically available information about the company , court/bank records,  etc.you can find out all kinds of personal information. Such differential privacy attackpossibilities using “big data” are endless. Thus,the question becomes how to defend against such intrusions.

Foremost among researchers is Dr. Cynthia Dwork (google her) who among other things uses the following idea to defend against differential privacy attack. If you add zero mean noise to data when answering statistical questions, then the accuracy of your answer will not be changed when big data are involved. However,individual information will be hidden from view in the example above. Mathematically you can prove privacy is preserved under well defined conditions.

The battle between defenders and attackers never ends.

http://blog.sciencenet.cn/blog-1565-866637.html

## 何毓琦

GMT+8, 2019-3-20 05:04