大工至善|大学至真分享 http://blog.sciencenet.cn/u/lcj2212916

博文

[转载]【机器学习开放项目】与个人隐私相关的开放研究项目

已有 1670 次阅读 2019-2-3 11:59 |系统分类:科研笔记|文章来源:转载

最近的一篇文章讨论了如何在“多方安全”设置中使用密码技术工具来近似决策树。

A recent paper shows how decision trees may be approximated in the "secure multiparty" setting, by using tools of cryptography. 


这允许多个参与方在联合数据上计算决策树,而无需它们彼此之间共享数据本身

This allows several parties to compute a decision tree on the union of their data without requiring them to share the data itself with each other. 


项目1:差分私有决策树

Project Idea 1: Differentially Private Decision Trees


看看是否有可能以一种差分私有方式实现决策树学习者。

See whether it is possible to implement a decision tree learner in a differentially-private way. 


这将需要创建一个输出决策树的随机算法。

This would entail creating a randomized algorithm which outputs a decision tree.


此外,当输入数据的一个元素发生变化时,输出数据的分布不应发生太大的变化(cf,差分隐私的定义)。

Furthermore, when one of the elements of the input data is changed, the distribution over the outputs should not change by much (cf, the definition of differential privacy). 


分析在何种条件下,设计的方法将起作用,并分析相对于非私有决策树的错误率。

Analyze under what conditions the approach will work and analyze the error rate relative to that of the non-private decision tree.


项目2:多方安全EM

Project Idea 2: Secure Multiparty EM


以安全的多方方式(即,利用密码学来保护中间量)为某种混合模型实现EM。

Implement EM for some kind of mixture model in a secure multiparty way (i.e., making use of cryptography to protect the intermediate quantities). 


这已经在插补的背景下进行了研究。

This has been studied already in the context of imputation.


虽然在原则上,现有的密码原语可以用来计算任何函数,但为了得到一个相当有效的算法,你必须想出一个计算成本较低的近似值

Although in principle, existing cryptographic primitives may be used to compute any funciton, in order to get a reasonably efficient algorithm you will have to come up with approximations which are less expensive to compute. 


分析你所设计的任何近似的效果。

Analyze the effects of any such approximations you make.


项目3:差分私有稀疏协方差估计

Project Idea 3: Differentially Private Sparse Covariance Estimation


高斯协方差矩阵的估计近年来受到广泛关注。

The estimation of gaussian covariance matrices has received ample attention lately.


当协方差矩阵稀疏(即非对角元素大多为0)时,多元高斯与某种类型的无向图形模型之间具有很好的对应关系。

When the covariance matrix is sparse (i.e., the off-diagonal elements are mostly 0) then there is a nice correspondence between the multivariate gaussian, and a certain type of undirected graphical model. 


看看是否有可能以满足以上所示的差分隐私定义的方式来估计稀疏协方差矩阵。

See whether it is possible to estimate sparse covariance matrices in a way which satisfies the definition of differential privacy as shown above. 


与非私有算法相比,该算法的误差究竟有多大。

Analyze how much error will be in the algorithm compared with the non-private one.


Privacy Preserving Data Mining

http://www.pinkas.net/PAPERS/id3-final.pdf 


Differentially Private Empirical Risk Minimization

https://arxiv.org/abs/0912.0071


Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning

https://arxiv.org/abs/0911.5708


Privacy-Preserving Data Imputation

https://www.cs.rutgers.edu/~rwright1/Publications/padm06.pdf


开放的研究数据集

http://archive.ics.uci.edu/ml/datasets.html




https://blog.sciencenet.cn/blog-69686-1160559.html

上一篇:[转载]【读书2】【2014】基于MATLAB的雷达信号处理基础(第二版)——雷达散射截面的统计描述(1)
下一篇:[转载]【源码】ConvertTDMS (v10)——将LabView TDMS文件导入或转换到MATLAB工作区或mat文件
收藏 IP: 114.102.185.*| 热度|

0

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...
扫一扫,分享此博文

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-9-28 05:20

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部