武夷山分享 http://blog.sciencenet.cn/u/Wuyishan 中国科学技术发展战略研究院研究员;南京大学信息管理系博导

博文

与博友的一次讨论(2014)

已有 1332 次阅读 2024-1-23 08:08 |个人分类:科学计量学研究|系统分类:观点评述

与博友的一次讨论

武夷山

20140910

 

XX厉害!这三处都是我犹豫的地方。

1.      原文是error,但我觉得,测度有误差,数据有错误。典型例子:我国台湾有东吴大学,英文是Soochow University。我们中国科学技术信息研究所曾在早年的论文统计中发现,SCI数据库统计出的苏州大学论文中曾经混入不少东吴大学的论文,因为数据库加工者不清楚这其实是两所大学。这就是张冠李戴的错误,不是误差。

2.      “统计功效”的表达,接受。

3.      Data Snooping,未发现统一的定义。译为“数据探测”时,显然不是贬义的,而本社评对Data Snooping持否定态度。估计社评作者认可以下的定义(黑体是我加重的):

WHITE, H., 2000. A Reality Check for Data Snooping. Econometrica. [Cited by 193] (34.80/year)

"Data snooping occurs when a given set of data is used more than once for purposes of inference or model selection. When such data reuse occurs, there is always the possibility that any satisfactory results obtained may simply be due to chance rather than to any merit inherent in the method yielding the results. This problem is practically unavoidable in the analysis of time-series data, as typically only a single history measuring a given phenomenon of interest is available for analysis. It is widely acknowledged by empirical researchers that data snooping is a dangerous practice to be avoided, but in fact it is endemic. The main problem has been a lack of sufficiently simple practical methods capable of assessing the potential dangers of data snooping in a given situation. Our purpose here is to provide such methods by specifying a straightforward procedure for testing the null hypothesis that the best model encountered in a specification search has no predictive superiority over a given benchmark model. This permits data snooping to be undertaken with some degree of confidence that one will not mistake results that could have been generated by chance for genuinely good results."



https://blog.sciencenet.cn/blog-1557-1418997.html

上一篇:[转载]苏力:一个文科学人如何面对中国的现代化
下一篇:与外单位合作伙伴讨论共同承担的课题(2011)
收藏 IP: 1.202.113.*| 热度|

5 张忆文 郑永军 尤明庆 陆仲绩 杨正瓴

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-11-24 04:07

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部