yueliusd07017的个人博客分享 http://blog.sciencenet.cn/u/yueliusd07017

博文

[转载]Why Most Published Research Findings are False (听力资料合集)

已有 1446 次阅读 2024-2-15 09:30 |个人分类:科技英语|系统分类:科普集锦|文章来源:转载

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195890

'Why Most Published Research Findings are False' Part I

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195891

'Why Most Published Research Findings are False' Part II

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195892

'Why Most Published Research Findings are False' Part III.

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195893

'You should do solid work, that's priority one' Bruce Beutler, Nobel Laureate

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195894

The Statistical Crisis in Science and How to Move Forward by Professor Andrew Gelman

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195895

The Problem of Bad Research!

The Challenges of Evidence-Based Medicine (Part 1).mp4

The Challenges of Evidence-Based Medicine (Part 1)

The Challenges of Evidence-Based Medicine (Part 2).mp4

The Challenges of Evidence-Based Medicine (Part 2)

Can I trust what’s written in scientific journals- Nobel Laureate Tim Hunt.mp4

Can science be objective- - John Ioannidis, Claudia de Rham & Harry Collins.mp4

Is Science Reliable.mp4

英汉对照 (机器翻译)

https://blog.sciencenet.cn/home.php?mod=attachment&id=1195895

The Problem of Bad Research!

In 2005, a Stanford medical professor 

John Ioannidis published an essay titled  

“Why Most Published Research Findings Are False”, 

where he showed that the results of many medical  

research studies could not be replicated by 

other researchers. This is obviously a problem!

 

A subsequent survey by the Science 

Journal Nature showed that more than  

70% of researchers have tried and failed to 

reproduce another scientist's experiments,  

not only that, but more than half admit to having 

failed to reproduce their own experiments. 

 

During a decade as head of global cancer research 

at Amgen, C. Glenn Begley identified 53 “landmark”  

publications -- papers in top journals, from 

reputable labs -- for his team to reproduce.  

He sought to double-check the findings before 

trying to build on them for drug development.

 

He found that 47 of the 53 could not be 

replicated, causing huge problems for those trying  

to produce new medicines based upon the findings.

So, what might be causing this problem? Well,  

part way through his project to reproduce these 

landmark cancer studies, Begley met with the lead  

scientist of one of the problematic studies.

He told the scientist that he had gone through  

the paper line by line, figure by figure and 

re-did the experiment 50 times and never got  

the published result. The scientist told him 

that they’d done the experiment six times,  

got the published result once and put it in 

the paper because it made the best story. 

 

Such selective publication is just one 

reason that the scientific literature  

is peppered with incorrect results.

Many blame the hypercompetitive academic  

environment, as researchers compete for 

diminishing funding. The surest ticket to  

getting a grant or a good job is getting published 

in a high-profile journal, and this can lead  

a scientist to engage in sensationalism 

and sometimes even dishonest behavior.

 

Obviously, this is most concerning in the 

world of medicine, but the same problem  

can be found in all other areas of research. 

Incredibly influential and commonly accepted  

theories have been found in recent years 

to be false under more rigorous retests. 

 

In 2011, Joseph Simmons, a psychologist at the 

University of Pennsylvania, published a paper in  

the journal Psychological Science, where he showed 

that people who listened to the Beatles song "When  

I'm Sixty-Four" grew younger, by nearly 18 months. 

The result was obviously ridiculous but the point  

the paper made was serious. It showed how standard 

scientific methods, when abused could generate  

scientific support for just about anything. 

Scientists have been shocked to discover that  

what they used to consider reasonable research 

practices were flawed and likely to generate  

false positives. This discovery has been 

labeled the “replication crisis” by the press.

 

Campbell Harvey, a professor of finance at Duke 

university argues that at least half of the 400  

supposedly market-beating strategies identified in 

top financial journals over the years are false. 

 

“It’s a huge issue,” he told the 

Financial Times. “Step one in dealing  

with the replication crisis in finance is to 

accept that there is a crisis. And right now,  

many of my colleagues are not there yet.” 

Harvey is the former editor of the Journal  

of Finance, a former president of the American 

Finance Association, and an adviser to investment  

firms like Research Affiliates and Man Group. 

He has written more than 150 papers on finance,  

several of which have won prestigious 

prizes. This is not like a child saying  

that the emperor has no clothes. Harvey’s 

criticism of the rigor of academic research  

in finance is more like the emperor himself 

announcing that he has no clothes. 

 

Obviously, the stakes of the replication crisis 

are much higher in medicine, where people’s health  

can be at risk than in the world of finance, but 

flawed financial research is often pitched to  

the public either through the press or by fund 

management companies looking to raise assets.  

Bad financial research makes its way into 

people’s portfolios and can affect their  

wealth and the comfort of their retirement.

While Ioannidis’s 2005 paper has been criticized  

over time for its use of dramatic and exaggerated 

language, most academics do agree with his paper's  

conclusions and its recommendations. So, lets 

look at some of the issues that he raised.

 

In statistics, we don’t try to prove that 

something is definitely true, instead we show  

how unlikely it is that we would have found our 

test results if the underlying process was random,  

a process known as rejecting the null hypothesis.

This approach is based on the principle of  

falsification introduced by the philosopher, 

Karl Popper. According to Popper,  

we can never prove that something is definitely 

true, we can only prove that something is  

false. Statistical hypothesis tests thus, never 

prove a model is correct, they instead show how  

unlikely it is that we would have gotten our test 

results if the idea being tested was incorrect.

 

The p value that we calculate in statistical 

hypothesis testing is the evidence against a  

null hypothesis. The smaller the p-value, the 

stronger the evidence is that our results are  

not attributable to randomness. P-scores 

are used to help us decide in medicine  

whether a given drug is actually helpful, or in 

finance if cheap stocks outperform over time.

 

p-values less than .05 are generally considered 

significant and worthy of publication,  

they tell us that there is a 5% chance that 

our results can be attributed to randomness.  

This 5% threshold was picked by Ronald Fisher – 

an important statistician in a book he published  

in 1925 as being a reasonable threshold. 

The term p-hacking, describes the deliberate or  

accidental manipulation of data in a study until 

it produces a sufficient p-value. It is the misuse  

of data analysis to find patterns in data that 

can be presented as statistically significant,  

thus dramatically increasing and understating the 

risk of false positives. If you took random data  

and tested enough hypothesizes on it, you would 

eventually come up with a study that appears to  

prove something, which is actually false.

Harvey (the former editor of the Journal of  

Finance who we mentioned earlier) attributes the 

scourge of p-hacking to incentives in academia.  

Getting a paper with a sensational 

finding published in a prestigious journal  

can earn an ambitious young professor the ultimate 

academic prize — tenure. Wasting months of work  

on a theory that does not hold up to scrutiny 

would frustrate anyone. It is therefore tempting  

to torture the data until it yields something 

interesting, even if other researchers are later  

unable to duplicate the results. And 

therein lies the problem of incentives:  

scientists have huge incentives to publish 

papers, in fact their careers depend on it;  

as one scientist Brian Nosek puts it: 

"There is no cost to getting things wrong,  

the cost is not getting things published".

But Isn't science supposed to self-correct by  

having other scientists replicate the findings of 

an initial discovery? It is a lot less glamorous  

to just replicate other people’s studies. 

Scientists want to find their own breakthrough,  

not check other scientists’ homework. 

Additionally, many journals don’t publish  

replication studies. So, if you're a scientist 

the successful strategy is clear, don’t waste your  

time on replication studies, do the kind of work 

that will get you published, and if you can find  

a result that is surprising and unusual, maybe 

it will get picked up in the popular press too.

 

Now I don't want this to be seen as a negative 

piece on science or the scientific method,  

because people are more aware of this problem 

today than in the past and things have started  

changing for the better. Many scientists 

acknowledge the problems I’ve outlined and  

are starting to take steps to correct them: there 

are more large-scale replication studies going on,  

there's a site, Retraction Watch, that publicizes 

research that has been withdrawn, there are online  

databases of unpublished negative results.

There has been a move in many fields towards  

preregistration of studies, where researchers 

write up what they plan on studying  

and the methods they will use. A journal then 

decides whether to accept it in principle.  

After the work is completed, reviewers 

simply check whether the researchers  

stuck to their own recipe; if so, the paper is 

published, regardless of what the data show.

 

This eliminates publication bias, promotes higher 

powered studies and lessens the incentive for  

p-hacking. The thing I find most striking 

about the replication crisis in academia  

is not the prevalence of incorrect information in 

published scientific journals after all getting  

to the truth we know is hard and mathematically 

not everything that is published can be correct.  

What gets me is that if we use our 

best scientific and statistical tools,  

and still make this many mistakes, 

how frequently do we delude ourselves  

when we're not using the scientific method? 

As flawed as our research methods may be,  

they are significantly more reliable than 

any other approach that we can use. 

 

Amusingly, around nine years after 

John Ioannidis wrote his essay “Why  

Most Published Research Findings Are False”, 

a team of biostatisticians Jager and Leek  

attempted to replicate his findings and calculated 

that the false positive rate in biomedical studies  

was estimated to be around 14%, not the 50% that 

Ioannidis had asserted. So, things are possibly  

not quite as bad as people thought 16 years ago, 

and science has moved in a positive direction  

where researchers are more aware of the mistakes, 

they might make than they were in the past.

 

Today’s video is based on my book Statistics 

for the Trading Floor, where I conclude with  

a chapter on common errors in statistical 

analysis and how to avoid them. There is a  

link to the book in the video description.

If you enjoyed this video, you should watch my  

video on chart crimes next.

See you later, bye.

2005 年,斯坦福大学医学教授约翰·约阿尼迪斯 (John Ioannidis) 发表了一篇题为

“为什么大多数已发表的研究结果都是错误的”的文章,他在文章中指出,许多医学

研究 的结果是 其他研究人员无法复制的。这显然有问题!

《自然》杂志随后的一项调查显示,超过

70% 的研究人员曾尝试复制另一位科学家的实验 但未能成功 ,

不仅如此,还有超过一半的人承认未能复制自己的实验。

在 Amgen 担任全球癌症研究负责人的十年中,C. Glenn Begley 确定了 53

篇 “具有里程碑意义的” 出版物——来自知名实验室的顶级期刊论文——供他的团队复制。

在尝试将其用于药物开发之前,他试图仔细检查这些发现。

他发现 53 种中的 47 种无法复制,这给那些试图 根据研究结果生产新药的

人带来了巨大的问题 。那么,什么可能导致这个问题?好吧,

在他重现这些具有里程碑意义的癌症研究的项目中,贝格利会见了

其中一项有问题的研究 的首席 科学家。他告诉这位科学家,他已经

逐行逐图 浏览 了论文,并重新进行了 50 次实验,但始终没有得到

发表的结果。科学家告诉他,他们做了六次实验,

得到了一次发表的结果,并把它放在了论文中,因为它创造了最好的故事。

这种选择性发表只是科学文献

中充斥着错误结果的原因之一。许多人归咎于竞争激烈的学术

环境,因为研究人员争夺不断减少的资金。 获得资助或一份好工作

的最可靠途径 是在知名期刊上发表文章,这可能导致

科学家从事耸人听闻的行为,有时甚至是不诚实的行为。

显然,这是医学界最令人担忧的问题,但

在所有其他研究领域中也存在 同样的问题 。 近年来,在更严格的重新测试下,

令人难以置信的影响力和普遍接受的 理论被发现是错误的。

2011 年,宾夕法尼亚大学的心理学家约瑟夫·西蒙斯 (Joseph Simmons) 在

《心理科学》(Psychological Science) 杂志上 发表了一篇论文

,他表明,听披头士乐队歌曲“当 我 64 岁”的人变年轻了近 18 个月.结果显然是荒谬的,但

论文提出 的观点 是严肃的。它展示了标准的科学方法在被滥用时如何能够

为几乎任何事情 提供 科学支持。科学家们震惊地发现,

他们过去认为合理的研究实践是有缺陷的,可能会产生

误报。这一发现被媒体称为“复制危机”。

杜克大学金融学教授坎贝尔哈维认为, 多年来在顶级金融期刊中确定

的 400 种 据称能够击败市场的策略 中,至少有一半 是错误的。

“这是一个大问题,”他告诉《金融时报》。 “应对 金融复制危机的

第一步 是接受危机的存在。现在, 我的许多同事还没有到那里。” Harvey 是《

金融 杂志 》 的前编辑、 美国金融协会的前任主席,以及

Research Affiliates 和 Man Group 等

投资 公司 的顾问 。他撰写了 150 多篇关于金融的论文

,其中几篇获得了著名的奖项。这可不是小孩子说

皇帝没衣服。哈维对 金融 学术研究严谨性的批评,

更像是皇帝自己宣布自己没有衣服。

显然,复制危机在医学领域的风险要高得多,在医学领域,人们的健康

可能比在金融领域面临风险,但有缺陷的金融研究往往

通过媒体或希望筹集资金的基金管理公司向公众宣传资产。

糟糕的金融研究会影响人们的投资组合,并可能影响他们的

财富和退休后的舒适度。尽管 Ioannidis 2005 年的论文

因使用夸张和夸张的语言 而受到批评 ,但大多数学者确实同意他论文的

结论和建议。那么,让我们来看看他提出的一些问题。

在统计学中,我们不会试图证明某事绝对正确,而是展示

了如果基础过程是随机的,我们发现测试结果的可能性有多大,

这个过程被称为拒绝零假设。这种方法基于 哲学家卡尔·波普尔(Karl Popper)引入

的 证伪 原则 。根据波普尔的说法,

我们永远无法证明某事绝对是真的,我们只能证明某事是

假的。因此,统计假设检验永远不会证明模型是正确的,而是表明

如果被检验的想法不正确,我们得到检验结果的可能性有 多大 。

我们在统计假设检验中计算的 p 值是反对 零假设 的证据

。 p 值越小,表明我们的结果 不可归因于随机性 的证据就越强

。 P-scores 用于帮助我们在医学方面

决定给定的药物是否真的有用,或者如果廉价股票随着时间的推移跑赢大盘,则在金融方面。

小于 0.05 的 p 值通常被认为是显着的并且值得发表,

它们告诉我们,我们的结果有 5% 的机会可以归因于随机性。

这个 5% 的阈值是由 Ronald Fisher 选择的,他是一位重要的统计学家,他

在 1925 年 出版的一本书 中将其作为一个合理的阈值。术语 p-hacking 描述了

对研究中数据 的故意或 意外操作,直到产生足够的 p 值。滥用

数据分析来发现数据中可以显示为具有统计意义的模式,

从而大大增加和低估了误报的风险。如果您采用随机数据

并对其进行足够的假设测试,您最终会提出一项似乎可以 证明某些事情 的研究

,但实际上这是错误的。 Harvey( 我们之前提到 的《

金融 杂志》的前任编辑 )将 p-hacking 的祸害归因于学术界的激励。

在著名期刊上发表具有轰动性发现的论文

可以为雄心勃勃的年轻教授赢得最终的学术奖——终身教职。 在一个 经不起

审查的理论上 浪费数月的工作 会让任何人感到沮丧。因此

,即使其他研究人员后来 无法复制结果 , 也很容易

折磨数据,直到它产生一些有趣的东西 。这就是激励的问题:

科学家有巨大的激励来发表论文,实际上他们的职业生涯依赖于此;

正如一位科学家布赖恩·诺塞克 (Brian Nosek) 所说:“把事情弄错是没有代价的,

代价是没有发表文章”。但是科学难道不应该通过

让其他科学家复制最初发现的结果来自我 纠正 吗? 仅仅复制别人的研究

并不那么光鲜 。科学家要找到自己的突破口,

而不是去查其他科学家的作业。此外,许多期刊不发表

重复研究。所以,如果你是一名科学家,成功的策略是明确的,不要 在重复研究上

浪费你的 时间,做那些能让你发表的工作,如果你能找到

一个令人惊讶和不寻常的结果,也许它也会被大众媒体报道。

现在我不希望这被视为对科学或科学方法的负面影响,

因为今天人们比过去更加意识到这个问题,而且情况已经开始

好转。许多科学家承认我概述的问题,

并开始采取措施纠正它们:有更多的大规模复制研究正在进行,

有一个网站,撤回观察,宣传已被撤回的研究,有在线

数据库未发表的阴性结果。许多领域都朝着 预先注册研究的

方向发展 ,研究人员在其中写下他们计划研究的内容

以及他们将使用的方法。然后期刊决定原则上是否接受。

工作完成后,审稿人只需检查研究人员是否

坚持自己的配方;如果是这样,无论数据显示什么,论文都会发表。

这消除了发表偏见,促进了更高功率的研究并减少了 p-hacking

的动机 。对于学术界的复制危机,我发现最引人注目的

不是已发表的科学期刊中错误信息的普遍存在,毕竟要

了解我们所知道的真相是困难的,而且从数学上讲,并非所有已发表的内容都是正确的。

让我吃惊的是,如果我们使用我们最好的科学和统计工具,

但仍然犯这么多错误, 当我们不使用科学方法时,我们

多久会自欺欺人 ?尽管我们的研究方法可能存在缺陷,

但它们比我们可以使用的任何其他方法都要可靠得多。

有趣的是,在 John Ioannidis 撰写他的文章“为什么

大多数已发表的研究结果是错误的” 大约九年后 ,一个由生物统计学家 Jager 和 Leek 组成的团队

试图复制他的发现并计算出生物医学研究中的假阳性率

估计约为 14% ,而不是约阿尼迪斯声称的 50%。所以,事情可能

不像 16 年前人们想象的那么糟糕,科学已经朝着积极的方向发展

,研究人员比过去更能意识到他们可能犯的错误。

今天的视频基于我的《交易大厅统计》一书,最后有

一章介绍了统计分析中的常见错误以及如何避免这些错误。

视频说明中 有 这本书 的 链接。如果你喜欢这个视频,

接下来 你应该看我 关于图表犯罪的视频。再见拜。



https://blog.sciencenet.cn/blog-3589443-1421697.html

上一篇:[转载]不能用文章发表期刊的等级来判断论文的学术质量 (科技英语听力资料,英汉对照)
下一篇:现行微波吸收理论混淆了膜和材料的区别(公开的学术擂台,接受挑战)
收藏 IP: 39.152.24.*| 热度|

1 宁利中

该博文允许注册用户评论 请点击登录 评论 (0 个评论)

数据加载中...

Archiver|手机版|科学网 ( 京ICP备07017567号-12 )

GMT+8, 2024-11-7 16:33

Powered by ScienceNet.cn

Copyright © 2007- 中国科学报社

返回顶部