博文

[转载]垃圾文章的大量产出导致的问题不仅仅是虚假繁荣（科技英语，英汉对照）

已有 1718 次阅读 2024-2-7 04:18 |个人分类:科技英语|系统分类:科普集锦|文章来源:转载

Significance 意义

The size of scientific fields may impede the rise of new ideas. Examining 1.8 billion citations among 90 million papers across 241 subjects, we find a deluge of papers does not lead to turnover of central ideas in a field, but rather to ossification of canon. Scholars in fields where many papers are published annually face difficulty getting published, read, and cited unless their work references already widely cited articles. New papers containing potentially important contributions cannot garner field-wide attention through gradual processes of diffusion. These findings suggest fundamental progress may be stymied if quantitative growth of scientific endeavors—in number of scientists, institutes, and papers—is not balanced by structures fostering disruptive scholarship and focusing attention on novel ideas.

科学领域的规模可能会阻碍新思想的兴起。通过检查241个学科的9000万篇论文中的18亿次引用，我们发现任何领域中论文的泛滥都不会导致思想的产出，而是导致教条和僵化。每年发表大量论文的领域的学者面临发表、阅读和引用的困难，除非他们的工作参考已经被广泛引用的文章。广泛引用代表共识而不是独到的新思想。具有潜在重要新贡献论文在大量垃圾文章的背景下的扩散和传播不会顺畅,因而获得全领域的关注。这些发现表明，科研大跃进无益于科学进步。如果科学事业的数量增长——科学家、研究所和论文的数量——不能与促进颠覆性学术和的结构相平衡，那么根本性的进步可能会受到阻碍。科学的本质是出新思想。

文献：

https://www.pnas.org/doi/full/10.1073/pnas.2021636118

https://www.pnas.org/doi/full/10.1073/pnas.2021636118?doi=10.1073%2Fpnas.2021636118

Slowed canonical progress in large fields of scienc

科研大跃进的结果是阻碍科学进步

Johan S. G. Chu and James A. Evans https://orcid.org/0000-0002-3669-0088

October 4, 2021 2021 年 10 月 4 日

https://doi.org/10.1073/pnas.2021636118

扩展阅读：

https://www.163.com/dy/article/FEE1RTDF05419EOY.html

历史学家李伯重：无论何种“学术垃圾”，都是有害的，而且都是公害

https://user.guancha.cn/main/content?id=73665

李伯重: 中国科研与其自卖自夸, 不如先从清除学术垃圾做起

https://www.zhihu.com/question/558204287

摘录：

“为什么现在越来越多搞科研的人都说自己产出的是「学术垃圾」？是否会有那么一刻觉得自己的研究有价值？

发起这个问题的初衷，是因为在我们这儿上课的同学，私下跟我们闲聊，觉得自己之前一直产出的都是“学术垃圾”。不仅她，她的同门以及其他搞学术的同伴很多都是这么想的。

Yuhang Liu：丘先生上个月在清华接受访谈时直言，大部分数学论文都是垃圾——他指的可不是什么民科论文，而是正经期刊发表的学术论文。

...

韩东燃：因为现在的学术评价体系，或者说我们的研究生培养体系，基本是建立在比赛堆积学术垃圾的基础之上的，学生为了毕业必须有文章，即便没有做出什么科研成果也要有文章，没有文章就毕不了业，毕不了业学校和导师就不好交差，不好交差就开始各种灌水。

...

韩东燃：如果自己的研究真是垃圾，却有那么一刻觉得自己的研究有价值，那基本是开始自欺欺人甚至是开始产生学术幻觉了。

...

李清言：现在的情况，更像是在生产“学术泡沫”。个人和集体都过于追求虚高的数字，沉迷其中的人顾不上反思”

------------------

雨落："我读博期间其实一直很困惑一件事，

为什么很多我觉得没啥用，但是看起来“很新颖”的东西能够发在很高IF的杂志（当然，大牛背书）。

有些我觉得很有用的东西，比如各种材料在不同波段的折射率和extinction coefficient，比如在同一和基底上集成不同的波长的吸收器（我觉得对imaging sensor会很有用啊），影响力基本没有。"

---------------------

Alex Julius："绝大部分成果既上不了货架，也上不了书架，说难听点就是学术垃圾。"

https://xzx.shnu.edu.cn/a5/33/c18434a501043/page.htm

上海师大詹丹：“学术垃圾”也是一种公害

=======

英汉对照 （机器翻译）

=======

Abstract 抽象的

In many academic fields, the number of papers published each year has increased significantly over time. Policy measures aim to increase the quantity of scientists, research funding, and scientific output, which is measured by the number of papers produced. These quantitative metrics determine the career trajectories of scholars and evaluations of academic departments, institutions, and nations. Whether and how these increases in the numbers of scientists and papers translate into advances in knowledge is unclear, however. Here, we first lay out a theoretical argument for why too many papers published each year in a field can lead to stagnation rather than advance. The deluge of new papers may deprive reviewers and readers the cognitive slack required to fully recognize and understand novel ideas. Competition among many new ideas may prevent the gradual accumulation of focused attention on a promising new idea. Then, we show data supporting the predictions of this theory. When the number of papers published per year in a scientific field grows large, citations flow disproportionately to already well-cited papers; the list of most-cited papers ossifies; new papers are unlikely to ever become highly cited, and when they do, it is not through a gradual, cumulative process of attention gathering; and newly published papers become unlikely to disrupt existing work. These findings suggest that the progress of large scientific fields may be slowed, trapped in existing canon. Policy measures shifting how scientific work is produced, disseminated, consumed, and rewarded may be called for to push fields into new, more fertile areas of study. 在许多学术领域，每年发表的论文数量随着时间的推移而显着增加。政策措施旨在增加科学家的数量、研究经费和科学产出（以论文数量来衡量）。这些定量指标决定了学者的职业轨迹以及学术部门、机构和国家的评价。然而，科学家和论文数量的增加是否以及如何转化为知识的进步尚不清楚。在这里，我们首先提出一个理论论证，解释为什么某个领域每年发表太多论文会导致停滞而不是进步。新论文的泛滥可能会剥夺审稿人和读者充分认识和理解新颖想法所需的认知松弛。许多新想法之间的竞争可能会阻碍对有前途的新想法逐渐积累注意力。然后，我们展示支持该理论预测的数据。当科学领域每年发表的论文数量不断增加时，引用量就会不成比例地流向已经被广泛引用的论文；被引用次数最多的论文列表变得僵化；新论文不太可能被高引用，即使被高引用，也不是通过一个渐进的、累积的注意力聚集过程；新发表的论文不太可能扰乱现有的工作。这些发现表明，大型科学领域的进步可能会放缓，并受困于现有的规范。可能需要采取政策措施改变科学工作的生产、传播、消费和奖励方式，以将各个领域推向新的、更丰富的研究领域。

A straightforward view of scientific progress would suggest more is better. The more papers published in a field, the greater the rate of scientific progress; the more researchers, the more ground covered. Even if not every article is earth shaking in its impact, each can contribute a metaphorical grain of sand to the sandpile, increasing the probability of an avalanche, wherein the scientific landscape is reconfigured and new paradigms arise to structure inquiry (1, 2). The publication of more papers also increases the probability at least one of them contains an important innovation. A disruptive new idea can destabilize the status quo, siphoning attention from previous work and garnering the lion’s share of new citations (3, 4). 对科学进步的直接看法表明，越多越好。某个领域发表的论文越多，科学进步的速度就越大；研究人员越多，覆盖范围就越广。即使不是每一篇文章的影响力都惊天动地，但每一篇文章都可以向沙堆贡献一粒隐喻的沙子，增加雪崩的可能性，其中科学景观被重新配置，结构探究出现了新的范式 (1, 2)。更多论文的发表也增加了其中至少一篇包含重要创新的可能性。颠覆性的新想法可能会破坏现状，吸引人们对之前工作的关注，并获得大量新的引用（3、4）。

Policy reflects this more-is-better view. Scholars are evaluated and rewarded on productivity. Publishing many articles within a set period of time is the surest path to tenure and promotion. Quantity remains the measuring stick at the university (5) and the national levels (6), where comparisons focus on the total number of publications, patents, scientists, and dollars spent. 政策反映了这种“越多越好”的观点。学者根据生产力进行评估和奖励。在规定的时间内发表大量文章是获得终身职位和晋升的最可靠途径。数量仍然是大学（5）和国家层面（6）的衡量标准，比较的重点是出版物、专利、科学家和花费的资金总数。

“Quality” is also predominantly judged quantitatively. Citation counts are used to measure the importance of individuals (7), teams (8), and journals (9) within a field. At the paper level, the assumption is that the best and most valuable papers will attract more attention, shaping the research trajectory of the field (10). “质量”也主要通过定量来判断。引用计数用于衡量某个领域内个人 (7)、团队 (8) 和期刊 (9) 的重要性。在论文层面，假设最好和最有价值的论文将吸引更多关注，从而塑造该领域的研究轨迹（10）。

Here, however, we predict that when the number of papers published each year grows very large, the rapid flow of new papers can force scholarly attention to already well-cited papers and limit attention for less-established papers—even those with novel, useful, and potentially transformative ideas. Rather than causing faster turnover of field paradigms, a deluge of new publications entrenches top-cited papers, precluding new work from rising into the most-cited, commonly known canon of the field. 然而，我们预测，当每年发表的论文数量变得非常大时，新论文的快速流动可能会迫使学术界关注已经被广泛引用的论文，并限制对不太成熟的论文的关注——即使是那些具有新颖性、有用性的论文。，以及潜在的变革性想法。大量的新出版物并没有导致领域范式的更快更替，而是巩固了被引用最多的论文，阻止了新作品成为该领域被引用最多、众所周知的经典。

These arguments, supported by our empirical analysis, suggest that the scientific enterprise’s focus on quantity may obstruct fundamental progress. This detrimental effect will intensify as the annual mass of publications in each field continues to grow—which is almost inevitable given the entrenched, interlocking structures motivating publication quantity. Policy measures restructuring the scientific production value chain may be required to allow mass attention to concentrate on promising, novel ideas. 这些论点得到我们的实证分析的支持，表明科学事业对数量的关注可能会阻碍根本性的进步。随着每个领域每年的出版物数量持续增长，这种有害影响将会加剧——考虑到促进出版物数量的根深蒂固的、环环相扣的结构，这几乎是不可避免的。可能需要采取政策措施重组科学生产价值链，让大众的注意力集中在有前景的新颖想法上。

This study focuses on the effects of field size: The number of papers published in a field in a given year. Previous studies have found that citation inequality is increasing across a range of disciplines (11), at least partially driven by processes of preferential attachment (12, 13). Papers do not always maintain their citation levels and rankings over the years, however. Disruptive papers can eclipse prior work (4) and natural fluctuations in citation numbers can upset rankings (14). We predict that when fields are large, the dynamics change. The most-cited papers become entrenched, garnering disproportionate shares of future citations. New papers cannot rise into canon by amassing citations through processes of preferential attachment. Newly published papers rarely disrupt established scholarship. 本研究重点关注领域规模的影响：特定年份在某个领域发表的论文数量。先前的研究发现，在一系列学科中，引用不平等现象正在加剧 (11)，至少部分是由优先依恋过程驱动的 (12, 13)。然而，多年来论文并不总是保持其引用水平和排名。颠覆性论文可能会掩盖之前的工作 (4)，而引用数量的自然波动可能会扰乱排名 (14)。我们预测，当场很大时，动态会发生变化。被引用最多的论文变得根深蒂固，在未来的引用中获得不成比例的份额。新论文无法通过优先附加过程积累引用而成为经典。新发表的论文很少会扰乱既定的学术成果。

Two mechanisms underlie these predictions (15). First, when many papers are published within a short period of time, scholars are forced to resort to heuristics to make continued sense of the field. Rather than encountering and considering intriguing new ideas each on their own merits, cognitively overloaded reviewers and readers process new work only in relationship to existing exemplars (16–18). A novel idea that does not fit within extant schemas will be less likely to be published, read, or cited. Faced with this dynamic, authors are pushed to frame their work firmly in relationship to well-known papers, which serve as “intellectual badges” (19) identifying how the new work is to be understood, and discouraged from working on too-novel ideas that cannot be easily related to existing canon. The probabilities of a breakthrough novel idea being produced, published, and widely read all decline, and indeed, the publication of each new paper adds disproportionately to the citations for the already most-cited papers. 这些预测有两种机制 (15)。首先，当许多论文在短时间内发表时，学者们被迫诉诸启发法来持续了解该领域。认知超载的审稿人和读者只会根据现有范例来处理新作品，而不是根据自己的优点来遇到和考虑有趣的新想法（16-18）。不符合现有模式的新颖想法将不太可能被发表、阅读或引用。面对这种动态，作者被迫将他们的作品与知名论文紧密联系起来，这些论文充当“知识徽章”（19），确定如何理解新作品，并且不鼓励研究过于新颖的想法这不能轻易地与现有的设定联系起来。产生、发表和广泛阅读突破性新颖想法的可能性都在下降，事实上，每一篇新论文的发表都会不成比例地增加已经被引用最多的论文的引用次数。

Second, if the arrival rate of new ideas is too fast, competition among new ideas may prevent any of the new ideas from becoming known and accepted field wide. To see why this is so, consider a sandpile model of idea spread in a field. When sand is dropped on a sandpile slowly, one grain at a time, waiting for movement on the sandpile to stop before dropping the next grain, the sandpile over time reaches a scale-free critical state wherein one dropped grain of sand can trigger an avalanche over the whole area of the pile (2). But when sand is dropped at a rapid rate, neighboring miniavalanches interfere with each other, and no individual grain of sand can trigger pile-wide shifts (20). The faster the rate of sand dropping the smaller the domain each new grain of sand can affect. If the arrival rate of papers is too fast, no new paper can rise into canon through localized processes of diffusion and preferential attachment. 其次，如果新想法的到来速度太快，新想法之间的竞争可能会阻止任何新想法被广泛认知和接受。要了解为什么会这样，请考虑思想在某个领域传播的沙堆模型。当沙子缓慢地滴在沙堆上，一次一粒，等待沙堆上的运动停止后再滴下下一粒沙子时，随着时间的推移，沙堆会达到无垢临界状态，其中掉落的一粒沙子可能会引发雪崩覆盖桩 (2) 的整个区域。但是，当沙子快速落下时，相邻的小雪崩会相互干扰，并且没有任何单个沙粒可以引发堆范围内的移动（20）。沙子掉落的速度越快，每颗新沙子所能影响的范围就越小。如果论文的到达速度太快，那么任何新论文都无法通过局部的扩散和优先附着过程成为经典。

The arguments above yield six predictions, two each predicting durable dominance of the most-cited papers, entrepreneurial futility for newly published papers, and decrease in the disruptiveness (3, 4) of newly published papers. Compared to when a field produces few publications each year, when that field produces many new publications each year: 1) new citations will be more likely to cite the most-cited papers rather than less-cited papers; 2) the list of most-cited papers will change little year to year—the canon ossifies; 3) the probability a new paper eventually becomes canon will drop; 4) new papers that do rise into the ranks of those most cited will not do so through gradual, cumulative processes of diffusion; 5) the proportion of newly published papers developing existing scientific ideas will increase and the proportion disrupting existing ideas will decrease; and 6) the probability of a new paper becoming highly disruptive will decline. 上述论点得出了六种预测，其中两种分别预测了被引用最多的论文的持久主导地位、新发表论文的创业徒劳性以及新发表论文的破坏性下降 (3, 4)。与一个领域每年产生很少的出版物相比，当该领域每年产生许多新出版物时：1）新的引文更有可能引用被引用最多的论文，而不是引用较少的论文； 2) 被引用次数最多的论文列表每年几乎没有变化——经典变得僵化； 3）一篇新论文最终成为经典的概率将会下降； 4）新论文确实跻身被引用最多的行列，但不会通过逐渐、累积的扩散过程实现； 5）新发表的论文发展现有科学思想的比例将增加，颠覆现有思想的比例将减少； 6）一篇新论文具有高度颠覆性的可能性将会下降。

Results 结果

Each of these predictions is borne out in citation patterns across the Web of Science dataset, as shown in Figs. 1–4. As fields get larger, the most-cited papers become durably dominant, entrenched atop the citation distribution. New papers, in contrast, suffer diminished probability of ever becoming very highly cited and cannot gradually accumulate attention over time. Published papers tend to develop existing ideas more than disrupt them, and rarely launch disruptive new streams of research. 这些预测中的每一个都在 Web of Science 数据集中的引用模式中得到证实，如图 1 和 1 所示。 1-4。随着领域变得越来越大，被引用次数最多的论文将持续占据主导地位，在引用分布中占据主导地位。相比之下，新论文被高引用的可能性就会降低，并且无法随着时间的推移逐渐积累注意力。发表的论文更倾向于发展现有的想法，而不是颠覆它们，并且很少推出颠覆性的新研究流。

The most-cited papers garner disproportionately higher shares of citations in larger fields. The largest fields have a Gini coefficient of citation shares of around 0.5 (Fig. 1A), which is as large as income inequality in the most unequal countries—only China and South Africa have Gini coefficients higher than 0.5 (21). Disproportionate numbers of citations to top-cited papers drive this increase in unequal attention. For example, when the field of Electrical and Electronic Engineering published ∼10,000 papers a year, the top 0.1% most-cited papers collected 1.5% and the top 1% most-cited collected 8.6% of total citations. When the field grew to 50,000 published papers a year, the top 0.1% captured 3.5% of citations, and the top 1% captured 11.9%. When the field was larger still with 100,000 published papers per year, the top 0.1% received 5.7% of citations within the field and the top 1% received 16.7%. The bottom 50% least-cited papers in contrast decreased in share as the field grew larger, dropping from garnering 43.7% of citations at 10,000 papers to slightly above 20% at both 50,000 and 100,000 papers per year. 被引用最多的论文在更大的领域中获得了不成比例的更高引用比例。最大领域的引用份额基尼系数约为 0.5（图 1A），与最不平等国家的收入不平等一样大——只有中国和南非的基尼系数高于 0.5 (21)。被引用最多的论文被引用的次数不成比例，导致了这种不平等关注的增加。例如，当电气与电子工程领域每年发表约 10,000 篇论文时，被引用次数最多的前 0.1% 的论文占总引用量的 1.5%，被引用次数最多的前 1% 的论文占总引用量的 8.6%。当该领域每年发表的论文达到 50,000 篇时，排名前 0.1% 的论文获得了 3.5% 的引用，排名前 1% 的论文获得了 11.9% 的引用。当该领域规模更大、每年发表论文 10 万篇时，排名前 0.1% 的论文获得该领域内 5.7% 的引用，排名前 1% 的论文获得 16.7% 的引用。相比之下，随着领域规模的扩大，被引用次数最少的 50% 论文所占的份额也随之下降，从每年 10,000 篇论文时的引用率 43.7% 下降到每年 50,000 篇和 100,000 篇论文时的略高于 20%。

Canons crystallize as fields grow large. Churn in the identity and ordering of the most-cited papers decreases with larger field size. The pattern holds consistent when looking at data across all fields and at individual large fields across time: When the number of papers published per year is larger, the rank correlation between the top-50 most-cited papers in the focal year and the next increases (Fig. 1B). The predicted Spearman rank correlation of the top-50 most-cited list in a field between subsequent years increases from 0.25 when 1,000 papers are published in the focal year to 0.74 when 100,000 papers are published yearly. 当领域变大时，规范就会具体化。随着领域规模的扩大，被引用次数最多的论文的身份和排序的流失会减少。当查看所有领域和跨时间的单个大领域的数据时，该模式保持一致：当每年发表的论文数量较多时，重点年份和下一年被引用次数最多的前 50 篇论文之间的排名相关性会增加（图1B）。预测某一领域被引用次数最多的 50 名列表之间的 Spearman 排名相关性从重点年发表 1,000 篇论文时的 0.25 增加到每年发表 100,000 篇论文时的 0.74。

This crystallization of canon happens because the most-cited papers maintain their number of citations year over year when fields are large, while all other papers’ citation counts decay. Fig. 2 displays the predicted ratio of current year to previous year citations for papers at various percentiles of citation-share ranking. In years where few papers are published, the ratio for the most-cited papers is significantly below 1 and not much different from less-cited papers. When the number of papers published grows large, however, the ratio for the most-cited papers is close to 1, significantly higher than that of less-cited papers. In very large field-years, with about 100,000 papers published, the most-cited papers on average see no decline in their numbers of citations received year over year. Papers just outside the top 1% most cited in the field-year, in contrast, lose on average about 17% of their citation counts each year, and those at the fifth percentile and below trend toward losing a quarter of their citations year over year. 这种经典结晶的发生是因为，当领域很大时，被引用最多的论文的引用数量逐年保持不变，而所有其他论文的引用数量却在下降。图 2 显示了不同引用份额排名的论文当年与上一年引用的预测比率。在论文发表较少的年份，被引用次数最多的论文的比率明显低于 1，与被引用次数较少的论文没有太大区别。然而，当发表的论文数量增加时，被引用最多的论文的比例接近于1，明显高于被引用较少的论文。在非常大的领域年中，发表了大约 100,000 篇论文，平均被引用次数最多的论文的引用次数并没有逐年下降。相比之下，当年被引用次数排名前 1% 之外的论文每年平均会损失约 17% 的引用次数，而位于第五个百分位数及以下的论文则有逐年减少四分之一的引用次数的趋势。

The probability of a paper ever reaching (even for 1 y) the top 0.1% most cited in its field shrinks when it is published in the same year as many others. This holds true cross-sectionally across fields in the same year, and across years in individual fields (Fig. 3A). When papers in large fields do become most cited, it is rarely through a process of local diffusion and preferential attachment. Fig. 3B presents the median time in years for an article to break into the field’s canon, conditional on the paper ever becoming one of the top cited in its field. When a field is small, papers rise slowly over time into the top 0.1% most cited, consistent with a process of cumulative attention gathering. A linear regression across all subjects for the year 1980 predicts a median time of 9 y for a successful paper to reach the 0.1% most cited in its field when published in the same year as 1,000 other papers in the field. Papers entering the canon in the largest fields, by contrast, shoot quickly to the top, inconsistent with a cumulative process where scholars discover new work by reading references cited in others’ work. The same regression predicts a median of less than a year for papers to reach the top 0.1% in large fields with 100,000 papers published each year. 当一篇论文与许多其他论文在同一年发表时，其进入该领域被引用次数最多的前 0.1% 的概率（即使是 1 年）就会缩小。这在同一年的跨领域以及各个领域的跨年份中都是如此（图3A）。当大领域的论文确实被引用最多时，很少是通过局部传播和优先附加的过程。图 3B 显示了一篇文章进入该领域经典的平均时间（以年为单位），条件是该论文成为该领域被引用次数最多的论文之一。当一个领域较小时，论文随着时间的推移会慢慢上升到被引用次数最多的前 0.1%，这与累积注意力聚集的过程一致。 1980 年所有主题的线性回归预测，当一篇成功的论文与该领域的其他 1,000 篇论文在同一年发表时，达到该领域被引用次数最多的 0.1% 的中位时间为 9 年。相比之下，在最大领域进入经典的论文很快就会名列前茅，这与学者们通过阅读他人作品中引用的参考文献来发现新作品的累积过程不一致。同样的回归预测，在每年发表 10 万篇论文的大型领域，论文达到前 0.1% 的时间中位数不到一年。

Most papers published in the same year as many others build on, rather than disrupt, existing literature (Fig. 4A). A logistic fit predicts 49% of papers have disruption measure (3, 4) D > 0 (and conversely 51% D < 0) when 1,000 papers are published in the field-year. The predicted proportion of disruptive papers drops to 27% when 10,000 papers are published and 13% at 100,000 papers. Even when D > 0, the disruptive impact of a newly published paper is muted in larger fields. Fig. 4B presents the proportion of new papers by field-year that rank in the top-5 percentile of disruption measure. Lowess estimates show the proportion of new papers with top-5 percentile disruption measure shrinks from 8.8% at 1,000 papers published in the field-year to 3.6% at 10,000 papers per year and 0.6% at 100,000 papers. 大多数论文在同一年发表，许多其他论文都建立在现有文献的基础上，而不是破坏现有文献（图 4A）。逻辑拟合预测，当领域年发表 1,000 篇论文时，49% 的论文具有干扰性度量 (3, 4) D > 0（反之，51% D < 0）。当发表 10,000 篇论文时，预测破坏性论文的比例将下降至 27%，当发表 100,000 篇论文时，颠覆性论文的比例将下降至 13%。即使 D > 0，新发表的论文的破坏性影响在更大的领域也会减弱。图 4B 显示了按领域年份划分的新论文在破坏性指标中排名前 5 个百分点的比例。 Lowess 估计显示，具有前 5 个百分位数颠覆指标的新论文比例从该领域每年发表 1,000 篇论文时的 8.8% 缩减至每年 10,000 篇论文时的 3.6% 和 100,000 篇论文时的 0.6%。

These empirical results are aligned with our theory’s predictions. Our current analyses cannot, however, rule out other causal explanations. The SI Appendix considers the most salient alternative explanation—that the changes observed are driven by the passage of time and maturing of fields rather than field size. While the number of papers published in a field tends to increase over time, this increase is not lockstep. Analysis shows significant effects of field size over and above effects of time (SI Appendix, Table S1 and Fig. S1). The SI Appendix also examines the mechanisms of change. We find that veteran scholars change their citation patterns as a field grows. While field size at the time a scholar entered the field does influence their propensity to reference the most-cited articles, the field’s size when an article is published has a much stronger effect (SI Appendix, Tables S2 and S3). Even well-established, veteran scholars come to cite canonical articles much more often when many other papers are also being published. 这些实证结果与我们的理论预测一致。然而，我们目前的分析不能排除其他因果解释。 SI 附录考虑了最显着的替代解释——观察到的变化是由时间的流逝和领域的成熟而不是领域的规模驱动的。虽然某个领域发表的论文数量往往会随着时间的推移而增加，但这种增长并不是同步的。分析显示，场大小的影响超过时间的影响（SI 附录，表 S1 和图 S1）。 SI 附录还研究了变化的机制。我们发现资深学者随着领域的发展而改变他们的引用模式。虽然学者进入该领域时的领域规模确实会影响他们引用被引用次数最多的文章的倾向，但发表文章时该领域的规模具有更强的影响（SI 附录，表 S2 和 S3）。当许多其他论文同时发表时，即使是知名的资深学者也会更频繁地引用规范文章。

Discussion 讨论

These findings suggest troubling implications for the current direction of science. If too many papers are published in short order, new ideas cannot be carefully considered against old, and processes of cumulative advantage cannot work to select valuable innovations. The more-is-better, quantity metric-driven nature of today’s scientific enterprise may ironically retard fundamental progress in the largest scientific fields. Proliferation of journals and the blurring of journal hierarchies due to online article-level access can exacerbate this problem. 这些发现对当前的科学方向提出了令人不安的影响。如果短时间内发表太多论文，就无法仔细考虑新想法与旧想法，积累优势的过程也无法选择有价值的创新。具有讽刺意味的是，当今科学事业的“越多越好”、数量指标驱动的本质可能会阻碍最大科学领域的根本进展。由于在线文章级访问，期刊的激增和期刊层次结构的模糊可能会加剧这一问题。

Reducing quantity may be impossible. Proscribing the number of annual publications, shuttering journals, closing research institutions, and reducing the number of scientists are hard-to-swallow policy prescriptions. Even if a scientist wholeheartedly agreed with the implications of our study, curtailing their output would be impractical given the damage to their career prospects and those of their colleagues and students, for example. Limiting article quantity without altering other incentives risks deterring the publication of novel, important new ideas in favor of low-risk, canon-centric work. 减少数量可能是不可能的。限制年度出版物的数量、关闭期刊、关闭研究机构以及减少科学家的数量都是难以接受的政策处方。即使一位科学家全心全意地同意我们研究的意义，考虑到他们的职业前景以及他们的同事和学生的职业前景受到损害，减少他们的产出也是不切实际的。在不改变其他激励措施的情况下限制文章数量可能会阻碍新颖、重要的新想法的发表，而有利于低风险、以经典为中心的工作。

Still, some changes in how scholarship is conducted, disseminated, consumed, and rewarded may help accelerate fundamental progress in large fields of science. A clearer hierarchy of journals with the most-prestigious, highly attended outlets devoting pages to less canonically rooted work could foster disruptive scholarship and focus attention on novel ideas. Reward and promotion systems, especially at the most prestigious institutions, that eschew quantity measures and value fewer, deeper, more novel contributions could reduce the deluge of papers competing for a field’s attention while inspiring less canon-centric, more innovative work. A widely adopted measure of novelty vis a vis the canon could provide a helpful guide for evaluations of papers, grant applications, and scholars. Revamped graduate training could push future researchers to better appreciate the uncomfortable novelty of ideas less rooted in established canon. These measures, while not easy to implement across large fields, may help push scholarship off the local attractor of existing canon and toward more novel frontiers. 尽管如此，学术研究的开展、传播、消费和奖励方式的一些改变可能有助于加速大型科学领域的根本性进展。期刊的层次结构更加清晰，其中最有声望、访问量最大的媒体将版面专门用于不太符合规范的作品，可以促进颠覆性的学术研究，并将注意力集中在新颖的想法上。奖励和晋升制度，特别是在最负盛名的机构，避开数量衡量标准，重视更少、更深入、更新颖的贡献，可以减少争夺某个领域注意力的论文泛滥，同时激发不那么以经典为中心、更具创新性的工作。广泛采用的相对于规范的新颖性衡量标准可以为论文、资助申请和学者的评估提供有用的指导。改进后的研究生培训可以促使未来的研究人员更好地欣赏那些不太植根于既定经典的想法的令人不安的新颖性。这些措施虽然在大领域实施起来并不容易，但可能有助于将学术研究从现有经典的本地吸引力中推向更新颖的领域。

The current study is at the level of fields and large subfields, and one could argue that progress now occurs at lower subdisciplinary levels. To examine lower levels at scale requires more precise methods for classifying papers—perhaps using temporal citation network community detection—than are currently available. But note that the fields and subfields identified in the Web of Science correspond closely to real-world self-classifications of journals and departments. Established scholars transmit their cognitive view of the world to their students via field-centric reading lists, syllabi, and course sequences, and field boundaries are enforced through career-shaping patterns of promotion and reward. 目前的研究是在领域和大子领域的层面上进行的，人们可能会说现在的进展发生在较低的子学科层面上。要大规模检查较低级别，需要比当前可用的更精确的论文分类方法（可能使用时间引用网络社区检测）。但请注意，Web of Science 中确定的字段和子字段与现实世界中期刊和院系的自我分类密切对应。知名学者通过以领域为中心的阅读清单、教学大纲和课程序列将他们对世界的认知观点传递给学生，并通过职业塑造的晋升和奖励模式来强化领域界限。

It may be that progress still occurs, even though the most-cited articles remain constant. While the most-cited article in molecular biology (22) was published in 1976 and has been the most-cited article every year since 1982, one would be hard pressed to say that the field has been stagnant, for example. But recent evidence (23) suggests that much more research effort and money are now required to produce similar scientific gains—productivity is declining precipitously. Could we be missing fertile new paradigms because we are locked into overworked areas of study? 尽管被引用最多的文章保持不变，但进展可能仍在发生。例如，虽然分子生物学领域被引用次数最多的文章 (22) 于 1976 年发表，并且自 1982 年以来每年都是被引用次数最多的文章，但很难说该领域已经停滞不前。但最近的证据 (23) 表明，现在需要更多的研究努力和资金才能产生类似的科学成果——生产力正在急剧下降。我们是否会因为被困在过度劳累的研究领域而错过丰富的新范式？

Materials and Methods 材料和方法

We utilize the Web of Science dataset, analyzing papers published between 1960 and 2014 inclusive. The resulting dataset contains 90,637,277 papers and 1,821,810,360 citations. The Web of Science classifies academic fields, or in some cases, large subfields, into what it terms subjects. There are 241 subjects in the classification, and we use these as the basis for our field-level analyses. The annual count of citations received by a focal paper from newly published papers in the same subject constitutes our main variable of interest. 我们利用 Web of Science 数据集，分析 1960 年至 2014 年（含）之间发表的论文。生成的数据集包含 90,637,277 篇论文和 1,821,810,360 次引用。 Web of Science 将学术领域（或在某些情况下，大的子领域）划分为学科。分类中有 241 个主题，我们将它们用作现场分析的基础。焦点论文每年从同一主题新发表的论文中获得的引用次数构成了我们感兴趣的主要变量。

To calculate 1−decay rate (λ) for the 10 largest nonmultidisciplinary subjects (Fig. 2 A–C), for each subject, we binned years by the base 10 log of number of publications (cutpoints at 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, and 5.5), and paper years by percentile most cited in the field-year (cutpoints at 1, 2, 3, …, 100). For each (logged number of publications) × (citation percentile) bin, we regressed the number of citations to a paper the subsequent year on number of citations to a paper in the focal year. The coefficient of this regression yields 1–λ. 为了计算 10 个最大的非多学科学科的 1−衰减率 (λ)（图 2 A-C），对于每个学科，我们按以 10 为底的出版物数量对数对年份进行分组（分点为 1、1.5、2、2.5）、 3、 3.5、 4、 4.5、 5 和 5.5），以及按领域年份中被引用次数最多的百分位列出的论文年份（分界点为 1、2、3、…、100）。对于每个（记录的出版物数量）×（引用百分位数）箱，我们将下一年论文的引用次数与重点年份论文的引用次数进行回归。该回归的系数为 1–λ。

To calculate 1–λ across all subjects (Fig. 2D), we selected the top 100 most-cited papers from each subject-year in the 1st, 2nd, 5th, 10th, and 25th percentiles. We binned subject-years by the base 10 log of number of publications (cutpoints at 1, 1.5, 2, 2.5, 3, 3.5, 4, 4.5, 5, and 5.5). For each bin × selected percentile, we regressed the number of citations to a paper the subsequent year on number of citations to a paper in the focal year. The coefficient of this regression yields 1–λ. 为了计算所有学科的 1–λ（图 2D），我们从每个学科年的第 1、第 2、第 5、第 10 和第 25 个百分位数中选择了前 100 篇被引用最多的论文。我们按照以 10 为底的出版物数量对数对主题年进行分类（分界点为 1、1.5、2、2.5、3、3.5、4、4.5、5 和 5.5）。对于每个 bin × 选定的百分位数，我们将下一年论文的引用次数与重点年份论文的引用次数进行回归。该回归的系数为 1–λ。

转载本文请联系原作者获取授权，同时请注明本文来自刘跃科学网博客。
链接地址：https://blog.sciencenet.cn/blog-3589443-1420867.html

上一篇：[转载]为什么当代出版实践扭曲了科学（科技英语，英汉对照）
下一篇：[转载]颠覆性成果很难发表在顶刊（科技英语听力，英汉对照）

收藏 IP: 39.152.24.*| 热度|

当前推荐数：2 推荐人：曹俊兴 宁利中

该博文允许注册用户评论请点击登录评论 (1 个评论)

数据加载中...

返回顶部

刘跃

扫一扫，分享此博文

yueliusd07017的个人博客分享 http://blog.sciencenet.cn/u/yueliusd07017

博文

[转载]垃圾文章的大量产出导致的问题不仅仅是虚假繁荣（科技英语，英汉对照）

当前推荐数：2 推荐人：曹俊兴 宁利中

该博文允许注册用户评论请点击登录评论 (1 个评论)

刘跃

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

yueliusd07017的个人博客分享 http://blog.sciencenet.cn/u/yueliusd07017

博文

[转载]垃圾文章的大量产出导致的问题不仅仅是虚假繁荣 （科技英语，英汉对照）

当前推荐数：2 推荐人： 曹俊兴 宁利中

该博文允许注册用户评论 请点击登录 评论 (1 个评论)

刘跃

全部作者的精选博文

全部作者的其他最新博文

全部精选博文导读

[转载]垃圾文章的大量产出导致的问题不仅仅是虚假繁荣（科技英语，英汉对照）

当前推荐数：2 推荐人：曹俊兴宁利中

该博文允许注册用户评论请点击登录评论 (1 个评论)