|||
【立委按】ALPAC 黑皮书是自然语言处理和机器翻译领域极其重要的历史文献,原文在:
http://books.nap.edu/html/alpac_lm/ARC000005.pdf。如此重要的文献本来以为一定有若干中文译本,居然遍搜而不得。我要是有时间,就给它译了,可现在实在没空。算了,至少先凑合弄个机器翻译版吧(略加最低限度的后编辑)。本来是要枪毙机器翻译的,正好让机器翻译serve它,也算小小的报应。把重要历史文献完整挖掘出来,也算功德一枚。Google Translate,给点力!要是不努力,没准哪天我就弃明投暗,找千百度去,伊人在灯火阑珊处已然守候多时了。
ALPAC 黑皮书 1/n(机器翻译版)
~~~~~~~~~~~~~~~~~~~~~~~~~
弗雷德里克塞茨院长博士
美国国家科学院
2101华盛顿宪法大道,D. C.20418
1965年8月20日
亲爱的博士塞茨:
在1964年4月你形成了一个自动语言处理咨询委员会,应利兰·霍沃斯博士,美国国家科学基金会主任的请求,以便告知国防部,中央情报局和美国国家科学基金会一般机械外语翻译领域的研究和发展状况。我们很快发现你是正确的,确实有很多强烈,但往往相互冲突的意见,关于机器翻译的承诺和现在应采取的最有成效的步骤是什么。
为了达到合理的结论,并提供合理的建议,我们觉得有必要咨询在各种各样领域的专家(他们的名字被列在附录20 ) 。我们已调查翻译的需求,考量翻译的评价,并比较了机器和人类的翻译和其他语言处理功能。
我们发现,我们所听到的都让我们得出同样的结论。我们谨此提交的报告阐明了我们共同的意见和建议。我们相信,这些可以形成有用的改变,旨在增加理解一个极其重要的现象:语言,并发展旨在改善人类翻译而适当使用的机器辅助。
我们很抱歉,由于有其他义务,查尔斯F.霍凯特,原委员会的成员之一,有必要在我们报告写作前就辞职了。然而,他对我们的工作作出了宝贵的贡献,这是我们要感谢的。
你真诚的,
J. R.皮尔斯,董事长
语言自动处理咨询委员会
Dr. Frederick Seitz, President
National Academy of Sciences
2101 Constitution Avenue Washington, D.C. 20418
Dear Dr. Seitz:
In April of 1964 you formed an Automatic Language Processing Advisory Committee at the request of Dr. Leland Haworth, Director of the National Science Foundation, to advise the Department of Defense, the Central Intelligence Agency, and the National Science Foundation on research and development in the general field of mechanical translation of foreign languages. We quickly found that you were correct in stating that there are many strongly held but often conflicting opinions about the promise of machine translation and about what the most fruitful steps are that should be taken now.
In order to reach reasonable conclusions and to offer sensible advice we have found it necessary to learn from experts in a wide variety of fields (their names are listed in Appendix 20). We have informed ourselves concerning the needs for translation, considered the evaluation of translations, and compared the capabilities of machines and human beings in translation and in other language processing functions.
We found that what we heard led us all to the same conclusions, and the report which we are submitting herewith states our common views and recommendations. We believe that these can form the basis for useful changes in the support of research aimed at an increased understanding of a vitally important phenomenon–language, and development aimed at improved human translation, with an appropriate use of machine aids.
We are sorry that other obligations made it necessary for Charles F. Hockett, one of the original members of the Committee, to resign before the writing of our report. He nonetheless made valuable contributions to our work, which we wish to acknowledge.
Sincerely yours,
J. R. Pierce, Chairman
Automatic Language Processing Advisory Committee
××××××××××××××××××××××××××××××××××××
弗雷德里克塞茨院长博士
美国国家科学院
2101华盛顿宪法大道,D. C.20418
1966年7月27日
亲爱的博士塞茨:
科学与公共政策委员会于3月13日对国家研究理事会语言自动处理咨询委员会的报告,进行了审查后,要求董事长,约翰·皮尔斯,准备一份简短的声明,说明计算语言学的资助需求,这不同于自动语言翻译的需求。这一要求源于担心孤独阅读该委员会的报告,可能会导致终止计算语言学研究的支持,以及所建议的减少对在相对短期的翻译目标的资助。
皮尔斯博士的建议,部分内容如下:
计算机为语言学家打开了一系列挑战、部分见地和潜力。我们相信,这些挑战可与粒子物理面临的挑战、问题和见地类比。毫无疑问,语言在所有现象中的重要性是首屈一指的。计算语言学所需要的工具成本,比起需要数十亿伏加速器的粒子物理小多了。新的语言学提出一个有吸引力的,以及一个极其重要的挑战。
我们完全有理由相信,面对这一挑战,最终将导致在许多领域的重要贡献。一个更深的语言知识可以帮助:
1。更有效地教外语。
2。教语言的本质更有效。
3。更有效地使用自然语言下指令和通信。
4。帮助我们构造为特殊用途(例如,飞行员控制塔通讯语言)的人工语言。
5。使我们能够在语言的使用以及人的沟通和思想方面做有意义的心理实验。除非我们知道语言是什么,我们不知道我们必须解释什么。
6。用机器辅助翻译和信息检索。
然而,语言学的状态是这样的,本身具有价值的优秀研究是必不可少的,如果语言学最终要做出这些贡献。
这样的研究必须使用电脑。我们必须研究以找出有关语言奥妙的数据是压倒性的,无论在数量还是复杂性上。电脑承诺帮助我们控制巨大的数据量问题,并在较小程度上对付数据的复杂性问题。但我们尚未有很好的,很容易使用,普及了的方法让计算机处理语言数据。
因此,下列重要的研究,是需要做的,应予以支持:(1)计算机处理语言的方法的基本开发研究,譬如帮助语言科学家发现并说明他的概括的工具,并作为工具帮助检查对数据的概括建议;(2)发展研究的方法,让语言的科学家用电脑来陈述他们的详细复杂的各种理论(例如,语法和意义理论),使他们生产的理论可以被检查细节。
对计算语言学研究最合理的支持来自美国国家科学基金会。需要多大的支持?有些工作必须做在一个相当大的规模上,因为小规模的实验和语言的微缩模型已经证明在过去有严重的偏差,一个真正的问题,只有在一定规模以上的语法、字典、可用语料库的状态下才可把握。
我们估计,一个机构60或70万一年可以支持一个相当规模的工作。我们相信,这种规模的工作有理由在四个或五个中心进行。因此,每年250至300万美元,似乎是合理的研究开支。这个数字不包括在眼前的实际应用中的一种或另一种的工作。这个建议,我明白皮尔斯博士的委员会也认可,还送出了给科学与公共政策委员会的成员征求意见。虽然科学与公共政策委员会没有考虑所建议的计算语言学项目与其他国家科学基金会计划的竞争,但我们相信,皮尔斯博士的声明应提请给美国国家科学基金会注意,以便把信息咨询委员会的报告放在适当的角度来看。
此致,哈维·布鲁克斯,
科学与公共政策委员会主席
~~~~~~~~~~~~~~~~~~
Dr. Frederick Seitz, President
National Academy of Sciences
2101 Constitution Avenue Washington, D. C. 20418
July 27, 1966
Dear Dr. Seitz:
In connection with the report of the Automatic Language Processing Advisory Committee, National Research Council, which was reviewed by the Committee on Science and Public Policy on March 13, John R. Pierce, the chairman, was asked to prepare a brief statement of the support needs for computational linguistics, as distinct from automatic language translation. This request was prompted by a fear that the committee report, read in isolation, might result in termination of research support for computational linguistics as well as in the recommended reduction of support aimed at relatively short-term goals in translation.
Dr. Pierce's recommendation states in part as follows:
The computer has opened up to linguists a host of challenges, partial insights, and potentialities. We believe these can be aptly compared with the challenges, problems, and insights of particle physics. Certainly, language is second to no phenomenon in importance. And the tools of computational linguistics are considerably less costly than the multibillion-volt accelerators of particle physics. The new linguistics presents an attractive as well as an extremely important challenge.
There is every reason to believe that facing up to this challenge will ultimately lead to important contributions in many fields. A deeper knowledge of language could help:
1.To teach foreign languages more effectively.
2.To teach about the nature of language more effectively.
3.To use natural language more effectively in instruction and communication.
4.To enable us to engineer artificial languages for special purposes (e.g., pilot-to-control-tower languages).
5.To enable us to make meaningful psychological experiments in language use and in human communication and thought. Unless we know what language is we don't know what we must explain.
6.To use machines as aids in translation and in information
retrieval.
However, the state of linguistics is such that excellent research that has value in itself is essential if linguistics is ultimately to make such contributions.
Such research must make use of computers. The data we must examine in order to find out about language is overwhelming both in quantity and in complexity. Computers give promise of helping us control the problems relating to the tremendous volume of data, and to a lesser extent the problems of data complexity. But we do not yet have good, easily used, commonly known methods for having computers deal with language data.
Therefore, among the important kinds of research that need to be done and should be supported are (1) basic developmental research in computer methods for handling language, as tools to help the linguistic scientist discover and state his generalizations, and as tools to help check proposed generalizations against data; and (2) developmental research in methods to allow linguistic scientists to use computers to state in detail the complex kinds of theories (for example, grammars and theories of meaning) they produce, so that the theories can be checked in detail.
The most reasonable government source of support for research in computational linguistics is the National Science Foundation. How much support is needed? Some of the work must be done on a rather large scale, since small-scale experiments and work with miniature models of language have proved seriously deceptive in the past, and one can come to grips with real problems only above a certain scale of grammar size, dictionary size, and available corpus.
We estimate that work on a reasonably large scale can be supported in one institution for 600or700 thousand a year. We believe that work on this scale would be justified at four or five centers. Thus, an annual expenditure of 2.5to3 million seems reasonable for research. This figure is not intended to include support of work aimed at immediate practical applications of one sort or another. This recommendation, which I understand has the endorsement of Dr. Pierce's committee, was also sent out for comment to the membership of the Committee on Science and Public Policy. While the Committee on Science and Public Policy has not considered the recommended program in computational linguistics in competition with other National Science Foundation programs, we do believe that Dr. Pierce's statement should be brought to the attention of the National Science Foundation as information necessary to put the report of the Advisory Committee in proper perspective.
Sincerely yours, Harvey Brooks,
Chairman Committee on Science and Public Policy
Dr. Frederick Seitz, President
National Academy of Sciences
2101 Constitution Avenue Washington, D. C. 20418
×××××××××××××××××××××××××××××××××××××××××
前言
国防部,美国国家科学基金会和美国中央情报局支持的项目,外国语言的自动处理大约十年; 这些主要是机械翻译。为了提供一个协调的联邦计划,在这方面的研究和开发,这三个机构成立了联合自动语言处理集团( JALPG ) 。
早期JALPG就确认需要一个咨询委员会,可以提供所要求的技术援助以及促进计算语言学、机械翻译,以及其他相关领域的独立观测。 1963年10月美国国家科学基金会主任,利兰·霍沃斯,作为这三个机构的代表要求美国国家科学院建立这样一个委员会。
委员会就这样建立了,并在1964年4月,利用三个机构提供的基金,国家研究理事会国家科学院自动语言处理咨询委员会在约翰·皮尔斯主席主持下,举行了第一次会议。
委员会决定,支持自动语言处理研究的理由有两个基础: (1)智力挑战领域的研究,与支持机构的使命相关;(2)研究和开发具有明确的前景:促成早期成本降低,或大幅提高性能,或满足实际的需要。
委员会明白支持自动语言处理的工作的很大的动机一直是在上述(2)所代表的实用目的。根据这一目标,该委员会调查了整个翻译问题。本报告介绍了该委员会的调查结果和建议。
~~~~~~~~~~~~~~~~~~~
Preface
The Department of Defense, the National Science Foundation, and the Central Intelligence Agency have supported projects in the automatic processing of foreign languages for about a decade; these have been primarily projects in mechanical translation. In order to provide for a coordinated federal program of research and development in this area, these three agencies established the Joint Automatic Language Processing Group (JALPG).
Early in its existence JALPG recognized its need for an advisory committee that could provide directed technical assistance as well as contribute independent observations in computational linguistics, mechanical translation, and other related fields. In October 1963 the Director of the National Science Foundation, Leland J. Haworth, requested on behalf of the three agencies that the National Academy of Sciences establish such a committee.
This was done, and in April 1964, with funds made available by the three agencies, the Automatic Language Processing Advisory Committee of the National Academy of Sciences–National Research Council, under the chairmanship of John R. Pierce, held its first meeting.
The Committee determined that support for research in automatic language processing could be justified on one of two bases: (1) research in an intellectually challenging field that is broadly relevant to the mission of the supporting agency and (2) research and development with a clear promise of effecting early cost reductions, or substantially improving performance, or meeting an operational need.
It is clear to the Committee that the motivation for support of much of the work in automatic language processing has been the practical aim represented in (2) above. In the light of that objective, the Committee studied the whole translation problem. This report presents the findings and recommendations of the Committee.
××××××××××××××××××××××××××××
目录
人类翻译1
类型译者就业2
英语作为语言的科学4
所需的时间,科学家学习俄语5
在美国政府的翻译 6
政府转换数 7
花费金额为翻译 9
是否有短缺翻译或翻译吗? 11
就可能超出翻译 13
翻译的关键问题 16
机器翻译的现状 19
机器辅助翻译在曼海姆和卢森堡 25
自动语言处理和计算语言学 29
改善翻译大道 32
建议 34
附录
1。视译与全译实验 35
2。国防语言学院课程科学俄罗斯 37
3。联合出版物研究服务 39
4 。公法 翻译 41
5 。机器翻译的外国技术部,美国空军系统司令部 43
6 。期刊翻译支持由美国国家科学基金会 45
7。公务员制度委员会的数据联邦翻译 50
8。需求和可翻译 54
9。翻译不同类型的成本估算 57
10。质量评价的实验翻译 67
11。机器翻译中常见错误类型 76
12。机器辅助翻译联邦武装部队翻译重刑局,德国曼海姆 79
13。机器辅助翻译的欧洲煤钢COM-群落,卢森堡 87
14。机器翻译的翻译对战后期编辑 91
15。评估的科学编辑和联合出版物研究服务副外国技术部翻译 102
16。政府支持的机器翻译研究 107
17。电脑出版 113
18。编程语言和语言学的关系 118
19。机器翻译及语言学系 121
20。委员会构成 124
~~~~~~~~~~~~~~~~~~~
Contents
Human Translation1
Types of Translator Employment2
English as the Language of Science4
Time Required for Scientists to Learn Russian5
Translation in the United States Government6
Number of Government Translators7
Amount Spent for Translation9
Is There a Shortage of Translators or Translation ?11
Regarding a Possible Excess of Translation13
The Crucial Problems of Translation16
The Present State of Machine Translation19
Machine-Aided Translation at Mannheim and Luxembourg25
Automatic Language Processing and Computational Linguistics29
Avenues to Improvement of Translation32
Recommendations34
APPENDIXES
1.Experiments in Sight Translation and Full Translation35
2.Defense Language Institute Course in Scientific Russian37
3.The Joint Publications Research Service39
4.Public Law 480 Translations41
5.Machine Translations at the Foreign Technology Division, U.S.43
Air Force Systems Command
6.Journals Translated with Support by the National Science Founda-45
tion
7.Civil Service Commission Data on Federal Translators50
8.Demand for and Availability of Translators54
9.Cost Estimates of Various Types of Translation57
10.An Experiment in Evaluating the Quality of Translations67
11.Types of Errors Common in Machine Translation76
12.Machine-Aided Translation at the Federal Armed Forces Transla-79
tion Agency, Mannheim, Germany
13.Machine-Aided Translation at the European Coal and Steel Com-87
munity, Luxembourg
14.Translation Versus Postediting of Machine Translation91
15. Evaluation by Science Editors and Joint Publications Research Ser- vice and Foreign Technology Division Translations 102
16. Government Support of Machine-Translation Research107
17.Computerized Publishing113
18.Relation Between Programming Languages and Linguistics118
19.Machine Translation and Linguistics121
20.Persons Who Appeared Before the Committee124
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-21 22:00
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社