||
§9. 可计算数的范围
还没有人试图证明“可计算”的数包括所有被自然视为可计算的数,所有能给出的论证本质上都必然是对直觉的诉求,因此在数学上相当不令人满意。真正的问题是,“能用于计算一个数的过程是什么?”
我将使用三种论证:
1. 直接诉诸直觉。
2. 两个定义等价的证明(如果新的定义具有更大的直觉吸引力)。
3. 举出大量可计算数的例子。
一旦承认可计算数都是“可计算的”,其他几个相同性质的命题也就随之而来。特别是,如果有一个通用过程判定希尔伯特函数演算的公式是否可证,那么这个判定过程就可以由机器来完成。
[类型 (a)] 这个论证只是对§1的观点的阐述
计算通常是通过在纸上书写某些符号来完成的。我们可以假设这张纸被分成一个个方格,就像儿童的算术本一样。在初级算术中,有时会用到纸的二维特性,但这种做法总是可以避免的。我想大家会同意,纸的二维特性不是计算的必要条件。那么,我假设计算是在一维纸上进行的,即在被分成方格的带上。我还假设,可以打印的符号数目是有限的。如果我们允许无穷多的符号,那么就会出现差异极小的符号。限制符号数目并不会有严重的影响,因为总是可以使用符号序列来代替单个符号。因此,如17或9999999999999999这样的阿拉伯数字将被视为单个符号。同样,在任何欧洲语言中,单词都被视为单个符号(然而,中文却试图使用无数无穷符号)。从我们的角度来看,单个符号和复合符号之间的区别在于,复合符号如果太长,一眼就无法看清,这符合经验,我们无法一眼看出9999999999999和99999999999999是否相同。
计算者在任何时刻的行为都取决于他所观察到的符号,以及他当时的 “思维状态”。我们可以假设,计算者在某一时刻能够观察到的符号或方格的数量有一个上限B,如果他想观察更多,就必须连续观察。我们还将假设,需要考虑的思维状态的数量是有限的,其原因与限制符号数量相同。如果我们允许无穷多的思维状态,那么其中一些状态就会 “无限接近”,从而造成混淆。同样,这种限制并不会严重影响计算,因为只要在带上多写几个符号,就可以避免使用更复杂的思维状态。
让我们想象一下,计算者所执行的操作被分解成 “简单的操作”,这些操作是如此的简单,以至于不容易想象它们还能被进一步分解。每个这样的操作都是由计算者及其带子组成的物理系统的变化构成。如果我们知道带的符号序列、计算者观察到的符号序列(可能有特殊的顺序)以及计算者的思维状态,我们就知道了系统的状态。我们可以假设,在一个简单的操作中,被改变的符号不超过一个,任何其他变化都可以归结为这种简单的变化。符号可以这样改变的方格的情况与观察到的方格的情况相同。因此,我们可以不失一般性地假设,符号被改变的方格总是 “观察到的 ”方格。
除了这些符号的变化,简单操作必须包括观察到的方格分布的变化。新观察到的方格必须能被计算者立即识别。我认为合理的假设是,它们只能是与之前观察到的最接近的方格的距离不超过某个特定值。假设每个新观察到的方格与之前观察到的方格的距离都在 L 个方格以内。
关于 “可直接识别性”,可以认为还有其他类型的方格也是可直接识别的。特别是,用特殊符号标记的方格可以被认为是可直接识别的。现在,如果这些方格只用单一符号标记,那么它们的数量是有限的,我们就不应该把这些标记的方格与观察到的方格相邻而扰乱我们的理论。另一方面,如果这些方格是由一系列符号标记的,我们就不能把识别过程看作是一个简单的过程,这是一个基本要点,应该加以说明。在大多数数学论文中,方程和定理都有编号。通常情况下,编号不会超过(例如)1000。因此,通过编号一眼就能认出一个定理。但是,如果论文很长,我们可能会看到定理 157767733443477;然后,在论文的更远处,我们可能会发现“......因此(应用定理 157767733443477)我们有......”。为了确定哪个是相关定理,我们应该逐个比较这两个数字,可能的话用铅笔勾出数字,以确保它们没有被重复计算。尽管如此,如果有人认为还有其他 “一眼就能认出的 ”方格,只要这些方格能通过我的机器所能完成的某种过程找到,就不会影响我的论点。这一观点将在下文第三部分中阐述。
因此,简单操作必须包括:
(a) 改变一个被观察方格的符号;
(b) 将观察到的一个方格变为与先前观察到的一个方格相距 L 个方格以内的另一个方格。
可能其中一些变化必然涉及到思维状态的改变。因此,最一般的单一操作必须是下列操作之一:
A. 符号的可能变化(a)加上思维状态的可能变化。
B. 观察到的方格的可能变化(b)以及思维状态的可能变化。
正如第250页所建议的那样,实际执行的操作是由计算者的思维状态和观察到的符号决定的。特别是,它们决定了操作进行后计算者的思维状态。
我们现在可以建造一台机器来做这个计算者的工作。计算者的每一种思维状态都对应着机器的一个m-格局。机器扫描B个方格,对应于计算机观察到的B个方格。在任何行动中,机器都可以改变一个被扫描的方格上的符号,也可以将任何一个被扫描的方格改为与其他被扫描的方格之一相距不超过L个方格的另一个方格。所做的动作和后续的格局是由扫描的符号和m-格局决定的。刚刚描述的机器与第2节中定义的计算机并无本质区别,与这种类型的任何机器相对应,可以构造一个计算机来计算相同的序列,即由计算机计算的序列。
II. [类型(b)]
如果对希尔伯特的函数演算的符号进行修改,使其系统化,并且只涉及有限数量的符号,那么就有可能构造一个自动机器K,用来寻找演算的所有可证明的公式。
令α是一个序列,用Gα(x)表示命题“α的第x个数字是1”,这样-Gα(x)表示“α的第x个数字是0”。假设我们可以找到一组定义序列α的性质,并且可以用Gα(x)和命题函数N(x)、F(x,y)来表达这些性质,其中,N(x)表示“x是一个非负整数”,F(x,y)表示 y = x +1。用合取符号把这些公式连接起来,我们可以得到一个定义α的公式,将其命名为U。U的项中必须包含皮亚诺公理的必要条件,也就是:
(Ǝu) N(u)& (x)(N(x)→(Ǝy)F(x,y)) & ( F(x,y)→N(y) ),
我们将把它缩写为P。
当我们说“U定义α”时,意思是,-U不是一个可证明的公式,同时,对于每个n,以下公式(An)或(Bn)之一是可证明的:
U & F(n) →Gα(u(n)), (An)
U & F(n) → ( –Gα(u(n))), (Bn)
其中 F(n) 代表F(u, u') & F(u', u'') & ... F(u(n-1)), u(n)) 。
我说α是一个可计算的序列:稍微修改K,便得到一个可以计算α的机器Kα。
我们把Kα的运动分成段。第n段用来求出α的第n个数字。在第(n-1)段完成后,在所有的符号后面打印上一个双冒号::,接下来的工作完全在这个双冒号右边的方格上完成。第一步是写字母A,再写公式(An),然后是B,再写公式(Bn)。然后机器Kα开始做K的工作,但只要发现一个可证明的公式,这个公式就会与(An)和(Bn)比较。如果是与(An)相同的公式,则打印出数字 “1”,第n部分结束。如果是与(Bn)相同的公式,则打印“0”,该部分结束。如果它和An、Bn都不相同,那么K的工作就从它停止的地方继续进行,迟早会得出公式(An)或(Bn);这源于我们对α和U的假设,以及K的已知性质。因此,第n个段最终会结束。Kα是非循环(circle-free)的,α是可计算的。
还可以证明,以这种方式通过使用公理来定义的数α包括所有可计算的数,这可以通过用函数演算来描述计算机器来完成。
必须记住,我们给“U定义α”这句话附上了相当特殊的含义。可计算数并不包括所有(一般意义上)的可定义数。让δ是一个序列,其第n个数字是1或0,根据n是或不是满足的。第8节的定理的一个直接结果是,δ是不可计算的。(就我们目前所知)δ 的任何指定数字都有可能被计算出来,但不是通过一个通用过程。当计算出足够多的δ的数字时,为了获得更多的数字,就需要一种全新的方法。
III. 这可以被看作是对I的修改,也可以看作是对II的推论
我们假设,就像在I中一样,计算是在带子上进行的;但我们通过考虑一个更物理的和明确的对应物来避免引入“思维状态”。计算者总是有可能中断工作,离开并忘记一切,然后回来继续工作。如果他这样做,他必须留下一份说明(以某种标准形式书写),解释如何继续工作。这个记录是“思维状态”的对应物。我们将假设计算机的工作方式很随意,以至于他一次只做一步。说明必须使他能够执行一步并写下下一步。因此,计算在任何阶段的进展状态,完全由指令的说明和带上的符号决定。也就是说,系统的状态可以由一个表达式(符号序列)来描述,包括带上的符号和Δ(我们假设它不会出现在其他地方),以及说明。这个表达式可以被称为“状态公式”。我们知道,任何给定阶段的状态公式都是由最后一步之前的状态公式决定的,我们假设这两个公式的关系可以用函数演算来表达。换句话说,我们假设有一个公理U,
它表达了控制计算机行为的规则,即任何阶段的状态公式与前一阶段的状态公式的关系。如果是这样,回到索引,我们可以构造一台机器来写下连续的状态公式,从而计算所需的数字。
原文:
9. The extent of the computable numbers.
No attempt has yet been made to show that the “computable” numbers include all numbers which would naturally be regarded as computable. All arguments which can be given are bound to be, fundamentally, appeals to intuition, and for this reason rather unsatisfactory mathematically. The real question at issue is “What are the possible processes which can be carried out in computing a number?”
The arguments which I shall use are of three kinds.
1. A direct appeal to intuition.
2. A proof of the equivalence of two definitions (in case the new definition has a greater intuitive appeal).
3. Giving examples of large classes of numbers which are computable.
Once it is granted that computable numbers are all “computable” several other propositions of the same character follow. In particular, it follows that, if there is a general process for determining whether a formula of the Hilbert function calculus is provable, then the determination can be carried out by a machine.
I. [Type (a)]. This argument is only an elaboration of the ideas of §1.
Computing is normally done by writing certain symbols on paper. We may suppose this paper is divided into squares like a child's arithmetic book. In elementary arithmetic the two-dimensional character of the paper is sometimes used. But such a use is always avoidable, and I think that it will be agreed that the two-dimensional character of paper is no essential of computation. I assume then that the computation is carried out on one-dimensional paper, i.e. on a tape divided into squares. I shall also suppose that the number of symbols which may be printed is finite. If we were to allow an infinity of symbols, then there would be symbols differing to an arbitrarily small extent.[6] The effect of this restriction of the number of symbols is not very serious. It is always possible to use sequences of symbols in the place of single symbols. Thus an Arabic numeral such as {250} 17 or 999999999999999 is normally treated as a single symbol. Similarly in any European language words are treated as single symbols (Chinese, however, attempts to have an enumerable infinity of symbols). The differences from our point of view between the single and compound symbols is that the compound symbols, if they are too lengthy, cannot be observed at one glance. This is in accordance with experience. We cannot tell at a glance whether 9999999999999999 and 999999999999999 are the same.
The behaviour of the computer at any moment is determined by the symbols which he is observing. and his “state of mind” at that moment. We may suppose that there is a bound B to the number of symbols or squares which the computer can observe at one moment. If he wishes to observe more, he must use successive observations. We will also suppose that the number of states of mind which need be taken into account is finite. The reasons for this are of the same character as those which restrict the number of symbols. If we admitted an infinity of states of mind, some of them will be “arbitrarily close” and will be confused. Again, the restriction is not one which seriously affects computation, since the use of more complicated states of mind can be avoided by writing more symbols on the tape.
Let us imagine the operations performed by the computer to be split up into “simple operations” which are so elementary that it is not easy to imagine them further divided. Every such operation consists of some change of the physical system consisting of the computer and his tape. We know the state of the system if we know the sequence of symbols on the tape, which of these are observed by the computer (possibly with a special order), and the state of mind of the computer. We may suppose that in a simple operation not more than one symbol is altered. Any other changes can be set up into simple changes of this kind. The situation in regard to the squares whose symbols may be altered in this way is the same as in regard to the observed squares. We may, therefore, without loss of generality, assume that the squares whose symbols are changed are always “observed” squares.
Besides these changes of symbols, the simple operations must include changes of distribution of observed squares. The new observed squares must be immediately recognisable by the computer. I think it is reasonable to suppose that they can only be squares whose distance from the closest of the immediately previously observed squares does not exceed a certain fixed amount. Let us say that each of the new observed squares is within L squares of an immediately previously observed square.
In connection with “immediate recognisability”, it may be thought that there are other kinds of square which are immediately recognisable. In particular, squares marked by special symbols might be taken as immediately recognisable. Now if these squares are marked only by single symbols there can be only a finite number of them, and we should not upset our theory by adjoining these marked squares to the observed squares. If, on the other hand, they are marked by a sequence of symbols, we cannot regard the process of recognition as a simple process. This is a fundamental point and should be illustrated. In most mathematical papers the equations and theorems are numbered. Normally the numbers do not go beyond (say) 1000. It is, therefore, possible to recognise a theorem at a glance by its number. But if the paper was very long, we might reach Theorem 157767733443477; then, farther on in the paper, we might find “... hence (applying Theorem 157767733443477) we have...”. In order to make sure which was the relevant theorem we should have to compare the two numbers figure by figure, possibly ticking the figures off in pencil to make sure of their not being counted twice. If in spite of this it is still thought that there are other “immediately recognisable” squares, it does not upset my contention so long as these squares can be found by some process of which my type of machine is capable. This idea is developed in III below.
The simple operations must therefore include:
(a) Changes of the symbol on one of the observed squares.
(b) Changes of one of the squares observed to another square within L squares of one of the previously observed squares.
It may be that some of these changes necessarily involve a change of state of mind. The most general single operation must therefore be taken to be one of the following:
A possible change (a) of symbol together with a possible change of state of mind.
B. A possible change (b) of observed squares, together with a possible change of state of mind.
The operation actually performed is determined, as has been suggested on p.250, by the state of mind of the computer and the observed symbols. In particular, they determine the state of mind of the computer after the operation is carried out.
We may now construct a machine to do the work of this computer. To each state of mind of the computer corresponds an “m-configuration” of the machine. The machine scans B squares corresponding to the B squares observed by the computer. In any move the machine can change a symbol on a scanned square or can change anyone of the scanned squares to another square distant not more than L squares from one of the other scanned {252} squares. The move which is done, and the succeeding configuration, are determined by the scanned symbol and the m-configuration. The machines just described do not differ very essentially from computing machines as defined in §2, and corresponding to any machine of this type a computing machine can be constructed to compute the same sequence, that is to say the sequence computed by the computer.
II. [Type (b)].
If the notation of the Hilbert functional calculus[7] is modified so as to be systematic, and so as to involve only a finite number of symbols, it becomes possible to construct an automatic[8] machine K which will find all the provable formulae of the calculus.[9]
Now let α be a sequence, and let us denote by Gα(x) the proposition “The x-th figure of α is 1”, so that [10] – Gα(x) means “The x-th figure of α is 0”. Suppose further that we can find a set of properties which define the sequence α and which can be expressed in terms of Gα(x) and of the propositional functions N(x) meaning “x is a non-negative integer” and F(x,y) meaning “y = x + 1”. When we join all these formulae together conjunctively we shall have a formula, U say, which defines α. The terms of U must include the necessary parts of the Peano axioms, viz.,
(Ǝu) N (u)& (x)(N(x)→(Ǝy)F(x,y)) & ( F(x,y)→N(y) ),
When we say “U defines α”, we mean that –U is not a provable formula, and also that, for each n, one of the following formulae (An) or (Bn) is provable.
U & F(n) →Gα(u(n)),(An)[11]
U & F(n) → ( –Gα(u(n))),(Bn)
where F(n)] stands for F(u, u') & F(u', u'') & … F(u(n-1), u(n)).
I say that α is then a computable sequence: a machine Kα to computeα can be obtained by a fairly simple modification of K.
We divide the motion of α into sections. The n-th section is devoted to finding the n-th figure of α. After the (n – 1)-th section is finished a double colon : : is printed after all the symbols, and the succeeding work is done wholly on the squares to the right of this double colon. The first step is to write the letter “A” followed by the formula (An) and then “B” followed by (Bn). The machine Kα then starts to do the work of K, but whenever a provable formula is found, this formula is compared with (An) and with (Bn). If it is the same formula as (An), then the figure “1” is printed, and the n-th section is finished. If it is (Bn), then “0” is printed and the section is finished. If it is different from both, then the work of K is continued from the point at which it had been abandoned. Sooner or later one of the formulae (An) or (Bn) is reached; this follows from our hypotheses about α and U, and the known nature of K. Hence the n-th section will eventually be finished; Kα is circle-free; α is computable.
It can also be shown that the numbers α definable in this way by the use of axioms include all the computable numbers. This is done by describing computing machines in terms of the function calculus.
It must be remembered that we have attached rather a special meaning to the phrase “U defines α ”. The computable numbers do not include all (in the ordinary sense) definable numbers. Let δ be a sequence whose n-th figure is 1 or 0 according as n is or is not satisfactory. It is an immediate consequence of the theorem of §8 that δ is not computable. It is (so far as we know at present) possible that any assigned number of figures of δ can be calculated, but not by a uniform process. When sufficiently many figures of δ have been calculated, an essentially new method is necessary in order to obtain more figures.
III. This may be regarded as a modification of I or as a corollary of II.
We suppose, as in I, that the computation is carried out on a tape; but we avoid introducing the “state of mind” by considering a more physical and definite counterpart of it. It is always possible for the computer to break off from his work, to go away and forget all about it, and later to come back and go on with it. If he does this he must leave a note of instructions (written in some standard form) explaining how the work is to be continued. This note is the counterpart of the “state of mind”. We will suppose that the computer works by such a desultory manner that he never does more than one step at a sitting. The note of instructions must enable him to carry out one step and write the next note. Thus the state of progress of the computation at any stage is completely determined by the note of {254} instructions and the symbols on the tape. That is, the state of the system may be described by a single expression (sequence of symbols), consisting of the symbols on the tape followed by Δ (which we suppose not to appear elsewhere) and then by the note of instructions. This expression may be called the “state formula”. We know that the state formula at any given stage is determined by the state formula before the last step was made, and we assume that the relation of these two formulae is expressible in the functional calculus. In other words we assume that there is an axiom U which expresses the rules governing the behaviour of the computer, in terms of the relation of the state formula at any stage to the state formula at the proceeding stage. If this is so, return to the indexwe can construct a machine to write down the successive state formulae, and hence to compute the required number.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-21 23:59
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社