There is an English saying about “always put your best foot forward” which roughly translates into 永遠只表揚你的長处. This is human nature and carries over to science and technology. However, sometimes this spirit is, in my opinion, carried a bit too far. Deficiencies in results and applications are often deliberately overlooked or omitted so as to enhance the value of scientific papers and theories. Although the author is not telling any untruth but the omission nevertheless can misled the reading public and young scholars. Researches can often be led down a primrose path way beyond its usefulness. This problem occurs when a scientific discipline purports to be practically useful – such as applied mathematics as distinguished from pure mathematics. Let me give some examples.
The Holy Grail of control theory, decision analysis, and operations research is the determination of the feedback control law, or the best decision rule, or the optimal strategy (see http://www.sciencenet.cn/blog/user_content.aspx?id=7555 for an explanation of the term “strategy” or “decision rule” or “control law”)
This is a mathematical function that maps input information to output decision – a general multivariable function. Thus, if we somehow can obtain this function our problem is SOLVED.
It is not an exaggeration to say that the entire aerospace industry from the Apollo moon landing to the latest GPS owe a debt to this control-theoretic development in the late ‘50s and early ‘60s. As a result, finding the optimal control law for general problems remains an idealized goal for all problem solvers. We continue to hope that with each advance in computer hardware and mathematical theory; we will move one step closer to this ultimate goal.
However, this is not quite the whole story The simple but often not emphasized fact is this:
It is extremely difficult to specify and impossible to implement a general multivariable function even if the function is known
Generally speaking, a one variable function is a two-column table; a two-variable function is then a book of tables; a three-variable function, a library of books; four-variable, a universe of libraries; and so on. Thus, how does one store or specify a general arbitrary 100-variable function never mind implementing it even if the function is God given? No amount of hardware advance will overcome this fundamental impossibility even if mathematical advances provide the general solution. Exponential growth is one difficulty that cannot be overcome computationally in general practice. Our earlier successes with control theory and its extensions were enabled by the fact the functions involved have very special form, namely, they decompose into sums or products of functions of single variable or low dimensions. As we move from the control of continuous variable dynamic systems into discrete event systems or more complex human-made systems such as electric power grid, communication networks, huge manufacturing plants and supply chains, there is no prior reason to expect that the optimal control law for such system will have the convenient additive or multiplicative form. Even if in the unlikely scenario that we are lucky to have such simple functional form for the control law of the systems under study, then our effort should be to concentrate on searches for and characterization of such systems as oppose to finding the more general form of control law.
By stating the above, this writer does not mean to imply that theorists are in general blind to this fact. Hilbert’s 13th problem, the Naïve approach to Bayesian Networks, and the Product Distribution theory, neural networks, and others are all examples of trying to approximate general multivariable functions by combinations of functions of single variable or low dimensions. But this writer submits that it is nevertheless our instinctive desire to look for this solution theoretically or to attempt to incorporate all relevant information into decision making practically. The consequence of such instincts or unconscious desire can be misdirected effort and uneconomical utilization of resources.
A second example of omission in similar vein deals with the fact combinatorial explosion can defeat any computational scheme. If the computational burden of a scheme can be expressed as polynomial (as different from exponential or combinatorial) function of the size of the problem, then we say the computational scheme is in “P” and is good. (Note: Does NP=P? is an outstanding problems in combinatorial analysis where NP roughly speaking denotes a very wide class of problems for which no known “P” solution exist so far However S. Cook in 1971 proved that if ever one P solution can be found for just one of the problem in this NP class, then all problems in this class can be solved the same way; hence NP=P. But to date we don’t know if this is true or possible despite efforts by untold number of able mathematicians. Solving it will get you the Fields Medal I am sure). The above is well known. However, occasionally (most particularly in “Markov Chain” literature) we see the statement in papers that “ . . the scheme is polynomial computationally in the size of the state space . . .” (i.e. a “P’ solution). This sounds very good. But what the author neglects to mention is that the size of the state space is combinatorially large. Anything other than a toy problems will render the scheme impossible in typical Markov chain formulation of the problem.. However, I don’t want to imply that such paper is worthless. Theoretical paper often clarifies and makes precise the essence of the problem. It is the simultaneous giving of misleading practical implication that I am objecting to.
A last (half) example, for which authors including the writer himself is partially guilty of, is this. Many solutions to problems in statistics and probability require the ability of sample uniformly from an arbitrary space Q. As taught in textbooks (including my own) this is an easy task involving generating random numbers on the interval [0,1]. What is not mentioned is the fact that for most real world problems, the space over which we need to generate uniform samples is often of very high dimension, irregular, and difficult to specify. The problem is thus not as simple as the textbook made out to be. However, I say this is only a “half” problem of omission because there does exist a general approach to solve this real sampling problem. But how effective is the solution depends on the problem and problem solver. In other words, if you are clever and the problem structure is right you can have a very good solution. Otherwise, you need to work harder. but it is not of the impossible nature of the first two examples above. Furthermore, the difficulty here is not that the technique of uniform sampling is not good but its application to a specific problem that sometimes introduces additional difficulties. For these reasons, authors tend not to go into details since each example will be different. But nevertheless, this omission often tend to make people forget that a practical solution is not complete until one addresses this sampling issue. Let me just give a practical example. In a busy international airport, scheduling of aircraft arrivals and departures subject to various constraints such as runway, weather, etc. can be a complex problem in operations research.. Here you need to define what is a schedule, what is the search space for such schedules, and decide how to sample uniformly in the “schedule” space subject of various service constraints. Think about it and you can easily convince yourself this is far from a routine problem. A similar and equally important problem is the scheduling of multiple elevators in high rise office buildings to minimize delays.
Anyway, these are several of my pet complaints which I often voice in public. This did not please the establishment at times for which I pay a penalty (it is like washing your family dirty laundry in public 家醜外揚). But at my age, it is of less concern to me. I just don’t want young scholars to be misled intentionally or unintentionally.