||
Fornew readers and those who request to be “好友 good friends” please read my 公告栏 first.
Another example of popular science writing
The algorithm that runs the world
13 August 2012 –
New Scientist 13/08/2012 14:06
Author: Richard Elwes is a visiting fellow at the University of Leeds,UK, and the author of Maths 1001: Absolutely everything that matters inmathematics (Quercus, 2010)
Abstract
The simplex algorithm directs wares to their destinations theworld over. Its services are called uponthousands of times a second to ensure the world's businessruns smoothly – but are its mathematics as dependable as we thought?
TEXT
YOU MIGHT not have heard of the algorithm that runs the world. Fewpeople have, though it can determine much that goes on in our day-to-day lives:the food we have to eat, our schedule at work, when the train will come to takeus there. Somewhere, in some server basement right now, it is probably workingon some aspect of your life tomorrow, next week, in a year's time. Perhapsignorance of the algorithm's workings is bliss. The door to Plato's Academy inancient Athens is said to have borne the legend "let no one ignorant of geometryenter". That was easy enough to say back then, when geometry was firmlygrounded in the three dimensions of space our brains were built to cope with.But the algorithm operates in altogether higher planes. Four, five, thousandsor even many millions of dimensions: these are the unimaginable spaces the algorithm's series of mathematicalinstructions was devised to probe. Perhaps, though, we should try a littleharder to get our heads round it. Because powerful though it undoubtedly is,the algorithm is running into a spot of bother. Its mathematical underpinnings,though not yet structurally unsound, are beginning to crumble at the edges.With so much resting on it, the algorithm may not be quite as dependable as it onceseemed.
To understand what all this is about, we must first delve into thedeep and surprising ways in which the abstractions of geometry describe theworld around us. Ideas about such connections stretch at least as far back asPlato, who picked out five 3D geometric shapes, orpolyhedra, whose perfect regularity he thoughtrepresented the essence of the cosmos. The tetrahedron, cube, octahedron and20-sided icosahedron embodied the "elements" of fire, earth, air andwater, and the 12-faced dodecahedron the shape of the universe itself. Thingshave moved on a little since then. Theories of physics today regularly invokestrangely warped geometries unknown to Plato, or propose the existence ofspatial dimensions beyond the immediately obvious three. Mathematicians, too,have reached for ever higher dimensions, extending ideas about polyhedra tomind-blowing "polytopes" with four, five or any number of dimensions.
A case in point is a law of polyhedra proposed in 1957 by the USmathematician Warren Hirsch. It stated that the maximum number of edges you have totraverse to get between two corners on any polyhedron is never greater than the number of its faces minus thenumber of dimensions in the problem, in this case three. The two oppositecorners on a six-sided cube, for example, are separated by exactly three edges, and no pair of corners is fouror more apart. Hirsch's rule holds true for all 3D polyhedra. But it has neverbeen proved generally for shapes in higher dimensions. The expectation that itshould translate has come largely through analogy with other geometrical rulesthat have proved similarly elastic (see "Edges, corners and faces").When it comes to guaranteeing short routes between points on the surface of a4D, 5D or 1000D shape, Hirsch's rule has remained one of those nigglingunsolved problems of mathematics - a mere conjecture.
How is this relevant? Because, for today's mathematicians, dimensions are not just about space. True, the concept arose because we have threecoordinates of location that can vary independently: up-down, left-right and forwards-backwards. Throwin time, and you have a fourth "dimension" that works very similarly,apart from the inexplicable fact that we can move through it in only onedirection. But beyond motion, we often encounter real-world situations where wecan vary many more than four things independently. Suppose, for instance, youare making a sandwich for lunch. Your fridge contains 10 ingredients that canbe used in varying quantities: cheese, chutney, tuna, tomatoes, eggs, butter,mustard, mayonnaise, lettuce, hummus. These ingredients are nothing other thanthe dimensions of a sandwich-making problem. This can be treated geometrically:combine your choice of ingredients in any particular way, and your completedsnack is represented by a single point in a 10-dimensional space.
In this multidimensional space, we are unlikely to have unlimitedfreedom of movement. There might be only two mouldering hunks of cheese in thefridge, for instance, or the merest of scrapings at the bottom of themayonnaise jar. Our personal preferences might supply other, more subtleconstraints to our sandwich-making problem: an eye on the calories, perhaps, ora desire not to mix tuna and hummus. Each of these constraints represents aboundary to our multidimensional space beyond which we cannot move. Ourresources and preferences in effect construct a multidimensional polytopethrough which we must navigate towards our perfect sandwich.
In reality, the decision-making processes in our sandwich-makingare liable to be a little haphazard; with just a few variables to consider, andmere gastric satisfaction riding on the outcome, that's not such a problem. Butin business, government and science, similar optimization problems crop upeverywhere and quickly morph into brutes with many thousands or even millions of variables and constraints. A fruit importer might have a1000-dimensional problem to deal with, for instance, shipping bananas from fivedistribution centres storing varying numbers of fruit to 200 shops withdifferent numbers in demand. How many items of fruit should be sent from which centresto which shops while minimising total transport costs?
A fund manager might similarly want to arrange a portfoliooptimally to balance risk and expected return over a range of stocks; a railwaytimetabler to decide how best to roster staff and trains; or a factory orhospital manager to work out how to juggle finite machine resources or wardspace. Each such problem can be depicted as a geometrical shape whose number ofdimensions is the number of variables in the problem, and whose boundaries aredelineated by whatever constraints there are. In each case, we need to box ourway through this polytope towards its optimal point.
This is the job of the algorithm. Its full name is the simplexalgorithm, and it emerged in the late 1940s from the work of the US mathematicianGeorge Dantzig, who had spent the second world war investigating ways toincrease the logistical efficiency of the US air force. Dantzig was a pioneerin the field of what he called linear programming, which uses the mathematics ofmultidimensional polytopes to solve optimisation problems. (Note added by blogger: Actually the idea of LP was firstmade by Kantorvich who won a Nobel prize for economics for this. But theSimplex algorithm is Danzig’s invention) Oneof the first insights he arrived at was that the optimum value of the "targetfunction" - the thing we want to maximise or minimise, be that profit,travelling time or whatever - is guaranteed to lie at one of the corners of thepolytope. This instantly makes things much more tractable: there are infinitelymany points within any polytope, but only ever a finite number of corners.
If we have just a few dimensions and constraints to play with,this fact is all we need. We can feel our way along the edges of the polytope,testing the value of the target function at every corner until we find its sweet spot. But things rapidly escalate. Evenjust a 10-dimensional problem with 50 constraints - perhaps trying to assign aschedule of work to 10 people with different expertise and time constraints -may already land us with several billion corners to try out. The simplexalgorithm finds a quicker way through. Rather than randomly wandering along a polytope'sedges, it implements a "pivot rule" at each corner. Subtly differentvariations of this pivot rule exist in different implementations of thealgorithm, but often it involves picking the edge along which the targetfunction descends most steeply, thus ensuring each step takes us nearer the optimalvalue. When a corner is found where no further descent is possible, we know wehave arrived at the optimal point. Practical experience shows that the simplexmethod is generally a very slick problem-solver indeed, typically reaching anoptimum solution after a number of pivots comparable to the number ofdimensions in the problem. That means a likely maximum of a few hundred stepsto solve a 50-dimensional problem, rather than billions with a suck-it-and-seeapproach. Such a running time is said to be "polynomial"or simply "P", the benchmark for practicalalgorithms that have to run on finite processors in the real world. Dantzig's algorithm saw itsfirst commercial application in 1952, when Abraham Charnes and William Cooper at what is now Carnegie Mellon University inPittsburgh, Pennsylvania, teamed up with Robert Mellon at the Gulf Oil Companyto discover how best to blend available stocksof four differentpetroleum products into an aviation fuel with an optimaloctane level. Since then the simplex algorithm has steadily conquered theworld, embedded both in commercial optimisation packages and bespoke software products. Whereveranyone is trying to solve a large scale optimisation problem, the chances arethat some computer chip is humming away to its tune. "Probably tens orhundreds of thousands of calls of the simplex method are made everyminute," says Jacek Gondzio, an optimisation specialist at the University ofEdinburgh, UK.
Yet even as its popularity grew in the 1950s and 1960s, thealgorithm's underpinnings were beginning to show signs of strain. To startwith, its running time is polynomial only on average. In 1972, US mathematicians Victor Klee and George Minty reinforced thispoint by running the algorithm around some ingeniously deformedmultidimensional "hypercubes". Just as a square has four corners, anda cube eight, a hypercube in n dimensions has 2n corners.The wonky way Klee and Minty put their hypercubes together meant that thesimplex algorithm had to run through all of these corners before landing on theoptimal one. In just 41 dimensions, that leaves the algorithm with over atrillion edges to traverse.
A similar story holds for every variation of the algorithm's pivotrule tried since Dantzig's original design: however well it does in general, italways seems possible to concoct some awkward optimisation problems in which itperforms poorly. The good news is that these pathological cases tend not toshow up in practical applications - though exactly why this should be soremains unclear. "This behaviour eludes any rigorous mathematicalexplanation, but it certainly pleases practitioners," says Gondzio.
The fault was still enough to spur on researchers to find analternative to the simplex method. The principal pretender came along in the1970s and 1980s with the discovery of "interior point methods", flashy algorithms which rather than feeling theirway around a polytope's surface drill a path through its core. They came with agenuine mathematical seal of approval - a guarantee always to run in polynomial time - and typically took fewer stepsto reach the optimum point than the simplex method, rarely needing over 100moves regardless of how many dimensions the problem had. The discovery generated a lot of excitement, and fora while it seemed that the demise of Dantzig's algorithm was on the cards. Yetit survived and even prospered. The trouble with interior point methods is thateach step entails far more computation than a simplex pivot: instead ofcomparing a target function along a small number of edges, you must analyse allthe possible directions within the polytope's interior, a gigantic undertaking.For some huge industrial problems, this trade-off is worth it, but for by no means all. Gondzio estimates that between80 and 90 per cent of today's linear optimisation problems are still solved bysome variant of the simplex algorithm. The same goes for a good few of the even more complex non-linear problems(see "Straight down the line"). "As a devoted interior-point researcher Ihave a huge respect for the simplex method," says Gondzio. "I'm doingmy best trying to compete."
We would still dearly love to find something better: some newvariant of the simplex algorithm that preserves all its advantages, but alsoinvariably runs in polynomial time. For US mathematician and Fields medallist Steve Smale, writing in 1998, discoveringsuch a "strongly polynomial" algorithm was one of 18 outstanding mathematical questions to be dealt with in the 21st century. Yet finding such an algorithm may not now even be possible.
That is because the existence of such an improved, infalliblealgorithm depends on a more fundamental geometrical assumption - that a shortenough path around the surface of a polytope between two corners actuallyexists. Yes, you've got it: the Hirsch conjecture. The fates of the conjectureand the algorithm have always been intertwined. Hirsch was himself a pioneer in operational research and an early collaborator ofDantzig's, and it was in a letter to Dantzig in 1957 musing about theefficiency of the algorithm that Hirsch first formulated his conjecture. Until recently, little had happened to cast doubt on it. Kleeproved it true for all 3D polyhedra in 1966, but had a hunch the same did nothold for higher-dimensional polytopes. In his later years, he made a habit of suggesting it as a problem to every freshlyscrubbed researcher he ran across.
In 2001 one of them, a young Spaniard called Francisco Santos,now at the University of Cantabria in Santander, took on the challenge. As isthe way of such puzzles, it took time. After almost a decade working on theproblem, Santos was ready to announce his findings at a conference in Seattlein 2010. Last month, the resulting paper was published in the Annals of Mathematics (vol 176, p 383). In it, Santos describes a 43-dimensional polytope with86 faces. According to Hirsch's conjecture, the longest path across this shapewould have (86-43) steps, that is, 43 steps. But Santos was able to establishconclusively that it contains a pair of corners at least 44 steps apart. Ifonly for a single special case, Hirsch's conjecture had been proved false."It settled a problem that we did not know how to approach for many decades," says Gil Kalai ofthe Hebrew University of Jerusalem. "The entire proof is deep, complicatedand very elegant. It is a great result." A great result, true, butdecidedly bad news for the simplex algorithm. Since Santos's first disproof, furtherHirsch-defying polytopes have been found in dimensions as low as 20. The onlyknown limit on the shortest distance between two points on a polytope's surfaceis now contained in a mathematical expression derived byKalai and Daniel Kleitman of the Massachusetts Institute of Technology in 1992.This bound is much larger than the one the Hirsch conjecture would have provided,had it proved to be true. It is far too big, in fact, to guarantee a reasonablerunning time for the simplex method, whatever fancy new pivot rule we mightdream up. If this is the best we can do, it may be that Smale's goal of anidealised algorithm will remain forever out of reach – with potentially seriousconsequences for the future of optimisation.
All is not lost, however. A highly efficient variant of thesimplex algorithm may still be possible if the so-called polynomial Hirschconjecture is true. This would considerably tighten Kalai and Kleitman's bound, guaranteeing that no polytopes have pathsdisproportionately long compared with their dimension and number of faces. Atopic of interest before the plain-vanilla Hirsch conjecture melted away, the polynomial version has been attractingintense attention since Santos's announcement, both as a deep geometricalconundrum and a promising place to sniff around for an optimally efficient optimisation procedure.
As yet, there is no conclusive sign that the polynomial conjecturecan be proved either. "I am not confident at all," says Kalai. Notthat this puts him off. "What is exciting about this problem is that we do not know the answer." A lot could be riding on thatanswer. As the algorithm continues to hum away in those basements it is still, for the most part, telling us what we want to know inthe time we want to know it. But its own fate is now more than ever in thehands of the mathematicians.
Edges, corners and faces
Since Plato laid down his stylus, a lot of work has gone intounderstanding the properties of 3D shapes, or polyhedra. Perhaps the mostcelebrated result came from the 18thcentury mathematician Leonhard Euler. Henoted that every polyhedron has a number of edges that is two fewer than thetotal of its faces and corners. The cube,for example, has six faces and eight corners, a total of 14, while its edges number12. The truncated icosahedron, meanwhile, is the familiar pattern of a standardsoccer ball. It has 32 faces (12 pentagonal and 20 hexagonal), 60 corners - and90 edges. The French mathematician Adrien-Marie Legendre proved this rule in1794 for every 3D shape that contains no holes and does not cut through itselfin any strange way. As geometry started to grow more sophisticated and extendinto higher dimensions in the 19th century, it became clear that Euler'srelationship didn't stop there: a simple extension to the rule applies to shapes, or polytopes, in any number ofdimensions. For a 4D "hypercube", for example, a variant of theformula guarantees that the total number of corners (16) and faces (24) will beequal to number of edges (32) added to the number of 3D "facets" theshape possesses (8).
The rule derived by Warren Hirsch in 1957 about the maximumdistance between two corners of a polyhedron was thought to be similarlycast-iron. Whether it truly is turns out to have surprising relevance to thesmooth workings of the modern world (see main story).
2000 years of algorithms
George Dantzig's simplex algorithm has a claim to be the world'smost significant (see main story). But algorithms go back far further.
c. 300 BC THEEUCLIDEAN ALGORITHM
From Euclid's mathematical primer Elements,this is the grandaddy of all algorithms, showing how, given two numbers, youcan find the largest number that divides into both. It has still not been bettered.
820 THEQUADRATIC ALGORITHM
The word algorithm is derived from the name of the Persian mathematicianAl-Khwarizmi. Experienced practitioners today perform his algorithm for solvingquadratic equations (those containing an x2 term) in their heads. For everyoneelse, modern algebra provides the formulafamiliar from school.
1936 THEUNIVERSAL TURING MACHINE
The British mathematician AlanTuring equated algorithms with mechanicalprocesses - and found one to mimic all the others, the theoretical template forthe programmable computer.
1946 THEMONTE CARLO METHOD
When your problem is just too hard to solve directly, enter thecasino of chance. John von Neumann, Stanislaw Ulam and Nicholas Metropolis'sMonte Carlo algorithm taught us how to play and win.
1957 THEFORTRAN COMPILER
Programming was a fiddly, laborious job until an IBM team led byJohn Backus inventedthe first high-level programming language, Fortran. At the centreis the compiler: the algorithm which converts the programmer's instructionsinto machine code.
1962 QUICKSORT
Extracting a word from the right place in a dictionary is an easytask; putting all the words in the right order in the first place is not. The Britishmathematician Tony Hoare provided the recipe, now an essential tool in managingdatabases of all kinds.
1965 THEFAST FOURIER TRANSFORM
Much digital technology depends on breaking down irregular signalsinto their pure sinewave components - making James Cooley and John Tukey'salgorithm one of the world's most widely used.
1994 SHOR'SALGORITHM
Bell Labs's Peter Shor found a new, fast algorithm for splitting awhole number into its constituent primes - but it could only be performed by aquantum computer. If ever implemented on a large scale, it would nullify almost all modern internet security.
1998 PAGERANK
The internet's vast repository of information would be of littleuse without a way to search it. Stanford University's Sergey Brin and LarryPage found a way to assign a rank to every web page - and the founders ofGoogle have been living off it ever since.
Straight down the line
When a young and nervous George Dantzig spoke about his newsimplex algorithm at a conference of eminent economists and statisticians inWisconsin in 1948, a rather large hand was raised in objection at the back ofthe room. It was that of the renowned mathematician Harold Hotelling. "But we all know the world is non-linear," he said. It was a devastating put-down. Thesimplex algorithm's success in solving optimization problems (see main story)depends on assuming that variables vary in response to other variables alongnice straight lines. A cutlery company increasing its expenditure on metal, forexample, will produce proportionately more finished knives, forks and profitthe next month. In fact, as Hotelling pointed out, the real world isjam-packed with non-linearity. As the cutlery company expands, economies ofscale may mean the marginal cost of each knife
or fork drops, making for a non-linear profit boost. Ingeometrical terms, such problems are represented by multidimensional shapesjust as linear problems are, but ones bounded by curved faces that the simplexalgorithm should have difficulty crawling round. Surprisingly, though, linearapproximations to non-linear processes turn out to be good enough for mostpractical purposes. "I would guess that 90 or 95 per cent of all optimisation problems solved in the world are linearprograms," says Jacek Gondzio of the University of Edinburgh, UK. Forthose few remaining problems that do not submit to linear wiles, there is arelated field of non-linear programming - and here too, specially adapted versions of the simplex algorithm have come to play animportant part.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-26 10:53
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社