Thursday, April 11, 2024

Impossible? Perhaps not!

Lord Kelvin

Throughout the history of science there have been proofs that something is impossible. These proofs are usually true in mathematics, such as that it is impossible to generate the number π with a ruler and compass. Despite which, many amateurs continue to assert that they have made it. I myself have had to face one of these “proofs”.

Another similar case is the proof, this time related to physical science, that it is impossible to build machines with perpetual motion, because they oppose the first or the second principle of thermodynamics. Also, in this case many amateurs insist that they have made it. In these cases, one should not waste time looking for the error, which is known to exist.

Other times, especially in the field of technology, the demonstration of impossibility is not definitive, but depends on conditions that are not always met. Sometimes this has stopped for some time the advancement of science, until someone noticed those conditions (which are not always clear) and bypassed the limitations. Let’s look at three examples:

  • The nineteenth-century controversy about the flight of heavier-than-air vehicles, which many considered impossible. Lord Kelvin declined the invitation to join the Aeronautical Society, by saying: I have no faith in any form of aerial navigation other than by balloons. And in 1903, the Canadian astronomer Simon Newcomb said this:

Quite likely the 20th century is destined to see the natural forces which will enable us to fly from continent to continent with a speed far exceeding that of a bird. But when we inquire whether aerial flight is possible in the present state of our knowledge; whether, with such materials as we possess, a combination of steel, cloth and wire can be made which, moved by the power of electricity or steam, shall form a successful flying machine, the outlook may be altogether different

The first flight of the Wright brothers took place in that same year (1903).

  • In 1969, Marvin Minsky and Seymour Papert published a book (Perceptrons: An Introduction to Computational Geometry) where they proved that the artificial neural networks of that time (perceptrons with two layers of neurons) could not represent the binary logic function exclusive-or, whose result is 0 if both inputs are equal, and 1 if they are different. Some people say (not everyone agrees) that this proof halted neural network research for ten years. The introduction of a third layer made it possible to represent the exclusive-or function and many others.
  • One of the most important problems posed to computers was sorting a series of numbers (from smallest to largest, or vice versa). The first algorithm developed to do this was simple, but slow when the series of numbers to be sorted was long: its execution time is proportional to the square of the size of the series. In other words: if that algorithm takes (say) 10 units of time to sort 10 numbers, sorting 100 numbers would take 1000 units; sorting 1000 numbers would take 100,000 units; and so on. The rest of this post is dedicated to this case.

Algorithms were soon designed that performed more quickly, since the time they need is proportional to the size of the series times its logarithm. If this algorithm took 10 units of time to sort 10 numbers, sorting 100 numbers would take 200 units; sorting 1000 numbers would take 3000 units; and so on. It can be easily seen that processing time, as a function of the number of elements to be sorted, grows much more slowly than with the first algorithm.

Some computer scientists tried to find algorithms even faster than the previous ones. However, someone proved a theorem that stated that it is impossible to order n numbers in a time shorter than n.log(n). That theorem stopped research in this field for about ten years, until someone designed the following algorithm, which sorts n numbers in a time proportional to n. This means that, if this algorithm takes 10 units of time to sort 10 numbers, sorting 100 numbers would take 100 units; sorting 1000 numbers would take 1000 units; and so on. Let’s look at the algorithm:

To sort n numbers, we cut n spaghetti into lengths proportional to the numbers to be sorted and place them vertically on a flat surface. We lower towards them another flat surface impregnated with glue. As soon as it encounters the slightest resistance, the descending surface stops and rises, taking the longest spaghetti with it. We take out that spaghetti and do the same operation again. In the end we have the spaghetti (and therefore the numbers) sorted from longest to smallest, and the time to do so has been proportional to the number of spaghetti (or numbers).

Although this algorithm is not easy to implement in a computer, it shows that it is possible to sort n numbers in a time proportional to n. Then what happens with the proof that it is impossible to do it faster than n.log(n)? Where was the mistake?

There wasn’t any. The proof is correct if it is applied to a series of real numbers. But not all real numbers are the same. Some (those that can be expressed as ratios a/b) are rational. There are infinitely many real numbers for which this is not possible, such as π, e, and many others. But it so happens that all real numbers that can be represented on a computer are rational, because they always have a finite number of digits (usually more than 50, in binary code). The theorem does not apply when we are dealing with rational numbers.

As soon as it was known that it was possible to solve the problem, algorithms were searched and found that could do the same on a computer. I used one of them in my implementation of APL language interpreters for the IBM personal computer, while previous interpreters of this language had used algorithms of the type n.log(n), which were slower for large values of n.

The same post in Spanish

Thematic Thread on Mathematics and Statistics: Previous Next

Manuel Alfonseca

No comments:

Post a Comment