Information was not quantified until the 1940s.

Although, most of the time, I like to think of myself as a middle-aged man in his prime, there are moments when I realize that my shelf life ("bäst-före-datum") is fast approaching its end. One such moment is when I contemplate the fact that Information Theory was put on a firm footing just 10 years before I started studying the subject at the Royal Institute of Technology in Stockholm, and that the fundamental unit of information, the "bit", was recognized as such as late as 1948. - By then, the first computers had already been built, for goodness' sake!

Nowadays you do not have to be a specialist to run into the terms bit (an abbreviation of "binary digit") or byte (a group of 8 bits) on a daily basis, although more frequently perhaps in the form megabits per second (Mbps) or gigabyte (GB). In a way, it seems almost as strange that the "bit" was not named until 1948 as it would have been to learn that the number zero had not been assigned the symbol "0" until after World War 2. Today we are well aware that any type of information: text, numbers, images, music, movies... can be represented as a stream of 1s and 0s, i. e. of bits.

Morse code.
The Morse code.

Granted, until the mid-19th century, information was communicated through the delivery of books, paintings, music sheets etc, so there was not much incentive to study the efficient transmission of information. Yet, the electrical telegraph was taken into use as early as the 1830s. Optical telegraph.The Morse code was developed shortly thereafter. It can be seen as the first attempt at data compression. Rather than assigning codes of equal length to the different characters, the most frequent letters were assigned shorter codes. This led to an overall improvement of transmission efficiency even though the least frequent characters then had to be assigned longer sequences. - Even before the electrical telegraph, communication networks using a chain of optical telegraphs had been established in Sweden and France in the 1790s.

In time, the electrical telegraph assumed great commercial importance. Later, wireless telegraphy became important, and remained so even when voice communications by radio became possible, using various modulation schemes. In 1936, the Olympics were televised to a number of theaters in Berlin.

One would think that all of those inventions should have triggered mathematical work on the optimal encoding of information, and of course there were many studies that were relevant to the subject (von Neumann, Wiener etc.), but it was not until Claude Shannon published his paper on "A Mathematical Theory of Communication" in 1948 that a comprehensive theory of information was achieved. Shannon has been called "The Father of the Digital Age", with some justification!

In the late 1930s Shannon had been struck by the similarity between the structure of Boolean algebra and the properties of networks of electronic switching devices. He showed how Boolean algebra could be used to analyze the behavior of electronic relay circuits, and in a leap of imagination proposed that electronic circuits could be used to perform logic operations. Conceptually, this was the basis for the electronic digital computer, which was developed during WW 2, based on vacuum tubes.

Claude Shannon.
Claude Shannon (1916 - 2001)

In his 1948 paper, Shannon analyzed the concept of information and pointed out its close relationship to the concept of entropy, familiar from statistical mechanics. If a certain possibility has a relatively high statistical probability, its confirmation does not carry much information, so we should not waste many bits to send the corresponding message, regardless of its semantic content. (This is why the letter "e" in the Morse code is encoded in the shortest form available, "e" being the most frequent letter in the English language.) When there is no uncertainty, the entropy is zero, and it is pointless to send the information. (In an exercise at the R. Institute of Technology, I had to calculate the optimal encoding of the announcement of that year's winner of the Nobel Prize in Literature, given certain a priori probabilities.)

Of course, the paper goes far beyond the preliminary observations on the mathematical characteristics of information. Pretty soon Shannon goes deep into the mathematics of reconstructing a signal corrupted by noise in a communications channel (leaving yours truly by the wayside). - In 1949, Shannon developed the Sampling Theorem, which deals with the reconstruction of a continuous signal from a number of discrete samples. The Sampling Theorem has a complex history, with several contributors, as outlined in this reference.

Returning to the "bit", it seems so surprising to me that this was such a late discovery, or rather that its significance was discovered so late. For instance, if we should seek to communicate with a putative extraterrestrial civilization, would anyone propose to do it in some other form than a signal consisting of a series of 0s and 1s (or pulse code modulating a continuous carrier, which amounts to the same thing)?

We happen to attach a lot of significance to the number ten, but that is almost certainly a result of our having five digits on each hand. (Professor Nils Åslund at the Institute of Technology had this trick question: "A Martian believes that two times three is ten. How many fingers does he have on each hand?" - "Aha!", you say, "The base of his number system must be six, so the answer is three." - "Wrong!", says professor Åslund, "The Martian has two fingers on each of his three hands!") If we could start from scratch, the most convenient base for our number system would probably not be ten (and certainly not two!) but twelve, divisible by 2, 3, 4 and 6.

Shannon himself seems to have been a colorful character. He designed and built chess-playing, maze-solving, and juggling machines, and a motorized pogo stick. He worked at Bell Labs for three decades. On occasion he would ride a unicycle down the corridors while juggling. He even modified a unicycle to an off-center mounting, so that he would bounce up and down while riding the cycle. He is also credited with having made money at the roulette tables in Las Vegas and on the stock market by applying certain elements of Information Theory.

  Last edited or checked October 1, 2019

Home page
Curriculum Vitae
Kerstin Amanda
Family tree
Things that surprise me
Web stuff
Funny quotes