Codes : Their Nature and Manipulation
by E. L. Bentley, Proprietor of Bentley's Complete Phrase Code

How the Code, the "shorthand of international commerce," has evolved, and the unthought-of economies of its application, are ably set out here. The companion entry, Cable, with its colour map, should be studied as well as the brief entries on the various code systems, e.g., A.B.C. ; Bentley's

Harmsworth's Business Encyclopedia and Commercial Educator (1925?) : 1483-88

For telegraphing or cabling to distant places a code is almost invariably used, the main reason for this being the great saving of money. The use of codes followed quickly on the establishment of cable communication between the two hemispheres, as rates were at first as high as 10s. a word.

In the earliest codes, dictionary words, not exceeding ten letters, were used; each code-word meant a phrase or sentence. Verini went a step further and had Latin roots and terminals, the root meaning one thing and terminals other things. Such roots and terminals could be tabulated and mean two things each, so that three or four lots of information could be expressed in one word.

The International Telegraphic Conferences, which have been held on several occasions, fix the regulations regarding the use of codes. Up to the London Conference of 1903 the conferences had decided that code messages must consist of real words from real languages, and they followed up this decision by issuing a list of admissible words, the Official Vocabulary, the last edition of which was a massive publication in four volumes, containing over a million and a quarter words, many defective and exceeding the ten-letter limit.

The London conference withdrew the insistence on real words, and laid it down than any groups of not more than ten letters might be used as code words, provided they were pronounceable according to the rules of the following languages: English, French, German, Italian, Spanish, Portuguese, Dutch and Latin. These regulations were confirmed at Lisbon in 1908, and a committee was appointed for the purpose of granting certificates to the owners of codes recognizing their code words as being in agreement with the regulations. Since the Great War, the committee which approves or rejects code words has not functioned, and there is some confusion as to what is permitted and what is not. The position now is that, there being no authority to pass or reject new corde words, any Government or any cable company may, and sometimes does, reject words even though they are being accepted without challenge elsewhere. This gives a gratuitious advantage to those which were approved before the committee collapsed.

Modern codes are mainly of three different types:
Dictionary codes;
Chain and Sectional Cipher codes; and
Figure codes. Each of these will now be discussed in turn.

DICTIONARY CODES . Under the Old Berne Vocabulary System, codes contained long stereotyped sentences. This had the great disadvantage that in such sentences there is often something which does not apply to the required case. The new rules admitting artificial words paved the way for a new method, and at the end of 1906 appeared the first code based on what may be called the Phrase System. This overcame the disadvantage by the use mainly of short phrases which could be built up into longer sentences. English is particularly suitable for this purpose.

This was the first of the usual type of modern public code, which may be called dictionary codes. These codes contain a number of words and phrases, and to each of them a five-letter code word is attached, the latter being in alphabetical order, as in an ordianry dictionary. Coding consists of writing the five-letter words for the phrases and joining them up in pairs to make words of ten letters for transmission. For example :
Referring to your telegram of the 31st .. .. .. .. UGVAY
do not agree .. .. .. .. AHHEM
prefer to .. .. .. .. .. OGYOZ
resume buying .. .. .. BISIJ

This would be arranged for transmission:

These codes are very simple to handle, and are the best and most popular for cables of a general or discursive type, as they give upwards of a thousand million variations.

The Detection of Error
In a good code of this kind the half-words should be carefully constructed. The least virtue they should have is a difference of not less than two letters between any one and any other. This means that if a letter becomes mutilated by bad transmission the new word so formed is not a word in the code, and the recipient knows at once there is an error. With a two-letter difference he can usually correct the error. To assist in this, some codes have the code words repeated in a list at the end of the book, but arranged in terminational order, that is, in alphabetical order starting from the ends of the words. This aids in the correction of mutilations and incidentally helps t oshow whether there really is a two-letter difference or not. Other codes have various kinds of mutilation tables for the same purpose.

The two-letter difference principle is a very important one. It is a well-established fact that large sums of money have been lost by ignoring it, sometimes over a thousand pounds on account of one error. The two-letter difference rule applies to each half-word. Sometimes the argument is raised that if a five-letter code has a one-letter difference among its half-words of five letters this ensures a two-letter difference in the complete words of ten letters. this is clearly an error. Each five letters, in the present type of code, must be considered separately. The second half-word might be identical in two different ten-letter words, and if the first half-words differed only in one letter there would only be a difference of one letter in the ten-letter words.

Morse Code and Code Ciphers
Though in general a two-letter difference is sufficient, there is one objectionable case which can be avoided by constructing code words, and at the same time keeping in mind their appearance in the Morse Code. Frequently one pair of letters can easily be mutilated into another pair, in such a case as the following: BE (-... .) and DI (-.. ..) are the same in Morse except for the spacing. So that if, say, BELUX were a code word, DILUX ought not to be. It is in refinements of this kind that a good series of code ciphers differs from an indifferent one. In compiling codes a system has also to be adopted to eliminate bad junctions of half-words. Before adopting ciphers, these features should be studied.

The number of pronounceable five-letter code words that can be constructed is limited, and naturally each extra refinement in their construction tends to reduce somewhat the number. To get a very large number of five-letter code words involves taking the utmost liberties with the rules of proncounceability.

In selecting a code of this five-letter kind a few points should be borne in mind. Firstly, a code with a very large number of phrases has obvious advantages. It includes a great deal of matter, and from that point of view is good. On the other hand much of this additional matter is of little use to the ordinary man. When a code gets up to, say, 50,000 phrases, all the really useful phrases should have been included, and any extra phrases do not mean a proportionate increase in the usefulness of the code. This increase is purchased at the expense of a decrease in the quality of the code words, with a consequent increase in the probability of a difficult mutilation and of the possibility of their being challenged. Also, whena letter gets mutilated there are a large number of possible corrections to consider, with a consequent diminution in the probability of one being able to choose the right one.

It should also be noticed that it is the short and common phrases that really make up the greater part of the usefulness of a code. Phrases like "you had" and "but if," which are of constant occurrence, are of much more importance than long phrases running into a dozen words or more, but rarely used. It will be obvious that, for the most part, the longer a phrase the more particular and exclusive the information it contains, and the less telegrams it is likely to fit into.

It is possible to exaggerate greatly the efficiency of a code by making up a hypothetical telegram containing several of the longer phrases in the code. This should be remembered when reading code advertisements.

To Suit Codes to Particular Cases
General codes frequently contain a number of blank five-letter words, to which correspondents can assign their own meanings by mutual arrangement. Where a private code is not used, this is a very great advantage. In choosing a suitable code the size of such a blank supplement is quite a point to be considered. No code as printed can exactly fit a particular business.

In deciding on a code the efficiency of the telegraph service on the routes to be used should be taken into account. Some of the main cable routes, e.g. the principal lines between London and New York, and elsewhere, are very efficient, though a few mutilations even there are inevitable. But if a cable comes from an up-country station in India, or the Balkans, there is a big chance it will get mutilated over the short land journey, before the longer but more efficient submarine cable is called upon to transmit it. The quality of the code words and any scheme for dealing with mutilations need to be specially good for use on routes where the efficiency of transmission is poor.

A public code puts its holders into touch with all other holders. Hence a code is of much more value if it is already widely spread. Most big offices keep copies of three or four of the best-known codes, even when for special reasons they do not use them much in the inside working of their business.

Use of Challenged Code Words
Some codes contain code words which have been internationally challenged. No attempt has been made to conform with due care to existing regulations. In such cases there is a risk of the defective code words being charged at double rates.

Several of the well-known codes are published in pocket form. These pocket editions are facsimiles of the office edition made in small size for the benefit of travellers.

Doubtless owing to the fact that English is the leading commercial language of the world, the greater part of the public codes so far produced are in that language. Other countries have not been prolific in their output of telegraphic codes. The brevity and simplicity of the English language, particularly of the English verb, have contributed much to this. Attempts have been made to produce codes in alternative languages, the code being printed in two or more languages with the idea that it may be coded by the sender in one and decoded by the recipient in another. Owing to the difference in systems of grammar, this has not proved a great success, as precision in expression is difficult.

CHAIN AND SECTIONAL CIPHER CODES . These codes are a later introduction, and they are almost all private codes. To be of much use they must be adapated to the work of a particular firm, and are usually used in conjunction with dictionary codes. In these codes the code word is built up a letter or two at a time. Each bit of the code word has a separate meaning and is taken from a separate table, the tables being used in a strict order. An example from a hypothetical code will make this clear.

Suppose a section of the code is used to report sales of some particular commodity and that the two letters RU introduce this table, and mean "we have sold." The second section will consist, say, of a list of possible quantities, with letters BA, BE, BI, etc. down to ZU against them (one of these two-letter syllables being left blank, to be used if the quantity in question is not in the list). The next section may consist of twenty consonants, B, C, D, F, G, etc., to Z, with positions, i.e., months, pairs of months, etc., against them; then a section with the five vowels, one being left blank and the other four having against them the principal ports to which sales are b eing made; after this a section BA to ZU, to indicate 99 of the chief prices at which the commodity is likely to be sold (with one blank for prices not on the list).

Example of Code Employment
Thus "we have sold 500 tons July/August Antwerp £1/2/6" might be RUCONEJA, where:
RU introduces the tables and means "we have sold."
CO means 500 tons from the list of tons.
N means July/August from the list of positions.
E means Antwerp from the list of ports.
JA means £1/2/6 from the list of prices.

The Table being headed "Sales" and the name of the commodity, the whole transaction is thus reported in eight letters.

The word thus produced is obviously pronounceable, as it consists of vowels and consonants alternately. As it stands there is no means of checking whether it has been accurately transmitted. Occasionally such codes are used without any check system, but this practice cannot be too strongly condemned. A check system is essential, and several are in use. In one method the meaning is conveyed by eight letters, as in the example above, and an extra syllable of two letters used to make up the complete word and to serve as a check. Using five vowels and twenty consonants there are 100 possible syllables consisting of a consonant followed by a vowel. These are numbered 00 to 99.

Making a Check Syllable
To make a check syllable write down the number of each of the four two-letter syllables, add them up, reject the hundreds figure (if any), and find the syllable corresponding to the total. This syllable is written after the other four, and serves as a check. If a single letter goes wrong in transit a check thus formed will always show it. It will be noted that this system, a fairly common one, is rather extravagant, the telegrams being increased in length by 25 per cent owing to the check syllables.

There are several other systems. In one type all the five syllables are used to carry the message, a check number is calculated from them, as above, and this is indicated by turning some of the syllables round, i.e. writing the vowel before the consonant.

Closely allied to these chain cipher codes are sectional cipher codes. These are not so economical as the chain codes, but are much simpler to work and quicker in use, with rapid correction when mutilated. The error can frequently be located within very narrow limits. This helps in rectifying an error, and even when the error cannot safely be righted it is useful to know which part of the cable is wrong. These sectional ciphers consist of sets of three letters with a two-letter difference among themselves, and sets of two letters, again with a two-letter difference. The first three letters stand for part of a message, and the next two for another part, these two parts being joined in a half-word. This is the simplest case, but the system can be elaborated.

FIGURE CODES . For economy a good figure code cannot be beaten. To the uninitiated the amount of information which can be conveyed in a word or two by a figure code borders on the miraculous. In such codes the message is expressed from the code book as a string of figures, and these figures are then converted into pronounceable ten-letter words for transmission. In the old days of the Berne Vocabulary the words in that monumental list could be numbered, and any six-figure number could be represented by one of a million numbered words.

Under the revised rules it is usual to use one code word to convey at least nine figures, and usually ten. Systems are available for turning a number of more than ten figures into a code word. But an efficient check system is essential. Most of the schemes for turning eleven, twelve or even thirteen figures into a code word either provide for no check at all, or, if they do, a letter may go wrong and the check still give the recipient the impression that the telegram is right.

A great varietey of such schemes (sometimes called "condensers") is on the market. Each should be judged on its merits. It may be noted that the fact that a system has run for some time without serious trouble is no proof of its soundness. If a letter can go wrong and the check fail to show it, it may be months before such an accident occurs, and when it does occur the resulting error may be trifling, or it may be disastrous. Also a system may be in use on a route where the cables are transmitted with great accuracy, whereas the same system would be much more likely to lead to trouble where there was a less efficient service.

As already observed, a figure code is very economical. But it always has one disadvantage, the double process. First the message is put into figures and then, by a fresh piece of work into letters. The extra labour is apt to discourage the use of such codes, whereas the saving is so great that to any firm with much cabling to do their use would be an economy.

Cumbersome Types of Code
Codes can be unnecessarily complicated. Some of the more cumbersome types, requiring sometimes even three processes in coding, are quite uneconomical when considering the delay involved and the salaries of the cabling clerks at both ends. Some condensers are printed in colours, the messages being read according to the colour of the figures transmitted by the code words.

Figure codes are apt to be a little bewildering to one unfamiliar with them. Ten minutes' careful examination may be necessary to dispel this, but busy people will often not give it! The risk of error in a figure code owing to bad or careless coding is not greater than in codes of other types. The coding has to be carefully done, and the coder feels this and keeps alert. It would not, perhaps, be saying too much to assert that in all probability more mistakes occur in the very simplest codes. The user gets so familiar with them that he is apt to be careless and trust to memory. Whatever code is used, the coder must see that the message is properly checked and should be certain that it is right. If the message arrives with a mistake in it the recipient should be able to feel sure that the cable copany and not the sender is at fault. If he has to allow for mistakes in coding as well as in transmission his task in putting right an error is made far more difficult.

The difference between figure codes and chain codes is not so much as might appear. At first sight the latter looks simpler, but as a matter of fact there is always an appeal to figures for the purpose of the check. This nullifies most of the apparent simplicity. In practice figure codes are more used. They are more easily adapted to any type of stereotyped message with the maximum of economy.

Firms which use a figure code usually use a dictionary code as well. The system is often so adjusted that the two codes can be used in the same cable. When the forms of the code words of the two codes are markedly different, this mixing is a simple matter. It sometimes happens, however, that too many liberties are taken and words from different codes are mixed together without any obvious distinction. This may involve looking up words in two different codes, with a possibility of a word being in both codes with different meanings. In case of a mutilated word it will be seen how much more difficult it is to rectify the error when the faulty word may be in either of two codes. A better way of mixing a figure code with a dictionary code is to proceed as follows: Use a dictionary code in which all the half-words are numbered. They will go up to, say, something under 40,000. Construct the figure code so that all the groups from it begin with a figure not less than 4. When coding from the dictionary code, ignore the code words in the code book and copy down the five-figure numbers instead. Using these and the figure code, the whole message is first of all expressed in figures, and then by a separate process turned into pronounceable groups for transmission. The decoder first turns the message as received into figures. As he decodes these figures he can at once tell which code each group comes from by noting whether it begins with a high or low number. In this scheme the printed five-letter code words are not used at all. There are number of other devices — switch words, distinctive beginnings or endings, etc. — in use for mixing codes.

Private Supplement to Public Codes
When the business done does not warrant the expense of preparing a private code, it is a common practice to adopt a suitable public code and produce a full private supplement to go with it. Some firms have arranged with the owner of a public code to print a special edition for them with different code words, so that the public should not be able to read their messages.

A firm doing a large business by cable, and some houses spend £30,000 a year on cables, will probably print a full private code of their own. In compiling such a code, the cables for a year or two past should be examined for phrases which constantly recur, and especially for such phrases as have not been coded economically hitherto. These phrases can then be embodied in the new code. Some firms regularly mark the phrases in their code whenever they are used. The code then becomes a valuable guide as to the relative usefulness of the phrases when the time comees to compile a new code. A number of code words must be left with no meanings against them, for additions, in case of the introduction of additional phrases for new business.

There are several firms in London who compile codes. Most of the leading code publishers undertake the compiling of private codes for business houses. An expert will, after looking into the activities of a firm, be able to explain the relative advantages of various possible types of code and to advise what kind is best under the circumstances. Many questions will crop up in this connexion. What is the relative iportance of speed, of economy and of secrecy. What cable routes are most used? how many commodities are dealt in? and a dozen other points. A really large firm, spending in cables thousands of pounds a year, will have a well-paid expert on the staff, to keep the codes up to date and to superintend the coding and decoding.

Codes are something of a nuisance in the routine of office work. They have, however, many points of interest, and will well repay a little careful thought on the part of the management of any firm whose overseas cabling is an important item. The compiling of a code, even by firms with an efficient coding staff, is best entrusted to a professional compiler. The chief reason for this is that one who devotes his time entirely to compiling codes has a greater selection of systems at his disposal and can more readily decide where they can be most advantageously introduced.


16 october 05