Classification

The classification of vocabulary used here is essentially a reduced version of the classification devised by Michael Samuels and Christian Kay for the HT and in use as editing of materials from the OED progressed[1]. The outline classification was developed in the HT project's early years and was continually refined as word senses were placed within it.[2] Its final shape could not therefore be seen until the full Historical Thesaurus was complete. From the outset of the HT project the usefulness of a small and easily separable body of ancillary materials (that is, c. 48,000 word senses from the Anglo-Saxon dictionaries as opposed to the almost 800,000 word senses drawn from the OED) was evident, to serve as the corpus for a pilot study for the full thesaurus.

The word senses abstracted from the Clark Hall and Bosworth-Toller dictionaries are presented in 18 categories, some of which themselves are further divided into sizeable subcategories. Wherever possible the defining headings, which are based on the wording of the standard dictionaries, are written so as to match grammatically what is defined, although sometimes an adjectival heading may define a noun sub-category to which it adds a degree of specificity. Thus the wording of the definition reflects the part of speech: verb definitions opening with 'To ...' and adjective ones with adjectival forms. The sense definition for a group of nouns may point to their use in mass or count contexts, but noun gender, specifically a matter of grammatical congruence, is not given. Again, with verbs an intransitive definition may indicate that the verb is used intransitively, but until more work has been done on the verb phrase in Old English, little that is certain can be stated. Nor should it be, if the evidence needs examination. To assume that one thousand years ago English verbs could be distinguished as transitive and intransitive, as has been done from the evidence contained in the Bosworth-Toller dictionaries, may stem from a wish to find them so and to contrast them with today's system, in which transitivity is viewed rather as a feature of the clause than of the verb.[3] From another standpoint it can be argued that Old English has the same basic sentence types as today's English and that too little is known of the differing transitivity realizations of even the commonest verbs. From the broad conceptual point of view it is more important to indicate the central meanings of verbs than to classify them into strong, weak and other grammatical types.

The overall structure of the classification is hierarchical, proceeding from the most general terms to the most specific. Thus, the meaning of a word at any particular point in the hierarchy is defined not only by its own heading but by the headings above and below it in the structure. Within each heading, further subordination or co-ordination may be indicated. The following selection from Category 02.06 Animal will serve as an example:

02.06 Animal
02.06.03 Wild animal
02.06.03.01 Particular animals (alphabetical order)
02.06.03.01.13 Rodents
.Mouse: mūs
..A thieving mouse: mūsþēof og
.Rat:

If the word we are interested in is mūs, we place it by reading back through the information, represented by the numerical hierarchy, that it is a type of wild animal, then animal. The unnumbered dots give us the further information that Mouse and Rat are subordinate to Rodent and co-ordinate with each other, while mūsþēof, with two dots, is subordinated by a further degree of specificity to Mouse. (The system of dots representing degrees of internal subordination was originally devised by Tom Chase.[4]) Whether a category is numbered or introduced by dots depends partly on perception of the taxonomy and partly on how many words it contains: a large category may be given a number for ease of reference at any level. Within the category, an order of Noun, Adjective, Adverb, Verb, Other, is followed unless the predominance of a particular part of speech suggests that a different order would be preferable. Where there is heavy lexicalization under a particular part of speech, it may have a category to itself, as with 02.02.03 To die, perish or the categories under 08.01.02.03 Friendliness, affection. This method of displaying the materials is intended to indicate the interdependence of items within a thesaurus structure, as opposed to the arrangement of an alphabetical dictionary, where each definition stands alone. The set of headings relevant to any given item may not add up to the precision of a dictionary definition, but often comes close to doing so, thus, we hope, rewarding the user for effort expended in mastering the system.

Classifying the meanings

The structure described above developed out of the day-to-day work of sorting the slips into groups which shared at least one component of meaning. No claim is made that the words assigned to each group are synonyms in any strict sense of the term, i.e., that they are mutually substitutable in all or most contexts. Rather, they are loosely synonymous terms which express the concept defined by the heading, which will itself often be a descriptive phrase rather than a single word. The precision of meaning (whether real or apparent) indicated by the Old English compounds leads to many small categories, often containing only one word. Such categories have generally been retained, as in 04.02.03.04.02 Pasture/pasturage, but occasionally a set of partially synonymous terms of relative transparency will be grouped together, as for example under 02.06.10.01.03 Half of another kind/ .Man, where centaur, healfmann, meremenn(en) and twēomann are listed together.

The starting point for classificatory work was the 26 major categories already established for version 1 of the Historical Thesaurus (expanded in version 4 to 37). The fate of these categories, reduced to 18 for the TOE, illustrates the general principle on which both classifications are based: that the structure should evolve from the data rather than be imposed upon them. HT, for example, has separate categories for The supernatural and Faith, but when it came to the TOE, such a dichotomy proved difficult to operate in lexical terms and counterintuitive in historical and cultural terms. It was therefore abandoned and the material classified under the single heading Religion. Many other distinctions which seem clearcut to the modern mind, such as that between Astrology and Astronomy, proved equally impossible to sustain. Below this macro-level, we allowed the taxonomy to emerge from the data, a relatively easy task for the vocabulary of the material universe, but one requiring a greater degree of subjective intervention in the abstract lexis. As far as possible, we tried to be guided by what we knew of the Anglo-Saxon world-view rather than by modern taxonomies (although our knowledge is obviously limited and this is another area in which we hope that the TOE will stimulate further research). Thus, the major headings in 02.06 Animal might not impress a modern zoologist, but seem to us to indicate the priorities reflected in the vocabulary:

02.06 Animal
02.06.01 Animal parts/activities
02.06.02 Domestic animals, livestock
02.06.03 Wild animal
02.06.04 Exotic animals
02.06.05 Marine animal
02.06.06 Fish
02.06.07 Reptile (serpent, snake, dragon)
02.06.08 Bird
02.06.09 Insect/small creature
02.06.10 Monster, strange creature

Technical terms such as 'mammal' or 'arachnid' simply do not appear in these headings, thus allowing us to place words such as hwæl 'whale' in 02.06.05 and ātorcoppa 'spider' in 02.06.09, without, we hope, offending either scientific or historical sensibilities (and, in the latter case, with some fidelity to the modern folk taxonomy). Where possible, categories within such headings follow a semantic order; 02.06.01 Animal parts/activities, for example, follows the order of Head to Tail, and the equivalent sections for human beings and for diseases are also organized in descending order. Where a semantic order is not apparent, alphabetical order is followed. Individual words are generally classified by their most specific component, so that general terms for tools are in 17 Work, doings, actions, labour, while tools and implements used in agriculture or the care of animals occur in different parts of 04.02 Farm.

The section on the four flags has described the difficulties faced in ascribing meaning to many Old English word forms. For a conceptually organized lexicon, some meaning must be attached to a form if it is to be included at all, and the vaguer the meaning, the greater the problems confronting the classifier. In addition to this, some concepts are inherently ambiguous and could find more than one home whatever the system of classification. Thus we have the categories 16 Religion and 18.02.07 Music: where do we put Religious Music? In such a case, it is Hobson's choice, and in the TOE those who turn first to Music will be supplied with cross references to appropriate points in Religion. Where the number of words is small, as in the sub-category War trumpet, the group may be repeated, here in both Music under 18.02.07.02.02.02 A trumpet and in Warfare under 13.02.08.01 Battle-horn. Economy must be practised for a printed book, but in a database repetition is less of a luxury. Polysemous words can legitimately be dealt with by placing their senses in as many categories as is necessary, but problems arise if there is a single definition of wide extension or if the boundaries of the senses are not clear cut. One solution is to use a comprehensive defining heading, as in 14.01.06 A rule, order, precept, tenet, principle, which can receive words which cover some or all of this area of meaning. Another is a series of overlapping headings, as in the verbs attached to 11.11.02 A hindrance. Neither solution is ideal, but both the nature of meaning in general, and the nature of Old English meanings in particular, renders a degree of fuzziness inevitable. Where all else fails, we have fallen back on categories like the depressingly large 02.07.11.02 Unidentified plants (alphabetical order).

Compared with a dictionary, any thesaurus is somewhat of a blunt instrument, sacrificing semantic or grammatical specificity to breadth of conceptual coverage. Schemes of classification have no inherent truth, but represent the best attempts of the compilers to present their materials within a coherent and illuminating framework. We hope that we have gone some way to achieving this goal in the TOE, and that it will provide its users with at least some new insights into the vocabulary of Old English and the lives of its speakers, as well as fulfilling its primary function of supplying data for further research.

[1] Christian J. Kay, 'Historical Thesaurus of English: Progress and Plans', Corpora across the Centuries, ed. Merja Kytö, Matti Rissanen and Susan Wright, Amsterdam, 1994, 111-20. See also http://historicalthesaurus.arts.gla.ac.uk/about/ for more information on the HT and its classification.

[2] C. J. Kay and M. L. Samuels, 'Componential Analysis in Semantics: Its Validity and Applications', Transactions of the Philological Society (1975), 49-81.

[3] See Halliday: System and function in language, ed. Gunther Kress, Oxford, 1976, p. 162.

[4] Thomas J. P. Chase, The English Religious Lexis, Texts and Studies in Religion, 37, Queenston, Ontario, 1988.