Verb-Noun Combinations in Machine Translation


by

Bill Niven

University of Aberdeen
E-mail: ger029@abdn.ac.uk (Bill Niven)
(received August 1996, revised February 1997)
[Download this Article (44K, Rich Text Format)]
Copyright Notice:

First published in Web Journal of Modern Language Linguistics in association with the publishers (to be announced). © 1997 Bill Niven.

The moral rights of the author(s) to be identified as author(s) of this work are asserted in accordance with §§.77 and 78 of the Copyright, Designs and Patents Act 1988. This work may be reproduced without the consent of the author, in part or in whole in any manner and in any medium subject only to the two following conditions:

(a) no charge shall be made for the copy containing the work or the excerpt,

(b) a copy of this notice shall precede the work or the excerpt.


Contents

1. Introduction
2. Noun-verb Combinations and MT: Analyzing the Source Language
3. Noun-verb Combinations and MT: Performing Transformations
4. Semantically Weak Verbs and the Boundaries of Existing MT systems
5. Conclusion

1. Introduction

The aim of this article is to demonstrate one of the main problem areas for automatic language analysis when translating from German to English: namely the handling of semantically weak verbs in verb-noun combinations introduced by prepositions (e.g. `in Rechnung stellen', which would be translated into English as `to charge' rather than `to put on account', which might be the literal rendering). In the first part of the article I shall look at the various tools available to automatic translation for the recognition and translation of simple verb-noun combinations. Then I shall try to apply these to the more complex verb-noun combinations involving semantically empty verbs, taking as my main example `zur Verfügung stellen'. It is my contention that these existing tools are often too clumsy to deal with the latter problem, given that, for instance, they assume the nominal phrase (e.g. `zur Verfügung') to be a fixed syntactico-semantic unit. The article aims to provide a few suggestions as to the kind of analytical principles an MT system needs to integrate into its range of rules if it is to be able to deal with a typical German Funktionsverbgefüge.


2. Noun-verb Combinations and MT: Analyzing the Source Language

Translating German verbs is an uncomplicated business when the verb to be translated has a single accepted equivalent in the target language. Thus `lächeln' can safely be translated as `smile' in all contexts. This one-to-one equivalence covers at least 70% of verbs when translating from German to English. As far as the remaining 30% of German verbs are concerned, however, they can have anything from two meanings upwards - just think of `stellen' or `setzen'. Many of these meanings will be rendered in English by different verbs. The question is: how can a machine translation system differentiate in the source language between the different meanings? Such differentiation - a prerequisite of accurate translation - is typically possible by means of context: syntactical, semantic and/or terminological.


2a. Syntactical Differentiation

Verbs typically have frames, that is, they occur in certain regular syntactical patterns. All verbs have a subject (SUBJ), many take an direct object (DOBJ), an indirect object (IOBJ) and a prepositional object (POBJ). These categories, known in MT as `roles', can be broken down further into subroles. Thus a SUBJ can be nominal (`Der Mann ärgert mich') or it can be sentential (`Daß er so spät kommt, ärgert mich'). And a DOBJ can be nominal (`Ich erinnerte ihn an seine Tante') or reflexive (`Ich erinnere mich an sie'). Moreover, the same type of role can be filled in different ways . Thus a POBJ can be introduced by a range of different prepositions. The verb `bestehen' for instance can be followed by a POBJ introduced by `auf' (`Er bestand auf seinem Recht'), or by a POBJ introduced by `in' (`Ihre Aufgabe besteht in der Aufstellung der Liste'). Sophisticated MT systems have been provided with information on the roles of the different German verbs, and any automatic sentence morphology and analysis must take account of them if a correct interpretation of the source language sentence is to be achieved. Thus the information that `bestehen' takes a POBJ with `auf' will enable an MT system to tell that, in the sentence `Er besteht auf dem Geschenk', the `auf' introduces a POBJ attached to `bestehen' and not an independent (locative) adverbial phrase.

The use of roles in establishing a correct picture of syntactical relations in the source language (SL) clearly has implications for translation to the target language (TL). To stay with the last example: accurate identification of the syntactical function of `auf' will trigger the transfer (translation rule) stipulating that `bestehen' means `insist' when followed by a dependent POBJ with `auf'. Verb roles are the life-blood of machine translation because so much semantic differentiation depends on them. Again, `bestehen' provides a good example:

1. `bestehen' + DOBJ = `pass'
2. `bestehen' + POBJ with `in' = consist
3. `bestehen' + POBJ with `aus' = consist
4. `bestehen' + POBJ with `auf' = insist


2b. Semantic Differentiation

Syntax is however often not an adequate means of distinction. In the case of `setzen', it might seem possible to argue that, when followed by a POBJ introduced by `auf', `setzen' means `bet'. But in such a context it can also mean `count'. Here more information, namely the semantic character of the nominal constituent of the POBJ, needs to be drawn upon. Again, such information is generally available in machine translation systems. Thus we can stipulate that `setzen' followed by a POBJ consisting of `auf' and a human noun (`Ich setze auf Peter', `ich setze auf dich') can be rendered as `count'. When `auf' is followed by an animate noun, i.e. an animal such as a racehorse or greyhound, or a collective group noun, such as a football team, the translation would be `bet' (`Ich setze auf Red Rum'). Another example of the usefulness of semantic type as a means of differentiation can be provided by the verb `tragen', which can be rendered in English as either `carry' or `wear'. It seems a safe assumption to stipulate that, when the semantic type of the nominal DOBJ is material or fabric, then `tragen' should be translated as `wear' (`Er trägt immer einen Hut'). In all other cases, i.e. when the semantic type is not material or fabric, then `tragen' should be rendered as `carry' (`Er trug einen schweren Koffer').

The reader will already be thinking up examples of how the kind of rules for differentiation described above can misfire. What happens, for instance, when bets are being taken in the schoolyard on who is going to win the lunch-break punch-up? "`Ich setze 20 Mark auf Peter', sagte Maria, `weil er so schöne Muskeln hat'". Here the translation is clearly `bet', not `count'. In the case of `tragen', it is even easier to find ambiguous sentences of the kind that make MT researchers despair. Thus `Er trug die Hose ins Badezimmer' is clearly not `He wore his trousers into the bathroom': here additional information on the existence of an accusative POBJ would be required. But by and large the verb-role functionality makes it possible for MT to gain some sort of a picture of the verb from its immediate syntactical and semantic context, thus paving the way for a reasonable translation in most cases.


2c. Differentiation by means of Subject-area

Another useful translation tool in MT is the structure of the lexical database. Usually there are several subject-areas within the dictionary. When processing a text, the MT user selects the subject-area felt to be most appropriate to the type of text to be translated. Thus if a German text relating to computer software is to be translated, it would make sense to instruct the MT system to look for meanings in the terminological subject-area `Computer Software' before scouring the rest of the lexical database. The `Computer Software' subject-area contains the information, for instance, that the noun `der Fehler' means `error' in software contexts, not `mistake' (which would be the more general meaning) or `defect' (the meaning in technical and electronic contexts) or `fault' (hardware contexts). The verb `löschen', to take another example, also has several meanings depending on the subject-area: `delete' is certainly the meaning in software contexts, `extinguish' would be the meaning in firefighting contexts, and the general meaning might be `erase' or `wipe out'. Differences in meaning are thus not always determined solely by the immediate linguistic context. i.e. at phrase and sentence level. Often they depend on more general criteria such as terminological field.


2d. Precise Differentiation by means of Canonical Form

The means of differentiating meaning defined above depend on the immediate grammatico-syntactical context and on wider subject-area context. However, meanings of German verbs often vary depending on the canonical form of the noun with which it is combined. In other words, it is often not enough to stipulate, for example, that a verb means one thing when it is followed by a DOBJ and another when followed by a POBJ. It can even be inadequate to define the character of the DOBJ (reflexive or non-reflexive - important with `erinnern') or define the preposition in the POBJ (as we did with `bestehen') or the semantic type of the DOBJ (as with `tragen'). Often meanings depend on the use of a particular noun. To refer to a previous example, namely `löschen'. It is fair to say that `löschen' means `extinguish' in firefighting contexts. This is important information when translating not just the verbal form of `löschen', but also nominal compounds which are in part verb-based (e.g. `Löscharbeiten', `Löschvorgang' etc.). However, `löschen' ALWAYS means `extinguish' when combined with DOBJ `Feuer', `Brand' or `Flammen', regardless of the subject-area in question. Similarly, `löschen' ALWAYS means `close' when used with the DOBJ `Konto', or `switch off' when used with the DOBJ `Licht' or `Lampe'.


3. Noun-verb Combinations and MT: Performing Transformations

As demonstrated above, correct translation depends on a precise reading of the verbal context in the source language. But that is not the end of the story, at least not always. For while such a correct reading guides us to a good rendering in English of the German verb, we still have the correct translation of our defining criterium, namely the context, to concern ourselves with. Stipulating that `bestehen' means `consist' when used with POBJ `aus' is not enough. For we still need to stipulate in our transfer lexicon entry `bestehen' --> `consist' that the preposition `aus' should be translated as `of'. If we do not provide this information the MT system simply draws on the standard dictionary renderings of `aus' and provides a selection of possible renderings (e.g. `from', `out of', `of') or might even arbitrarily plump for one of these - with the risk of error. Such stipulations in transfer entries are known as `transformations'. This term is not to be confused with Chomsky's transformational grammar. In this particular area of MT it simply indicates that, once the correct rendering of a word has been established, it is then necessary to effect a number of `special translations' of the immediate context.

3a. MAPPING Transformation

The type of transformation I referred to in the above example `bestehen' --> `consist' is known as `mapping'. One preposition is `mapped' onto another one. There are various kinds of mapping. Probably the most common is mapping from one verb role to another. Thus it is possible to map from IOBJ to DOBJ. An example: `sich entsinnen' takes the genitive, e.g. `Ich entsinne mich eines schönen Tages im Herbst'. In English however the genitive will be rendered as an accusative, i.e. `I remember a beautiful day in autumn'. So a transformation of grammatical case is required. The most common grammatical case transformation is probably from POBJ to DOBJ. In the case of `Ich erinnere mich an meine erste Liebe', the MT system is provided with the information not only that `erinnern' is translated as `remember' when used with a reflexive DOBJ, but also that the following POBJ must be `transformed' in the English translation into a DOBJ. The transfer entry for `erinnern' will be as follows:

`erinnern' --> `remember'
TEST on presence of DOBJ reflexive
MAP POBJ with `an' to DOBJ
DELETE reflexive pronoun

We will then get `I remember my first love'. Mapping from one role to another is especially helpful with axiomatic expressions, as in the case of the sentence `Maria lebt immer in den Tag hinein'. This might be rendered in English as `Maria takes each day as it comes'. To achieve this, the transfer entry `hineinleben'--> `take' requires the stipulation that the POBJ be replaced by a DOBJ. The transformation of the definite article into the indefinite pronoun `each' is however not possible, as a grammatical case can be transformed but not its pronominal or determiner constituents. But we would still get `Maria always takes the day as it comes', which is close, if not perfect.


3b. DELETE Transformation

In the above example `erinnern' --> `remember', there is an instruction `delete reflexive pronoun'. This is a specific and typical example of another kind of transformation, namely a DELETE transformation, whereby the MT system is instructed to `remove' a German word from the original sentence, i.e. not to translate it. Whole verb roles in the source language can thus simply be elided. Another good example would be the expression `Zieh Leine!', which translates not as `Pull the line!', but `Go away!'. Here we create a transfer entry `ziehen --> go away' and place a test on the existence of a DOBJ with the canonical form `Leine'. But of course we don't want `Go away line!', so the transfer entry would contain a delete transformation instructing the computer to remove `Leine' from the sentence, i.e. not to translate it:

`ziehen' --> `go away'
TEST on presence of DOBJ `Leine'
DELETE DOBJ `Leine'


3c. ADD Transformation

In addition to deleting a verb role, it is also possible to insert one into the English sentence that was not there in the German sentence. This, logically enough, is known as an ADD transformation. A good example of this is provided by the verb `fliegen' when used with a POBJ introduced by `aus' (in the sense of `Vor zwei Jahren flog er aus der Firma'). We can instruct the MT system that `fliegen' in this case is to be rendered in English as `be'. The transfer entry would then contain an ADD transformation stipulating the insertion of a predicative adjective `thrown'. A further MAP transformation would require `aus' to be rendered by the English prepositional compound `out of':

`fliegen' --> `be'
TEST on presence of POBJ introduced by `aus'
MAP POBJ with `aus' to POBJ with `out of'
ADD participial predicative adjective `thrown'

With any luck, `vor zwei Jahren flog er aus der Firma' will then correctly be translated as `two years ago he was thrown out of the firm'.

Testing on the existence of POBJ `aus' however can be problematic. `Der Pilot flog aus zeitlichen Gründen über Paris statt über Calais' will then be mistranslated as `the pilot was thrown out of reasons of time above Paris instead of Calais'. The problem in this example sentence is that `aus' is an adverbial marker: it does not introduce a POBJ. Even if we have another transfer entry in the lexicon for `fliegen' stipulating that it means `to fly' when used with `über', the MT system will typically still opt for the mistranslation as `aus' is positioned closer to `flog' than `über'. The definition of means of differentiation can be not exclusive enough. On the other hand it can also be too exclusive, as is exemplified by the fact that `fliegen' can mean `to be thrown out' in contexts where there is no contiguous `aus', albeit in colloquial contexts (e.g. `Du arbeitest, oder du fliegst!'): in other words, `aus' is not an indispensable semantic marker for the `be + thrown out' rendering.


3d. Combining MAPPING, DELETE and ADD Transformations

It is possible to combine different transformations within one transfer lexicon entry, as can be seen in the case of `erinnern' --> `remember', which combines a delete and a mapping transformation, while `fliegen' --> `be' above combines an add and a mapping transformation. Add and delete transformations can also be combined within a single transfer entry. Take the example `Peter hat sich gestern einen Rausch angetrunken', which might best be translated as `Peter got drunk yesterday'. In the transfer lexicon we would need an entry that looks as follows:

`antrinken' --> `get'
TEST on presence of reflexive accusative
TEST on presence of DOBJ `Rausch'
DELETE DOBJ `Rausch'
DELETE reflexive accusative ADD participial adjective `drunk'>

This is a highly complex piece of coding, involving considerable lexical and syntactical rearrangement of the constituents in which both additions and deletions play their part.


4. Semantically Weak Verbs and the Boundaries of Existing MT systems

It would seem then that MT provides us with a highly varied and effective range of techniques for dealing with complex verb-noun phrases, both at the analysis and translation level. However, there are limitations in scope. Some of these were hinted at above, but will be dealt with now in more detail by looking at problems in the handling of the German Funktionsverbgefüge.

German abounds with semi-idiomatic expressions in which verbs such as `bringen', `setzen', `ziehen', `stellen' and `kommen' are used together with prepositional phrases - frequently nominalizations - to form a fixed unit. Examples are:

bringen: `in Gefahr bringen', `in Umlauf bringen', `zur Anwendung bringen';
setzen: `in Beziehung setzen', `außer Kraft setzen', `in Umlauf setzen';
ziehen: `aus dem Verkehr ziehen', `in Betracht ziehen';
stellen: `in Rechnung stellen', `zur Verfügung stellen', `in Frage stellen';
kommen: `zum Vorschein kommen', `in Frage kommen'

In the case of these expressions, the verb has, semantically speaking, an ancillary function: the main semantic thrust lies with the POBJ. In English, such combinations of semantically weak verbs and POBJ are less frequent, and usually the German combinations such as those above will be translated into English by one verb. Thus:

in Gefahr bringen = endanger
in Umlauf bringen = circulate
in Betracht ziehen = consider
in Frage stellen = question
zum Vorschein kommen = appear

As far as machine translation is concerned, providing the correct rendering is not as easy as it might first appear to be.


4a. Testing on the Noun within the POBJ

One of the first problems is identification of the unit to be translated. In MT systems it is possible, as outlined above, to define the roles which accompany verbs, i.e. SUBJ, DOBJ, IOBJ, POBJ etc. It is also possible to test on these roles when translating, and to test on specific canonical forms within the roles. Thus `bestehen' is rendered as `pass' when used with DOBJ `Prüfung', and as `consist' when used with POBJ `aus'. However, while it is possible to test on the canonical form of the noun in the SUBJ, DOBJ and IOBJ, it is usually not possible to perform a test on the noun in a POBJ, only on the preposition. This is clearly a weakness. Take the example of `aus dem Verkehr ziehen'. Here we would need to be able to specify that `ziehen' is rendered as `withdraw' when used with the preposition `aus' followed by the noun `Verkehr'. This cannot be done because MT systems usually only allow for a test on the presence of a prepositional canonical form, in this case `aus', not on the canonical form of the following noun. It may allow a test on the semantic type of the following noun: thus we can stipulate that `fortfahren' means `drive away' when followed by a POBJ `mit' and a concrete noun (such as `Auto'), and `continue' when used with `mit' and an abstract noun (such as `Rede'). But this is of little use in most semantically weak verb plus POBJ combinations, where a test on the canonical form is essential. In the case of `aus dem Verkehr ziehen' --> `withdraw', our only option at present would be to make the test on `ziehen' applicable to all connected POBJs introduced by `aus', an impracticable step. We do not want `Der Mann zog seine Frau aus dem Wasser' to be translated as `the man withdrew his wife from the water', which sounds very awkward.


4b. Testing on the POBJ by Coding it as an ADV

While it is not possible to test on the canonical form of the noun in a POBJ, it is at least possible to test on an adverbial phrase (ADV), and many coders try to solve the problems described above by coding whole POBJs - preposition and subsequent noun - in the source lexicon as adverbs and then constructing the appropriate test. Take the example of `in Erfahrung bringen'. Here, in analogy with the problem described in 4a, it would not suffice to create an entry `bringen' --> `find out' when followed by the preposition `in', because then sentences such as `Er brachte die Schuhe ins Wohnzimmer' would also fall into this net and be accordingly mistranslated, in this case as `He found out the shoes into the lounge'. So it would seem apposite to code `in Erfahrung' in the monolingual German lexicon as an adverb and then stipulate in the transfer lexicon that, when `bringen' is followed by an ADV with canonical form `in Erfahrung', it should be rendered as `find out'. A delete transformation then stipulates that the adverb `in Erfahrung' should be deleted, i.e. not translated:

`bringen' --> `find out'
TEST on presence of ADV `in Erfahrung'
DELETE ADV `in Erfahrung'

So far so good. However, the POBJ combinations used with semantically weak verbs are not always unmodifiable syntactico-semantic units. One can conceive of adjectives being inserted between the preposition and the following noun. `Er hat es in schmerzhafte Erfahrung gebracht' is a rather fanciful example; but `das Feuer hat den geliebten Firmenchef in höchste Gefahr gebracht' is more realistic. Indeed whole participial structures can be inserted, as in `das Feuer hat den geliebten Firmenchef in höchste, die ganze Mannschaft in Angst und Schrecken versetzende Gefahr gebracht', which is somewhat Kleistian, but by no means absurd. In such cases, testing on the presence of the ADV `in Gefahr', `in Erfahrung' etc. will not work, as the system's understanding of these adverbs does not extend to modified forms.


4c. Single or Multi-word Translations?

While most semantically weak verb plus POBJ constructions can be rendered in English by a single verb of semantic weight, this is not always the case. Thus `in Betrieb nehmen' cannot comfortably be rendered by a one-word translation. The correct translation here would be `put into operation'. Other POBJ constructions can be rendered both by a one-word and a multi-word translation, such as `in Freiheit setzen', which can be both `free' and `set free'. These multi-word translations can be achieved in MT by placing tests on the adverbially-coded `in Betrieb' or `in Freiheit' in the transfer lexicon entries for `nehmen' and `setzen' respectively,
`nehmen' --> `put' `setzen' --> `set'
TEST on presence of ADV `in Betrieb' TEST on presence of ADV `in Freiheit'
DELETE ADV `in Betrieb' DELETE ADV `in Freiheit'
ADD ADV `into operation' ADD real adjective `free'


4d. Problems attendant on Transformation

> There is however a problem with both single-word and multi-word renderings, as both dependent on delete transformations. Delete transformations often have a habit of operating at too high a node level in the tree. This is particularly problematic in the case of delete transformations requiring the removal of the DOBJ, where the tendency is to delete the whole DOBJ node, not just the noun, but all attendant adjectives and extended attributes. There is a problem here with recursive rules, in which DOBJs are defined as consisting not just of determiner and noun, but also of a potentially unlimited number of adjectives, adjectival phrases and preceding adverbs. But the problem can also occur in the case of ADVs. Take the sentence `Wir wollen die Maschinen möglichst schnell in Betrieb nehmen'. The adverb here is a compound one, `möglichst' modifying `schnell' and `möglichst schnell' modifying `in Betrieb'. MT systems will see the `in Betrieb' as the specifier, i.e. the `main' adverb in the composite expression. `Möglichst' and `schnell' will be subsumed under the overarching node specification `ADV: in Betrieb', and then deleted along with `in Betrieb' in the course of translation. A mixture of top-down and bottom-up parsing not allowed for in unidirectional parsers would prevent such coarse excision, but this is the exception rather than the rule.

One might argue that a more sensible way of handling `in Betrieb nehmen' or `in Freiheit setzen' would be simply to stipulate the desired translations for the adverbs `in Betrieb' and `in Freiheit' in the adverb transfer lexicon, thus obviating the need for delete and add transformations in the verb transfer lexicon. But the standard adverbial translation of `in Betrieb' would surely be `in operation', not `into operation'. And `in Freiheit' cannot always be translated as `free' (`Wir leben in Freiheit' = `We live in freedom').


4e. `Zur Verfügung stellen' and what needs to be made available...

Given the drawbacks of traditional coding options described above, it is clear that new, more precise functionality is required. I would like to demonstrate this by means of investigating two possible translations of `zur Verfügung stellen', probably one of the most common of semantically weak verb and prepositional phrase combinations.


4ei: `Zur Verfügung stellen' = `provide'

This is probably the most common and smooth rendering, as is clear from the example sentence `Wir stellen den Kunden das Programm zur Verfügung' = `We provide the customers with the program'. In order to achieve this translation with existing MT software, we would have to code `stellen' as meaning `provide', placing a test on the presence of an ADV with canonical form `zur Verfügung'. Then a number of transformations would be required:

1. DELETE `zur Verfügung'.
2. MAP IOBJ (case dative) to DOBJ.
3. MAP DOBJ to POBJ with `with'.

This piece of coding is certainly effective, even when other adverbs are inserted, such as in the sentence `Wir stellen Ihnen das Programm sofort zur Verfügung'. But problems arise with variant sentences such as `Wir stellen Ihnen das Programm zur sofortigen Verfügung', a businessspeak not uncommon in Germany. The possibility of interpolating adjectives into the adverbial phrase demonstrates that it is not a fixed syntactico-semantic unit. Treating it as a set canonical form is restrictive. The only potential solution here would be as follows. Firstly, we would need to be able to test both on the presence of the preposition `zu' and on the presence of a following noun with the canonical form `Verfügung'. Now comes the difficulty: the prepositional phrase `zu + Verfügung' cannot be deleted as a node without losing the `sofortig. A logarithm would be required according to which only the preposition and its noun object would be deleted, the interpolated modifier being preserved for translation. But of course the modifier `sofortig' would in the English translation have an adverbial function, i.e. `provided immediately'. So we need:

1. A DELETE transformation to remove the preposition and its dependent noun.
2. A MAP transformation rendering any German adjectival modifier of the dependent noun as an English adverb.
3. A MAP transformation transforming any dative IOBJ into a DOBJ.
4. Fourthly, a MAP transformation transforming any DOBJ into a POBJ with `with'.

This would produce the translation: `We provide the customers with the program immediately'. This coding should also prove effective in the case of complicated extended attributes such as `Wir stellen Ihnen das Programm zur baldmöglichsten, wenn nicht sofortigen Verfügung', since both modifiers can happily be rendered as adverbs in English. At present there is to my knowledge no commercial system which allows step 2 above, which seems to me a failing.


4ei: `Zur Verfügung stellen' = `place at the disposal'

A second way of rendering `Wir stellen den Kunden das Programm zur Verfügung' might be: `We place the program at the disposal of our customers'. This is a rendering which mirrors syntactically (if not quite morphologically) the original German and therefore presupposes less rearrangement in the translation. With existing software we could test on adverbial phrase `zur Verfügung' and then carry out three transformations:

1. DELETE `zur Verfügung'.
2. ADD `at the disposal'.
3. MAP any indirect object into a POBJ with `of'.

However, this would not produce a correct translation. Indeed the resulting translation - `We place at the disposal the program of our customers' - is disastrously misleading. The problem is that add transformations at present allow for addition to the verb, but not for addition to its direct object, which is what would be required here. A potential solution would be to transform the IOBJ into a POBJ introduced by the compound preposition `at the disposal of', which would certainly give us `We place the program at the disposal of our customers'. But we would still have the problem as to what to do when `zur Verfügung' containers a modifying adjective. In this case it would need to be inserted into `at the disposal of' as `at the immediate disposal of', but there is no way we can cater in present MT systems for the introduction of additional elements into fixed lexical forms.


4f. Participial Use

The difficulties described above are not restricted to the strictly verbal use of such verb-noun combinations. Problems also occur when they are used as participial adjectives. At present sophisticated machine-translation systems are able to translate participial adjectives, whether past or present, by identifying them as forms of an original verb, and then seeking the corresponding verb form in English. Thus the morphological analysis of `Der lachende Mann' would recognise `lachend' as the present participle of the verb `lachen'. The transfer process would then seek the corresponding gerundive form of the English verb `laugh' to which `lachen' translates, namely `laughing'. This works wonderfully - except in the case of composite participials such as `zur Verfügung stellend'. The latter is not used often in this way, but analogous forms are, such as `zur Verfügung stehend'. In the transfer lexicon there will be an entry stipulating that `stehen' means `be' when used with the adverbial phrase `zur Verfügung'. An add transformation will supply the predicative adjective `available'. Thus `Ein Bus steht zur Verfügung' will translate as `A bus is available'. But unfortunately this transfer entry will not be taken in the case of the sentence `Die zur Verfügung stehenden Busse sind leider alle sehr alt'. Checking up on whether or not the participial adjective is derived from a lexicalised verb form, the default entry for `stehen'- i.e. the entry which has no tests appended - will be consulted. The phrase `zur Verfügung' will be translated as a separate adverb. The result will be: `the buses standing available are unfortunately all very old'. The weakness here is that the parser does not examine the context of the verb when it is used as a participial adjective: correspondingly it does not check those transfer entries with tests and therefore cannot apply these tests.


4g. Special case of `erfolgen'

Complication upon complication. What I have discussed above are predictable verb-noun combinations in which the nouns, themselves often nominalizations, carry the main semantic thrust, while in the context the verbs have a purely auxiliary function. There are however a number of verbs which are semantically weak in almost any context and can be combined with a potentially unlimited number of nouns. One of these is `erfolgen', a particular favourite of writers of technical documentation. Software manuals abound with phrases such as `die Speicherung erfolgt über....', or `die Datenübertragung erfolgt über....', or `das Einfügen erfolgt mittels...'. In a sentence like `Die Löschung erfolgt über die DEL-Taste', to take another example, the verb `erfolgen' is a pale fellow indeed; most of the semantic import is carried by the deverbal `Löschung', so that one could really rephrase this sentence as `Sie löschen mit der DEL-Taste'. The trouble is that German has a considerable potential for syntactical complexity which is often ruthlessly exploited by technical writers. Thus we rarely read `Sie löschen mit der F-Taste' in a manual. That sounds too simple, as if something has been forgotten. If we were to give the instruction `Sie löschen mit der F-Taste' to a virgin user of our new computer, he or she may well look questioningly, wondering why we did not at least endeavour to put an extended attribute in there somewhere, along the lines of `Sie löschen mit der sich auf der linken Seite der Tastatur befindenden F-Taste', which sounds much more professional. What we do encounter in manuals is `Löschungen werden vorgenommen mit...' or `Zur Löschung bedienen sie sich der...' or `Zur Löschung dient...'. Or, of course, `Die Löschung erfolgt...'.

In English the translation of such a structure could never be one-to-one. One would never write `The deletion is carried out with the F key', which sounds far too impersonal. Better would be `You can delete with the F key', or `To delete you use the F key', or `Use the F key to delete'. Whichever one of these we take, there is quite a bit of syntactic rearrangement going on, and this is a rearrangement of which MT is currently not capable.

For an MT system to turn `Die Löschung erfolgt über die F-Taste' into the simplest option `You delete with the F key', a number of transformations are required. First, we would have to instruct our system to turn the nominal subject of the verb `erfolgen' into a verb in the second-person, by seeking the verb from which the noun derives and then rendering it by its English equivalent. There is no system currently on the market as far as I know which allows for such a monstrous transformation. The second step would be to instruct our system not to translate `erfolgen', which we can tell the system to do, but at the risk of it then also refusing to translate any dependent adverbs. But the second step is in any case pointless without the first one, which is, as I said, not feasible in current systems. And of course it is by no means advisable to make a general rule out of this, as there are always exceptions. In an instruction manual for budding doctors, `Die Behandlung erfolgt im Krankenhaus' could not reasonably be translated as `You treat in the hospital'; better would indeed be `The treatment is carried out in the hospital'. And imagine you are a patient sitting in the doctor's surgery in Germany. You do not speak German, and have your personal machine translator with you. The doctor tells you you have a broken arm and adds `Die Behandlung erfolgt im Krankenhaus'. Your personal translator comes out with `You treat in the hospital'. No, no, you reply, I'm not the doctor, you are... A chaotic conversation ensues... So it's probably just as well that translating machines cannot transform nouns into second-person verb forms: this would be in some cases a severe distortion of sense.


5. Conclusion

In conclusion then: while MT has already a wide range of facilities at its disposal for translating complex verb-noun combinations, this range needs to be extended to make possible more subtle and precise transformations. Two of these we might term INSERTION and EXTRAPOLATION transformations, whereby words can be inserted within lexical units or extracted from within a phrase destined for deletion and then translated. It remains my hope that it will one day indeed be possible to equip a machine translation system with the logarithmic equipment to effect the complex test and transformation mechanisms often required. Of course we will always come up against contexts where the tests and transformations prove inadequate. Tests can be too limited or too unspecific. Similarly, transformations impose translations on words and phrases which will work nicely in one context but sound awkward and wooden in another. But machine translation is not about creating perfect translations, since the output is generally post-edited to acceptable standard or used in unedited form as a source of rough information (`Informationsübersetzung'). It is, however, about approximation, about finding ways and means to get as close as possible in as many cases as possible, a goal which can only be achieved by a combination of refinement and a highly-developed feeling for wide-ranging applicability. It is the search for this difficult and yet tantalising balance that is the challenge faced by MT software engineers and computer linguists.



Return to
The Issue 2 Contents Page |