Conjunctions (new style) in the Link Grammar Parser
1.1. Conjunction overview[Back to Introduction to Link Grammar] [Back to Link Grammar front page]
1.2. Miscellaneous Idiomatic Conjoining Expressions
1.3. Known problems
Earlier versions of the Link Grammar parser (those prior to version 4.7.0) used a different, special technique for parsing sentences containing coordinating conjunctions and other correlative structures. This technique, known as "fat links", and described here, proved to have a number of disadvantages, and is now in the process of being deprecated and removed. What follows is a description of the new mechanism.
Conjunction refers to the joining together different parsts of a sentence using either coordinating conjunctions, correlative conjunctions, or subordinating conjunctions. The most common coordinating conjunctions are and, or, nor, but, and are used to join together two grammatically similar or identical words or phrases. Correlative conjunctions and subordinating conjunctions are set phrases enforcing more complex, long-range structure. Examples include either ... or ..., not only ... but also ..., if ... then ..., and so on. Conjunctions of any of these types are (mostly) handled by the same technique within link grammar, although there is some variation. The strongest link grammar patterns occur for coordinating conjunctions, which are supported by a half-dozen distinct link types, roughly aligned with the part-of-speech or gramatical type of the words or phrases involved. These links include:
- AJ, used for coordinating adjectives, such as "The BLACK AND WHITE cat sleeps."
- CC, used for connecting clauses, such as "John arrived and Mary left".
- MJ, used for coordinating prepositions and other post-nominal modifiers, such as "It is hidden IN OR NEAR the house."
- NI, used for numerical ranges, such as "It takes 2 to 3 times the effort."
- QJ, used for coordinating question words, such as "WHEN AND WHERE is the party?"
- RJ, used for coordinating adverbs and other miscellaneous constructions, such as "She did it QUICKLY AND QUIETLY."
- SJ, used for coordinating nouns, such as "JOHN AND MARY are coming to the party."
- VJ, used for coordinating verbs, such as "He RAN AND JUMPED"
- XJ, used for miscellaneous idiomatic coordinating expressions, such as "... NOT ONLY x, BUT y": "You should NOT ONLY ask for your money back, BUT demand it."
- V, used for miscellaneous adpositions, such as "... TOOK ... FOR GRANTED".
Many of these link types have several common, recurring themes in their structure and use. In general, these links only connect words with the same part-of-speech. Thus, for example: "The black and white cat sleeps" gives the following parse:+-------------Ds-------------+ | +-------A------+ | +--AJl--+--AJr--+ +---Ss--+ | | | | | | the black.a and.j-a white.a cat.n sleeps.vHere, the AJ link connects each adjective to the conjunction "and". The resulting conjunction behaves as if it were an adjective itself: so "and" will link to "cat" with the A link, just as an ordinary adjective would. This arrangement is sometimes refered to as a "Prague style" dependency. It imposes a fundamentally hierarchical ordering: "and" acts as a head-word, with two dependent words: "black" and "white"; the combined phrase "black and white" acts as a single adjective. This style for handling coordination is used for other parts of speech as well; so, for example, SJ links combine nouns into a noun phrase:+------Spx------+ +----Js----+ +-SJls-+--SJrs-+ +--MVa--+ +--Ds-+ | | | | | | | Jack.b and.j-n Jill.f fell.v-d down.r the hill.nHere, the conjoined "Jack and Jill" acts as a (plural) subject.
The subtypes AJl and AJr, standing for "left" and "right", are used to maintain proper sequential ordering; this is useful for properly managing comma-conjoined lists:+-------------------Ds------------------+ | +-----AJl-----+-------A------+ | +-AJl+-AJr-+ +--AJr--+ +---Ss--+ | | | | | | | | the black.a , orange.a and.j-a white.a cat.n sleeps.vThe l,r subtype is also used for most of the other links (so, for example, VJl and VJr), with a few exceptions: the NI link uses NIf and NIt subtypes for numerical ranges, indicating the "from" and "to" ends of the range.
Other subtypes are used for enforcing agreement of various sorts, such as number and tense; so, for example: "cars and trucks are vehicles" but "*car and truck are vehicles", "*car and truck is vehicle". Agreement is discussed in greater detail in each individual link-type page.
The basic mechanism shown above can be extended to, and employed for any sort of multi-word correlative conjunctions, such as "neither ... nor ..." or "not only ... but also ...". Agreement between the various parts is enforced by means of the XJ link. So, for example:+------XJn------+------S*x-----+ | +-SJls-+--SJrs-+ +---I--+ | | | | | | neither.r Jack.b nor.j-n Jill.f will.v come.v
Miscellaneous Idiomatic Correlating Expressions
Although the words "and", "or" are the conjunctions most commonly used to correlate words and phrases, there are also many more complex idiomatic expressions that are used for coordination. These include:
Few of the above are implemented; those that are, are handled with the XJ, RJ and V links.
- Not only ... but also ...
- First ... next ...
- If ... then ...
- If .. only ... "If there were only more like you"
- ought ... if ... "That ought to be the case, if John is not lying"
- Someone ... who ... "Someone is outside who wants to see you"
- It ... that ... "It seemed likely that John would go"
- ...from X and from Y
- By X, and by Y, ...
The implementation of the CC link is irregular, as compared to how AJ, SJ, and VJ are implemented, and needs to be redesigned.
The V link deals with adpositions, but somewhat irregularly. Notice how the adpositions have a conjoining-like behaviour, but different ... The overall theory should be clarified & unified. See the README file that is shipped with the distro for more information.
The problem of complex verbs
The "Prague style" dependency works well when the conjoined words/phrases have a simple relationship to the rest of the sentence, but are problematic for more complex verb phrase conjunctions. Consider the sentence I taught these mice to jump, and those mice to freeze. The first part of the sentence+-------TOo-----+ +---Op------+ | +-Sp-+ +-Dmc-+ +-I-+ | | | | | | I taught these mice to jumpshows that taught acts as a head-word, and has two links to the right: the object link Op and the TOo link. A similar linkage is desired for the second half of the sentence; but how should this be acheived? There are two possibilities; neither are implemented by the current version of link-grammar, for reasons explained below.
The first possibility is to imprint the verb-linking rules onto the conjunction, asking and to behave as if it were a verb:+-----------V*taught----+ +------TOo-- +-------TOo----+ +---Op-- +---Op-----+ | +-Sp-+ | +-Dmc-+ +-I-+ | | | | | | | I taught these mice to jump and those mice to freezeIn the above, the verb taught links as usual, but also sports a custom VJ link to and that makes the and behave just like the verb. The problem with this scheme is that there are many exceptional verb linkages, and this requires a custom VJ link for each exceptional case (so that the correct linkage for and can be forced).
The second possibility is to attach all of the links to the conjunction, as below:+-----V*super*and-------+ | +--Op -------+-------TOo----+ | | +--TOo--+---Op-----+ | +-Sp-+ | | +-Dmc-+ +-I-+ | | | | | | | I taught these mice to jump and those mice to freezeThe above somewhat resembles the simple "Prague style" conjunction linkage, but is also dis-satisfying: by forcing both left- and right-directed links onto the and, it makes it difficult or even impossible to determine the verb-phrase structure of the sentence. Do all left-facing links form one one verb-phrase, and all right-facing links form another? Or perhaps some of the left and right-facing links should be grouped into one verb-phrase, and the remainder into another? This could be clarified by adding "phantom nodes" to indicate the desired grouping; however, the current link grammar has absolutely no concept of a "phantom node".
Given these two choices, the first possibility appears to be the most natural, as it offers the greatest fidelity to verb-phrase structure. It does introduce the concept of "link transferance onto the conjunction" in order to make it workable. Although this can be implemented within the current theory of link-grammar, it does force the and to be festooned with a large variety of linkage rules, one for each exceptional verb linakge. There is currently no way of instructing the parser to treat the and "just like the verb before it".
Linas Vepstas Last modified: 8 September 2010