Link Grammar Parser

The original homepage hosted at the Carnegie Mellon University lists an extensive bibliography (mirror) referencing several dozen older (pre-2004) papers pertaining to the Link Grammar Parser. More recent publications and announcements are listed below.

Recent Applications and Publications

Some miscellaneous facts

  • Any categorical grammar can be easily converted to a link grammar; see section 6 of Daniel Sleator and Davy Temperley. 1993. "Parsing English with a Link Grammar." Third International Workshop on Parsing Technologies.
  • Link grammars can be learned by performing a statistical analysis on a large corpus: see John Lafferty, Daniel Sleator, and Davy Temperley. 1992. "Grammatical Trigrams: A Probabilistic Model of Link Grammar." Proceedings of the AAAI Conference on Probabilistic Approaches to Natural Language, October, 1992.

Psycholinguistic research on dependency

There are a number of interesting psychological and experimental analysis of the dependency properties of languages. Below is a selection that offers insight.

  • It turns out that writing an algorithm for a no-crossing minimum spanning tree is surprisingly painful; enforcing the no-crossing constraint requies treatment of a number of special cases. But perhaps this is not actually required! R. Ferrer i Cancho in “Why do syntactic links not cross?” EPL (Europhysics Letters) 76, 6 (2006), pp. 1228-1234. shows that, when attempting to arrange a random set of points on a line, in such a way as to minimize euclidean distances between connected points, one ends up with trees that almost never cross!
  • Crossings are rare: Havelka, J. (2007). Beyond projectivity: multilingual evaluation of constraints and measures on non-projective structures. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL-07): 608-615. Prague, Czech Republic: Association for Computational Linguistics.
  • Hubbiness is a better model of sentence complexity than mean dependency distance: Ramon Ferrer-i-Cancho (2013) “Hubiness, length, crossings and their relationships in dependency trees”, ArXiv 1304.4086 --- also states: maximum number of crossings is bounded above by mean dependency length. Also, mean dependency length is bounded below by variance of degrees of vertexes (i.e. variance in number of connectors a word can have).
  • Language tends to be close to the theoretical minimum possible dependency distance, if it was legal to re-arrange words arbitrarily. See Temperley, D. (2008). Dependency length minimization in natural and artificial languages. Journal of Quantitative Linguistics, 15(3):256-282.
  • Park, Y. A. and Levy, R. (2009). Minimal-length linearizations for mildly context-sensitive dependency trees. In Proceedings of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT) conference.
  • Sentences with long dependencies are hard to understand: The original claim is from Yngve, 1960, having to do with phrase-structure depth. See -- Gibson, E. (2000). The dependency locality theory: A distance-based theory of linguistic complexity. In Marantz, A., Miyashita, Y., and O'Neil, W., editors, Image, Language, Brain. Papers from the first Mind Articulation Project Symposium. MIT Press, Cambridge, MA.
  • (Cite this, its good) Mean dependency distance is a good measure of sentence complexity -- for 20 languages -- Haitao Liu gives overview starting from Yngve. [Liu2008]. Haitao Liu “Dependency distance as a metric of language comprehension difficulty”, 2008, Journal of Cognitive Science, v9.2 pp 159-191 http://www.lingviko.net/JCS.pdf
  • Sentences with long dependencies are rarely spoken: Hawkins, J. A. (1994). A Performance Theory of Order and Constituency. Cambridge University Press, Cambridge, UK. ----Hawkins, J. A. (2004). Efficiency and Complexity in Grammars. Oxford University Press, Oxford, UK. ----Wasow, T. (2002). Postverbal Behavior. CSLI Publications, Stanford, CA. Distributed by University of Chicago Press.
  • Dependency-length minimzation is universal: Richard Futrell, Kyle Mahowald, and Edward Gibson, “Large-scale evidence of dependency length minimization in 37 languages” (2015), doi: 10.1073/pnas.1502134112

Of related interest

Genia tagger
The Genia tagger is useful for named entity extraction. BSD license source.
After the Deadline
After the Deadline is a GPL-licensed language-checking tool. If you just want to have your text proof-read, this is probably a good choice.