Download Parts of Speech: Understanding Grammar Rules and Tagging in English and more Study notes Linguistics in PDF only on Docsity! HMMs • Hidden Markov Models provide a mechanism for assigning labels to items in sequence • Training • Parameter estimation (supervised) • Forward-backward algorithm (unsupervised) • Decoding • Viteribi algorithm • Beam search • All we need now are some labels . . . • Part-of-speech tagging assigns a grammatical category to tokens in a corpus • Since words may potentially occur as more than one part of speech, tagging is a limited kind of disambiguation: The representative put the chairs on the table . DET NOUN VERB DET NOUN PREP DET NOUN PERIOD • Tagging can be done by hand, automatically, or as a combination of the two. Tagging • Grammar rules (syntax) govern how words are put together into phrases and sentences • Traditional grammar is based on parts of speech • Parts of speech were a central feature in classical grammars: • Aristotle (384—322 BC), Dionysius Thrax (c.170—c.90 BC), Aelius Donatus (fl.353 AD) • Grammarians extended classical theories to English (and other vernaculars): Aelfric of Eynsham (c.955— 1020 AD) Parts of speech • Traditional parts of speech are defined by a mix of distributional and semantic properties • Schoolhouse Rock Well every person you can know, And every place that you can go, And any thing that you can show, You know they’re nouns. A noun’s a special kind of word, It’s any name you ever heard. I find it quite interesting, A noun’s a person, place or thing. Parts of speech • Other classes don’t have a distinct meaning, but are traditionally defined by their function: An adverb is a word . . . That modifies a verb . . . It modifies an adjective, Or else another adverb. And so you see that it’s positively, very, very, necessary. Or: Conjunction Junction, what’s your function? Hooking up phrases and clauses that balance, like: Out of the frying pan and into the fire. Parts of speech Parts of speech • “A part of speech outside of the limitations of syntactic form is but a will o’ the wisp. For this reason, no logical scheme of the parts of speech—their number, nature, and necessary confines—is of the slightest interest to the linguist.” (Sapir 1921) • “The term ‘parts of speech’ is traditionally applied to the most inclusive and fundamental word-classes of a language, and then . . . the syntactic form classes are described in terms of the parts of speech that appear in them. However, it is impossible to set up a fully consistent set of parts of speech, because the word- classes overlap and cross each other.” (Bloomfield 1933) Parts of speech • “The question of substantive representation in the case of the grammatical formatives and the category symbols is, in effect, the traditional question of universal grammar. I shall assume that these elements too are selected from a fixed, universal vocabulary, although this assumption will actually have no significant effect on any of the descriptive material to be presented.” (Chomsky 1965) • McCawley (1982) “avoids the notion of syntactic category as such, operating instead directly in terms of a number of distinct factors which syntactic phenomena can be sensitive to; in this view, syntactic category names will merely be informal abbreviations for combinations of these factors.” • Once we look a little deeper, the semantic definitions don’t work very well: • running denotes an action but is noun • sick denotes a state of being, but is an adjective • The only reliable way to define parts of speech is by reference to the other parts of speech that they combine with • Major distinction is between open class and closed class words (also called content words and function words) • Major open class categories: noun, verb, adjective, adverb Parts of speech