Download Computational Linguistics I: Parts-of-Speech Tagging - Lecture 4 - Prof. Saif Mohammad and more Study notes Computer Science in PDF only on Docsity! CMSC 723/LING 723 Computational Linguistics I Parts-of-Speech Tagging Lecture 4 September 24th, 20081 CMSC 723/LING 723 Computational Linguistics I Parts-of-Speech Tagging Lecture 4 September 24th, 20081 Parts-of-Speech • Schacter (1985) provides more details • These classes occur in almost every language • Defined primarily in terms of syntactic and morphological criteria (not semantic): • Syntactic distribution: what occurs nearby? • Morphological properties: what affixes they take? • Syntactic function: what does it act as? • Semantic cohesion is incidental, not guaranteed [nouns: peoples/places, adjectives: properties] Note: Think back to the comic (verb is actually a noun) Schacter P. (1985) Part-of-Speech systems. Language Typology and Syntactic Description 4 Parts-of-Speech • Two broad categories • Closed Class: • Relatively fixed membership • Generally function words (of, to, as, since ...) • Short and used primarily for structuring • Open Class • Frequent Neologisms (borrowed/coined) 5 Closed Class POS • Idiosyncratic • Not all languages have the same classes • English • Prepositions: on, under, over, near, ... • Conjunctions: and, but, or, if, ... • Particles: up, down, off, in, ... • Auxiliaries: can, may, should, are, ... • Determiners: a, an, the, ... • Pronouns: she, who, I, ... 6 Closed Class POS Particles vs Prepositions He came by the office in a hurry He came by his fortune honestly We ran up the phone bill We ran up the small hill He lived down the block He never lived down the nicknames (by = preposition) (by = particle) (up = particle) (up = preposition) (down = preposition) (down = particle) 9 Closed Class POS Particles vs Prepositions He came by the office in a hurry He came by his fortune honestly We ran up the phone bill We ran up the small hill He lived down the block He never lived down the nicknames (by = preposition) (by = particle) (up = particle) (up = preposition) (down = preposition) (down = particle) Very Difficult To Differentiate 10 Closed Class POS
of
in
for
to
with
on
at
by
from
about
than
over
540,085
331,235
142.421
125,691
124,965
109,129
100,169
77,794
74,843
38,428
20,210
18,071
through
after
between
under
per
among
within
towards
above
near
off
past
14,964
13,670
13,275
9525
6.515
5,090
5,030
4,700
3,056
2,026
1,695
1,575
worth
toward
plus
till
amongst
via
amid
underneath
versus
amidst
sans
circa
NWO A~iwh
WN BML
YK ND © Le
J
164
113
67
20
14
pace
nigh
re
mid
o'er
but
ere
less
midst
thru
vice
Prepositions & Particles from CELEX
cooooocoocoonwhkh won
Closed Class POS
it
I
he
you
his
they
this
that
she
her
we
all
which
their
what
my
him
me
who
them
no
some
other
your
its
our
these
any
more
many
such
those
own
us
199,920
198,139
158,366
128,688
99,820
88.416
84,927
82,603
73,966
69,004
64.846
61,767
61,399
51,922
50,116
46,791
45,024
43,071
42,881
42,099
33,458
32,863
29,391
28,923
27,783
23,029
22,697
22,666
21,873
17,343
16,880
15,819
15,741
15,724
how
another
where
same
something
each
both
last
every
himself
nothing
when
one
much
anything
next
themselves
most
itself
myself
everything
several
less
herself
whose
someone
certain
anyone
whom
enough
half
few
everyone
whatever
13,137
12,551
11,857
11,841
11,754
11,320
10,930
10,816
9,788
9,113
9,026
8,336
7423
7,237
6,937
6,047
5,990
5.115
5,032
4,819
4,662
4306
4.278
4.016
4.005
3,755
3,345
3,318
3,229
3,197
3,065
2,933
2,812
2,571
yourself
why
little
none
nobody
further
everybody
ourselves
mine
somebody
former
past
plenty
either
yours
neither
fewer
hers
ours
whoever
least
twice
theirs
wherever
oneself
thou
“un
ye
thy
whereby
thee
yourselves
latter
whichever
COww
oon
Doeons
6
S
DX
Do
as
:
1,474
1.428
1,426
1,322
1,177
984
940
848
826
618
536
482
458
391
386
382
303
289
239
229
227
192
191
176
166
148
142
121
no one
wherein
double
thine
summat
suchlike
fewest
thyself
whomever
whosoever
whomsoever
wherefore
whereat
whatsoever
whereon
whoso
aught
howsoever
thrice
wheresoever
you-all
additional
anybody
each other
once
one another
overmuch
such and such
whate’er
whenever
whereof
whereto
whereunto
whichsoever
106
58
39
30
22
18
15
14
11
10
CooooooooooOCoO HR RP RE RE KE NUNEUNDH
Pronouns
(Personal,Possessive
&
Wh-)
Modal Verbs (part of Auxiliaries class) Closed Class POS 15 Open Class POS • Nouns • Verbs • Adjectives • Adverbs • All languages have Nouns and Verbs but may not have the other two 16 Open Class POS • Adjectives • Word referring to properties/qualities • Not present in all languages (e.g., Korean) • Adverbs • A semantic and formal potpourri • Usually modify verbs • Actually, John walked home extremely slowly yesterday 19 Tagsets • Several English tagsets have been developed • Vary in number of tags • Penn Treebank (45) • Brown Tagset (87) • Language specific Simple morphology = more ambiguity = smaller tagset • Size depends on language and purpose 20 Penn Treebank Tagset • Developed at UPenn • Culled from Brown Tagset • Leaves out some information e.g., that one can get from the word or the parse tree • Applied to Brown Corpus, WSJ, Switchboard 21 POS Tagging the girl kissed the boy on the cheek NN VBD PRP DT “The process of assigning “one” POS or other lexical class marker to each word in a corpus” (Jurafsky & Martin) 24 Why do POS tagging? • Corpus-based Linguistic Analysis & Lexicography • Information Retrieval & Question Answering • Automatic Speech Synthesis • Word Sense Disambiguation • Shallow Syntactic Parsing • Machine Translation 25 Why is it hard? • Not really a lexical problem • Sequence labeling problem • Treating it as lexical problem runs us smack into the wall of ambiguity I thought that you ... (that: CS) That day was nice (that: DT) You can go that far (that: RB) 26 Rule-based POS Tagging • (Klein & Simmons, 1963) • One of the first rule-based taggers (“grammar coder”) • Two stage architecture • Use dictionary to tag function words directly OR find which tests to run in second stage • Run chosen handwritten tests to find the “right” candidate Klein, S. and Simmons, R. F. 1963. A Computational Approach to Grammatical Coding of English Words. J. ACM 10, 3 (Jul. 1963), 334-347. 29 Rule-based POS Tagging • (Klein & Simmons, 1963) • Tagset size: 30 • Fits in ~15,000 IBM 7090 machine words (gasp!) • Dictionary: about 2000 English words • Tests: Capitalization, Suffixes, Numerals • Final answer: intersection of all test answers • Evaluated manually on “several pages” of text; 90% accuracy (half via dictionary) Klein, S. and Simmons, R. F. 1963. A Computational Approach to Grammatical Coding of English Words. J. ACM 10, 3 (Jul. 1963), 334-347. 30 Rule-based POS Tagging • Constraint Grammar Approach✝ • More recent rule-based method • Similar two-stage architecture • Vastly larger dictionaries and rulesets • Most popular implementation: EngCG ✝ Fred Karlsson et al. Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text. 31 Rule-based POS Tagging
Word POS Additional POS features
smaller ADJ COMPARATIVE
entire ADJ ABSOLUTE ATTRIBUTIVE
fast ADV SUPERLATIVE
that DET CENTRAL DEMONSTRATIVE SG
all DET PREDETERMINER SG/PL QUANTIFIER
dog’s N GENITIVE SG
furniture N NOMINATIVE SG NOINDEFDETERMINER
one-third NUM SG
she PRON PERSONAL FEMININE NOMINATIVE SG3
show Vv PRESENT -SG3 VFIN
show N NOMINATIVE SG
shown PCP2 SVOO SVO SV
occurred PCP2 SV
occurred V PAST VFIN SV
Sample EngCG Lexicon
34
Rule-based POS Tagging Example Sentence: Newman had originally practiced that ... Newman NEWMAN N NOM SG PROPER had HAVE <SVO> V PAST VFIN HAVE <SVO> PCP2 originally ORIGINAL ADV practiced PRACTICE <SVO> <SV> V PAST VFIN PRACTICE <SVO> <SV> PCP2 that ADV PRON DEM SG DET CENTRAL DEM SG CS Overgenerated Taggings ADVERBIAL‐THAT Rule Given input: that if (+1 A/ADV/QUANT); (+2 SENT‐LIM); (NOT ‐1 SVOC/A); then eliminate non‐ADV tags else eliminate ADV tag One possible disambiguation constraint 35 Rule-based POS Tagging • Accuracy about 96% (very good at the time) • A lot of effort to write the rules and create the lexicon • Probably not worth it today given how easy it is to bootstrap stochastic methods • Could try and learn rules automatically • Moving on ! 36 TBL Illustration 39 TBL Illustration Training 39 TBL Illustration Training 39 TBL Illustration Training Most common: BLUE Initial Step: Apply Broadest Transformation 100%Error: 39 TBL Illustration Training 100%Error: 40 TBL Illustration Training Error: 44% 40 TBL Illustration Training Step 3: Apply this transformation 44%Error: change B to G if touching 41 TBL Illustration Training Error: 44% 42 TBL Illustration Training Error: 11% 42 TBL Illustration Training Error: 11% 43 TBL Illustration Training Error: 0% 43 TBL Illustration Training Finished ! Error: 0% 43 TBL Illustration
Ordered transformations:
change B to G if touching A
change B to R if shape is)
Testing
46
TBL Illustration
-~
Ordered transformations:
A
change B to R if shape is)
Testing
47
TBL Illustration Testing Initial: Make all B change B to G if touching change B to R if shape is Ordered transformations: 48 TBL Painting Algorithm function TBL‐Paint (given: empty canvas with goal painting) begin apply initial transformation to canvas repeat try all color transformations rules find transformation rule that would yield most improved painting apply color transformation rule to canvas until improvement below some threshold end Now, substitute: ‘tag’ for ‘color’ ‘corpus’ for ‘canvas’ ‘untagged’ for ‘empty’ ‘taggi g’ for ‘pai ting’ 51 TBL Tagging Algorithm function TBL‐Tag (given: untagged corpus with goal tagging) begin apply initial transformation to corpus repeat try all tag transformation rules find transformation rule that would yield most improved tagging apply tag transformation rule to corpus until improvement below some threshold end Impossible ! 52 TBL Templates Change tag t1 to tag t2 when: w‐1 (w+1) is tagged t3 w‐2 (w+2) is tagged t3 w‐1 is tagged t3 and w+1 is tagged t4 w‐1 is tagged t3 and w+2 is tagged t4 Change tag t1 to tag t2 when: w‐1 (w+1) is foo w‐2 (w+2) is bar w is foo and w‐1 is bar w is foo, w‐2 is bar and w+1 is baz Non-Lexicalized Lexicalized Only try instances of these (and their combinations) 53 TBL Example Rules He/PRP is/VBZ as/RB tall/JJ as/IN her/PRP$ Change from IN to RB if w+2 is as 55 TBL Example Rules He/PRP is/VBZ as/RB tall/JJ as/IN her/PRP$ Change from IN to RB if w+2 is as He/PRP is/VBZ expected/VBN to/TO race/NN today/NN 55 TBL Example Rules He/PRP is/VBZ as/RB tall/JJ as/IN her/PRP$ Change from IN to RB if w+2 is as He/PRP is/VBZ expected/VBN to/TO race/NN today/NN Change from NN to VB if w‐1 is tagged as TO 55 HMM Teaser 58 HMM Teaser • Supervised learning requires tagged data • Wouldn’t it be great if we could: • Learn from untagged data (unsupervised) • Get the best tag sequence • Also benefit from tagged data, if available • Achieve accuracies >95% • We can ! Read Ch. 6 & show up next week! 59