Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

A freshman-level, rigorous, non-programming, computer ..., Exercises of Web Application Development

A new computer-science course on CL/NLP, IR, and AI (henceforth “NLP”). Three main design decisions: (1) For entering college freshmen.

Typology: Exercises

2022/2023

Uploaded on 05/11/2023

country.side
country.side 🇺🇸

4.1

(10)

16 documents

1 / 14

Toggle sidebar

Related documents


Partial preview of the text

Download A freshman-level, rigorous, non-programming, computer ... and more Exercises Web Application Development in PDF only on Docsity! Lillian Lee - ACL TeachNLP Workshop, 2002 1 A freshman-level, rigorous, non-programming, computer-science intro to NLP, IR, & AI Lillian Lee Department of Computer Science Cornell University http://www.cs.cornell.edu/home/llee [This presentation was given at the 2002 ACL Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, and hence was directed towards an NLP, rather than a computer science, audience.] Lillian Lee - ACL TeachNLP Workshop, 2002 2 Computation, Information, and Intelligence A new computer-science course on CL/NLP, IR, and AI (henceforth “NLP”). Three main design decisions: (1) For entering college freshmen. Usually junior+ courses (at least at Cornell). (2) A rigorous, technical focus, including recent research results. Not “Philosophy of AI” or “The Information Society”. (3) No programming. Neither required nor taught. Lillian Lee - ACL TeachNLP Workshop, 2002 5 “If you want to truly understand something, try to change it.” – Kurt Lewis The course format was fairly traditional. Course material was introduced almost entirely in lecture (no obvious textbook, research papers not suitable; lecture notes available this fall). Homework involved challenging pencil-and-paper problem sets. Problems typically investigated implications of lecture material rather than simply testing recall, e.g., students explored the consequences of changing definitions, assumptions, or settings. Exams were similar to the homework, but emphasized the basic concepts. Lillian Lee - ACL TeachNLP Workshop, 2002 6 Syllabus Outline “Knowledge without appropriate procedures for its use is [mute], and procedure without suitable knowledge is blind.” – Herb Simon, 1977 Computation: [15 lecs] Search; game-playing; perceptron and nearest-neighbor learning; the halting problem. Used later: graphs, inner products, Turing machines Information: • Document retrieval [3 lecs] • The World Wide Web [4 lecs] • Language structure [7 lecs] • Statistical NLP [6 lecs] Intelligence: • The Turing Test [2 lecs] Lillian Lee - ACL TeachNLP Workshop, 2002 7 Document Retrieval [3 lecs] IR was treated as a subfield of NLP using reduced models of language. • Tighter integration of the syllabus • Search engines a highly visible “NLP app”. Topics: Boolean query retrieval, indexing structures (arrays, B-trees, binary search), the vector space model, term weighting. Notes: inner products and related geometric notions were introduced in the previous perceptron unit. Lillian Lee - ACL TeachNLP Workshop, 2002 10 Statistical NLP [6 lecs] Explorations of sub-sentential, distributional language structure. Word counts, Zipf’s law, and Miller’s [1957] monkeys. Same type of argument as the rich-get-richer hyperlink power law derivation IBM-style statistical MT. Alignments : translations :: hubs : authorities. Japanese segmentation [Ando/Lee 2000]: more multilingual considerations The Federalist Papers [Mosteller/Wallace 1984]: historical applications Infant statistical segmentation learning [Saffran et al 1996]: cf. Ando/Lee Notes: The statistical paradigm was introduced in the unit on learning. Kevin Knight’s [1999] tutorial was very helpful. Lillian Lee - ACL TeachNLP Workshop, 2002 11 Statistical NLP [6 lecs] Explorations of sub-sentential, distributional language structure. Word counts, Zipf’s law, and Miller’s [1957] monkeys. Same type of argument as the rich-get-richer hyperlink power law derivation IBM-style statistical MT. Alignments : translations :: hubs : authorities. ...plus other topics ... Notes: The statistical paradigm was introduced in the unit on learning. Kevin Knight’s [1999] tutorial was very helpful. Lillian Lee - ACL TeachNLP Workshop, 2002 12 Statistical NLP (cont) Japanese segmentation [Ando/Lee 2000]: more multilingual considerations The Federalist Papers [Mosteller/Wallace 1984]: historical applications Infant statistical segmentation learning [Saffran et al 1996]: cf. Ando/Lee
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved