CSE 628 Introduction to Natural Language Processing

Fall 2011

Tue /Thu 3:50am - 5:10pm at [Earth & Space 069]


Instructor:

Course Description:

  • Prerequisites: Familiarity with either Artificial Intelligence or Machine Learning is strongly recommended, but not strictly necessary.

  • Tentative Grading:
  • Late submission: Each student may adjust his/her homework deadline upto 7 days throughout the semester without a penalty. (not 7 days for each assignment, but 7 days cumulatively for the entire semester). After then, 10% of score will be subtracted each day. This rule does not apply to the critique submission which is due at the beginning of the class for the paper discussions. If you're late, then it is considered to be that you decided to skip the corresponding session. This policy is to encourage students to submit quality work, rather than poorly composed work in a hurry. For the purpose of counting late submission, fractional values will be rounded up - for instance, late submission by 1 hour is counted as late by 1 day. If there are situations where the application of this rule can be ambiguous, I have the right to apply the rule as I see appropriate. If you have a doubt, consult with me first before making your own assumption.

  • Assignments are posted to Blackboard

    Announcements:

    1. There will be a quiz at the beginning of Sep/8 (Thu) lecture. Check "recap" slides to prepare. (Disclaimer -- Not all quizzes will be announced in advance.)
    2. Homework-1 is due Oct/2 (SUN) 11:59pm.
    3. Project Proposal is due Oct/9 (SUN) 11:59pm.
    4. Project Update Report is due Nov/13 (SUN) 11:59pm. (note that the presentation due is sooner. see the schedule below.)
    5. Project Final Report is due Dec/18 (SUN) 11:59pm. (note that the presentation due is sooner. see the schedule below.)

    Tentative Syllabus: (subject to change depending on the students' backgrounds and interests)

    Date Topics References
    --- Slides (provided below for convenience) correspond to roughly 70% of the actual class material. The rest will be delivered on the chalkboard. ---
    01 Tue 08/30 Introduction [slides *updated*]
    • J&M Chapter 1
    • [pdf] Lillian Lee, 2001. I'm sorry Dave, I'm afraid I can't do that: Linguistics, Statistics, and Natural Language Processing circa 2001. The National Academies' study on the Fundamentals of Computer Science
    02 Thu 09/01 Language Models [slides *updated*]
    • J&M Chapter 4
    03 Tue 09/06 Language Models & Information Theory [slides *updated*]
    • J&M Chapter 4
    04 Thu 09/8 Text Categorization & Machine Learning Basics [slides *updated*]
    05 Tue 09/13 Text Categorization & Machine Learning Basics [slides *updated*]
    06 Thu 09/15 Part-of-Speech Tagging & Sequence Tagging [slides *updated*]
    • J&M Chapter 5
    07 Tue 09/20 Paper Discussion Session I Prepare for both papers below:
    • Gender Attribution: Tracing Stylometric Evidence Beyond Topic and Genre. Ruchita Sarawgi, Kailash Gajulapalli and Yejin Choi. Computational Natural Language Learning (CoNLL), 2011. [pdf]
    • Finding Deceptive Opinion Spam by Any Stretch of the Imagination. Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey Hancock. Association for Computational Linguistics (ACL), 2011. [pdf]
    08 Thu 09/22 Evaluation Techniques (chalkboard)
    & Discussion on Project Topics
    (Slides to be posted on Blackboard)
    09 Tue 09/27 Hidden Markov Models [slides *updated*]
    • J&M Chapter 6
    10 Thu 09/29 NO CLASS (Rosh Hashanah)
    11 Tue 10/04 Paper Discussion Session II
    • Web-Scale N-gram Models for Lexical Disambiguation. Shane Bergsma, Dekang Lin, Randy Goebel. In Proc. IJCAI 2009. [pdf]
    12 Thu 10/06 Hidden Markov Models (mostly chalkboard) + [slides *updated*]
    • J&M Chapter 6
    13 Tue 10/11 Maximum Entropy Models & Conditional Random Fields [slides]
    • J&M Chapter 6
    14 Thu 10/13 NO CLASS (instructor out of town)
    15 Tue 10/18 Context Free Grammars [slides]
    • J&M Chapter 12
    16 Thu 10/20 Parsing [slides]
    • J&M Chapter 13
    17 Tue 10/25 Statistical Parsing [slides]
    • J&M Chapter 14
    18 Thu 10/27 Guest Lecture by Dr. Veselin Stoyanov
    on Graphical Models[slides]
    19 Tue 11/01 Beyond CFG [slides]

    Also, Paper Discussion Session III
    Prepare for both papers below:
    • A Web-based English Proofing System for English as a Second Language Users Xing Yi, Jianfeng Gao, and William B. Dolan. IJCNLP 2008. [pdf]
    • Automatic Collocation Suggestion in Academic Writing Jian-Cheng Wu, Yu-Chia Chang, Teruko Mitamura, and Jason S. Chang. ACL 2010. [pdf]
    20 Thu 11/03 Guest Lecture by Dr. Shane Bergsma
    on Coreference Resolution [slides]
    21 Tue 11/08 Project Update Presentation I
    22 Thu 11/10 Project Update Presentation II
    23 Tue 11/15 Paper Discussion Session IV Prepare for both papers below:
    • Revisiting Readability: A Unified Framework for Predicting Text Quality Emily Pitler and Ani Nenkova. Conference on Empirical Methods in Natural Language Processing (EMNLP), 2008. [pdf]
    • Cognitively Motivated Features for Readability Assessment Lijun Feng, Noemie Elhadad, and Matt Huenerfauth. Conference of the European Chapter of the ACL (EACL), 2009. [pdf]
    24 Thu 11/17 Machine Translation [slides *updated*]
    • J&M Chapter 25
    25 Tue 11/22 Machine Translation II [slides *updated*]
    • J&M Chapter 25
    26 Thu 11/24 NO CLASS (Thanksgiving Break)
    27 Tue 11/29 Machine Translation III [slides *updated*]
    • J&M Chapter 25
    28 Thu 12/01 Information Extraction [slides]
    • J&M Chapter 22
    29 Tue 12/06 Final Project Presentation I
    30 Thu 12/08 Final Project Presentation I