CS 11-737: Multilingual Natural Language Processing
Course Description
CS 11-737 is an advanced graduate-level course on natural language processing techniques applicable to many languages. Students who take this course should be able to develop linguistically motivated solutions to core and applied NLP tasks for any language. This includes understanding and mitigating the difficulties posed by lack of data in low-resourced languages or language varieties, and the necessity to model particular properties of the language of interest such as complex morphology or syntax. The course will introduce modeling solutions to these issues such as multilingual or cross-lingual methods, linguistically informed NLP models, and methods for effectively bootstrapping systems with limited data or human intervention. The project work will involve building an end-to-end NLP pipeline in a language you don’t know.
Instructor
Lei Li
(Office Hour: GHC 6403, book a slot here)
Teaching Assistants
TA Mailing list: cs11-737-fa2023-tas@cs.cmu.edu
- Simran Khanuja (Office Hour: xxx)
- (Office Hour: xxx)
- (Office Hour: xxx)
Time and Location
Tuesday and Thursday, 2-3:20pm, DH 1212
Prerequisites
You must have taken a NLP (11-411 or 11-611 or 11-711) and Deep Learning (11-685 or 11-785) course previously. The assignments for the class will be done by creating neural network models, and examples will be provided using PyTorch. If you are not familiar with PyTorch, we suggest you attempt to familiarize yourself using online tutorials (for example Deep Learning for NLP with PyTorch) before starting the class.
Class Format
For each class there will be:
- Reading: Most classes will have associated reading material that we recommend you read before the class to familiarize yourself with the topic.
- Lecture: The first part of the class with feature a lecture to overview the topic of the day, in which you can ask questions to clarify about any of the material.
- Language in 10: Groups in the class will make a 10-minute presentation of one of the languages of the world.
- Discussion: There will be an open-ended discussion in which we will split into small groups and discuss a question regarding the class.
Homework Submission & Grading
Please submit your homework on canvas. The assignments will be given a grade of A+ (100), A (96), A- (92), B+ (88), B (85), B- (82), or below. The final grades will be determined based on the weighted average of discussion participation, assignments, and project. Cutoffs for final grades will be approximately 97+ A+, 93+ A, 90+ A-, 87+ B+, 83+ B, 80+ B-, etc., although we reserve some flexibility to change these thresholds slightly.
- Participation: Worth 15% of the grade. Your lowest 3 participation grades will be dropped.
- Assignments: There will be 4 assignments (the final one being the project), worth respectively 15%, 20%, 20%, 30% of the grade.
The details of the assignments are elaborated on the assignments page.
Discussion Forum
We will use the Ed platform for discussions (sign up here), but emailing the TA mailing list and coming to office hours are also encouraged.
Policy
Please read the following link
carefully!
Syllabus