Date
|
Topic
|
Comments
|
Jan 9 |
Course Introduction |
|
Jan 11 |
Transformers / Quiz |
Basic Concepts |
Jan 16 |
Holiday |
|
Jan 18 |
Topic: Time Series Forecasting (Shiyang) Paper:
DeepAR:DeepAR:
Probabilistic Forecasting with Autoregressive Recurrent Networks
Paper: ConvTran:Enhancing
the Locality and Breaking the Memory Bottleneck of Transformer on Time Series
Forecasting Comments: Attention in Time Series (Xifeng) |
|
Jan 23 |
Topic: GPT, CodeX, ChatGPT (Weizhi)
Paper: GPT:
Improving Language Understanding by Generative Pre-Training
Paper: GPT-3: Language Models
are Few-Shot Learners
Paper: InstructGPT: Training
Language Models to Follow Instructions with Human Feedback
Reading:
How does GPT Obtain its Ability?
|
|
Jan 25 |
REALM, RAG, DPR, FiD (Hong)
Topic: Retrieval-Augmented Pre-training and Fine-tuning for
Knowledge-Intensive NLP Tasks
Paper: REALM: Retrieval-Augmented Language Model Pre-Training
Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper: Fid: Leveraging Passage
Retrieval with Generative Models for Open Domain Question Answering
|
|
Jan 30 |
Topic: T5 and BART (Shiyang)
Paper: BART: Denoising
Sequence-to-Sequence Pre-training for Natural Language Generation,
Translation, and Comprehension
Paper: T5: Exploring the Limits of
Transfer Learning with a Unified Text-to-Text Transformer
|
|
Feb 1 |
Topic: Prefix, Adapter, in Context Learning (Hong)
Paper: Prefix-Tuning: Optimizing Continuous Prompts for Generation
Paper: Adapters:
Parameter-Efficient Transfer Learning for NLP
|
Project Proposal Due |
Feb 6 |
Topic: Multimodality (Weizhi)
Paper: CLIP:
Learning Transferable Visual Models From Natural Language Supervision
Paper: Weizhi's paper
|
|
Feb 8 |
Topic: Scanned
Document Analysis
Paper:
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
(Krushna)
Paper:
LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding
(Rajan) Paper: LayoutLMv3: Pre-training
for Document AI with Unified Text and Image Masking |
|
Feb 13 |
Topic: Dialogue and Limitation of Language Model Paper: A Simple Language Model for Task-Oriented
Dialogue (Erwan) Paper:
Limitations of Language Models
in Arithmetic and Symbolic Induction (Xifeng) |
|
Feb 15 |
Topic: Pretrained Models for Long Documents
Paper:
Longformer: The Long-Document Transformer (Saastha)
BigBird: Big Bird: Transformers for
Longer Sequences (Ross) |
Paper review due |
Feb 20 |
Holiday |
|
Feb 22 |
Topic: Make it smaller
Paper:
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
(Danish)
Paper:
TinyBERT: Distilling BERT for Natural Language Understanding (Kyle)
|
|
Feb 27 |
Topic: Architecture Idea
Paper:
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts
Layer (Sid) Paper:
PaLM: Scaling Language Modeling
with Pathways
(Alex Mei) |
|
Mar 1 |
Topic: Speech Recognition and Image Recognition Application Paper:
wav2vec: Unsupervised Pre-Training
for Speech Recognition (Marius) Paper:
ViT:An Image is
Worth 16x16 Words: Transformers for Image Recognition at Scale (Daniel)
Paper: MAE: Masked Autoencoders Are
Scalable Vision Learners (Ian)
|
|
Mar 6 |
Topic: Transformer based Reinforcement Learning Paper:
Decision Transformer: Reinforcement
Learning via Sequence Modeling (Rhys) |
Midterm Quiz (cover Jan 11 - Feb 22) |
|
Topic: Dreamer Paper:
Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
Paper: Dreamerv2: Mastering
Atari with Discrete World Models Paper:
DayDreamer: World Models for
Physical Robot Learning
|
Just for your reading, no presentation |
Mar 8 |
Project Presentation |
|
Mar 13 |
Project Presentation |
|
Mar 15 |
Project Presentation |
|
Mar 20 |
Project Final Report Due |
|