CS291A - Schedule home | schedule


 

Date

Topic

Comments

Jan 9 Course Introduction  
Jan  11 Transformers / Quiz Basic Concepts
Jan 16 Holiday  
Jan 18 Topic: Time Series Forecasting  (Shiyang)
Paper: DeepAR:DeepAR: Probabilistic Forecasting with Autoregressive Recurrent Networks
Paper: ConvTran:Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Comments: Attention in Time Series  (Xifeng)
Jan 23

Topic: GPT, CodeX, ChatGPT (Weizhi)

Paper: GPT: Improving Language Understanding by Generative Pre-Training

Paper: GPT-3:  Language Models are Few-Shot Learners

Paper: InstructGPT: Training Language Models to Follow Instructions with Human Feedback

Reading:  How does GPT Obtain its Ability?

 
Jan 25

REALM, RAG, DPR,  FiD (Hong)

Topic: Retrieval-Augmented Pre-training and Fine-tuning for Knowledge-Intensive NLP Tasks

Paper: REALM: Retrieval-Augmented Language Model Pre-Training 

Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

Paper: Fid: Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

 
Jan 30

Topic: T5 and BART (Shiyang)

Paper: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Paper: T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

 
Feb 1

Topic: Prefix, Adapter, in Context Learning (Hong)
Paper: Prefix-Tuning: Optimizing Continuous Prompts for Generation

Paper: Adapters: Parameter-Efficient Transfer Learning for NLP

Project Proposal Due
Feb 6

Topic:  Multimodality (Weizhi)

Paper: CLIP: Learning Transferable Visual Models From Natural Language Supervision

Paper: Weizhi's paper

 
Feb 8

Topic: Scanned Document Analysis

Paper: LayoutLM: Pre-training of Text and Layout for Document Image Understanding (Krushna)

Paper: LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding
(Rajan)
Paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
 
Feb 13 Topic: Dialogue and Limitation of Language Model
Paper: A Simple Language Model for Task-Oriented Dialogue (Erwan)
Paper:  Limitations of Language Models in Arithmetic and Symbolic Induction (Xifeng)
 
Feb 15

Topic: Pretrained Models for Long Documents

Paper: Longformer: The Long-Document Transformer (Saastha)

BigBird: Big Bird: Transformers for Longer Sequences  (Ross)
Paper review due
Feb 20 Holiday  
Feb 22

Topic: Make it smaller

Paper: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (Danish)

Paper: TinyBERT: Distilling BERT for Natural Language Understanding (Kyle)

Feb 27

Topic:  Architecture Idea

Paper: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer  (
Sid)
Paper: PaLM: Scaling Language Modeling with Pathways (Alex Mei)
Mar 1

Topic: Speech Recognition and Image Recognition Application
Paper: wav2vec: Unsupervised Pre-Training for Speech Recognition (Marius)
Paper: ViT:An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Daniel)  
Paper: MAE: Masked Autoencoders Are Scalable Vision Learners (Ian)

Mar 6 Topic:  Transformer based Reinforcement Learning
Paper: Decision Transformer: Reinforcement Learning via Sequence Modeling (Rhys)
Midterm Quiz (cover Jan 11 -  Feb 22)
  Topic: Dreamer
Paper: Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
Paper: Dreamerv2: Mastering Atari with Discrete World Models
Paper: DayDreamer: World Models for Physical Robot Learning
Just for your reading, no presentation
Mar 8 Project Presentation  
Mar 13 Project Presentation  
Mar 15 Project Presentation  
Mar 20 Project Final Report Due