CS291K - Schedule home | schedule


 

Date

Topic

Comments

Jan 8 Course Introduction  
Jan  10 Transformers / Quiz Basic Concepts
Jan 15 Holiday  
Jan 17

Topic: T5 and BART (Ross)

Paper: BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

Paper: T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

Paper: Finetuned Language Models Are Zero-Shot Learners

Jan 22

Topic: GPT, ChatGPT (Weizhi)

Paper: GPT: Improving Language Understanding by Generative Pre-Training

Paper: GPT-3:  Language Models are Few-Shot Learners

Paper: InstructGPT: Training Language Models to Follow Instructions with Human Feedback

Reading:  How does GPT Obtain its Ability?

Reading:  Sparks of Artificial General Intelligence: Early experiments with GPT-4

 
Jan 24

Topic: Retrieval-Augmented Generation

Paper: REALM: Retrieval-Augmented Language Model Pre-Training  (Shinda)

Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Mehak)

Paper: Fid: Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

Practice: Try LangChain to build a question answering chatbot

 
Jan 29 Topic: Chain of Thoughts
Paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Peiyang)
Paper: Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Callie)
Paper: PAL: Program-aided Language Models (Alfonso)
 
Jan 31 Topic: Actionable LLMs
Paper: ReAct: Synergizing Reasoning and Acting in Language Models (Deepark)
Paper: ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (Esha)
Practice: Read tutorials, e.g., this one,   try GPT4 function call
Read: Augmented Language Models: a Survey
Project Proposal Due
Feb 5

Topic: Intelligent Agents
Paper: AutoGPT (Tanay)
Paper: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation (Ruiquan)
Paper: Agents: An Open-source Framework for Autonomous Language Agents  (Hwajung)
Practice:  Try one of these packages

 
Feb 7 Topic: Small models guiding large models
Paper: REPLUG: Retrieval-Augmented Black-Box Language Models  (Aditya)
Paper: Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering (Luke)
Paper:
Guiding Large Language Models via Directional Stimulus Prompting (xuan)
 
Feb 12

Topic: Prefix and Adapter in Context Learning
Paper: Prefix-Tuning: Optimizing Continuous Prompts for Generation (Sahil)

Paper: Adapters: Parameter-Efficient Transfer Learning for NLP (Alvin)

Paper: LoRA: Low-Rank Adaptation of Large Language Models (Kenan)

 
Feb 14

Topic: Pretrained Models for Long Context

Paper: Longformer: The Long-Document Transformer (Haarika)

Paper: BigBird: Big Bird: Transformers for Longer Sequences (Rutvik)
Paper: Extending Context Window of Large Language Models via Positional Interpolation (Ethan)
Paper review or System Play due
Feb 19 Holiday  
Feb 21

Topic: Make it smaller

Paper: DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter (Zihan)

Paper: TinyBERT: Distilling BERT for Natural Language Understanding (Shanxiu)

Paper: Distilling step-by-step: Outperforming larger language models with less training data and smaller model sizes (Parker)

Feb 26

Topic: Multimodality

Paper: CLIP: Learning Transferable Visual Models From Natural Language Supervision (zifeng)

Paper: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (Ivan)

Paper: Visual Instruction Tuning (Noa)

Feb 28

Topic: Speech Recognition and Image Recognition Application
Paper: wav2vec: Unsupervised Pre-Training for Speech Recognition (Eren)

Paper: wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Laasya)
Paper: ViT:An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Yuchen)
Paper: MAE: Masked Autoencoders Are Scalable Vision Learners (Vihaan)

Mar 4 Topic: Transformer based Reinforcement Learning
Paper: Decision Transformer: Reinforcement Learning via Sequence Modeling (Trysten)
Exam (cover Jan 10 -  Feb 26)
  Topic: Dreamer
Paper: Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
Paper: Dreamerv2: Mastering Atari with Discrete World Models
Paper: DayDreamer: World Models for Physical Robot Learning
Just for your reading, no presentation
 

Topic: Scanned Document Analysis

Paper: LayoutLM: Pre-training of Text and Layout for Document Image Understanding

Paper: LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding

Paper: LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Paper: Linformer: Self-Attention with Linear Complexity
Just for your reading, no presentation
Mar 6 Project Presentation  
Mar 11 Project Presentation  
Mar 13 Project Presentation  
Mar 18 Project Final Report Due