Date
|
Topic
|
Comments
|
Jan 8 |
Course Introduction |
|
Jan 10 |
Transformers / Quiz |
Basic Concepts |
Jan 15 |
Holiday |
|
Jan 17 |
Topic: T5 and BART (Ross)
Paper: BART: Denoising
Sequence-to-Sequence Pre-training for Natural Language Generation,
Translation, and Comprehension
Paper: T5: Exploring the Limits of
Transfer Learning with a Unified Text-to-Text Transformer
Paper: Finetuned Language Models
Are Zero-Shot Learners
|
|
Jan 22 |
Topic: GPT, ChatGPT (Weizhi)
Paper: GPT:
Improving Language Understanding by Generative Pre-Training
Paper: GPT-3: Language Models
are Few-Shot Learners
Paper: InstructGPT: Training
Language Models to Follow Instructions with Human Feedback
Reading:
How does GPT Obtain its Ability?
Reading: Sparks of Artificial
General Intelligence: Early experiments with GPT-4
|
|
Jan 24 |
Topic: Retrieval-Augmented Generation
Paper: REALM: Retrieval-Augmented
Language Model Pre-Training (Shinda)
Paper: Retrieval-Augmented
Generation for Knowledge-Intensive NLP Tasks (Mehak)
Paper: Fid: Leveraging Passage
Retrieval with Generative Models for Open Domain Question Answering
Practice: Try LangChain
to build a question answering chatbot
|
|
Jan 29 |
Topic: Chain of Thoughts Paper:
Chain-of-Thought Prompting Elicits
Reasoning in Large Language Models (Peiyang) Paper:
Tree of Thoughts: Deliberate
Problem Solving with Large Language Models (Callie) Paper:
PAL: Program-aided
Language Models (Alfonso) |
|
Jan 31 |
Topic: Actionable
LLMs
Paper: ReAct: Synergizing
Reasoning and Acting in Language Models (Deepark) Paper:
ToolLLM: Facilitating Large
Language Models to Master 16000+ Real-world APIs
(Esha) Practice: Read
tutorials, e.g.,
this one, try GPT4 function call Read:
Augmented Language Models: a
Survey |
Project Proposal Due |
Feb 5 |
Topic:
Intelligent Agents Paper:
AutoGPT
(Tanay)
Paper: AutoGen: Enabling
Next-Gen LLM Applications via Multi-Agent Conversation (Ruiquan) Paper:
Agents: An Open-source
Framework for Autonomous Language Agents (Hwajung) Practice: Try one
of these packages
|
|
Feb 7 |
Topic: Small models guiding large models Paper:
REPLUG: Retrieval-Augmented
Black-Box Language Models (Aditya) Paper:
Empower Large Language Model to
Perform Better on Industrial Domain-Specific Question Answering (Luke)
Paper:
Guiding Large Language
Models via Directional Stimulus Prompting (xuan) |
|
Feb 12 |
Topic: Prefix and Adapter in Context Learning
Paper: Prefix-Tuning: Optimizing Continuous Prompts for Generation
(Sahil)
Paper: Adapters:
Parameter-Efficient Transfer Learning for NLP (Alvin)
Paper: LoRA: Low-Rank
Adaptation of Large Language Models (Kenan)
|
|
Feb 14 |
Topic: Pretrained Models for Long Context
Paper:
Longformer: The Long-Document Transformer (Haarika) Paper: BigBird: Big Bird: Transformers for
Longer Sequences (Rutvik) Paper:
Extending Context Window of Large
Language Models via Positional Interpolation (Ethan) |
Paper review or System Play due |
Feb 19 |
Holiday |
|
Feb 21 |
Topic: Make it smaller
Paper:
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
(Zihan)
Paper:
TinyBERT: Distilling BERT for Natural Language Understanding (Shanxiu)
Paper: Distilling step-by-step:
Outperforming larger language models with less training data and smaller model
sizes (Parker)
|
|
Feb 26 |
Topic: Multimodality
Paper: CLIP:
Learning Transferable Visual Models From Natural Language Supervision
(zifeng)
Paper:
BLIP-2: Bootstrapping
Language-Image Pre-training with Frozen Image Encoders and Large Language
Models (Ivan)
Paper: Visual Instruction
Tuning (Noa)
|
|
Feb 28 |
Topic: Speech Recognition and Image Recognition Application Paper:
wav2vec: Unsupervised Pre-Training
for Speech Recognition (Eren)
Paper: wav2vec 2.0: A Framework
for Self-Supervised Learning of Speech Representations
(Laasya) Paper:
ViT:An Image is
Worth 16x16 Words: Transformers for Image Recognition at Scale
(Yuchen)
Paper: MAE: Masked Autoencoders Are
Scalable Vision Learners (Vihaan)
|
|
Mar 4 |
Topic: Transformer based Reinforcement Learning Paper:
Decision Transformer: Reinforcement
Learning via Sequence Modeling (Trysten) |
Exam (cover Jan 10 - Feb 26) |
|
Topic: Dreamer Paper:
Dreamer: Dream to Control: Learning Behaviors by Latent Imagination
Paper: Dreamerv2: Mastering
Atari with Discrete World Models Paper:
DayDreamer: World Models for
Physical Robot Learning
|
Just for your reading, no presentation |
|
Topic: Scanned
Document Analysis
Paper:
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Paper:
LayoutLMv2: Multi-modal Pre-training for Visually-rich Document Understanding Paper: LayoutLMv3: Pre-training
for Document AI with Unified Text and Image Masking Paper:
Linformer: Self-Attention with
Linear Complexity |
Just for your reading, no presentation |
Mar 6 |
Project Presentation |
|
Mar 11 |
Project Presentation |
|
Mar 13 |
Project Presentation |
|
Mar 18 |
Project Final Report Due |
|