CS291K Advanced Data Mining -Schedule

Date	Topic	Comments
Mar 31	Course Introduction The Bitter Lesson, Rich Sutton
Apr 4	Transformers / Quiz	Basic Concepts
Apr 7	Topic: GPT, ChatGPT (Weizhi) Paper: GPT: Improving Language Understanding by Generative Pre-Training (Weizhi) Paper: GPT-3: Language Models are Few-Shot Learners (Weizhi) Paper: InstructGPT: Training Language Models to Follow Instructions with Human Feedback (Weizhi)
Apr 11	Topic: Retrieval-Augmented Generation Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (Xuan) Paper: From Local to Global: A Graph RAG Approach to Query-Focused Summarization (Xuan) Paper: SimCSE: Simple Contrastive Learning of Sentence Embeddings (Xuan) Practice: Try LangChain to build a question answering chatbot
Apr 14	Topic: Prompt Engineering and LLM Agents Blog: Prompt Engineering Paper: ReAct: Synergizing Reasoning and Acting in Language Models Survey: Agent framework: AutoGen, CrewAI, Swarm and Mica (our own) (Sirui)
Apr 18	Topic: Chain of Thoughts Paper: Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Kay) Paper: Self-Consistency Improves Chain of Thought Reasoning in Language Models (Jasmine Lesner) Paper: Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Arul Saxena)
Apr 21	Topic: More LLM Agents Paper: Toolformer: Language Models Can Teach Themselves to Use Tools (Amrutha Kalle) Paper: Reflexion: Language Agents with Verbal Reinforcement Learning (Bhargavi Kurukunda) Paper: OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning (Alisetti Sai Vamsi) Practice: Read tutorials, e.g., this one, try GPT4 function call
Apr 25	Topic: Multimodality Paper: Magnetic-One: A Generalist Multi-Agent System for Solving Complex Tasks (Kanav Arora) Paper: CLIP: Learning Transferable Visual Models From Natural Language Supervision (Lixing Guo)	Project Proposal Due
Apr 28	Topic: Multimodality Paper: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models (Yuexi Shen) Paper: Visual Instruction Tuning (Sathvika Anand)
May 2	Topic: Long Context Paper: Longformer: The Long-Document Transformer (Adyah Rastogi) Paper: Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention (Yicheng(Tim) Qin) Paper: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Paper: Big Bird: Transformers for Longer Sequences (Jonathan Cheng)
May 5	Topic: Prefix and Adapter in Context Learning Paper: Adapters: Parameter-Efficient Transfer Learning for NLP (Jianyu (Jerry) Hou) Paper: Prefix-Tuning: Optimizing Continuous Prompts for Generation () Paper: LoRA: Low-Rank Adaptation of Large Language Models (Wesley)
May 9	Topic: Make it smaller Paper: Fast Inference from Transformers via Speculative Decoding (Chieh-Ying Lai) Paper: EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test (Guannan Wang) Paper: Adaptive Layer-skipping in Pre-trained LLMs (brand new, Xuan)
May 12	Topic: Mixture of Experts Paper: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (Surya Gunukula) Paper: Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (Abhishek)	Paper review or System Play due
May 16	Topic: DeepSeek Paper: DeepSeek-V3 Technical Report (MOE part)(Charlie) Paper: DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning (Ethan Epp) Paper: Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning (Ethan Epp)
May 19	Topic: Misc. Paper: A Time Series is Worth 64 Words: Long-term Forecasting with Transformers (Kirill Aristarkhov) Paper: Robust Speech Recognition via Large-Scale Weak Supervision (Skanda)
May 23	Exam
May 26	Holiday
May 30	Project Presentation
Jun 2	Project Presentation
Jun 6	Project Presentation
Jun 10	Project Report Due