Date
Topic
Comments
Topic: Transformers
Paper: Attention Is All You Need
Topic: GPT, ChatGPT
Paper: GPT: Improving Language Understanding by Generative Pre-Training
Paper: GPT-3: Language Models are Few-Shot Learners
Paper: InstructGPT: Training Language Models to Follow Instructions with Human Feedback
Topic: Retrieval-Augmented Generation
Paper: Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models
Paper: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
Paper: The Faiss library
Topic: More LLM Agents Paper: Toolformer: Language Models Can Teach Themselves to Use Tools
Paper: AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Paper: OctoTools: An Agentic Framework with Extensible Tools for Complex ReasoningPractice: OpenAI Tool Calling, OpenClaw, NanoClaw
Topic: Multimodality
Paper: CLIP: Learning Transferable Visual Models From Natural Language Supervision
Paper: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Paper: Visual Instruction Tuning
Topic: Long Context / KV Cache
Topic: Fine Tuning Paper: Adapters: Parameter-Efficient Transfer Learning for NLP
Paper: LoRA: Low-Rank Adaptation of Large Language Models
Paper: Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Topic: Make it smaller
Paper: Fast Inference from Transformers via Speculative Decoding
Paper: EAGLE-3: Scaling up Inference Acceleration of Large Language Models via Training-Time Test (please also discuss DeepSeek-V3 Technical Report (MTP part))
Holiday