Yu Su

About Me

Yu Su

I'm currently a researcher at Microsoft Semantic Machines in Berkeley. I will join the Department of Computer Science and Engineering at the Ohio State University in January 2020. I got my PhD from the Department of Computer Science at University of California, Santa Barbara, advised by Xifeng Yan. Before coming to UC Santa Babara, I received my bachelor degree from the Department of Computer Science and Technology, Tsinghua University in 2012. I have spent some fun time interning in IBM T.J. Watson Research Center (2015), Microsoft Research Redmond (2016 and 2017), and U.S. Army Research Laboratory (2015 and 2016).

I have broad interests in understanding human languages, formal knowledge, and their interplay. Specifically, I have been working on natural language interfaces (an umbrella term for techniques like semantic parsing and dialog systems) for a wide range of backend data and services such as knowledge bases, web tables, and APIs. I'm also interested in knowledge base construction and general text mining.

I'm looking for highly-motivated students. Drop me an email if you are interested in natural language processing, machine learning, and data mining (unfortunately due to the high volumn I may not be able to reply to every email).

What's New

  • 05/2019: Short paper on general-purpose textual relation embedding got accepted to ACL 2019
  • 05/2019: Received Outstanding Dissertation Award of Computer Science from UCSB. Thank you UCSB!
  • 05/2019: Check out what we are doing at Microsoft Semantic Machines (highlighted in Microsoft Build 2019)!
  • 02/2019: Full paper on vocabulary selection got accepted to NAACL 2019
  • 02/2019: Talk at Stanford NLP Seminar on democratizing data science with knowledge engines
  • 11/2018: Full paper on zero-shot video captioning got accepted to AAAI 2019
  • 10/2018: Started as researcher at Microsoft Semantic Machines in Berkeley working on conversational AI.
  • 08/2018: Full paper on concept mining from text got accepted to ICDM 2018.
  • 08/2018: Two long papers on dialog/semantic parsing got accepted to EMNLP 2018.
  • 07/2018: Our work on natural language interfaces to APIs highlighted in Microsoft Research Blog!
  • 06/2018: Serve as PC member for ACL'18, EMNLP'18, CoNLL'18, NLPCC'18, and AAAI'19.
  • 04/2018: Paper "DialSQL: Dialogue Based Structured Query Generation" accepted to ACL'18 as long paper: Improve semantic parsing with dialog.
  • 04/2018: Paper "Natural Language Interfaces with Fine-Grained User Interaction: A Case Study on Web APIs" accepted to SIGIR'18 as long paper.
  • 03/2018: Awarded the Best Distinguished Graduate Student Lecture of UCSB CS Summit.
  • 02/2018: Paper "Global Relation Embedding for Relation Extraction" accepted to NAACL-HLT'18: Robust relation extraction from text with global statistics.
  • 02/2018: Talk about "Bridging the Gap between Human and Data with AI" at the University of Massachusetts, Amherst.
  • 02/2018: Successfully organized the first Workshop on Knowledge Base Construction, Reasoning and Mining at Los Angeles. Check out the great invited talks and accepted papers!
  • 01/2018: Talk about "Bridging the Gap between Human and Data with AI" at the Ohio State University.
  • 12/2017: I will serve in the Program Committee (Research Track) of KDD'18
  • 12/2017: Paper "Unsupervised Neural Categorization for Scientific Publications" accepted to SDM'18.
  • 11/2017: Attended CIKM'17 in Singapore and gave a talk on natural lanugage interface and a tutorial on construction and querying of large-scale knowledge bases.
  • 10/2017: Upcoming visits in China: 10.09-10.15 (Alibaba, Hangzhou), 10.10 (Fudan University, Shanghai), 10.11 (The Computing Conferencce, Hangzhou), 10.16 (Tsinghua University, Beijing), 10.17 (Toutiao AI Lab, Beijing)
  • 09/2017: I'm co-organizing the First Workshop on Knowledge Base Construction, Reasoning and Mining (KBCOM'18) co-located with WSDM'18 on Feb 9, 2018 at Los Angeles. CFP is out!
  • 09/2017: Finished summer internship at MSR. Flying to Copenhagen for EMNLP.
  • 08/2017: I will serve in the Program Committee of WWW'18
  • 08/2017: Paper on natural language interface to web API from zero user and data accepted to CIKM'17.
  • 07/2017: Tutorial on Construction and Querying of Large-Scale Knowledge Bases accepted to CIKM'17. See you in Singapore!
  • 06/2017: Three papers on semantic parsing/QA accepted to EMNLP'17. Thanks to my collaborators!
  • 06/2017: Started summer internship in Microsoft Research
  • 04/2017: I will serve in the Program Committee of CIKM'17
  • 03/2017: Attended a project meeting at UIUC and gave a talk on unsupervised document categorization
  • 03/2017: I will serve in the Program Committee of NLPCC'17
  • 02/2017: I will serve in the Program Committee of EMNLP'17
  • 01/2017: I will serve in the Program Committee of ACL'17
  • 11/2016: Attended EMNLP'16 in Austin, US
  • 09/2016: Our QA dataset GraphQuestions v1 is released. Check it out!
  • 09/2016: Two papers on knowledge base question answering got accepted to EMNLP'16!
  • 09/2016: Attended the Bay Area Deep Learning School, Stanford
  • 06/2016: Started summer internship in Microsoft Research, Redmond

Publications

     Refereed Publications

  • Global Textual Relation Embedding for Relational Understanding
    Zhiyu Chen, Hanwen Zha, Honglei Liu, Wenhu Chen, Xifeng Yan and Yu Su. To appear in the Annual Conference of the Association for Computational Linguistics, 2019, short paper (ACL’19) [paper coming soon] [code coming soon] [data coming soon]
  • How Large A Vocabulary Does Text Classification Need? A Variational Approach on Vocabulary Selection
    Wenhu Chen, Yu Su, Yilin Shen, Zhiyu Chen, Xifeng Yan and William Yang Wang. To appear in the Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2019 (NAACL-HLT’19) [paper]
  • Learning to Compose Topic-Aware Mixture of Experts for Zero-Shot Video Captioning
    Xin Wang, Jiawei Wu, Da Zhang, Yu Su, William Yang Wang. In Proc. of the AAAI Conference on Artificial Intelligence, 2019 (AAAI’19) [paper]
  • Concept Mining via Embedding
    Keqian Li, Hanwen Zha, Yu Su, Xifeng Yan. In Proc. of the IEEE International Conference on Data Mining, 2018 (ICDM’18) [paper]
  • XL-NBT: A Cross-lingual Neural Belief Tracking Framework
    Wenhu Chen, Jianshu Chen, Yu Su, Xin Wang, Dong Yu, Xifeng Yan and William Yang Wang. In Proc. of the Conference on Empirical Methods in Natural Language Processing, 2018 (EMNLP’18) [paper]
  • What It Takes to Achieve 100% Condition Accuracy on WikiSQL
    Semih Yavuz, Izzeddin Gur, Yu Su and Xifeng Yan. In Proc. of the Conference on Empirical Methods in Natural Language Processing, 2018 (EMNLP’18) [paper]
  • DialSQL: Dialogue Based Structured Query Generation
    Izzeddin Gur, Semih Yavuz, Yu Su, Xifeng Yan. In Proc. of the Annual Meeting of the Association for Computational Linguistics, 2018, oral (ACL’18) [paper]
  • Natural Language Interfaces with Fine-Grained User Interaction: A Case Study on Web APIs
    Yu Su, Ahmed Hassan Awadallah, Miaosen Wang, Ryen White. In Proc. of the International ACM SIGIR Conference on Research and Development in Information Retrieval, 2018, oral (SIGIR’18) [paper] [Microsoft Research Blog]
  • Global Relation Embedding for Relation Extraction
    Yu Su*, Honglei Liu*, Semih Yavuz, Izzeddin Gur, Huan Sun, Xifeng Yan. In Proc. of the Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018 (NAACL-HLT’18) [paper] [code] (*: Equal Contribution)
  • Unsupervised Neural Categorization for Scientific Publications
    Keqian Li, Hanwen Zha, Yu Su, Xifeng Yan. In Proc. of the SIAM International Conference on Data Mining, 2018, oral (SDM’18) [paper]
  • Building Natural Language Interfaces to Web APIs
    Yu Su, Ahmed Hassan Awadallah, Madian Khabsa, Patrick Pantel, Michael Gamon, Mark Encarnacion. In Proc. of the ACM International Conference on Information and Knowledge Management, 2017, oral (CIKM’17) [paper]
  • Cross-domain Semantic Parsing via Paraphrasing
    Yu Su, Xifeng Yan. In Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP’17) [paper] [code]
  • An End-to-End Deep Framework for Answer Triggering with a Novel Group-Level Objective
    Jie Zhao, Yu Su, Ziyu Guan, Huan Sun. In Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing, short paper (EMNLP’17) [paper]
  • Recovering Question Answering Errors via Query Revision
    Semih Yavuz, Izzeddin Gur, Yu Su, Xifeng Yan. In Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing, short paper (EMNLP’17) [paper]
  • On Generating Characteristic-rich Question Sets for QA Evaluation
    Yu Su, Huan Sun, Brian Sadler, Mudhakar Srivatsa, Izzeddin Gur, Zenghui Yan, Xifeng Yan. In Proc. of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP’16) [paper] [appendix] [data]
  • Improving Semantic Parsing via Answer Type Inference
    Semih Yavuz, Izzeddin Gur, Yu Su, Mudhakar Srivatsa, Xifeng Yan. In Proc. of the 2016 Conference on Empirical Methods in Natural Language Processing, oral (EMNLP’16) [paper]
  • A Fast Kernel for Attributed Graphs
    Yu Su, Fangqiu Han, Richard E. Harang, Xifeng Yan. In Proc. of the SIAM International Conference on Data Mining, 2016, oral (SDM’16) [paper] [appendix] [slides] [poster]
  • Table Cell Search for Question Answering
    Huan Sun, Hao Ma, Xiaodong He, Wen-Tau Yih, Yu Su, Xifeng Yan. In Proc. of the International World Wide Web Conference, 2016, oral (WWW’16) [paper]
  • Visual Graph Query Formulation and Exploration: A New Perspective on Information Retrieval at the Edge
    Sue Kase, Michelle Vanni, Joanne Knight, Yu Su, Xifeng Yan. In Proc. of SPIE 9851, Next-Generation Analyst IV, 2016 (SPIE Defense+Security’16)
  • Exploiting Relevance Feedback in Knowledge Graph Search
    Yu Su, Shengqi Yang, Huan Sun, Mudhakar Srivatsa, Sue Kase, Michelle Vanni, Xifeng Yan. In Proc. of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2015, oral (KDD’15) [paper] [slides] [poster] [data]
  • On the Validity of Geosocial Mobility Traces
    Zengbin Zhang, Lin Zhou, Xiaohan Zhao, Gang Wang, Yu Su, Miriam Metzger, Haitao Zheng, and Ben Y. Zhao. In Proc. of the ACM Workshop on Hot Topics in Networks, 2013 (HotNets’13) [paper]

     Tutorials

  • Scalable Construction and Querying of Massive Knowledge Bases
    Xiang Ren, Yu Su, Pedro Szekely, Xifeng Yan. In Proc. of the International Conference on World Wide Web, 2018 (WWW’18) [website]
  • Construction and Querying of Large-scale Knowledge Bases
    Xiang Ren, Yu Su, Xifeng Yan. In Proc. of the ACM International Conference on Information and Knowledge Management, 2017 (CIKM’17) [website]

     Patents

  • Natural Language Interface to Web API (co-inventor)
    US Patent Pending 15/582,242
    A framework for building natural language interface to web API via crowdsourcing with a hierarchical probabilistic model and optimization algorithm for crowdsourcing.

Experience

  • 10/2018 - present: Researcher at Microsoft Semantic Machines
  • 08/2018 - present: Visiting assistant professor at The Ohio State University
  • 06/2017 - 09/2017: Research intern at Microsoft Research, Redmond
  • 06/2016 - 09/2016: Research intern at Microsoft Research, Redmond
  • 05/2015 - 06/2015, 09/2015 - 11/2015, 05/2016 - 06/2016: Visiting researcher at U.S. Army Research Laboratory
  • 06/2015 - 09/2015: Research intern at IBM T.J. Watson Research Center

Service

  • Co-organizer: KBCOM'18
  • Session Chair: CIKM'17 (Representation Learning)
  • Program Committee Member: KDD'19, NAACL'19, AAAI'19, KDD'18, ACL'18, EMNLP'18, WWW'18, NLPCC'18, CoNLL'18, ACL'17, EMNLP'17, NLPCC'17, CIKM'17
  • Reviewer: IEEE TNNLS, IEEE TKDE, ACM TKDD

Contact

  • Email: %s@osu.edu % 'su.809'