Ambuj K Singh

Department of Computer Science
Science and Engineering

3119 Engineering I
University of California at Santa Barbara

CA 93106-5110

Email: ambuj at

Office phone: (805)-893-3236
Main department phone: (805)-893-4321
Dept fax: (805)-893-8553





My research interests are broadly in the areas of network science, cheminformatics & bioinformatics, graph querying and mining, and databases (recent papers).

Network Science

Network science is a new and emerging scientific discipline that examines the interconnections among diverse physical or engineered networks, information networks, biological networks, cognitive and semantic networks, and social networks. This field of science seeks to discover common principles, algorithms and tools that govern network behavior. The National Research Council defines Network Science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena." My group is developing methodologies, algorithms, and implementations needed for scalable, dynamic, and resilient networks. Specific problems include querying composite networks, modeling dynamic networks, sentiment analysis, analysis of content and user behavior, discovering unusual patterns, and sampling in composite networks.

Graph Querying and Mining

A number of scientific endeavors are generating data that can be modeled as graphs: high-throughput genome analysis, screening of chemical compounds, social networks, and ecological networks and food webs.  Mining and analysis of these annotated and probabilistic graphs is crucial for advancing the state of scientific research, accurate modeling and analysis of existing systems, and engineering of new systems. The goal of this research project is to develop a set of scalable querying and mining tools for graph databases by integrating techniques from the fields of databases, bioinformatics, machine learning, and algorithms.


Intensive investigations over several decades have revealed the functions of many individual genes, proteins, and pathways. There has been an explosion of data of widely diverse types, arising from genome-wide characterization of transcriptional profiles, protein-protein interactions, genomic structure, genetic phenotype, gene interactions, gene expression, and proteomics. We are developing techniques that can integrate and analyze data from multiple sources and models efficiently. One research thrust quantifies phenotypic variation using image analysis and pattern recognition tools, develops a causal model for gene regulatory processes, and validates the model experimentally. In another research thrust, high resolution images of molecules and cells are being analyzed for understanding complex systems such as localization of specific neuron types, branching patterns of dendritic trees, and localization of molecules at the subcellular level. These efforts are being augmented by a unique distributed digital library of bio-molecular image data. Such searchable databases will make it possible to optimally understand and interpret the data, leading to a more complete and integrated understanding of cellular structure, function and regulation.

Data Mining in Chemoinformatics and Drug Discovery

Increased availability of large repositories of chemical compounds and other biochemical data has created new challenges and opportunities for data-mining in chemical informatics and drug discovery: identification of active substructures and compounds, prediction of physicochemical properties and structure-activity relationships, diversity analysis of compound collections, drug repurposing, and pathway mining for identification of network fragments responsible for disease progression. My group has developed several graph-based and 3D-based methods for such analyses. These ideas are being pursued by Acelot, Inc., a local drug discovery startup.

Students and Projects

Current Research Group

Current Research Projects

Past Research Group

Current Funding

NIGMS, NIH: Integrative Modeling of Regulatory Processes using High-throughput Genetic Data

Computational Challenges in the Discovery and Understanding of Complex Biological Structures through Multimodal Imaging

Working with Uncertain Data in Exploring Scientific Images

Modeling, Querying, and Mining of Dynamic Graphs

Institute for Collaborative Biotechnologies

Network Science Collaborative Technological Alliance

Other Links


Center for BioImage Informatics