Foundations of Data Modeling

Department of Computer Science
Spring 2019

Data management arose and continue to have strong demand from application needs for storing, accessing, and maintaining data consistently. Data modeling refers to the activity of organizing data to facilitate these functions. Early approaches to data modeling were physically inspired (network and hierarchical structures), and management system implementations were ad hoc. The emergence of the relational data model in the 70's brought a sweeping change to data modeling and stimulated/spawned many database research areas. The goal of this course is to explore some of the fundamental concepts and learn some fundamental techniques in data modeling concerning query languages, query optimization, and possibly others.
In the database literature, query languages (including algebraic, calculus and deductive paradigms) for the relational, nested relations, and object-oriented data models were well studied and understood. These models and languages incorporate many concepts and techniques from logic, artificial intelligence, and programming languages into the database framework. Based on the relational modeling approach this course explore and study fundamental techniques and advanced features on data access languages. For the relational languages, we will examine how different structures and operations affect query processing and optimization in terms of the complexity (lower and upper bounds) and expressiveness. For advance information systems, we will discuss some recent developments in data modeling and related query optimization issues.

Instructor: Professor Jianwen Su, Department of Computer Science

Units: 4

Lectures: T & Th 9:00am-10:50pm, Phelps 2510

Recommended textbook/references:

  1. Approximately 3-4 homework assignments
  2. One exam
  3. Active participation

Online information: Gauchospace

Tentative course outline:
  1. Relational data model and relational query languages
  2. Recursive languages and Datalog
  3. Evaluation and optimization of Datalog programs
  4. And selected topics from Data integration, Incomplete information, Consistent query answering, and Structured and semi-structured data (nested relations and XML)