Techniques for Data Integration

Department of Computer Science
Winter 2016

Enterprise systems are large-scale software applications to support business operations; they typically include software systems for data management, business process/workflow management, information flows, reporting, and data analytics. Focusing only the data management aspect, a typical enterprise has to struggle with many data integration difficulties, since its data are usually spread around many database systems, workflow systems, file systems, etc. and in a variety of form possibly with no coherent semantics. In this course, we plan to discuss some fundamental data modeling and manipulation techniques that will be useful in tackling these data integration problems. Topics covered include conjunctive queries, Datalog, data integration frameworks (GAV, LAV, GLAV), data exchange formalisms, views and updates.

Instructor: Professor Jianwen Su, Department of Computer Science

Units: 4

Lectures: Mondays and Wednesdays, 9:00am-10:50pm, Harold Frank Hall 1132

Recommended textbook/references:

  1. Approximately 3-4 homework assignments
  2. One exam
  3. A course project
  4. Active participation

Course homepage:

Tentative course outline:
  1. Conjunctive queries
  2. Deductive databases (Datalog and optimization)
  3. Data integration frameworks (GAV, LAV, GLAV)
  4. Data exchange
  5. Updates and views

Homework/Project Assignments

Lecture Notes