CSE622

Course CSE622
Title Advanced Database Systems
Credits 3 - credits
Course Coordinator

Dr. Anita Wasilewska

Description

The course will cover selected topics on the cutting edge of database technology, such as deductive database query languages and systems, object-oriented data models, persistent programming languages, heterogeneous databases, and advanced transaction models.

Course Outcomes
  • Data Mining, called also Knowledge Discovery in Databases (KDD) is a new multidisciplinary field. It brings together research and ideas from database technology, machine learning, neural networks, statistics, pattern recognition, knowledge based systems, information retrieval, high-performance computing, and data visualization. Its main focus is the automated extraction of patterns representing knowledge implicitly stored in large databases, data warehouses, and other massive information repositories.
  • The course will closely follow the book and is designed to give a broad, yet in-depth overview of the Data Mining field and examine the most recognized techniques in a more rigorous detail.
Textbook

Jiawei Han and Micheline Kamber, DATA MINING Concepts and Techniques, Morgan Kaufman Publishers, 2001.

Major Topics Covered in Course
  • General overview: 
    What is Data Mining; which data, what kinds of patterns can be mined.
  • Data Warehouse and OLAP technology for Data Mining.
  • Data preprocessing:
    Data cleaning; data integration and transformation; data reduction; discretization and concept hierarchy generation.
  • Data Mining primitives, languages and system architectures.
  • Concept descriptions:
    Characteristic and discriminant rules; data generalization
  • Mining association rules in large databases; transactional databases and apriori algorithm.
  • Classification and prediction: 
    Decision tree induction; rough sets; Bayesian classification; classification based on concepts from association rule mining; classifiers; genetic algorithms.
  • Cluster analysis; a categorization of major clustering methods.
Laboratory
Course Webpage