DM Menu


Integration of Data mining system with a Data warehouse




The data mining system is integrated with a database or data warehouse system so that it can do its tasks in an effective mode. A data mining system operates in an environment that needs to communicate with other data systems like a Database or Datawarehouse system.

There are differentpossible integration (coupling) schemes as follows:

  • No Coupling
  • Loose Coupling
  • Semi-Tight Coupling
  • Tight Coupling
Integration-of-Data-mining-system-with-a-Data-warehouse

No Coupling

No coupling means that a Data Mining system will not utilize any function of a Data Base or Data Warehouse system.

It may fetch data from a particular source (such as a file system), process data using some data mining algorithms, and then store the mining results in another file.

Drawbacks of No Coupling

  • First, without using a Database/Data Warehouse system, a Data Mining system may spend a substantial amount of time finding, collecting, cleaning, and transforming data.
  • Second, there are many tested, scalable algorithms and data structures implemented in Database and Data Warehouse systems.

Loose Coupling

In this Loose coupling, the data mining system uses some facilities / services of a database or data warehouse system. The data is fetched from a data repository managed by these (DB/DW) systems.

Data mining approaches are used to process the data and then the processed data is saved either in a file or in a designated area in a database or data warehouse.

Loose coupling is better than no coupling because it can fetch any portion of data stored in Databases or Data Warehouses by using query processing, indexing, and other system facilities.

Drawbacks of Loose Coupling

  • It is difficult for loose coupling to achieve high scalability and good performance with large data sets.

Semi-Tight Coupling

Semitight couplingmeans that besides linking a Data Mining system to a Data Base/Data Warehousesystem, efficient implementations of a few essential data mining primitives can be provided in the DB/DW system. These primitives can include sorting, indexing, aggregation, histogram analysis, multi way join, and precomputation of some essential statistical measures, such as sum, count, max, min, standard deviation.

Advantage of Semi-Tight Coupling

  • This Coupling will enhance the performance of Data Mining systems

Tight Coupling

Tight couplingmeans that a Data Mining system is smoothly integrated into the Data Base/Data Warehousesystem. The data mining subsystem is treated as one functional component of information system. Data mining queries and functions are optimized based on mining query analysis, data structures, indexing schemes, and query processing methods of a DB or DW system.


Next Topic :Major issues in Data Mining