A data mining task can be specified in the form of a data mining query, which is input to the data mining system. A data mining query is defined in terms of data mining task primitives. These primitives allow the user to interactively communicate with the data mining system during the mining process to discover interesting patterns.
Here is the list of Data Mining Task Primitives
Set of task relevant data to be mined
This specifies the portions of the database or the set of data in which the user is interested.
This portion includes the following
For example, suppose that you are a manager of All Electronics in charge of sales in the United States and Canada. You would like to study the buying trends of customers in Canada. Rather than mining on the entire database. These are referred to as relevant attributes.
This specifies the data mining functions to be performed, such as
For instance, if studying the buying habits of customers in Canada, you may choose to mine associations between customer profiles and the items that these customers like to buy.
Users can specify background knowledge, or knowledge about the domain to be mined. This knowledge is usefulfor guiding the knowledge discovery process, and for evaluating the patterns found. User beliefs about relationship in the data.
There are several kinds of background knowledge. Concept hierarchies are a popular form of background knowledge, which allow data to be mined at multiple levels of abstraction.
Example:
An example of a concept hierarchy for the attribute (or dimension) age is shown in the following Figure.
In the above, the root node represents the mostgeneral abstraction level, denoted as all.
The Interestingness measures are used to separateinteresting and uninteresting patterns from the knowledge.They may be used to guide the mining process, or after discovery, to evaluate the discovered patterns. Different kinds of knowledge may have different interestingness measures.
For example, interesting measures for association rules include support and confidence.
This refers to the formin which discovered patterns are to be displayed. Users can choose from different forms for knowledge presentation, such as