Introduction
Project management is becoming an integral part of the decision making in the world today. Decision tree analysis plays an important role in accomplishing this daunting task for managers. In cases where uncertainty is rife in making decisions, this technique is applicable. When a decision needs to be made concerning choosing from a range of available options, it becomes so ambiguous and unknowable. Due to this, the analysis of which solution is best suited for implementation involves the use of a decision tree. Decision trees are the most effective way of making decisions since; the problem is clearly set, and it allows full analysis of the risks in a decision. Another benefit of decision trees is that they help in making the most feasible decisions based on existing information (Mind Tools, 2011).
Result of a decision tree analysis
In performing a decision tree analysis, the possible, resulting solution is a classification.
Explanation
The resulting solution is a classification because the data set used in a decision tree known to the user. The decision tree analysis works out in classification since the data set indicates a known class assignment. In carrying out the analysis, two approaches in classification are feasible. First, there is the selection of the specific models, and then new data is assigned to them (Mind Tools, 2011).
Describe and discuss the concept of clustering, including its use with respect to data mining
Clustering is a process used in sorting data into groups called clusters with a distinct distance between elements in a particular group. In establishing a cluster, it is evident that the number of clusters is of a given number known to the user. Clustering has several types, such as hierarchical clustering, which creates a tree-like structure in representing data in a hierarchy of clusters. At the root is a single cluster that represents all the observations made.
Hierarchical clustering uses agglomerative algorithms in analyzing data. The analysis takes each element as a different cluster on its own then later merge them as one larger cluster. Another type of clustering is the subspace clustering, which takes note of clusters that are visible in a projection of data. This type of clustering does away with most of the irrelevant characteristics of the clusters. Clustering techniques in data mining fall into a group called undirected data mining tools whose main objective is in establishing the structure in the data as a unit. Clustering techniques help in a combination of observed examples into clusters taking into consideration two main criteria (Athman, 1996).
- In each cluster, the assumption is that there is a similarity in examples in the same group and that they are homogenous.
- Each cluster is unique to itself without similarities with other clusters.
In each clustering technique in use, clusters representation is exclusive then represented in its own way only admissible to it. The ways that can be applied are
- Cluster identification should be exclusive.
- Representation in a probabilistic manner in such a way that examples belong to clusters in regards to probability ratio.
- Cluster representation should be hierarchical.
Uses of Clustering in data mining
Clustering in data mining is very useful since it can identify groups that have similar and can create points of exploring further relations in the cluster. This technique is very viable in data mining since it supports population segmentation models. Another use of clustering in data mining is that it can determine the attributes of a segment in collaboration with the desired outcomes. An example of this is a case where segments in the market are put into consideration for their buying habits and a comparison made for targeting a new sales campaign (Ian, 2007).
Describe in detail, the differences between classifying and predicting. Provide examples based on scholarly research
Classification
Classification is a method of constructing model sets that give a description of data concepts. Classification in data mining is a function that assigns items in a group to classes of unique attributes. The main objective of classification is to give a projection of the target class for data in each case for classification. In working out a classification task, one indicates the data set in a known class assignment. In carrying out a classification, two approaches are considered. The first approach is creating specific models through the evaluation of training data. The second approach is the application of the developed model to the existing new data (Chapple, 2011).
Example
An example in classification is of a model that predicts the risks in credit based on the loan applicants over the duration of time. The data might be used to track the history of employee years at work, how many types of investments does employee ownership, and how many years has he/she resided in the given area. The target in this scenario is the credit rating, and the data associated with each customer is a case. Another example of classification is when students are classified according to the grades they attained, such as grade A, grade B or grade C, and so on.
Prediction
Prediction is a method used in data mining to predict missing or unavailable numerical data values. The similarity of the two is that they are both methods for prediction, whereas classification is for predicting the class label of data while prediction is for predicting numerical data that is missing. The major method used in prediction is regression.
Example
An example of prediction is a case whereby a decision is made on voter preferences. The voters give their opinions about a candidate, and then a prediction is in regards to the voters’ opinions (Mind Tools, 2011).
Decision Tree
An example is a decision tree is of a product, which is retailing in the market. Decision-making for the product is vital and suggestions made in case it needs consolidation or developing a new product. The decision tree is then developed and checked if all outcomes are considered. The second stage in decision tree analysis is an evaluation of the tree. This is through weighing the options that best applies to the situation at hand. An estimation of the worth of the outcome is calculated. The third stage in the analysis is calculating the values of the tree. It is important to work calculations of the values of the outcome that are important in decision-making. The next stage is calculating the value of uncertain outcomes, and lastly is calculating the value associated with the decision nodes. After calculation of the decision benefits, the largest option is vital for decision-making. (Project Management Institute, 2004).
Conclusion
Project decisions often become complicated since their implications towards the outcome of the projects are not certain. Each decision made in project management has a clearly set solution though it can pose severe consequences at the end of the decision-making is not well projected. In dealing with uncertainties in project management, the use of probability in the decision tree analysis is of vital importance (Chapple, 2011).
References
Athman, B. (1996). IEEE Transaction on Knowledge and Data Engineering, Volume 8, No. 2. Web.
Chapple, M. (2011). Clustering. About.com Guide. Web.
Ian, D. (2007). Knowledge Discovery and Data Mining: Challenges and Realities. New York. Hershey. Web.
Mind Tools, (2011). Decision Trees: Choosing by projecting “expected outcomes”. Web.
Project Management Institute, (2004). A guide to the project management body of knowledge. Newtown Square, PA. Web.