Summary of C4.5 Algorithm: Data Mining Essay (Critical Writing)

Exclusively available on IvyPanda Available only on IvyPanda

C4.5 algorithm is a decision tree with unlimited number of paths within the node. This algorithm can work only with discrete dependent attribute, that is why it can solve only classification tasks. C4.5 algorithm is considered to be one of the most famous and widely used algorithms of generating decision trees. It is necessary to follow the next demands for working with C4.5 algorism:

We will write a custom essay on your topic a custom Critical Writing on Summary of C4.5 Algorithm: Data Mining
808 writers online
  1. Each record from set of data should be associated with one of the offered classes, it means that one of the attributes of the class should be considered as a class mark. It may be concluded that all the samples should belong to the same class, otherwise the mistakes are inevitable.
  2. Each class should be discrete. Each sample should belong to one of the classes.
  3. The number of classes should be much fewer from the number of samples in the considered scope of data.

One should understand that C4.5 algorithm works slowly with very large scale set of data.

Using the concept of information entropy, C4.5 builds the decision trees based on the set of data, like ID3 algorithm. Filestem.ext is the form for the files which are read and written within C4.5 algorithm (filestem is a file name, and ext is a file extension which is aimed at defining the file type). Working with the program, one is expected to have at least two files, the first one is with the file name and class definition and the second one is with the date which gathers the set of objects described by the value of the class attributes. Considering the structure of a decision tree based on C4.5 algorithm, it may be either a leaf, which is predicted to identify a class or a decision node with a number of branches and sub trees, which show the possible outcome of the trial (Quinlan 5).

There are two ways how this algorithm can generate decision trees, batch mode and iterative model. Batch mode (often called default mode) generates a single decision tree. This tree covers all the data available for the decision. Another kind of this algorithm, iterative mode, is based on the random basis. The set of data is selected randomly. Then, a decision tree is generated with adding some specific objects which have been misclassified.

The actions are repeated and the decision tree is continued until it is classified in a correct way or it is found out that there is no any progress. Keeping in mind that iterative model is based on the subset selected randomly, many trials may be used for generating decision trees based on the same data. Keeping in mind that there can be many different decision trees due to the multiple trials, the presence of the filestem.unpruned is necessary. This file is created with the purpose to collect the decision trees in the process. If the similar data is used for generating decision trees, the latest variant of the tree is used. The machine saves the best generated decision tree in the file filestem.tree.

Works Cited

Quinlan, John Ross, C4.5: programs for machine learning. Burlington: Morgan Kaufmann, 1993.

Print
Need an custom research paper on Summary of C4.5 Algorithm: Data Mining written from scratch by a professional specifically for you?
808 writers online
Cite This paper
Select a referencing style:

Reference

IvyPanda. (2022, March 26). Summary of C4.5 Algorithm: Data Mining. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/

Work Cited

"Summary of C4.5 Algorithm: Data Mining." IvyPanda, 26 Mar. 2022, ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

References

IvyPanda. (2022) 'Summary of C4.5 Algorithm: Data Mining'. 26 March.

References

IvyPanda. 2022. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

1. IvyPanda. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.


Bibliography


IvyPanda. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

Powered by CiteTotal, the best referencing generator
If you are the copyright owner of this paper and no longer wish to have your work published on IvyPanda. Request the removal
More related papers
Cite
Print
1 / 1