Summary of C4.5 Algorithm: Data Mining Essay (Critical Writing)

Exclusively available on Available only on IvyPanda® Made by Human No AI

C4.5 algorithm is a decision tree with unlimited number of paths within the node. This algorithm can work only with discrete dependent attribute, that is why it can solve only classification tasks. C4.5 algorithm is considered to be one of the most famous and widely used algorithms of generating decision trees. It is necessary to follow the next demands for working with C4.5 algorism:

  1. Each record from set of data should be associated with one of the offered classes, it means that one of the attributes of the class should be considered as a class mark. It may be concluded that all the samples should belong to the same class, otherwise the mistakes are inevitable.
  2. Each class should be discrete. Each sample should belong to one of the classes.
  3. The number of classes should be much fewer from the number of samples in the considered scope of data.

One should understand that C4.5 algorithm works slowly with very large scale set of data.

Using the concept of information entropy, C4.5 builds the decision trees based on the set of data, like ID3 algorithm. Filestem.ext is the form for the files which are read and written within C4.5 algorithm (filestem is a file name, and ext is a file extension which is aimed at defining the file type). Working with the program, one is expected to have at least two files, the first one is with the file name and class definition and the second one is with the date which gathers the set of objects described by the value of the class attributes. Considering the structure of a decision tree based on C4.5 algorithm, it may be either a leaf, which is predicted to identify a class or a decision node with a number of branches and sub trees, which show the possible outcome of the trial (Quinlan 5).

There are two ways how this algorithm can generate decision trees, batch mode and iterative model. Batch mode (often called default mode) generates a single decision tree. This tree covers all the data available for the decision. Another kind of this algorithm, iterative mode, is based on the random basis. The set of data is selected randomly. Then, a decision tree is generated with adding some specific objects which have been misclassified.

The actions are repeated and the decision tree is continued until it is classified in a correct way or it is found out that there is no any progress. Keeping in mind that iterative model is based on the subset selected randomly, many trials may be used for generating decision trees based on the same data. Keeping in mind that there can be many different decision trees due to the multiple trials, the presence of the filestem.unpruned is necessary. This file is created with the purpose to collect the decision trees in the process. If the similar data is used for generating decision trees, the latest variant of the tree is used. The machine saves the best generated decision tree in the file filestem.tree.

Works Cited

Quinlan, John Ross, C4.5: programs for machine learning. Burlington: Morgan Kaufmann, 1993.

More related papers Related Essay Examples
Cite This paper
You're welcome to use this sample in your assignment. Be sure to cite it correctly

Reference

IvyPanda. (2022, March 26). Summary of C4.5 Algorithm: Data Mining. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/

Work Cited

"Summary of C4.5 Algorithm: Data Mining." IvyPanda, 26 Mar. 2022, ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

References

IvyPanda. (2022) 'Summary of C4.5 Algorithm: Data Mining'. 26 March.

References

IvyPanda. 2022. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

1. IvyPanda. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.


Bibliography


IvyPanda. "Summary of C4.5 Algorithm: Data Mining." March 26, 2022. https://ivypanda.com/essays/summary-of-c45-algorithm-data-mining/.

If, for any reason, you believe that this content should not be published on our website, please request its removal.
Updated:
This academic paper example has been carefully picked, checked and refined by our editorial team.
No AI was involved: only quilified experts contributed.
You are free to use it for the following purposes:
  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for you assignment
1 / 1