C4.5: Programs for Machine Learning Paperback – Oct 1 1992
|New from||Used from|
Customers Who Bought This Item Also Bought
No Kindle device required. Download one of the Free Kindle apps to start reading Kindle books on your smartphone, tablet, and computer.
To get the free app, enter your mobile phone number.
From the Back Cover
Classifier systems play a major role in machine learning and knowledge-based systems, and Ross Quinlan's work on ID3 and C4.5 is widely acknowledged to have made some of the most significant contributions to their development. This book is a complete guide to the C4.5 system as implemented in C for the UNIX environment. It contains a comprehensive guide to the system's use , the source code (about 8,800 lines), and implementation notes. The source code and sample datasets are also available for download (see below).
C4.5 starts with large sets of cases belonging to known classes. The cases, described by any mixture of nominal and numeric properties, are scrutinized for patterns that allow the classes to be reliably discriminated. These patterns are then expressed as models, in the form of decision trees or sets of if-then rules, that can be used to classify new cases, with emphasis on making the models understandable as well as accurate. The system has been applied successfully to tasks involving tens of thousands of cases described by hundreds of properties. The book starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting. Advantages and disadvantages of the C4.5 approach are discussed and illustrated with several case studies.
This book and software should be of interest to developers of classification-based intelligent systems and to students in machine learning and expert systems courses.
About the Author
J. Ross Quinlan, University of New South Wales
Top Customer Reviews
Most Helpful Customer Reviews on Amazon.com (beta)
C5.0 and See5 are built on C4.5, which is open source and free. However, since C5.0 and See5 are commercial products the code and the internals of the See5/C5 algorithms are not public. This is why this book is still so valuable. The first half of the book explains how C4.5 works, and describes its features, for example, partitioning, pruning, and windowing in detail. The book also discusses how C4.5 should be used, and potential problems with over-fit and non-representative data. The second half of the book gives a complete listing of the source code; 8,800 lines of C-code.
C5.0 is faster and more accurate than C4.5 and has features like cross validation, variable misclassification costs, and boost, which are features that C4.5 does not have. However, since minor misuse of See5 could have cost our company tens of millions of dollars it was important that we knew as much as possible about what we were doing, which is why this book was so valuable.
The reasons we did not use, for example, neural networks were:
(1) We had a lot of nominal data (in addition to numeric data)
(2) We had unknown attributes
(3) Our data sets were typically not very large and still we had a lot of attributes
(4) Unlike neural networks, decision trees and rule sets are human readable, possible to comprehend, and can be modified manually if necessary. Since we had problems with non-representative data but understood these problems as well as our system quite well, it was sometimes advantageous for us to modify the decision trees.
If you are in a similar situation I recommend See5/C5 as well as this book.
Overall, it is a good book to learn about the C4.5 algorithm.
Look for similar items by category
- Books > Computers & Technology > Computer Science > Artificial Intelligence > Computer Mathematics
- Books > Computers & Technology > Computer Science > Artificial Intelligence > Machine Learning
- Books > Computers & Technology > Programming > Algorithms
- Books > Computers & Technology > Programming > Software Design, Testing & Engineering > Software Development
- Books > Computers & Technology > Software
- Books > Professional & Technical > Engineering > Telecommunications
- Books > Qualifying Textbooks - Fall 2007 > Computers & Internet
- Books > Qualifying Textbooks - Fall 2007 > Science
- Books > Science & Math > Mathematics
- Books > Science & Math > Reference
- Books > Textbooks > Computer Science & Information Systems > Artificial Intelligence
- Books > Textbooks > Sciences